From patchwork Thu Jun 16 06:54:16 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 1963 Return-Path: X-Original-To: patchwork@peony.canonical.com Delivered-To: patchwork@peony.canonical.com Received: from fiordland.canonical.com (fiordland.canonical.com [91.189.94.145]) by peony.canonical.com (Postfix) with ESMTP id 2182724B1C for ; Thu, 16 Jun 2011 06:54:20 +0000 (UTC) Received: from mail-vw0-f52.google.com (mail-vw0-f52.google.com [209.85.212.52]) by fiordland.canonical.com (Postfix) with ESMTP id 912D2A1845F for ; Thu, 16 Jun 2011 06:54:19 +0000 (UTC) Received: by vws16 with SMTP id 16so46745vws.11 for ; Wed, 15 Jun 2011 23:54:19 -0700 (PDT) Received: by 10.52.112.106 with SMTP id ip10mr738515vdb.127.1308207258872; Wed, 15 Jun 2011 23:54:18 -0700 (PDT) X-Forwarded-To: linaro-patchwork@canonical.com X-Forwarded-For: patch@linaro.org linaro-patchwork@canonical.com Delivered-To: patches@linaro.org Received: by 10.52.183.130 with SMTP id em2cs168791vdc; Wed, 15 Jun 2011 23:54:18 -0700 (PDT) Received: by 10.231.63.67 with SMTP id a3mr521195ibi.94.1308207257587; Wed, 15 Jun 2011 23:54:17 -0700 (PDT) Received: from mail-pv0-f178.google.com (mail-pv0-f178.google.com [74.125.83.178]) by mx.google.com with ESMTPS id x3si3290225ibh.44.2011.06.15.23.54.17 (version=TLSv1/SSLv3 cipher=OTHER); Wed, 15 Jun 2011 23:54:17 -0700 (PDT) Received-SPF: neutral (google.com: 74.125.83.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=74.125.83.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 74.125.83.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) smtp.mail=ira.rosen@linaro.org Received: by pvg7 with SMTP id 7so1059419pvg.37 for ; Wed, 15 Jun 2011 23:54:16 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.188.14 with SMTP id l14mr75562wff.398.1308207256252; Wed, 15 Jun 2011 23:54:16 -0700 (PDT) Received: by 10.143.79.13 with HTTP; Wed, 15 Jun 2011 23:54:16 -0700 (PDT) Date: Thu, 16 Jun 2011 09:54:16 +0300 Message-ID: Subject: [patch] Another enhancement of widen-mult in the vectorizer From: Ira Rosen To: gcc-patches@gcc.gnu.org Cc: Patch Tracking , irar@il.ibm.com Hi, For unsigned char in[N]; int out[N]; for (i = 0; i < N; i++) out[i] = in[i] * 300; in[i] is first promoted to int and then multiplied by 300. This over-promotion prevents the vectorizer from using the widen-mult pattern here. This patch checks if a constant fits an intermediate type (short in this example) and generates widen-mult operation on that type. I.e., the following sequence: type a_t; TYPE a_T, prod_T; S1 a_t = ; S3 a_T = (TYPE) a_t; S5 prod_T = a_T * CONST; is marked as: type a_t; interm_type a_it; TYPE a_T, prod_T, prod_T'; S1 a_t = ; S3 a_T = (TYPE) a_t; '--> a_it = (interm_type) a_t; S5 prod_T = a_T * CONST; '--> prod_T' = a_it w* CONST; by the pattern detection (and later vectorized using the new statements). Bootstrapped and tested on powerpc64-suse-linux. Comments are welcome. Thanks, Ira ChangeLog: * tree-vectorizer.h (vect_recog_func_ptr): Change the first argument to be a VEC of statements. * tree-vect-loop.c (vect_determine_vectorization_factor): Remove the assert that pattern statements have to have their vector type set. * tree-vect-patterns.c (langhooks.h): Include. (vect_recog_widen_sum_pattern): Change the first argument to be a VEC of statements. Update documentation. (vect_recog_dot_prod_pattern, vect_recog_pow_pattern): Likewise. (vect_recog_widen_mult_pattern): Likewise and support multiplication by a constant that fits an intermediate type. Use int_fits_type_p instead of comparing to types max and min values. (vect_pattern_recog_1): Update vect_recog_func_ptr and its call. Handle additional pattern statements if necessary. * Makefile.in (tree-vect-patterns.c): Add dependency on langhooks.h. testsuite/ChangeLog: * gcc.dg/vect/vect-widen-mult-half-u8.c: New test. Index: testsuite/gcc.dg/vect/vect-widen-mult-half-u8.c =================================================================== --- testsuite/gcc.dg/vect/vect-widen-mult-half-u8.c (revision 0) +++ testsuite/gcc.dg/vect/vect-widen-mult-half-u8.c (revision 0) @@ -0,0 +1,59 @@ +/* { dg-require-effective-target vect_int } */ + +#include "tree-vect.h" +#include + +#define N 32 +#define COEF 32470 + +unsigned char in[N]; +int out[N]; + +__attribute__ ((noinline)) void +foo () +{ + int i; + + for (i = 0; i < N; i++) + out[i] = in[i] * COEF; +} + +__attribute__ ((noinline)) void +bar () +{ + int i; + + for (i = 0; i < N; i++) + out[i] = COEF * in[i]; +} + +int main (void) +{ + int i; + + for (i = 0; i < N; i++) + { + in[i] = i; + __asm__ volatile (""); + } + + foo (); + + for (i = 0; i < N; i++) + if (out[i] != in[i] * COEF) + abort (); + + bar (); + + for (i = 0; i < N; i++) + if (out[i] != in[i] * COEF) + abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 2 "vect" { target vect_widen_mult_hi_to_si } } } */ +/* { dg-final { scan-tree-dump-times "vect_recog_widen_mult_pattern: detected" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ +/* { dg-final { scan-tree-dump-times "pattern recognized" 2 "vect" { target vect_widen_mult_hi_to_si_pattern } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ + Index: tree-vectorizer.h =================================================================== --- tree-vectorizer.h (revision 175073) +++ tree-vectorizer.h (working copy) @@ -896,7 +896,7 @@ extern void vect_slp_transform_bb (basic_block); /* Pattern recognition functions. Additional pattern recognition functions can (and will) be added in the future. */ -typedef gimple (* vect_recog_func_ptr) (gimple *, tree *, tree *); +typedef gimple (* vect_recog_func_ptr) (VEC (gimple, heap) **, tree *, tree *); #define NUM_PATTERNS 4 void vect_pattern_recog (loop_vec_info); Index: tree-vect-loop.c =================================================================== --- tree-vect-loop.c (revision 175074) +++ tree-vect-loop.c (working copy) @@ -311,9 +311,7 @@ vect_determine_vectorization_factor (loop_vec_info } else { - gcc_assert (!STMT_VINFO_DATA_REF (stmt_info) - && !is_pattern_stmt_p (stmt_info)); - + gcc_assert (!STMT_VINFO_DATA_REF (stmt_info)); scalar_type = TREE_TYPE (gimple_get_lhs (stmt)); if (vect_print_dump_info (REPORT_DETAILS)) { Index: tree-vect-patterns.c =================================================================== --- tree-vect-patterns.c (revision 175074) +++ tree-vect-patterns.c (working copy) @@ -37,12 +37,16 @@ along with GCC; see the file COPYING3. If not see #include "tree-vectorizer.h" #include "recog.h" #include "diagnostic-core.h" +#include "langhooks.h" /* Pattern recognition functions */ -static gimple vect_recog_widen_sum_pattern (gimple *, tree *, tree *); -static gimple vect_recog_widen_mult_pattern (gimple *, tree *, tree *); -static gimple vect_recog_dot_prod_pattern (gimple *, tree *, tree *); -static gimple vect_recog_pow_pattern (gimple *, tree *, tree *); +static gimple vect_recog_widen_sum_pattern (VEC (gimple, heap) **, tree *, + tree *); +static gimple vect_recog_widen_mult_pattern (VEC (gimple, heap) **, tree *, + tree *); +static gimple vect_recog_dot_prod_pattern (VEC (gimple, heap) **, tree *, + tree *); +static gimple vect_recog_pow_pattern (VEC (gimple, heap) **, tree *, tree *); static vect_recog_func_ptr vect_vect_recog_func_ptrs[NUM_PATTERNS] = { vect_recog_widen_mult_pattern, vect_recog_widen_sum_pattern, @@ -142,9 +146,9 @@ vect_recog_temp_ssa_var (tree type, gimple stmt) Input: - * LAST_STMT: A stmt from which the pattern search begins. In the example, - when this function is called with S7, the pattern {S3,S4,S5,S6,S7} will be - detected. + * STMTS: Contains a stmt from which the pattern search begins. In the + example, when this function is called with S7, the pattern {S3,S4,S5,S6,S7} + will be detected. Output: @@ -165,12 +169,13 @@ vect_recog_temp_ssa_var (tree type, gimple stmt) inner-loop nested in an outer-loop that us being vectorized). */ static gimple -vect_recog_dot_prod_pattern (gimple *last_stmt, tree *type_in, tree *type_out) +vect_recog_dot_prod_pattern (VEC (gimple, heap) **stmts, tree *type_in, + tree *type_out) { - gimple stmt; + gimple stmt, last_stmt = VEC_index (gimple, *stmts, 0); tree oprnd0, oprnd1; tree oprnd00, oprnd01; - stmt_vec_info stmt_vinfo = vinfo_for_stmt (*last_stmt); + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); tree type, half_type; gimple pattern_stmt; tree prod_type; @@ -178,10 +183,10 @@ static gimple struct loop *loop = LOOP_VINFO_LOOP (loop_info); tree var; - if (!is_gimple_assign (*last_stmt)) + if (!is_gimple_assign (last_stmt)) return NULL; - type = gimple_expr_type (*last_stmt); + type = gimple_expr_type (last_stmt); /* Look for the following pattern DX = (TYPE1) X; @@ -207,7 +212,7 @@ static gimple /* Starting from LAST_STMT, follow the defs of its uses in search of the above pattern. */ - if (gimple_assign_rhs_code (*last_stmt) != PLUS_EXPR) + if (gimple_assign_rhs_code (last_stmt) != PLUS_EXPR) return NULL; if (STMT_VINFO_IN_PATTERN_P (stmt_vinfo)) @@ -228,12 +233,12 @@ static gimple if (STMT_VINFO_DEF_TYPE (stmt_vinfo) != vect_reduction_def) return NULL; - oprnd0 = gimple_assign_rhs1 (*last_stmt); - oprnd1 = gimple_assign_rhs2 (*last_stmt); + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); if (!types_compatible_p (TREE_TYPE (oprnd0), type) || !types_compatible_p (TREE_TYPE (oprnd1), type)) return NULL; - stmt = *last_stmt; + stmt = last_stmt; if (widened_name_p (oprnd0, stmt, &half_type, &def_stmt, true)) { @@ -244,7 +249,7 @@ static gimple half_type = type; } - /* So far so good. Since *last_stmt was detected as a (summation) reduction, + /* So far so good. Since last_stmt was detected as a (summation) reduction, we know that oprnd1 is the reduction variable (defined by a loop-header phi), and oprnd0 is an ssa-name defined by a stmt in the loop body. Left to check that oprnd0 is defined by a (widen_)mult_expr */ @@ -319,7 +324,7 @@ static gimple /* We don't allow changing the order of the computation in the inner-loop when doing outer-loop vectorization. */ - gcc_assert (!nested_in_vect_loop_p (loop, *last_stmt)); + gcc_assert (!nested_in_vect_loop_p (loop, last_stmt)); return pattern_stmt; } @@ -361,12 +366,31 @@ static gimple S3 a_T = (TYPE) a_t; S5 prod_T = a_T * CONST; - Input: + A special case of multiplication by constants is when 'TYPE' is 4 times + bigger than 'type', but CONST fits an intermediate type 2 times smaller + than 'TYPE'. In that case we create an additional pattern stmt for S3 + to create a variable of the intermediate type, and perform widen-mult + on the intermediate type as well: - * LAST_STMT: A stmt from which the pattern search begins. In the example, - when this function is called with S5, the pattern {S3,S4,S5,(S6)} is - detected. + type a_t; + interm_type a_it; + TYPE a_T, prod_T, prod_T'; + S1 a_t = ; + S3 a_T = (TYPE) a_t; + '--> a_it = (interm_type) a_t; + S5 prod_T = a_T * CONST; + '--> prod_T' = a_it w* CONST; + + Input/Output: + + * STMTS: Contains a stmt from which the pattern search begins. In the + example, when this function is called with S5, the pattern {S3,S4,S5,(S6)} + is detected. In case of unsigned widen-mult, the original stmt (S5) is + replaced with S6 in STMTS. In case of multiplication by a constant + of an intermediate type (the last case above), STMTS also contains S3 + (inserted before S5). + Output: * TYPE_IN: The type of the input arguments to the pattern. @@ -379,10 +403,10 @@ static gimple */ static gimple -vect_recog_widen_mult_pattern (gimple *last_stmt, - tree *type_in, - tree *type_out) +vect_recog_widen_mult_pattern (VEC (gimple, heap) **stmts, + tree *type_in, tree *type_out) { + gimple last_stmt = VEC_pop (gimple, *stmts); gimple def_stmt0, def_stmt1; tree oprnd0, oprnd1; tree type, half_type0, half_type1; @@ -394,28 +418,30 @@ static gimple int dummy_int; VEC (tree, heap) *dummy_vec; bool op0_ok, op1_ok; + tree new_type = NULL_TREE, new_oprnd0, new_oprnd1, tmp; + gimple new_stmt; - if (!is_gimple_assign (*last_stmt)) + if (!is_gimple_assign (last_stmt)) return NULL; - type = gimple_expr_type (*last_stmt); + type = gimple_expr_type (last_stmt); /* Starting from LAST_STMT, follow the defs of its uses in search of the above pattern. */ - if (gimple_assign_rhs_code (*last_stmt) != MULT_EXPR) + if (gimple_assign_rhs_code (last_stmt) != MULT_EXPR) return NULL; - oprnd0 = gimple_assign_rhs1 (*last_stmt); - oprnd1 = gimple_assign_rhs2 (*last_stmt); + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); if (!types_compatible_p (TREE_TYPE (oprnd0), type) || !types_compatible_p (TREE_TYPE (oprnd1), type)) return NULL; /* Check argument 0. */ - op0_ok = widened_name_p (oprnd0, *last_stmt, &half_type0, &def_stmt0, false); + op0_ok = widened_name_p (oprnd0, last_stmt, &half_type0, &def_stmt0, false); /* Check argument 1. */ - op1_ok = widened_name_p (oprnd1, *last_stmt, &half_type1, &def_stmt1, false); + op1_ok = widened_name_p (oprnd1, last_stmt, &half_type1, &def_stmt1, false); /* In case of multiplication by a constant one of the operands may not match the pattern, but not both. */ @@ -430,27 +456,118 @@ static gimple else if (!op0_ok) { if (CONSTANT_CLASS_P (oprnd0) - && TREE_CODE (half_type1) == INTEGER_TYPE - && tree_int_cst_lt (oprnd0, TYPE_MAXVAL (half_type1)) - && tree_int_cst_lt (TYPE_MINVAL (half_type1), oprnd0)) - { - /* OPRND0 is a constant of HALF_TYPE1. */ - half_type0 = half_type1; - oprnd1 = gimple_assign_rhs1 (def_stmt1); + && TREE_CODE (half_type1) == INTEGER_TYPE) + { + if (int_fits_type_p (oprnd0, half_type1)) + { + /* OPRND0 is a constant of HALF_TYPE1. */ + half_type0 = half_type1; + oprnd1 = gimple_assign_rhs1 (def_stmt1); + } + else if (TYPE_PRECISION (type) >= (TYPE_PRECISION (half_type1) * 4) + && vinfo_for_stmt (def_stmt1)) + { + /* TYPE is 4 times bigger than HALF_TYPE1, try widen-mult for + a type 2 times bigger than HALF_TYPE1. */ + new_type = lang_hooks.types.type_for_size ( + TYPE_PRECISION (type) / 2, + TYPE_UNSIGNED (type)); + if (!int_fits_type_p (oprnd0, new_type)) + return NULL; + + /* Use NEW_TYPE for widen_mult. */ + half_type0 = half_type1 = new_type; + if (STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt1))) + { + new_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt ( + def_stmt1)); + /* Check if the already created pattern stmt is what we + need. */ + if (!is_gimple_assign (new_stmt) + || gimple_assign_rhs_code (new_stmt) != NOP_EXPR + || TREE_TYPE (gimple_assign_lhs (new_stmt)) != new_type) + return NULL; + + oprnd1 = gimple_assign_lhs (new_stmt); + } + else + { + /* Create a_T = (NEW_TYPE) a_t; */ + oprnd1 = gimple_assign_rhs1 (def_stmt1); + tmp = create_tmp_var (new_type, NULL); + add_referenced_var (tmp); + new_oprnd1 = make_ssa_name (tmp, NULL); + new_stmt = gimple_build_assign_with_ops (NOP_EXPR, + new_oprnd1, oprnd0, NULL_TREE); + SSA_NAME_DEF_STMT (new_oprnd1) = new_stmt; + STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt1)) + = new_stmt; + VEC_safe_push (gimple, heap, *stmts, def_stmt1); + oprnd1 = new_oprnd1; + } + } + else + return NULL; } + else return NULL; } else if (!op1_ok) { if (CONSTANT_CLASS_P (oprnd1) - && TREE_CODE (half_type0) == INTEGER_TYPE - && tree_int_cst_lt (oprnd1, TYPE_MAXVAL (half_type0)) - && tree_int_cst_lt (TYPE_MINVAL (half_type0), oprnd1)) + && TREE_CODE (half_type0) == INTEGER_TYPE) { - /* OPRND1 is a constant of HALF_TYPE0. */ - half_type1 = half_type0; - oprnd0 = gimple_assign_rhs1 (def_stmt0); + if (int_fits_type_p (oprnd1, half_type0)) + { + /* OPRND1 is a constant of HALF_TYPE0. */ + half_type1 = half_type0; + oprnd0 = gimple_assign_rhs1 (def_stmt0); + } + else if (TYPE_PRECISION (type) >= (TYPE_PRECISION (half_type0) * 4) + && vinfo_for_stmt (def_stmt0)) + { + /* TYPE is 4 times bigger than HALF_TYPE0, try widen-mult for + a type 2 times bigger than HALF_TYPE0. */ + new_type = lang_hooks.types.type_for_size ( + TYPE_PRECISION (type) / 2, + TYPE_UNSIGNED (type)); + if (!int_fits_type_p (oprnd1, new_type)) + return NULL; + + /* Use NEW_TYPE for widen_mult. */ + half_type0 = half_type1 = new_type; + if (STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt0))) + { + new_stmt = STMT_VINFO_RELATED_STMT (vinfo_for_stmt ( + def_stmt0)); + /* Check if the already created pattern stmt is what we + need. */ + if (!is_gimple_assign (new_stmt) + || gimple_assign_rhs_code (new_stmt) != NOP_EXPR + || TREE_TYPE (gimple_assign_lhs (new_stmt)) != new_type) + return NULL; + + oprnd0 = gimple_assign_lhs (new_stmt); + } + else + { + /* Create a_T = (NEW_TYPE) a_t; */ + oprnd0 = gimple_assign_rhs1 (def_stmt0); + tmp = create_tmp_var (new_type, NULL); + add_referenced_var (tmp); + new_oprnd0 = make_ssa_name (tmp, NULL); + new_stmt = gimple_build_assign_with_ops (NOP_EXPR, + new_oprnd0, oprnd0, NULL_TREE); + SSA_NAME_DEF_STMT (new_oprnd0) = new_stmt; + STMT_VINFO_RELATED_STMT (vinfo_for_stmt (def_stmt0)) + = new_stmt; + VEC_safe_push (gimple, heap, *stmts, def_stmt0); + oprnd0 = new_oprnd0; + } + } + else + return NULL; } else return NULL; @@ -461,7 +578,7 @@ static gimple Use unsigned TYPE as the type for WIDEN_MULT_EXPR. */ if (TYPE_UNSIGNED (type) != TYPE_UNSIGNED (half_type0)) { - tree lhs = gimple_assign_lhs (*last_stmt), use_lhs; + tree lhs = gimple_assign_lhs (last_stmt), use_lhs; imm_use_iterator imm_iter; use_operand_p use_p; int nuses = 0; @@ -489,7 +606,7 @@ static gimple return NULL; type = use_type; - *last_stmt = use_stmt; + last_stmt = use_stmt; } if (!types_compatible_p (half_type0, half_type1)) @@ -504,7 +621,7 @@ static gimple vectype_out = get_vectype_for_scalar_type (type); if (!vectype || !vectype_out - || !supportable_widening_operation (WIDEN_MULT_EXPR, *last_stmt, + || !supportable_widening_operation (WIDEN_MULT_EXPR, last_stmt, vectype_out, vectype, &dummy, &dummy, &dummy_code, &dummy_code, &dummy_int, &dummy_vec)) @@ -522,6 +639,7 @@ static gimple if (vect_print_dump_info (REPORT_DETAILS)) print_gimple_stmt (vect_dump, pattern_stmt, 0, TDF_SLIM); + VEC_safe_push (gimple, heap, *stmts, last_stmt); return pattern_stmt; } @@ -553,16 +671,18 @@ static gimple */ static gimple -vect_recog_pow_pattern (gimple *last_stmt, tree *type_in, tree *type_out) +vect_recog_pow_pattern (VEC (gimple, heap) **stmts, tree *type_in, + tree *type_out) { + gimple last_stmt = VEC_index (gimple, *stmts, 0); tree fn, base, exp = NULL; gimple stmt; tree var; - if (!is_gimple_call (*last_stmt) || gimple_call_lhs (*last_stmt) == NULL) + if (!is_gimple_call (last_stmt) || gimple_call_lhs (last_stmt) == NULL) return NULL; - fn = gimple_call_fndecl (*last_stmt); + fn = gimple_call_fndecl (last_stmt); if (fn == NULL_TREE || DECL_BUILT_IN_CLASS (fn) != BUILT_IN_NORMAL) return NULL; @@ -572,8 +692,8 @@ static gimple case BUILT_IN_POWI: case BUILT_IN_POWF: case BUILT_IN_POW: - base = gimple_call_arg (*last_stmt, 0); - exp = gimple_call_arg (*last_stmt, 1); + base = gimple_call_arg (last_stmt, 0); + exp = gimple_call_arg (last_stmt, 1); if (TREE_CODE (exp) != REAL_CST && TREE_CODE (exp) != INTEGER_CST) return NULL; @@ -665,21 +785,22 @@ static gimple inner-loop nested in an outer-loop that us being vectorized). */ static gimple -vect_recog_widen_sum_pattern (gimple *last_stmt, tree *type_in, tree *type_out) +vect_recog_widen_sum_pattern (VEC (gimple, heap) **stmts, tree *type_in, + tree *type_out) { - gimple stmt; + gimple stmt, last_stmt = VEC_index (gimple, *stmts, 0); tree oprnd0, oprnd1; - stmt_vec_info stmt_vinfo = vinfo_for_stmt (*last_stmt); + stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt); tree type, half_type; gimple pattern_stmt; loop_vec_info loop_info = STMT_VINFO_LOOP_VINFO (stmt_vinfo); struct loop *loop = LOOP_VINFO_LOOP (loop_info); tree var; - if (!is_gimple_assign (*last_stmt)) + if (!is_gimple_assign (last_stmt)) return NULL; - type = gimple_expr_type (*last_stmt); + type = gimple_expr_type (last_stmt); /* Look for the following pattern DX = (TYPE) X; @@ -691,25 +812,25 @@ static gimple /* Starting from LAST_STMT, follow the defs of its uses in search of the above pattern. */ - if (gimple_assign_rhs_code (*last_stmt) != PLUS_EXPR) + if (gimple_assign_rhs_code (last_stmt) != PLUS_EXPR) return NULL; if (STMT_VINFO_DEF_TYPE (stmt_vinfo) != vect_reduction_def) return NULL; - oprnd0 = gimple_assign_rhs1 (*last_stmt); - oprnd1 = gimple_assign_rhs2 (*last_stmt); + oprnd0 = gimple_assign_rhs1 (last_stmt); + oprnd1 = gimple_assign_rhs2 (last_stmt); if (!types_compatible_p (TREE_TYPE (oprnd0), type) || !types_compatible_p (TREE_TYPE (oprnd1), type)) return NULL; - /* So far so good. Since *last_stmt was detected as a (summation) reduction, + /* So far so good. Since last_stmt was detected as a (summation) reduction, we know that oprnd1 is the reduction variable (defined by a loop-header phi), and oprnd0 is an ssa-name defined by a stmt in the loop body. Left to check that oprnd0 is defined by a cast from type 'type' to type 'TYPE'. */ - if (!widened_name_p (oprnd0, *last_stmt, &half_type, &stmt, true)) + if (!widened_name_p (oprnd0, last_stmt, &half_type, &stmt, true)) return NULL; oprnd0 = gimple_assign_rhs1 (stmt); @@ -730,7 +851,7 @@ static gimple /* We don't allow changing the order of the computation in the inner-loop when doing outer-loop vectorization. */ - gcc_assert (!nested_in_vect_loop_p (loop, *last_stmt)); + gcc_assert (!nested_in_vect_loop_p (loop, last_stmt)); return pattern_stmt; } @@ -760,7 +881,7 @@ static gimple static void vect_pattern_recog_1 ( - gimple (* vect_recog_func) (gimple *, tree *, tree *), + gimple (* vect_recog_func) (VEC (gimple, heap) **, tree *, tree *), gimple_stmt_iterator si) { gimple stmt = gsi_stmt (si), pattern_stmt; @@ -772,12 +893,14 @@ vect_pattern_recog_1 ( enum tree_code code; int i; gimple next; + VEC (gimple, heap) *stmts_to_replace = VEC_alloc (gimple, heap, 1); - pattern_stmt = (* vect_recog_func) (&stmt, &type_in, &type_out); + VEC_quick_push (gimple, stmts_to_replace, stmt); + pattern_stmt = (* vect_recog_func) (&stmts_to_replace, &type_in, &type_out); if (!pattern_stmt) return; - si = gsi_for_stmt (stmt); + stmt = VEC_last (gimple, stmts_to_replace); stmt_info = vinfo_for_stmt (stmt); loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info); @@ -847,6 +970,35 @@ vect_pattern_recog_1 ( FOR_EACH_VEC_ELT (gimple, LOOP_VINFO_REDUCTIONS (loop_vinfo), i, next) if (next == stmt) VEC_ordered_remove (gimple, LOOP_VINFO_REDUCTIONS (loop_vinfo), i); + + /* In case of widen-mult by a constant, it is possible that an additional + pattern stmt is created and inserted in STMTS_TO_REPLACE. We create a + stmt_info for it, and mark the relevant statements. */ + for (i = 0; VEC_iterate (gimple, stmts_to_replace, i, stmt) + && (unsigned) i < (VEC_length (gimple, stmts_to_replace) - 1); + i++) + { + stmt_info = vinfo_for_stmt (stmt); + pattern_stmt = STMT_VINFO_RELATED_STMT (stmt_info); + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, "additional pattern stmt: "); + print_gimple_stmt (vect_dump, pattern_stmt, 0, TDF_SLIM); + } + + set_vinfo_for_stmt (pattern_stmt, + new_stmt_vec_info (pattern_stmt, loop_vinfo, NULL)); + gimple_set_bb (pattern_stmt, gimple_bb (stmt)); + pattern_stmt_info = vinfo_for_stmt (pattern_stmt); + + STMT_VINFO_RELATED_STMT (pattern_stmt_info) = stmt; + STMT_VINFO_DEF_TYPE (pattern_stmt_info) + = STMT_VINFO_DEF_TYPE (stmt_info); + STMT_VINFO_VECTYPE (pattern_stmt_info) = STMT_VINFO_VECTYPE (stmt_info); + STMT_VINFO_IN_PATTERN_P (stmt_info) = true; + } + + VEC_free (gimple, heap, stmts_to_replace); } @@ -923,7 +1075,7 @@ vect_pattern_recog (loop_vec_info loop_vinfo) unsigned int nbbs = loop->num_nodes; gimple_stmt_iterator si; unsigned int i, j; - gimple (* vect_recog_func_ptr) (gimple *, tree *, tree *); + gimple (* vect_recog_func_ptr) (VEC (gimple, heap) **, tree *, tree *); if (vect_print_dump_info (REPORT_DETAILS)) fprintf (vect_dump, "=== vect_pattern_recog ==="); Index: Makefile.in =================================================================== --- Makefile.in (revision 175073) +++ Makefile.in (working copy) @@ -2776,7 +2776,7 @@ tree-vect-patterns.o: tree-vect-patterns.c $(CONFI $(TM_H) $(GGC_H) $(TREE_H) $(TARGET_H) $(BASIC_BLOCK_H) $(DIAGNOSTIC_H) \ $(TREE_FLOW_H) $(TREE_DUMP_H) $(CFGLOOP_H) $(EXPR_H) $(OPTABS_H) $(PARAMS_H) \ $(TREE_DATA_REF_H) $(TREE_VECTORIZER_H) $(RECOG_H) $(DIAGNOSTIC_CORE_H) \ - gimple-pretty-print.h + gimple-pretty-print.h langhooks.h tree-vect-slp.o: tree-vect-slp.c $(CONFIG_H) $(SYSTEM_H) \ coretypes.h $(TM_H) $(GGC_H) $(TREE_H) $(TARGET_H) $(BASIC_BLOCK_H) \ $(DIAGNOSTIC_H) $(TREE_FLOW_H) $(TREE_DUMP_H) $(CFGLOOP_H) $(CFGLAYOUT_H) \