From patchwork Mon Jun 13 12:43:32 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Ira Rosen X-Patchwork-Id: 1871 Return-Path: Delivered-To: unknown Received: from imap.gmail.com (74.125.47.109) by localhost6.localdomain6 with IMAP4-SSL; 14 Jun 2011 16:46:12 -0000 Delivered-To: patches@linaro.org Received: by 10.52.181.10 with SMTP id ds10cs452801vdc; Mon, 13 Jun 2011 05:43:35 -0700 (PDT) Received: by 10.236.137.131 with SMTP id y3mr6682611yhi.503.1307969014676; Mon, 13 Jun 2011 05:43:34 -0700 (PDT) Received: from mail-pv0-f178.google.com (mail-pv0-f178.google.com [74.125.83.178]) by mx.google.com with ESMTPS id y48si19299545yhl.134.2011.06.13.05.43.33 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 13 Jun 2011 05:43:33 -0700 (PDT) Received-SPF: neutral (google.com: 74.125.83.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=74.125.83.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 74.125.83.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) smtp.mail=ira.rosen@linaro.org Received: by pvg7 with SMTP id 7so2377859pvg.37 for ; Mon, 13 Jun 2011 05:43:32 -0700 (PDT) MIME-Version: 1.0 Received: by 10.142.125.18 with SMTP id x18mr803308wfc.207.1307969012610; Mon, 13 Jun 2011 05:43:32 -0700 (PDT) Received: by 10.143.93.4 with HTTP; Mon, 13 Jun 2011 05:43:32 -0700 (PDT) Date: Mon, 13 Jun 2011 15:43:32 +0300 Message-ID: Subject: [patch] Don't insert pattern statements into the code (was Fix PR tree-optimization/49318) From: Ira Rosen To: Richard Guenther Cc: gcc-patches@gcc.gnu.org, Patch Tracking On 10 June 2011 12:14, Richard Guenther wrote: > In the end I think we should not generate the pattern stmt during > pattern matching but only mark the relevant statements with a > pattern kind.  Say, for each pattern we have a "main" statement > that has related stmts belonging to the pattern that define uses > of the "main" statement - mark those to refer to that "main" statement. > For that "main" statement simply record an enum value, like, > widening_mult.  Then only at vectorized statement > generation time actually generate the vectorized form of the > pattern statement. I ended up with the following: during pattern detection a new scalar pattern statement is created but not inserted into the code, it is only recorded as a related statement of the last statement in the detected pattern. Every time the last statement is being analyzed/transformed, we switch to the pattern statement instead. It is much more difficult just to mark the last stmt with an enum value, since we have to retrieve the relevant operands every time. I am not sure if we need to free the pattern stmt at the end. Bootstrapped and now testing on powerpc64-suse-linux (tested vectorizer testsuite on powerpc64-suse-linux and x86_64-suse-linux. What do you think? Thanks, Ira ChangeLog: * tree-vect-loop.c (vect_determine_vectorization_factor): Don't remove irrelevant pattern statements. For irrelevant statements check if it is the last statement of a detected pattern, use corresponding pattern statement instead. (destroy_loop_vec_info): No need to remove pattern statements, only free stmt_vec_info. (vect_transform_loop): For irrelevant statements check if it is the last statement of a detected pattern, use corresponding pattern statement instead. * tree-vect-patterns.c (vect_pattern_recog_1): Don't insert pattern statements. Set basic block for the new statement. (vect_pattern_recog): Update documentation. * tree-vect-stmts.c (vect_mark_stmts_to_be_vectorized): Scan operands of pattern statements. (vectorizable_call): Fix printing. In case of a pattern statement use the lhs of the original statement when creating a dummy statement to replace the original call. (vect_analyze_stmt): For irrelevant statements check if it is the last statement of a detected pattern, use corresponding pattern statement instead. * tree-vect-slp.c (vect_schedule_slp_instance): For pattern statements use gsi of the original statement. Index: tree-vect-loop.c =================================================================== --- tree-vect-loop.c (revision 174982) +++ tree-vect-loop.c (working copy) @@ -244,7 +244,7 @@ vect_determine_vectorization_factor (loop_vec_info for (si = gsi_start_bb (bb); !gsi_end_p (si); gsi_next (&si)) { tree vf_vectype; - gimple stmt = gsi_stmt (si); + gimple stmt = gsi_stmt (si), pattern_stmt; stmt_info = vinfo_for_stmt (stmt); if (vect_print_dump_info (REPORT_DETAILS)) @@ -258,20 +258,26 @@ vect_determine_vectorization_factor (loop_vec_info /* Skip stmts which do not need to be vectorized. */ if (!STMT_VINFO_RELEVANT_P (stmt_info) && !STMT_VINFO_LIVE_P (stmt_info)) - { - if (is_pattern_stmt_p (stmt_info)) + { + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && (pattern_stmt = STMT_VINFO_RELATED_STMT (stmt_info)) + && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_stmt)) + || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_stmt)))) { - /* We are not going to vectorize this pattern statement, - therefore, remove it. */ - gimple_stmt_iterator tmp_gsi = gsi_for_stmt (stmt); - STMT_VINFO_RELATED_STMT (stmt_info) = NULL; - gsi_remove (&tmp_gsi, true); - free_stmt_vec_info (stmt); + stmt = pattern_stmt; + stmt_info = vinfo_for_stmt (pattern_stmt); + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, "==> examining pattern statement: "); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); + } } - - if (vect_print_dump_info (REPORT_DETAILS)) - fprintf (vect_dump, "skip."); - continue; + else + { + if (vect_print_dump_info (REPORT_DETAILS)) + fprintf (vect_dump, "skip."); + continue; + } } if (gimple_get_lhs (stmt) == NULL_TREE) @@ -828,25 +834,17 @@ destroy_loop_vec_info (loop_vec_info loop_vinfo, b if (stmt_info) { - /* Check if this is a "pattern stmt" (introduced by the - vectorizer during the pattern recognition pass). */ - bool remove_stmt_p = false; - gimple orig_stmt = STMT_VINFO_RELATED_STMT (stmt_info); - if (orig_stmt) - { - stmt_vec_info orig_stmt_info = vinfo_for_stmt (orig_stmt); - if (orig_stmt_info - && STMT_VINFO_IN_PATTERN_P (orig_stmt_info)) - remove_stmt_p = true; - } + /* Check if this statement has a related "pattern stmt" + (introduced by the vectorizer during the pattern recognition + pass). Free pattern's stmt_vec_info. */ + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && vinfo_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info))) + free_stmt_vec_info (STMT_VINFO_RELATED_STMT (stmt_info)); /* Free stmt_vec_info. */ free_stmt_vec_info (stmt); + } - /* Remove dead "pattern stmts". */ - if (remove_stmt_p) - gsi_remove (&si, true); - } gsi_next (&si); } } @@ -5131,7 +5129,7 @@ vect_transform_loop (loop_vec_info loop_vinfo) for (si = gsi_start_bb (bb); !gsi_end_p (si);) { - gimple stmt = gsi_stmt (si); + gimple stmt = gsi_stmt (si), pattern_stmt; bool is_store; if (vect_print_dump_info (REPORT_DETAILS)) @@ -5156,14 +5154,25 @@ vect_transform_loop (loop_vec_info loop_vinfo) if (!STMT_VINFO_RELEVANT_P (stmt_info) && !STMT_VINFO_LIVE_P (stmt_info)) - { - gsi_next (&si); - continue; + { + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && (pattern_stmt = STMT_VINFO_RELATED_STMT (stmt_info)) + && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_stmt)) + || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_stmt)))) + { + stmt = pattern_stmt; + stmt_info = vinfo_for_stmt (stmt); + } + else + { + gsi_next (&si); + continue; + } } gcc_assert (STMT_VINFO_VECTYPE (stmt_info)); - nunits = - (unsigned int) TYPE_VECTOR_SUBPARTS (STMT_VINFO_VECTYPE (stmt_info)); + nunits = (unsigned int) TYPE_VECTOR_SUBPARTS ( + STMT_VINFO_VECTYPE (stmt_info)); if (!STMT_SLP_TYPE (stmt_info) && nunits != (unsigned int) vectorization_factor && vect_print_dump_info (REPORT_DETAILS)) Index: tree-vect-patterns.c =================================================================== --- tree-vect-patterns.c (revision 174981) +++ tree-vect-patterns.c (working copy) @@ -831,9 +831,9 @@ vect_pattern_recog_1 ( } /* Mark the stmts that are involved in the pattern. */ - gsi_insert_before (&si, pattern_stmt, GSI_SAME_STMT); set_vinfo_for_stmt (pattern_stmt, new_stmt_vec_info (pattern_stmt, loop_vinfo, NULL)); + gimple_set_bb (pattern_stmt, gimple_bb (stmt)); pattern_stmt_info = vinfo_for_stmt (pattern_stmt); STMT_VINFO_RELATED_STMT (pattern_stmt_info) = stmt; @@ -856,8 +856,8 @@ vect_pattern_recog_1 ( LOOP_VINFO - a struct_loop_info of a loop in which we want to look for computation idioms. - Output - for each computation idiom that is detected we insert a new stmt - that provides the same functionality and that can be vectorized. We + Output - for each computation idiom that is detected we create a new stmt + that provides the same functionality and that can be vectorized. We also record some information in the struct_stmt_info of the relevant stmts, as explained below: @@ -872,53 +872,49 @@ vect_pattern_recog_1 ( S5: ... = ..use(a_0).. - - - Say the sequence {S1,S2,S3,S4} was detected as a pattern that can be - represented by a single stmt. We then: - - create a new stmt S6 that will replace the pattern. - - insert the new stmt S6 before the last stmt in the pattern + represented by a single stmt. We then: + - create a new stmt S6 equivalent to the pattern (the stmt is not + inserted into the code) - fill in the STMT_VINFO fields as follows: in_pattern_p related_stmt vec_stmt S1: a_i = .... - - - S2: a_2 = ..use(a_i).. - - - S3: a_1 = ..use(a_2).. - - - - > S6: a_new = .... - S4 - S4: a_0 = ..use(a_1).. true S6 - + '---> S6: a_new = .... - S4 - S5: ... = ..use(a_0).. - - - (the last stmt in the pattern (S4) and the new pattern stmt (S6) point - to each other through the RELATED_STMT field). + to each other through the RELATED_STMT field). S6 will be marked as relevant in vect_mark_stmts_to_be_vectorized instead of S4 because it will replace all its uses. Stmts {S1,S2,S3} will remain irrelevant unless used by stmts other than S4. If vectorization succeeds, vect_transform_stmt will skip over {S1,S2,S3} - (because they are marked as irrelevant). It will vectorize S6, and record + (because they are marked as irrelevant). It will vectorize S6, and record a pointer to the new vector stmt VS6 both from S6 (as usual), and also - from S4. We do that so that when we get to vectorizing stmts that use the + from S4. We do that so that when we get to vectorizing stmts that use the def of S4 (like S5 that uses a_0), we'll know where to take the relevant - vector-def from. S4 will be skipped, and S5 will be vectorized as usual: + vector-def from. S4 will be skipped, and S5 will be vectorized as usual: in_pattern_p related_stmt vec_stmt S1: a_i = .... - - - S2: a_2 = ..use(a_i).. - - - S3: a_1 = ..use(a_2).. - - - > VS6: va_new = .... - - - - S6: a_new = .... - S4 VS6 S4: a_0 = ..use(a_1).. true S6 VS6 + '---> S6: a_new = .... - S4 VS6 > VS5: ... = ..vuse(va_new).. - - - S5: ... = ..use(a_0).. - - - - DCE could then get rid of {S1,S2,S3,S4,S5,S6} (if their defs are not used + DCE could then get rid of {S1,S2,S3,S4,S5} (if their defs are not used elsewhere), and we'll end up with: VS6: va_new = .... - VS5: ... = ..vuse(va_new).. + VS5: ... = ..vuse(va_new).. */ - If vectorization does not succeed, DCE will clean S6 away (its def is - not used), and we'll end up with the original sequence. -*/ - void vect_pattern_recog (loop_vec_info loop_vinfo) { Index: tree-vect-stmts.c =================================================================== --- tree-vect-stmts.c (revision 174981) +++ tree-vect-stmts.c (working copy) @@ -605,15 +605,49 @@ vect_mark_stmts_to_be_vectorized (loop_vec_info lo break; } - FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, iter, SSA_OP_USE) - { - tree op = USE_FROM_PTR (use_p); - if (!process_use (stmt, op, loop_vinfo, live_p, relevant, &worklist)) - { - VEC_free (gimple, heap, worklist); - return false; - } - } + if (is_pattern_stmt_p (vinfo_for_stmt (stmt))) + { + /* Pattern statements are not inserted into the code, so + FOR_EACH_PHI_OR_STMT_USE optimizes their operands out, and we + have to scan the RHS or function arguments instead. */ + if (is_gimple_assign (stmt)) + { + for (i = 1; i < gimple_num_ops (stmt); i++) + { + tree op = gimple_op (stmt, i); + if (!process_use (stmt, op, loop_vinfo, live_p, relevant, + &worklist)) + { + VEC_free (gimple, heap, worklist); + return false; + } + } + } + else if (is_gimple_call (stmt)) + { + for (i = 0; i < gimple_call_num_args (stmt); i++) + { + tree arg = gimple_call_arg (stmt, i); + if (!process_use (stmt, arg, loop_vinfo, live_p, relevant, + &worklist)) + { + VEC_free (gimple, heap, worklist); + return false; + } + } + } + } + else + FOR_EACH_PHI_OR_STMT_USE (use_p, stmt, iter, SSA_OP_USE) + { + tree op = USE_FROM_PTR (use_p); + if (!process_use (stmt, op, loop_vinfo, live_p, relevant, + &worklist)) + { + VEC_free (gimple, heap, worklist); + return false; + } + } } /* while worklist */ VEC_free (gimple, heap, worklist); @@ -1406,6 +1440,7 @@ vectorizable_call (gimple stmt, gimple_stmt_iterat VEC(tree, heap) *vargs = NULL; enum { NARROW, NONE, WIDEN } modifier; size_t i, nargs; + tree lhs; /* FORNOW: unsupported in basic block SLP. */ gcc_assert (loop_vinfo); @@ -1543,7 +1578,7 @@ vectorizable_call (gimple stmt, gimple_stmt_iterat /** Transform. **/ if (vect_print_dump_info (REPORT_DETAILS)) - fprintf (vect_dump, "transform operation."); + fprintf (vect_dump, "transform call."); /* Handle def. */ scalar_dest = gimple_call_lhs (stmt); @@ -1662,8 +1697,11 @@ vectorizable_call (gimple stmt, gimple_stmt_iterat rhs of the statement with something harmless. */ type = TREE_TYPE (scalar_dest); - new_stmt = gimple_build_assign (gimple_call_lhs (stmt), - build_zero_cst (type)); + if (is_pattern_stmt_p (stmt_info)) + lhs = gimple_call_lhs (STMT_VINFO_RELATED_STMT (stmt_info)); + else + lhs = gimple_call_lhs (stmt); + new_stmt = gimple_build_assign (lhs, build_zero_cst (type)); set_vinfo_for_stmt (new_stmt, stmt_info); set_vinfo_for_stmt (stmt, NULL); STMT_VINFO_STMT (stmt_info) = new_stmt; @@ -4846,10 +4884,26 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vect if (!STMT_VINFO_RELEVANT_P (stmt_info) && !STMT_VINFO_LIVE_P (stmt_info)) { - if (vect_print_dump_info (REPORT_DETAILS)) - fprintf (vect_dump, "irrelevant."); + gimple pattern_stmt = STMT_VINFO_RELATED_STMT (stmt_info); + if (STMT_VINFO_IN_PATTERN_P (stmt_info) + && (STMT_VINFO_RELEVANT_P (vinfo_for_stmt (pattern_stmt)) + || STMT_VINFO_LIVE_P (vinfo_for_stmt (pattern_stmt)))) + { + stmt = pattern_stmt; + stmt_info = vinfo_for_stmt (pattern_stmt); + if (vect_print_dump_info (REPORT_DETAILS)) + { + fprintf (vect_dump, "==> examining pattern statement: "); + print_gimple_stmt (vect_dump, stmt, 0, TDF_SLIM); + } + } + else + { + if (vect_print_dump_info (REPORT_DETAILS)) + fprintf (vect_dump, "irrelevant."); - return true; + return true; + } } switch (STMT_VINFO_DEF_TYPE (stmt_info)) Index: tree-vect-slp.c =================================================================== --- tree-vect-slp.c (revision 174981) +++ tree-vect-slp.c (working copy) @@ -2546,6 +2546,8 @@ vect_schedule_slp_instance (slp_tree node, slp_ins && STMT_VINFO_STRIDED_ACCESS (stmt_info) && !REFERENCE_CLASS_P (gimple_get_lhs (stmt))) si = gsi_for_stmt (SLP_INSTANCE_FIRST_LOAD_STMT (instance)); + else if (is_pattern_stmt_p (stmt_info)) + si = gsi_for_stmt (STMT_VINFO_RELATED_STMT (stmt_info)); else si = gsi_for_stmt (stmt);