Message ID | AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com |
---|---|
State | Accepted |
Headers | show |
On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: > On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >> Hi, >> >> With this patch a data-ref is marked as unconditionally read or >> written also if its adjacent field is read or written unconditionally >> in the loop. >> My concern is that this is not safe enough, even though the fields >> have to be non-pointers and non-aggregates, and this optimization is >> applied only with -ftree-loop-if-convert-stores flag. >> >> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >> >> OK for trunk? > > The restrictions do not make too much sense to me. For the C++ > memory model we can't do speculative stores at all, but for the > rest I'd say just checking if the data-refs access the same object > is enough, thus, instead of same_data_refs (a, b) simply check > operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) > or operand_equal_p (get_base_address (DR_REF (a)), get_base_address > (DR_REF (b)), 0), whatever makes more sense (I always confuse what > the contents of the various DR fields are). But what about int a[10], b[100], c[100]; for (i = 0; i < 100; i++) { if (i < 10) t = a[i]; else t = b[i]; c[i] = t + a[1]; } We can't transform this to int a[10], b[100], c[100]; for (i = 0; i < 100; i++) { t1 = a[i]; t2 = b[i]; t = (i < 10) ? t1 : t2; c[i] = t + a[1]; } since we create accesses to a[10:100], right? Thanks, Ira > > Thanks, > Richard. > >> Thanks, >> Ira >> >> >> ChangeLog: >> >> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >> if an adjacent field of the data-ref is accessed unconditionally. >> >> testsuite/ChangeLog: >> >> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >> -ftree-loop-if-convert-stores. >> >
On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote: > On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: >> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >>> Hi, >>> >>> With this patch a data-ref is marked as unconditionally read or >>> written also if its adjacent field is read or written unconditionally >>> in the loop. >>> My concern is that this is not safe enough, even though the fields >>> have to be non-pointers and non-aggregates, and this optimization is >>> applied only with -ftree-loop-if-convert-stores flag. >>> >>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >>> >>> OK for trunk? >> >> The restrictions do not make too much sense to me. For the C++ >> memory model we can't do speculative stores at all, but for the >> rest I'd say just checking if the data-refs access the same object >> is enough, thus, instead of same_data_refs (a, b) simply check >> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) >> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address >> (DR_REF (b)), 0), whatever makes more sense (I always confuse what >> the contents of the various DR fields are). > > But what about > > int a[10], b[100], c[100]; > for (i = 0; i < 100; i++) > { > if (i < 10) > t = a[i]; > else > t = b[i]; > > c[i] = t + a[1]; > } > > We can't transform this to > > int a[10], b[100], c[100]; > for (i = 0; i < 100; i++) > { > t1 = a[i]; > t2 = b[i]; > t = (i < 10) ? t1 : t2; > c[i] = t + a[1]; > } > > since we create accesses to a[10:100], right? That's correct - we may only ever create "valid" accesses, but any valid access to a is ok after we see an unconditional access to a. So we probably have to restrict ourselves to DECL_P (get_base_address ()) objects and make sure we don't access past it. Thus, I see what you were trying to do - may I suggest sth like ref_base_a = DR_REF (a); while (TREE_CODE (ref_base_a) == COMPONENT_REF || TREE_CODE (ref_base_a) == IMAGPART_EXPR || TREE_CODE (ref_base_a) == REALPART_EXPR) ref_base_a = TREE_OPERAND (ref_base_a, 0); ... same for DR_REF (b) ... if (operand_equal_p (ref_base_a, ref_base_b, 0)) ok that is, strip all non-variable offset handled_component_refs and compare the remaining base of the two accesses. If they are equal we are ok. Any hole in that? Thanks, Richard. > Thanks, > Ira > >> >> Thanks, >> Richard. >> >>> Thanks, >>> Ira >>> >>> >>> ChangeLog: >>> >>> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >>> if an adjacent field of the data-ref is accessed unconditionally. >>> >>> testsuite/ChangeLog: >>> >>> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >>> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >>> -ftree-loop-if-convert-stores. >>> >> >
On 30 March 2011 14:41, Richard Guenther <richard.guenther@gmail.com> wrote: > On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote: >> On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote: >>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote: >>>> Hi, >>>> >>>> With this patch a data-ref is marked as unconditionally read or >>>> written also if its adjacent field is read or written unconditionally >>>> in the loop. >>>> My concern is that this is not safe enough, even though the fields >>>> have to be non-pointers and non-aggregates, and this optimization is >>>> applied only with -ftree-loop-if-convert-stores flag. >>>> >>>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux. >>>> >>>> OK for trunk? >>> >>> The restrictions do not make too much sense to me. For the C++ >>> memory model we can't do speculative stores at all, but for the >>> rest I'd say just checking if the data-refs access the same object >>> is enough, thus, instead of same_data_refs (a, b) simply check >>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) >>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address >>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what >>> the contents of the various DR fields are). >> >> But what about >> >> int a[10], b[100], c[100]; >> for (i = 0; i < 100; i++) >> { >> if (i < 10) >> t = a[i]; >> else >> t = b[i]; >> >> c[i] = t + a[1]; >> } >> >> We can't transform this to >> >> int a[10], b[100], c[100]; >> for (i = 0; i < 100; i++) >> { >> t1 = a[i]; >> t2 = b[i]; >> t = (i < 10) ? t1 : t2; >> c[i] = t + a[1]; >> } >> >> since we create accesses to a[10:100], right? > > That's correct - we may only ever create "valid" accesses, but any > valid access to a is ok after we see an unconditional access to a. > > So we probably have to restrict ourselves to DECL_P (get_base_address ()) > objects and make sure we don't access past it. > > Thus, I see what you were trying to do - may I suggest sth like > > ref_base_a = DR_REF (a); > while (TREE_CODE (ref_base_a) == COMPONENT_REF > || TREE_CODE (ref_base_a) == IMAGPART_EXPR > || TREE_CODE (ref_base_a) == REALPART_EXPR) > ref_base_a = TREE_OPERAND (ref_base_a, 0); > > ... same for DR_REF (b) ... > > if (operand_equal_p (ref_base_a, ref_base_b, 0)) > ok > > that is, strip all non-variable offset handled_component_refs and compare > the remaining base of the two accesses. If they are equal we are ok. > > Any hole in that? I don't see any :) I'll test your version. Thanks, Ira > > Thanks, > Richard. > >> Thanks, >> Ira >> >>> >>> Thanks, >>> Richard. >>> >>>> Thanks, >>>> Ira >>>> >>>> >>>> ChangeLog: >>>> >>>> * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true >>>> if an adjacent field of the data-ref is accessed unconditionally. >>>> >>>> testsuite/ChangeLog: >>>> >>>> * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test. >>>> * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with >>>> -ftree-loop-if-convert-stores. >>>> >>> >> >
Index: testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c =================================================================== --- testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0) +++ testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c (revision 0) @@ -0,0 +1,69 @@ +/* { dg-require-effective-target vect_int } */ + +#include <stdarg.h> +#include "tree-vect.h" + +#define N 50 + +typedef struct { + short a; + short b; +} data; + +data in1[N], in2[N], out[N]; +short result[N*2] = {10,-7,11,-6,12,-5,13,-4,14,-3,15,-2,16,-1,17,0,18,1,19,2,20,3,21,4,22,5,23,6,24,7,25,8,26,9,27,10,28,11,29,12,30,13,31,14,32,15,33,16,34,17,35,18,36,19,37,20,38,21,39,22,40,23,41,24,42,25,43,26,44,27,45,28,46,29,47,30,48,31,49,32,50,33,51,34,52,35,53,36,54,37,55,38,56,39,57,40,58,41,59,42}; +short out1[N], out2[N]; + +__attribute__ ((noinline)) void +foo () +{ + int i; + short c, d; + + for (i = 0; i < N; i++) + { + c = in1[i].b; + d = in2[i].b; + + if (c >= d) + { + out[i].b = in1[i].a; + out[i].a = d + 5; + } + else + { + out[i].b = d - 12; + out[i].a = in2[i].a + d; + } + } +} + +int +main (void) +{ + int i; + + check_vect (); + + for (i = 0; i < N; i++) + { + in1[i].a = i; + in1[i].b = i + 2; + in2[i].a = 5; + in2[i].b = i + 5; + __asm__ volatile (""); + } + + foo (); + + for (i = 0; i < N; i++) + { + if (out[i].a != result[2*i] || out[i].b != result[2*i+1]) + abort (); + } + + return 0; +} + +/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" { xfail { vect_no_align || {! vect_strided } } } } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */ Index: testsuite/gcc.dg/vect/vect.exp =================================================================== --- testsuite/gcc.dg/vect/vect.exp (revision 171716) +++ testsuite/gcc.dg/vect/vect.exp (working copy) @@ -210,6 +210,12 @@ lappend DEFAULT_VECTCFLAGS "--param" "gg dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/ggc-*.\[cS\]]] \ "" $DEFAULT_VECTCFLAGS +# -ftree-loop-if-convert-stores +set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS +lappend DEFAULT_VECTCFLAGS "-ftree-loop-if-convert-stores" +dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/if-cvt-stores-vect-*.\[cS\]]] \ + "" $DEFAULT_VECTCFLAGS + # With -O3. # Don't allow IPA cloning, because it throws our counts out of whack. set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS Index: tree-if-conv.c =================================================================== --- tree-if-conv.c (revision 171716) +++ tree-if-conv.c (working copy) @@ -461,11 +461,11 @@ struct ifc_dr { #define DR_WRITTEN_AT_LEAST_ONCE(DR) (IFC_DR (DR)->written_at_least_once) #define DR_RW_UNCONDITIONALLY(DR) (IFC_DR (DR)->rw_unconditionally) -/* Returns true when the memory references of STMT are read or written - unconditionally. In other words, this function returns true when - for every data reference A in STMT there exist other accesses to - the same data reference with predicates that add up (OR-up) to the - true predicate: this ensures that the data reference A is touched +/* Returns true when the memory references of STMT (or their adjacent fields) + are read or writtenunconditionally. In other words, this function + returns true when for every data reference A in STMT there exist other + accesses to the same data reference with predicates that add up (OR-up) to + the true predicate: this ensures that the data reference A is touched (read or written) on every iteration of the if-converted loop. */ static bool @@ -479,7 +479,7 @@ memrefs_read_or_written_unconditionally for (i = 0; VEC_iterate (data_reference_p, drs, i, a); i++) if (DR_STMT (a) == stmt) { - bool found = false; + bool found = false, same_comp_ref = false; int x = DR_RW_UNCONDITIONALLY (a); if (x == 0) @@ -489,22 +489,45 @@ memrefs_read_or_written_unconditionally continue; for (j = 0; VEC_iterate (data_reference_p, drs, j, b); j++) - if (DR_STMT (b) != stmt - && same_data_refs (a, b)) - { - tree cb = bb_predicate (gimple_bb (DR_STMT (b))); + { + if (DR_STMT (b) == stmt) + continue; - if (DR_RW_UNCONDITIONALLY (b) == 1 - || is_true_predicate (cb) - || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb), - ca, cb))) - { - DR_RW_UNCONDITIONALLY (a) = 1; - DR_RW_UNCONDITIONALLY (b) = 1; - found = true; - break; - } - } + if (TREE_CODE (DR_REF (a)) == COMPONENT_REF + && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (a))) + && !POINTER_TYPE_P (TREE_TYPE (DR_REF (a))) + && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (b))) + && !POINTER_TYPE_P (TREE_TYPE (DR_REF (b))) + && operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0) + && dr_equal_offsets_p (a, b)) + { + HOST_WIDE_INT type_size_a, type_size_b, init_a, init_b; + + type_size_a = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a)))); + type_size_b = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b)))); + init_a = TREE_INT_CST_LOW (DR_INIT (a)); + init_b = TREE_INT_CST_LOW (DR_INIT (b)); + + if (init_a + type_size_a == init_b || init_b + type_size_b == init_a) + same_comp_ref = true; + } + + if (same_comp_ref || same_data_refs (a, b)) + { + tree cb = bb_predicate (gimple_bb (DR_STMT (b))); + + if (DR_RW_UNCONDITIONALLY (b) == 1 + || is_true_predicate (cb) + || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb), + ca, cb))) + { + DR_RW_UNCONDITIONALLY (a) = 1; + DR_RW_UNCONDITIONALLY (b) = 1; + found = true; + break; + } + } + } if (!found) {