[RFC] If-conversion of COMPONENT_REFs

Message ID	AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com
State	Accepted
Headers	show Return-Path: <ira.rosen@linaro.org> Received-SPF: neutral (google.com: 209.85.161.178 is neither permitted nor denied by best guess record for domain of ira.rosen@linaro.org) client-ip=209.85.161.178; MIME-Version: 1.0 Date: Wed, 30 Mar 2011 11:13:04 +0200 Message-ID: <AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com> Subject: [RFC][patch] If-conversion of COMPONENT_REFs From: Ira Rosen <ira.rosen@linaro.org> To: gcc-patches@gcc.gnu.org Cc: Patch Tracking <patches@linaro.org> Content-Type: multipart/mixed; boundary=000e0cd6abb69c4b70049faf959c

Message ID

AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com

State

Accepted

Headers

Received-SPF: neutral (google.com: 209.85.161.178 is neither permitted nor
	denied by best guess record for domain of
	ira.rosen@linaro.org) client-ip=209.85.161.178; 
MIME-Version: 1.0
Date: Wed, 30 Mar 2011 11:13:04 +0200
Message-ID: <AANLkTin7SZSErUguxdcu-upF+GRq+sUz0ooeqt7g53JX@mail.gmail.com>
Subject: [RFC][patch] If-conversion of COMPONENT_REFs
From: Ira Rosen <ira.rosen@linaro.org>
To: gcc-patches@gcc.gnu.org
Cc: Patch Tracking <patches@linaro.org>
Content-Type: multipart/mixed; boundary=000e0cd6abb69c4b70049faf959c

Commit Message

Ira Rosen March 30, 2011, 9:13 a.m. UTC

Hi,

With this patch a data-ref is marked as unconditionally read or
written also if its adjacent field is read or written unconditionally
in the loop.
My concern is that this is not safe enough, even though the fields
have to be non-pointers and non-aggregates, and this optimization is
applied only with -ftree-loop-if-convert-stores flag.

Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.

OK for trunk?

Thanks,
Ira


ChangeLog:

         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
         if an adjacent field of the data-ref is accessed unconditionally.

testsuite/ChangeLog:

         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
         -ftree-loop-if-convert-stores.

Comments

Ira Rosen March 30, 2011, 12:22 p.m. UTC | #1

On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
>> Hi,
>>
>> With this patch a data-ref is marked as unconditionally read or
>> written also if its adjacent field is read or written unconditionally
>> in the loop.
>> My concern is that this is not safe enough, even though the fields
>> have to be non-pointers and non-aggregates, and this optimization is
>> applied only with -ftree-loop-if-convert-stores flag.
>>
>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>>
>> OK for trunk?
>
> The restrictions do not make too much sense to me.  For the C++
> memory model we can't do speculative stores at all, but for the
> rest I'd say just checking if the data-refs access the same object
> is enough, thus, instead of same_data_refs (a, b) simply check
> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
> the contents of the various DR fields are).

But what about

int a[10], b[100], c[100];
for (i = 0; i < 100; i++)
  {
     if (i < 10)
       t = a[i];
    else
       t = b[i];

    c[i] = t + a[1];
   }

We can't transform this to

int a[10], b[100], c[100];
for (i = 0; i < 100; i++)
  {
     t1 = a[i];
     t2 = b[i];
     t = (i < 10) ? t1 : t2;
     c[i] = t + a[1];
   }

since we create accesses to a[10:100], right?

Thanks,
Ira

>
> Thanks,
> Richard.
>
>> Thanks,
>> Ira
>>
>>
>> ChangeLog:
>>
>>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
>>         if an adjacent field of the data-ref is accessed unconditionally.
>>
>> testsuite/ChangeLog:
>>
>>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>>         -ftree-loop-if-convert-stores.
>>
>

Richard Biener March 30, 2011, 12:41 p.m. UTC | #2

On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
> On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote:
>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
>>> Hi,
>>>
>>> With this patch a data-ref is marked as unconditionally read or
>>> written also if its adjacent field is read or written unconditionally
>>> in the loop.
>>> My concern is that this is not safe enough, even though the fields
>>> have to be non-pointers and non-aggregates, and this optimization is
>>> applied only with -ftree-loop-if-convert-stores flag.
>>>
>>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>>>
>>> OK for trunk?
>>
>> The restrictions do not make too much sense to me.  For the C++
>> memory model we can't do speculative stores at all, but for the
>> rest I'd say just checking if the data-refs access the same object
>> is enough, thus, instead of same_data_refs (a, b) simply check
>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
>> the contents of the various DR fields are).
>
> But what about
>
> int a[10], b[100], c[100];
> for (i = 0; i < 100; i++)
>  {
>     if (i < 10)
>       t = a[i];
>    else
>       t = b[i];
>
>    c[i] = t + a[1];
>   }
>
> We can't transform this to
>
> int a[10], b[100], c[100];
> for (i = 0; i < 100; i++)
>  {
>     t1 = a[i];
>     t2 = b[i];
>     t = (i < 10) ? t1 : t2;
>     c[i] = t + a[1];
>   }
>
> since we create accesses to a[10:100], right?

That's correct - we may only ever create "valid" accesses, but any
valid access to a is ok after we see an unconditional access to a.

So we probably have to restrict ourselves to DECL_P (get_base_address ())
objects and make sure we don't access past it.

Thus, I see what you were trying to do - may I suggest sth like

  ref_base_a = DR_REF (a);
  while (TREE_CODE (ref_base_a) == COMPONENT_REF
           || TREE_CODE (ref_base_a) == IMAGPART_EXPR
           || TREE_CODE (ref_base_a) == REALPART_EXPR)
    ref_base_a = TREE_OPERAND (ref_base_a, 0);

... same for DR_REF (b) ...

  if (operand_equal_p (ref_base_a, ref_base_b, 0))
    ok

that is, strip all non-variable offset handled_component_refs and compare
the remaining base of the two accesses.  If they are equal we are ok.

Any hole in that?

Thanks,
Richard.

> Thanks,
> Ira
>
>>
>> Thanks,
>> Richard.
>>
>>> Thanks,
>>> Ira
>>>
>>>
>>> ChangeLog:
>>>
>>>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
>>>         if an adjacent field of the data-ref is accessed unconditionally.
>>>
>>> testsuite/ChangeLog:
>>>
>>>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>>>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>>>         -ftree-loop-if-convert-stores.
>>>
>>
>

Ira Rosen March 30, 2011, 1:09 p.m. UTC | #3

On 30 March 2011 14:41, Richard Guenther <richard.guenther@gmail.com> wrote:
> On Wed, Mar 30, 2011 at 2:22 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
>> On 30 March 2011 12:59, Richard Guenther <richard.guenther@gmail.com> wrote:
>>> On Wed, Mar 30, 2011 at 11:13 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
>>>> Hi,
>>>>
>>>> With this patch a data-ref is marked as unconditionally read or
>>>> written also if its adjacent field is read or written unconditionally
>>>> in the loop.
>>>> My concern is that this is not safe enough, even though the fields
>>>> have to be non-pointers and non-aggregates, and this optimization is
>>>> applied only with -ftree-loop-if-convert-stores flag.
>>>>
>>>> Bootstrapped on powerpc64-suse-linux and tested on x86_64-suse-linux.
>>>>
>>>> OK for trunk?
>>>
>>> The restrictions do not make too much sense to me.  For the C++
>>> memory model we can't do speculative stores at all, but for the
>>> rest I'd say just checking if the data-refs access the same object
>>> is enough, thus, instead of same_data_refs (a, b) simply check
>>> operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
>>> or operand_equal_p (get_base_address (DR_REF (a)), get_base_address
>>> (DR_REF (b)), 0), whatever makes more sense (I always confuse what
>>> the contents of the various DR fields are).
>>
>> But what about
>>
>> int a[10], b[100], c[100];
>> for (i = 0; i < 100; i++)
>>  {
>>     if (i < 10)
>>       t = a[i];
>>    else
>>       t = b[i];
>>
>>    c[i] = t + a[1];
>>   }
>>
>> We can't transform this to
>>
>> int a[10], b[100], c[100];
>> for (i = 0; i < 100; i++)
>>  {
>>     t1 = a[i];
>>     t2 = b[i];
>>     t = (i < 10) ? t1 : t2;
>>     c[i] = t + a[1];
>>   }
>>
>> since we create accesses to a[10:100], right?
>
> That's correct - we may only ever create "valid" accesses, but any
> valid access to a is ok after we see an unconditional access to a.
>
> So we probably have to restrict ourselves to DECL_P (get_base_address ())
> objects and make sure we don't access past it.
>
> Thus, I see what you were trying to do - may I suggest sth like
>
>  ref_base_a = DR_REF (a);
>  while (TREE_CODE (ref_base_a) == COMPONENT_REF
>           || TREE_CODE (ref_base_a) == IMAGPART_EXPR
>           || TREE_CODE (ref_base_a) == REALPART_EXPR)
>    ref_base_a = TREE_OPERAND (ref_base_a, 0);
>
> ... same for DR_REF (b) ...
>
>  if (operand_equal_p (ref_base_a, ref_base_b, 0))
>    ok
>
> that is, strip all non-variable offset handled_component_refs and compare
> the remaining base of the two accesses.  If they are equal we are ok.
>
> Any hole in that?

I don't see any :) I'll test your version.

Thanks,
Ira

>
> Thanks,
> Richard.
>
>> Thanks,
>> Ira
>>
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks,
>>>> Ira
>>>>
>>>>
>>>> ChangeLog:
>>>>
>>>>         * tree-if-conv.c (memrefs_read_or_written_unconditionally): Return true
>>>>         if an adjacent field of the data-ref is accessed unconditionally.
>>>>
>>>> testsuite/ChangeLog:
>>>>
>>>>         * gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c: New test.
>>>>         * gcc.dg/vect/vect.exp: Run if-cvt-stores-vect* tests with
>>>>         -ftree-loop-if-convert-stores.
>>>>
>>>
>>
>

Index: testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c
===================================================================
--- testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c	(revision 0)
+++ testsuite/gcc.dg/vect/if-cvt-stores-vect-ifcvt-18.c	(revision 0)
@@ -0,0 +1,69 @@ 
+/* { dg-require-effective-target vect_int } */
+
+#include <stdarg.h>
+#include "tree-vect.h"
+
+#define N 50
+
+typedef struct {
+  short a;
+  short b;
+} data;
+
+data in1[N], in2[N], out[N];
+short result[N*2] = {10,-7,11,-6,12,-5,13,-4,14,-3,15,-2,16,-1,17,0,18,1,19,2,20,3,21,4,22,5,23,6,24,7,25,8,26,9,27,10,28,11,29,12,30,13,31,14,32,15,33,16,34,17,35,18,36,19,37,20,38,21,39,22,40,23,41,24,42,25,43,26,44,27,45,28,46,29,47,30,48,31,49,32,50,33,51,34,52,35,53,36,54,37,55,38,56,39,57,40,58,41,59,42};
+short out1[N], out2[N];
+
+__attribute__ ((noinline)) void
+foo ()
+{
+  int i;
+  short c, d;
+
+  for (i = 0; i < N; i++)
+    {
+      c = in1[i].b;
+      d = in2[i].b;
+
+      if (c >= d)
+        {
+          out[i].b = in1[i].a;
+          out[i].a = d + 5;
+        }
+      else
+        {
+          out[i].b = d - 12;
+          out[i].a = in2[i].a + d;
+        }
+    }
+}
+
+int
+main (void)
+{
+  int i;
+
+  check_vect ();
+
+  for (i = 0; i < N; i++)
+    {
+      in1[i].a = i;
+      in1[i].b = i + 2;
+      in2[i].a = 5;
+      in2[i].b = i + 5;
+      __asm__ volatile ("");
+    }
+
+  foo ();
+
+  for (i = 0; i < N; i++)
+    {
+      if (out[i].a != result[2*i] || out[i].b != result[2*i+1])
+        abort ();
+    }
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  { xfail { vect_no_align || {! vect_strided } } } } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */
Index: testsuite/gcc.dg/vect/vect.exp
===================================================================
--- testsuite/gcc.dg/vect/vect.exp	(revision 171716)
+++ testsuite/gcc.dg/vect/vect.exp	(working copy)
@@ -210,6 +210,12 @@  lappend DEFAULT_VECTCFLAGS "--param" "gg
 dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/ggc-*.\[cS\]]]  \
         "" $DEFAULT_VECTCFLAGS
 
+# -ftree-loop-if-convert-stores
+set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
+lappend DEFAULT_VECTCFLAGS "-ftree-loop-if-convert-stores"
+dg-runtest [lsort [glob -nocomplain $srcdir/$subdir/if-cvt-stores-vect-*.\[cS\]]]  \
+        "" $DEFAULT_VECTCFLAGS
+
 # With -O3.
 # Don't allow IPA cloning, because it throws our counts out of whack.
 set DEFAULT_VECTCFLAGS $SAVED_DEFAULT_VECTCFLAGS
Index: tree-if-conv.c
===================================================================
--- tree-if-conv.c	(revision 171716)
+++ tree-if-conv.c	(working copy)
@@ -461,11 +461,11 @@  struct ifc_dr {
 #define DR_WRITTEN_AT_LEAST_ONCE(DR) (IFC_DR (DR)->written_at_least_once)
 #define DR_RW_UNCONDITIONALLY(DR) (IFC_DR (DR)->rw_unconditionally)
 
-/* Returns true when the memory references of STMT are read or written
-   unconditionally.  In other words, this function returns true when
-   for every data reference A in STMT there exist other accesses to
-   the same data reference with predicates that add up (OR-up) to the
-   true predicate: this ensures that the data reference A is touched
+/* Returns true when the memory references of STMT (or their adjacent fields)
+   are read or writtenunconditionally.  In other words, this function
+   returns true when for every data reference A in STMT there exist other
+   accesses to the same data reference with predicates that add up (OR-up) to
+   the true predicate: this ensures that the data reference A is touched
    (read or written) on every iteration of the if-converted loop.  */
 
 static bool
@@ -479,7 +479,7 @@  memrefs_read_or_written_unconditionally 
   for (i = 0; VEC_iterate (data_reference_p, drs, i, a); i++)
     if (DR_STMT (a) == stmt)
       {
-	bool found = false;
+	bool found = false, same_comp_ref = false;
 	int x = DR_RW_UNCONDITIONALLY (a);
 
 	if (x == 0)
@@ -489,22 +489,45 @@  memrefs_read_or_written_unconditionally 
 	  continue;
 
 	for (j = 0; VEC_iterate (data_reference_p, drs, j, b); j++)
-	  if (DR_STMT (b) != stmt
-	      && same_data_refs (a, b))
-	    {
-	      tree cb = bb_predicate (gimple_bb (DR_STMT (b)));
+	  {
+            if (DR_STMT (b) == stmt)
+              continue;
 
-	      if (DR_RW_UNCONDITIONALLY (b) == 1
-		  || is_true_predicate (cb)
-		  || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb),
-								 ca, cb)))
-		{
-		  DR_RW_UNCONDITIONALLY (a) = 1;
-		  DR_RW_UNCONDITIONALLY (b) = 1;
-		  found = true;
-		  break;
-		}
-	    }
+            if (TREE_CODE (DR_REF (a)) == COMPONENT_REF
+                && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (a)))
+                && !POINTER_TYPE_P (TREE_TYPE (DR_REF (a)))
+                && !AGGREGATE_TYPE_P (TREE_TYPE (DR_REF (b)))
+                && !POINTER_TYPE_P (TREE_TYPE (DR_REF (b)))
+                && operand_equal_p (DR_BASE_ADDRESS (a), DR_BASE_ADDRESS (b), 0)
+                && dr_equal_offsets_p (a, b))
+              {
+                HOST_WIDE_INT type_size_a, type_size_b, init_a, init_b;
+
+                type_size_a = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (a))));
+                type_size_b = TREE_INT_CST_LOW (TYPE_SIZE_UNIT (TREE_TYPE (DR_REF (b))));
+                init_a = TREE_INT_CST_LOW (DR_INIT (a));
+                init_b = TREE_INT_CST_LOW (DR_INIT (b));
+
+                if (init_a + type_size_a == init_b || init_b + type_size_b == init_a)
+                  same_comp_ref = true;
+              }
+
+            if (same_comp_ref || same_data_refs (a, b))
+              {
+                tree cb = bb_predicate (gimple_bb (DR_STMT (b)));
+
+                if (DR_RW_UNCONDITIONALLY (b) == 1
+                    || is_true_predicate (cb)
+                    || is_true_predicate (ca = fold_or_predicates (EXPR_LOCATION (cb),
+								   ca, cb)))
+		  {
+		    DR_RW_UNCONDITIONALLY (a) = 1;
+		    DR_RW_UNCONDITIONALLY (b) = 1;
+		    found = true;
+		    break;
+		  }
+              }
+	  }
 
 	if (!found)
 	  {