diff mbox series

Improve canonicalisation of TARGET_MEM_REFs

Message ID 87tvybnx0i.fsf@linaro.org
State New
Headers show
Series Improve canonicalisation of TARGET_MEM_REFs | expand

Commit Message

Richard Sandiford Nov. 3, 2017, 4:32 p.m. UTC
A general TARGET_MEM_REF is:

    BASE + STEP * INDEX + INDEX2 + OFFSET

After classifying the address in this way, the code that builds
TARGET_MEM_REFs tries to simplify the address until it's valid
for the current target and for the mode of memory being addressed.
It does this in a fixed order:

(1) add SYMBOL to BASE
(2) add INDEX * STEP to the base, if STEP != 1
(3) add OFFSET to INDEX or BASE (reverted if unsuccessful)
(4) add INDEX to BASE
(5) add OFFSET to BASE

So suppose we had an address:

    &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

on a target that only allows an index or an offset, not both.  Following
the steps above, we'd first create:

    tmp = symbol
    tmp2 = tmp + index * 8

Then if the given offset value was valid for the mode being addressed,
we'd create:

    MEM[base:tmp2, offset:offset]

while if it was invalid we'd create:

    tmp3 = tmp2 + offset
    MEM[base:tmp3, offset:0]

The problem is that this could happen if ivopts had decided to use
a scaled index for an address that happens to have a constant base.
The old procedure failed to give an indexed TARGET_MEM_REF in that case,
and adding the offset last prevented later passes from being able to
fold the index back in.

The patch avoids this by skipping (2) if BASE + INDEX * STEP
is a legitimate address and if OFFSET is stopping the address
being valid.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.
OK to install?

Richard


2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-ssa-address.c (keep_index_p): New function.
	(create_mem_ref): Use it.  Only split out the INDEX * STEP
	component if that is invalid even with the symbol and offset
	removed.

Comments

Richard Biener Nov. 7, 2017, 11:30 a.m. UTC | #1
On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> A general TARGET_MEM_REF is:

>

>     BASE + STEP * INDEX + INDEX2 + OFFSET

>

> After classifying the address in this way, the code that builds

> TARGET_MEM_REFs tries to simplify the address until it's valid

> for the current target and for the mode of memory being addressed.

> It does this in a fixed order:

>

> (1) add SYMBOL to BASE

> (2) add INDEX * STEP to the base, if STEP != 1

> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

> (4) add INDEX to BASE

> (5) add OFFSET to BASE

>

> So suppose we had an address:

>

>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>

> on a target that only allows an index or an offset, not both.  Following

> the steps above, we'd first create:

>

>     tmp = symbol

>     tmp2 = tmp + index * 8

>

> Then if the given offset value was valid for the mode being addressed,

> we'd create:

>

>     MEM[base:tmp2, offset:offset]

>

> while if it was invalid we'd create:

>

>     tmp3 = tmp2 + offset

>     MEM[base:tmp3, offset:0]

>

> The problem is that this could happen if ivopts had decided to use

> a scaled index for an address that happens to have a constant base.

> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

> and adding the offset last prevented later passes from being able to

> fold the index back in.

>

> The patch avoids this by skipping (2) if BASE + INDEX * STEP

> is a legitimate address and if OFFSET is stopping the address

> being valid.

>

> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

> OK to install?

>

> Richard

>

>

> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hayward@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * tree-ssa-address.c (keep_index_p): New function.

>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>         component if that is invalid even with the symbol and offset

>         removed.

>

> Index: gcc/tree-ssa-address.c

> ===================================================================

> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>                                              true, GSI_SAME_STMT);

>  }

>

> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

> +   address for type TYPE and if the offset is making it appear invalid.  */

> +

> +static bool

> +keep_index_p (tree type, mem_address parts)


mem_ref_valid_without_offset_p (...)

?

> +{

> +  if (!parts.base)

> +    return false;

> +

> +  gcc_assert (!parts.symbol);

> +  parts.offset = NULL_TREE;

> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

> +}

> +

>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>     computations are emitted in front of GSI.  TYPE is the mode

>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs


Which means all of the following would be more naturally written as

>       into:

>         index' = index << step;

>         [... + index' + ,,,].  */

> -  if (parts.step && !integer_onep (parts.step))

> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

> +  if (scaled_p && !keep_index_p (type, parts))

>      {


  if (mem_ref_valid_without_offset_p (...))
   {
     ...
     return create_mem_ref_raw (...);
   }

that said, the function should ideally be re-written to try a few more
options non-destructively.  Doesn't IVOPTs itself already verify
the TARGET_MEM_REFs it generates (and costs!) are valid?

Thanks,
Richard.

>        gcc_assert (parts.index);

>        parts.index = force_gimple_operand_gsi (gsi,

> @@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>        mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>        if (mem_ref)

>         return mem_ref;

> +      scaled_p = false;

>      }

>

>    /* Add offset to invariant part by transforming address expression:

> @@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs

>         index' = index + offset;

>         [base + index']

>       depending on which one is invariant.  */

> -  if (parts.offset && !integer_zerop (parts.offset))

> +  if (parts.offset

> +      && !integer_zerop (parts.offset)

> +      && (!var_in_base || !scaled_p))

>      {

>        tree old_base = unshare_expr (parts.base);

>        tree old_index = unshare_expr (parts.index);

> @@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>    /* Transform [base + index + ...] into:

>         base' = base + index;

>         [base' + ...].  */

> -  if (parts.index)

> +  if (parts.index && !scaled_p)

>      {

>        tmp = parts.index;

>        parts.index = NULL_TREE;
Richard Sandiford Nov. 7, 2017, 6:04 p.m. UTC | #2
Richard Biener <richard.guenther@gmail.com> writes:
> On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> A general TARGET_MEM_REF is:

>>

>>     BASE + STEP * INDEX + INDEX2 + OFFSET

>>

>> After classifying the address in this way, the code that builds

>> TARGET_MEM_REFs tries to simplify the address until it's valid

>> for the current target and for the mode of memory being addressed.

>> It does this in a fixed order:

>>

>> (1) add SYMBOL to BASE

>> (2) add INDEX * STEP to the base, if STEP != 1

>> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

>> (4) add INDEX to BASE

>> (5) add OFFSET to BASE

>>

>> So suppose we had an address:

>>

>>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>>

>> on a target that only allows an index or an offset, not both.  Following

>> the steps above, we'd first create:

>>

>>     tmp = symbol

>>     tmp2 = tmp + index * 8

>>

>> Then if the given offset value was valid for the mode being addressed,

>> we'd create:

>>

>>     MEM[base:tmp2, offset:offset]

>>

>> while if it was invalid we'd create:

>>

>>     tmp3 = tmp2 + offset

>>     MEM[base:tmp3, offset:0]

>>

>> The problem is that this could happen if ivopts had decided to use

>> a scaled index for an address that happens to have a constant base.

>> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

>> and adding the offset last prevented later passes from being able to

>> fold the index back in.

>>

>> The patch avoids this by skipping (2) if BASE + INDEX * STEP

>> is a legitimate address and if OFFSET is stopping the address

>> being valid.

>>

>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>> OK to install?

>>

>> Richard

>>

>>

>> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>>             Alan Hayward  <alan.hayward@arm.com>

>>             David Sherwood  <david.sherwood@arm.com>

>>

>> gcc/

>>         * tree-ssa-address.c (keep_index_p): New function.

>>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>>         component if that is invalid even with the symbol and offset

>>         removed.

>>

>> Index: gcc/tree-ssa-address.c

>> ===================================================================

>> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

>> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

>> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>>                                              true, GSI_SAME_STMT);

>>  }

>>

>> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

>> +   address for type TYPE and if the offset is making it appear invalid.  */

>> +

>> +static bool

>> +keep_index_p (tree type, mem_address parts)

>

> mem_ref_valid_without_offset_p (...)

>

> ?


OK.

>> +{

>> +  if (!parts.base)

>> +    return false;

>> +

>> +  gcc_assert (!parts.symbol);

>> +  parts.offset = NULL_TREE;

>> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

>> +}

>> +

>>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>>     computations are emitted in front of GSI.  TYPE is the mode

>>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

>> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

>

> Which means all of the following would be more naturally written as

>

>>       into:

>>         index' = index << step;

>>         [... + index' + ,,,].  */

>> -  if (parts.step && !integer_onep (parts.step))

>> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

>> +  if (scaled_p && !keep_index_p (type, parts))

>>      {

>

>   if (mem_ref_valid_without_offset_p (...))

>    {

>      ...

>      return create_mem_ref_raw (...);

>    }


Is this inside the test for a scale:

  if (parts.step && !integer_onep (parts.step))
    {
      if (mem_ref_valid_without_offset_p (...))
        {
          tree tmp = parts.offset;
          if (parts.base)
            {
              tmp = fold_build_pointer_plus (parts.base, tmp);
              tmp = force_gimple_operand_gsi_1 (gsi, tmp,
                                                is_gimple_mem_ref_addr,
                                                NULL_TREE, true,
                                                GSI_SAME_STMT);
            }
          parts.base = tmp;
          parts.offset = NULL_TREE;
          mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
          gcc_assert (mem_ref);
          return mem_ref;
        }
      ...current code...
    }

?  I can do that if you don't mind the cut-&-paste.  Or I could
split the code that adds the offset out into a subroutine.

If we do it outside the check for a scaled index, we'd need to
duplicate the logic about whether to add the offset to the index
or the base.

> that said, the function should ideally be re-written to try a few more

> options non-destructively.  Doesn't IVOPTs itself already verify

> the TARGET_MEM_REFs it generates (and costs!) are valid?


I think it only evaluates the cost of the address and leaves this
routine to make them valid.  (And the costs are usually right for
SVE after the ivopts patch I posted.)  We'd probably have to duplicate
the same logic if ivopts legitimised the addresses itself.

Thanks,
Richard

>

> Thanks,

> Richard.

>

>>        gcc_assert (parts.index);

>>        parts.index = force_gimple_operand_gsi (gsi,

>> @@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>        mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>        if (mem_ref)

>>         return mem_ref;

>> +      scaled_p = false;

>>      }

>>

>>    /* Add offset to invariant part by transforming address expression:

>> @@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs

>>         index' = index + offset;

>>         [base + index']

>>       depending on which one is invariant.  */

>> -  if (parts.offset && !integer_zerop (parts.offset))

>> +  if (parts.offset

>> +      && !integer_zerop (parts.offset)

>> +      && (!var_in_base || !scaled_p))

>>      {

>>        tree old_base = unshare_expr (parts.base);

>>        tree old_index = unshare_expr (parts.index);

>> @@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>    /* Transform [base + index + ...] into:

>>         base' = base + index;

>>         [base' + ...].  */

>> -  if (parts.index)

>> +  if (parts.index && !scaled_p)

>>      {

>>        tmp = parts.index;

>>        parts.index = NULL_TREE;
Richard Biener Nov. 20, 2017, 11:02 a.m. UTC | #3
On Tue, Nov 7, 2017 at 7:04 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:

>> On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford

>> <richard.sandiford@linaro.org> wrote:

>>> A general TARGET_MEM_REF is:

>>>

>>>     BASE + STEP * INDEX + INDEX2 + OFFSET

>>>

>>> After classifying the address in this way, the code that builds

>>> TARGET_MEM_REFs tries to simplify the address until it's valid

>>> for the current target and for the mode of memory being addressed.

>>> It does this in a fixed order:

>>>

>>> (1) add SYMBOL to BASE

>>> (2) add INDEX * STEP to the base, if STEP != 1

>>> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

>>> (4) add INDEX to BASE

>>> (5) add OFFSET to BASE

>>>

>>> So suppose we had an address:

>>>

>>>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>>>

>>> on a target that only allows an index or an offset, not both.  Following

>>> the steps above, we'd first create:

>>>

>>>     tmp = symbol

>>>     tmp2 = tmp + index * 8

>>>

>>> Then if the given offset value was valid for the mode being addressed,

>>> we'd create:

>>>

>>>     MEM[base:tmp2, offset:offset]

>>>

>>> while if it was invalid we'd create:

>>>

>>>     tmp3 = tmp2 + offset

>>>     MEM[base:tmp3, offset:0]

>>>

>>> The problem is that this could happen if ivopts had decided to use

>>> a scaled index for an address that happens to have a constant base.

>>> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

>>> and adding the offset last prevented later passes from being able to

>>> fold the index back in.

>>>

>>> The patch avoids this by skipping (2) if BASE + INDEX * STEP

>>> is a legitimate address and if OFFSET is stopping the address

>>> being valid.

>>>

>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>>> OK to install?

>>>

>>> Richard

>>>

>>>

>>> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>>>             Alan Hayward  <alan.hayward@arm.com>

>>>             David Sherwood  <david.sherwood@arm.com>

>>>

>>> gcc/

>>>         * tree-ssa-address.c (keep_index_p): New function.

>>>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>>>         component if that is invalid even with the symbol and offset

>>>         removed.

>>>

>>> Index: gcc/tree-ssa-address.c

>>> ===================================================================

>>> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

>>> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

>>> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>>>                                              true, GSI_SAME_STMT);

>>>  }

>>>

>>> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

>>> +   address for type TYPE and if the offset is making it appear invalid.  */

>>> +

>>> +static bool

>>> +keep_index_p (tree type, mem_address parts)

>>

>> mem_ref_valid_without_offset_p (...)

>>

>> ?

>

> OK.

>

>>> +{

>>> +  if (!parts.base)

>>> +    return false;

>>> +

>>> +  gcc_assert (!parts.symbol);

>>> +  parts.offset = NULL_TREE;

>>> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

>>> +}

>>> +

>>>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>>>     computations are emitted in front of GSI.  TYPE is the mode

>>>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

>>> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

>>

>> Which means all of the following would be more naturally written as

>>

>>>       into:

>>>         index' = index << step;

>>>         [... + index' + ,,,].  */

>>> -  if (parts.step && !integer_onep (parts.step))

>>> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

>>> +  if (scaled_p && !keep_index_p (type, parts))

>>>      {

>>

>>   if (mem_ref_valid_without_offset_p (...))

>>    {

>>      ...

>>      return create_mem_ref_raw (...);

>>    }

>

> Is this inside the test for a scale:

>

>   if (parts.step && !integer_onep (parts.step))

>     {

>       if (mem_ref_valid_without_offset_p (...))

>         {

>           tree tmp = parts.offset;

>           if (parts.base)

>             {

>               tmp = fold_build_pointer_plus (parts.base, tmp);

>               tmp = force_gimple_operand_gsi_1 (gsi, tmp,

>                                                 is_gimple_mem_ref_addr,

>                                                 NULL_TREE, true,

>                                                 GSI_SAME_STMT);

>             }

>           parts.base = tmp;

>           parts.offset = NULL_TREE;

>           mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>           gcc_assert (mem_ref);

>           return mem_ref;

>         }

>       ...current code...

>     }

>

> ?  I can do that if you don't mind the cut-&-paste.  Or I could

> split the code that adds the offset out into a subroutine.


Sorry for the late response.  Yes, sth like that.

> If we do it outside the check for a scaled index, we'd need to

> duplicate the logic about whether to add the offset to the index

> or the base.

>

>> that said, the function should ideally be re-written to try a few more

>> options non-destructively.  Doesn't IVOPTs itself already verify

>> the TARGET_MEM_REFs it generates (and costs!) are valid?

>

> I think it only evaluates the cost of the address and leaves this

> routine to make them valid.  (And the costs are usually right for

> SVE after the ivopts patch I posted.)  We'd probably have to duplicate

> the same logic if ivopts legitimised the addresses itself.


Hmm.  Just wonder how it can compute costs when it might not
be legitimized yet.  And if legitimizing doesn't matter for costs
why we can't delegitimize during RTL expansion...

Richard.

> Thanks,

> Richard

>

>>

>> Thanks,

>> Richard.

>>

>>>        gcc_assert (parts.index);

>>>        parts.index = force_gimple_operand_gsi (gsi,

>>> @@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>        mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>>        if (mem_ref)

>>>         return mem_ref;

>>> +      scaled_p = false;

>>>      }

>>>

>>>    /* Add offset to invariant part by transforming address expression:

>>> @@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>         index' = index + offset;

>>>         [base + index']

>>>       depending on which one is invariant.  */

>>> -  if (parts.offset && !integer_zerop (parts.offset))

>>> +  if (parts.offset

>>> +      && !integer_zerop (parts.offset)

>>> +      && (!var_in_base || !scaled_p))

>>>      {

>>>        tree old_base = unshare_expr (parts.base);

>>>        tree old_index = unshare_expr (parts.index);

>>> @@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>    /* Transform [base + index + ...] into:

>>>         base' = base + index;

>>>         [base' + ...].  */

>>> -  if (parts.index)

>>> +  if (parts.index && !scaled_p)

>>>      {

>>>        tmp = parts.index;

>>>        parts.index = NULL_TREE;
Bin.Cheng Nov. 20, 2017, 12:52 p.m. UTC | #4
On Mon, Nov 20, 2017 at 11:02 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Tue, Nov 7, 2017 at 7:04 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> Richard Biener <richard.guenther@gmail.com> writes:

>>> On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford

>>> <richard.sandiford@linaro.org> wrote:

>>>> A general TARGET_MEM_REF is:

>>>>

>>>>     BASE + STEP * INDEX + INDEX2 + OFFSET

>>>>

>>>> After classifying the address in this way, the code that builds

>>>> TARGET_MEM_REFs tries to simplify the address until it's valid

>>>> for the current target and for the mode of memory being addressed.

>>>> It does this in a fixed order:

>>>>

>>>> (1) add SYMBOL to BASE

>>>> (2) add INDEX * STEP to the base, if STEP != 1

>>>> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

>>>> (4) add INDEX to BASE

>>>> (5) add OFFSET to BASE

>>>>

>>>> So suppose we had an address:

>>>>

>>>>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>>>>

>>>> on a target that only allows an index or an offset, not both.  Following

>>>> the steps above, we'd first create:

>>>>

>>>>     tmp = symbol

>>>>     tmp2 = tmp + index * 8

>>>>

>>>> Then if the given offset value was valid for the mode being addressed,

>>>> we'd create:

>>>>

>>>>     MEM[base:tmp2, offset:offset]

>>>>

>>>> while if it was invalid we'd create:

>>>>

>>>>     tmp3 = tmp2 + offset

>>>>     MEM[base:tmp3, offset:0]

>>>>

>>>> The problem is that this could happen if ivopts had decided to use

>>>> a scaled index for an address that happens to have a constant base.

>>>> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

>>>> and adding the offset last prevented later passes from being able to

>>>> fold the index back in.

>>>>

>>>> The patch avoids this by skipping (2) if BASE + INDEX * STEP

>>>> is a legitimate address and if OFFSET is stopping the address

>>>> being valid.

>>>>

>>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>>>> OK to install?

>>>>

>>>> Richard

>>>>

>>>>

>>>> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>>>>             Alan Hayward  <alan.hayward@arm.com>

>>>>             David Sherwood  <david.sherwood@arm.com>

>>>>

>>>> gcc/

>>>>         * tree-ssa-address.c (keep_index_p): New function.

>>>>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>>>>         component if that is invalid even with the symbol and offset

>>>>         removed.

>>>>

>>>> Index: gcc/tree-ssa-address.c

>>>> ===================================================================

>>>> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

>>>> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

>>>> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>>>>                                              true, GSI_SAME_STMT);

>>>>  }

>>>>

>>>> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

>>>> +   address for type TYPE and if the offset is making it appear invalid.  */

>>>> +

>>>> +static bool

>>>> +keep_index_p (tree type, mem_address parts)

>>>

>>> mem_ref_valid_without_offset_p (...)

>>>

>>> ?

>>

>> OK.

>>

>>>> +{

>>>> +  if (!parts.base)

>>>> +    return false;

>>>> +

>>>> +  gcc_assert (!parts.symbol);

>>>> +  parts.offset = NULL_TREE;

>>>> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

>>>> +}

>>>> +

>>>>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>>>>     computations are emitted in front of GSI.  TYPE is the mode

>>>>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

>>>> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>

>>> Which means all of the following would be more naturally written as

>>>

>>>>       into:

>>>>         index' = index << step;

>>>>         [... + index' + ,,,].  */

>>>> -  if (parts.step && !integer_onep (parts.step))

>>>> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

>>>> +  if (scaled_p && !keep_index_p (type, parts))

>>>>      {

>>>

>>>   if (mem_ref_valid_without_offset_p (...))

>>>    {

>>>      ...

>>>      return create_mem_ref_raw (...);

>>>    }

>>

>> Is this inside the test for a scale:

>>

>>   if (parts.step && !integer_onep (parts.step))

>>     {

>>       if (mem_ref_valid_without_offset_p (...))

>>         {

>>           tree tmp = parts.offset;

>>           if (parts.base)

>>             {

>>               tmp = fold_build_pointer_plus (parts.base, tmp);

>>               tmp = force_gimple_operand_gsi_1 (gsi, tmp,

>>                                                 is_gimple_mem_ref_addr,

>>                                                 NULL_TREE, true,

>>                                                 GSI_SAME_STMT);

>>             }

>>           parts.base = tmp;

>>           parts.offset = NULL_TREE;

>>           mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>           gcc_assert (mem_ref);

>>           return mem_ref;

>>         }

>>       ...current code...

>>     }

>>

>> ?  I can do that if you don't mind the cut-&-paste.  Or I could

>> split the code that adds the offset out into a subroutine.

>

> Sorry for the late response.  Yes, sth like that.

>

>> If we do it outside the check for a scaled index, we'd need to

>> duplicate the logic about whether to add the offset to the index

>> or the base.

>>

>>> that said, the function should ideally be re-written to try a few more

>>> options non-destructively.  Doesn't IVOPTs itself already verify

>>> the TARGET_MEM_REFs it generates (and costs!) are valid?

>>

>> I think it only evaluates the cost of the address and leaves this

>> routine to make them valid.  (And the costs are usually right for

>> SVE after the ivopts patch I posted.)  We'd probably have to duplicate

>> the same logic if ivopts legitimised the addresses itself.

>

> Hmm.  Just wonder how it can compute costs when it might not

> be legitimized yet.  And if legitimizing doesn't matter for costs

I'd say IVOPTs tries to legitimize TARGET_MEM_REFS during cost
computation, but...  So the whole idea is:
1) IVOPTs computes address expression in the form of addr_invariant +
IV * scale, in which scale could be 1.
2) IVOPTs tries to distribute addr_invriant into different part of
mem_address, generating memory address as
    base + IV * scale + offset.  Note this procedure is where
legitimization happens, and it should be equal to
    what create_mem_ref does.  Unfortunately, it's only mimic
create_mem_ref/addr_to_parts in a simplified
    way because of reasons listed below.  So I think for most cases
IVOPTs get_address_cost actually gets
    the legitimate form of address, otherwise we can simply reuse it
in rewriting address in the end.
3) In rewriting address, IVOPTs calls create_mem_ref generating
legitimate TARGET_MEM_REF which has
    the same form (part distribution) as cost computation in step 2).

Ideally, the cost computation should do the legitimization work and
the result should be simply reused during
rewriting.  In this case, we can actually remove create_mem_ref call
in the end.   But,
1) Cost computation is of O(mn) complexity.  So I didn't do fully
legitimization.  Instead I simply use mimic
mem_address like:  {symbol, base = integer_one_node, index =
integer_one_node, step, offset}.  It can be
seen as legitimization with base/index parts parameterized.
2) Another reason cost computation can't do full legitimization is as
commented in rewrite_use_address.
We don't know memory object hint before candidates are chosen.

When rewriting IVOPTs, I tried to keep aforementioned two processes
the same.  The most important part
is if invariant part of address is costed as invariant (and that's
always the case), then it will be factored out
as loop invariant during create_mem_ref, rather than mixing it with IV part.
This leads to a question about this patch.

+static bool
+keep_index_p (tree type, mem_address parts)
+{
+  if (!parts.base)
+    return false;
+
+  gcc_assert (!parts.symbol);
+  parts.offset = NULL_TREE;
+  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);
+}
+
Given [base + index << scale + offset], it will be always simplified into:
  base' = base + offset
  [base' + index << scale]
even when IV is in base part (i.e, var_in_base is true)?  Of course,
IV is normally in index part.

And
@@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs
        index' = index + offset;
        [base + index']
      depending on which one is invariant.  */
-  if (parts.offset && !integer_zerop (parts.offset))
+  if (parts.offset
+      && !integer_zerop (parts.offset)
+      && (!var_in_base || !scaled_p))
     {
       tree old_base = unshare_expr (parts.base);
       tree old_index = unshare_expr (parts.index);
IIUC, what we want to do is if index<<scale should be kept, we always
fold base+offset outside of memory expression.
Above code seems hard to follow, should something like below
works(didn't verify)?
  if (parts.offset && !integer_zerop (parts.offset))
    {
      tree old_base = unshare_expr (parts.base);
      tree old_index = unshare_expr (parts.index);
      tree old_offset = unshare_expr (parts.offset);

      tmp = parts.offset;
      parts.offset = NULL_TREE;
      /* Add offset to invariant part.  */
      if (!var_in_base)                 // Change to if (!var_in_base
|| keep_scale_p)
    {
      if (parts.base)
        {
          tmp = fold_build_pointer_plus (parts.base, tmp);
          tmp = force_gimple_operand_gsi_1 (gsi, tmp,
                        is_gimple_mem_ref_addr,
                        NULL_TREE, true,
                        GSI_SAME_STMT);
        }
      parts.base = tmp;
    }

Or we can duplicate code merging base + offset as soon as keep_index_p
is detected?

Thanks,
bin

> why we can't delegitimize during RTL expansion...

>

> Richard.

>

>> Thanks,

>> Richard

>>

>>>

>>> Thanks,

>>> Richard.

>>>

>>>>        gcc_assert (parts.index);

>>>>        parts.index = force_gimple_operand_gsi (gsi,

>>>> @@ -821,6 +836,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>>        mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>>>        if (mem_ref)

>>>>         return mem_ref;

>>>> +      scaled_p = false;

>>>>      }

>>>>

>>>>    /* Add offset to invariant part by transforming address expression:

>>>> @@ -832,7 +848,9 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>>         index' = index + offset;

>>>>         [base + index']

>>>>       depending on which one is invariant.  */

>>>> -  if (parts.offset && !integer_zerop (parts.offset))

>>>> +  if (parts.offset

>>>> +      && !integer_zerop (parts.offset)

>>>> +      && (!var_in_base || !scaled_p))

>>>>      {

>>>>        tree old_base = unshare_expr (parts.base);

>>>>        tree old_index = unshare_expr (parts.index);

>>>> @@ -882,7 +900,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>>    /* Transform [base + index + ...] into:

>>>>         base' = base + index;

>>>>         [base' + ...].  */

>>>> -  if (parts.index)

>>>> +  if (parts.index && !scaled_p)

>>>>      {

>>>>        tmp = parts.index;

>>>>        parts.index = NULL_TREE;
Richard Sandiford Jan. 9, 2018, 2:39 p.m. UTC | #5
Richard Biener <richard.guenther@gmail.com> writes:
> On Tue, Nov 7, 2017 at 7:04 PM, Richard Sandiford

> <richard.sandiford@linaro.org> wrote:

>> Richard Biener <richard.guenther@gmail.com> writes:

>>> On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford

>>> <richard.sandiford@linaro.org> wrote:

>>>> A general TARGET_MEM_REF is:

>>>>

>>>>     BASE + STEP * INDEX + INDEX2 + OFFSET

>>>>

>>>> After classifying the address in this way, the code that builds

>>>> TARGET_MEM_REFs tries to simplify the address until it's valid

>>>> for the current target and for the mode of memory being addressed.

>>>> It does this in a fixed order:

>>>>

>>>> (1) add SYMBOL to BASE

>>>> (2) add INDEX * STEP to the base, if STEP != 1

>>>> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

>>>> (4) add INDEX to BASE

>>>> (5) add OFFSET to BASE

>>>>

>>>> So suppose we had an address:

>>>>

>>>>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>>>>

>>>> on a target that only allows an index or an offset, not both.  Following

>>>> the steps above, we'd first create:

>>>>

>>>>     tmp = symbol

>>>>     tmp2 = tmp + index * 8

>>>>

>>>> Then if the given offset value was valid for the mode being addressed,

>>>> we'd create:

>>>>

>>>>     MEM[base:tmp2, offset:offset]

>>>>

>>>> while if it was invalid we'd create:

>>>>

>>>>     tmp3 = tmp2 + offset

>>>>     MEM[base:tmp3, offset:0]

>>>>

>>>> The problem is that this could happen if ivopts had decided to use

>>>> a scaled index for an address that happens to have a constant base.

>>>> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

>>>> and adding the offset last prevented later passes from being able to

>>>> fold the index back in.

>>>>

>>>> The patch avoids this by skipping (2) if BASE + INDEX * STEP

>>>> is a legitimate address and if OFFSET is stopping the address

>>>> being valid.

>>>>

>>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>>>> OK to install?

>>>>

>>>> Richard

>>>>

>>>>

>>>> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>>>>             Alan Hayward  <alan.hayward@arm.com>

>>>>             David Sherwood  <david.sherwood@arm.com>

>>>>

>>>> gcc/

>>>>         * tree-ssa-address.c (keep_index_p): New function.

>>>>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>>>>         component if that is invalid even with the symbol and offset

>>>>         removed.

>>>>

>>>> Index: gcc/tree-ssa-address.c

>>>> ===================================================================

>>>> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

>>>> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

>>>> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>>>>                                              true, GSI_SAME_STMT);

>>>>  }

>>>>

>>>> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

>>>> +   address for type TYPE and if the offset is making it appear invalid.  */

>>>> +

>>>> +static bool

>>>> +keep_index_p (tree type, mem_address parts)

>>>

>>> mem_ref_valid_without_offset_p (...)

>>>

>>> ?

>>

>> OK.

>>

>>>> +{

>>>> +  if (!parts.base)

>>>> +    return false;

>>>> +

>>>> +  gcc_assert (!parts.symbol);

>>>> +  parts.offset = NULL_TREE;

>>>> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

>>>> +}

>>>> +

>>>>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>>>>     computations are emitted in front of GSI.  TYPE is the mode

>>>>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

>>>> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>

>>> Which means all of the following would be more naturally written as

>>>

>>>>       into:

>>>>         index' = index << step;

>>>>         [... + index' + ,,,].  */

>>>> -  if (parts.step && !integer_onep (parts.step))

>>>> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

>>>> +  if (scaled_p && !keep_index_p (type, parts))

>>>>      {

>>>

>>>   if (mem_ref_valid_without_offset_p (...))

>>>    {

>>>      ...

>>>      return create_mem_ref_raw (...);

>>>    }

>>

>> Is this inside the test for a scale:

>>

>>   if (parts.step && !integer_onep (parts.step))

>>     {

>>       if (mem_ref_valid_without_offset_p (...))

>>         {

>>           tree tmp = parts.offset;

>>           if (parts.base)

>>             {

>>               tmp = fold_build_pointer_plus (parts.base, tmp);

>>               tmp = force_gimple_operand_gsi_1 (gsi, tmp,

>>                                                 is_gimple_mem_ref_addr,

>>                                                 NULL_TREE, true,

>>                                                 GSI_SAME_STMT);

>>             }

>>           parts.base = tmp;

>>           parts.offset = NULL_TREE;

>>           mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>           gcc_assert (mem_ref);

>>           return mem_ref;

>>         }

>>       ...current code...

>>     }

>>

>> ?  I can do that if you don't mind the cut-&-paste.  Or I could

>> split the code that adds the offset out into a subroutine.

>

> Sorry for the late response.  Yes, sth like that.


Sorry for the dropping the ball on this.  Is the version below OK?

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

Thanks,
Richard


2018-01-09  Richard Sandiford  <richard.sandiford@linaro.org>
	    Alan Hayward  <alan.hayward@arm.com>
	    David Sherwood  <david.sherwood@arm.com>

gcc/
	* tree-ssa-address.c (mem_ref_valid_without_offset_p): New function.
	(add_offset_to_base): New function, split out from...
	(create_mem_ref): ...here.  When handling a scale other than 1,
	check first whether the address is valid without the offset.
	Add it into the base if so, leaving the index and scale as-is.

Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c	2018-01-05 13:38:01.651810414 +0000
+++ gcc/tree-ssa-address.c	2018-01-09 14:36:36.533460131 +0000
@@ -746,6 +746,35 @@ gimplify_mem_ref_parts (gimple_stmt_iter
 					     true, GSI_SAME_STMT);
 }
 
+/* Return true if the OFFSET in PARTS is the only thing that is making
+   it an invalid address for type TYPE.  */
+
+static bool
+mem_ref_valid_without_offset_p (tree type, mem_address parts)
+{
+  if (!parts.base)
+    parts.base = parts.offset;
+  parts.offset = NULL_TREE;
+  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);
+}
+
+/* Fold PARTS->offset into PARTS->base, so that there is no longer
+   a separate offset.  Emit any new instructions before GSI.  */
+
+static void
+add_offset_to_base (gimple_stmt_iterator *gsi, mem_address *parts)
+{
+  tree tmp = parts->offset;
+  if (parts->base)
+    {
+      tmp = fold_build_pointer_plus (parts->base, tmp);
+      tmp = force_gimple_operand_gsi_1 (gsi, tmp, is_gimple_mem_ref_addr,
+					NULL_TREE, true, GSI_SAME_STMT);
+    }
+  parts->base = tmp;
+  parts->offset = NULL_TREE;
+}
+
 /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary
    computations are emitted in front of GSI.  TYPE is the mode
    of created memory reference. IV_CAND is the selected iv candidate in ADDR,
@@ -812,6 +841,14 @@ create_mem_ref (gimple_stmt_iterator *gs
   if (parts.step && !integer_onep (parts.step))
     {
       gcc_assert (parts.index);
+      if (parts.offset && mem_ref_valid_without_offset_p (type, parts))
+	{
+	  add_offset_to_base (gsi, &parts);
+	  mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
+	  gcc_assert (mem_ref);
+	  return mem_ref;
+	}
+
       parts.index = force_gimple_operand_gsi (gsi,
 				fold_build2 (MULT_EXPR, sizetype,
 					     parts.index, parts.step),
@@ -906,18 +943,7 @@ create_mem_ref (gimple_stmt_iterator *gs
        [base'].  */
   if (parts.offset && !integer_zerop (parts.offset))
     {
-      tmp = parts.offset;
-      parts.offset = NULL_TREE;
-      /* Add offset to base.  */
-      if (parts.base)
-	{
-	  tmp = fold_build_pointer_plus (parts.base, tmp);
-	  tmp = force_gimple_operand_gsi_1 (gsi, tmp,
-					    is_gimple_mem_ref_addr,
-					    NULL_TREE, true, GSI_SAME_STMT);
-	}
-      parts.base = tmp;
-
+      add_offset_to_base (gsi, &parts);
       mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
       if (mem_ref)
 	return mem_ref;
Richard Biener Jan. 15, 2018, 10:08 a.m. UTC | #6
On Tue, Jan 9, 2018 at 3:39 PM, Richard Sandiford
<richard.sandiford@linaro.org> wrote:
> Richard Biener <richard.guenther@gmail.com> writes:

>> On Tue, Nov 7, 2017 at 7:04 PM, Richard Sandiford

>> <richard.sandiford@linaro.org> wrote:

>>> Richard Biener <richard.guenther@gmail.com> writes:

>>>> On Fri, Nov 3, 2017 at 5:32 PM, Richard Sandiford

>>>> <richard.sandiford@linaro.org> wrote:

>>>>> A general TARGET_MEM_REF is:

>>>>>

>>>>>     BASE + STEP * INDEX + INDEX2 + OFFSET

>>>>>

>>>>> After classifying the address in this way, the code that builds

>>>>> TARGET_MEM_REFs tries to simplify the address until it's valid

>>>>> for the current target and for the mode of memory being addressed.

>>>>> It does this in a fixed order:

>>>>>

>>>>> (1) add SYMBOL to BASE

>>>>> (2) add INDEX * STEP to the base, if STEP != 1

>>>>> (3) add OFFSET to INDEX or BASE (reverted if unsuccessful)

>>>>> (4) add INDEX to BASE

>>>>> (5) add OFFSET to BASE

>>>>>

>>>>> So suppose we had an address:

>>>>>

>>>>>     &symbol + offset + index * 8   (e.g. "a[i + 1]" for a global "a")

>>>>>

>>>>> on a target that only allows an index or an offset, not both.  Following

>>>>> the steps above, we'd first create:

>>>>>

>>>>>     tmp = symbol

>>>>>     tmp2 = tmp + index * 8

>>>>>

>>>>> Then if the given offset value was valid for the mode being addressed,

>>>>> we'd create:

>>>>>

>>>>>     MEM[base:tmp2, offset:offset]

>>>>>

>>>>> while if it was invalid we'd create:

>>>>>

>>>>>     tmp3 = tmp2 + offset

>>>>>     MEM[base:tmp3, offset:0]

>>>>>

>>>>> The problem is that this could happen if ivopts had decided to use

>>>>> a scaled index for an address that happens to have a constant base.

>>>>> The old procedure failed to give an indexed TARGET_MEM_REF in that case,

>>>>> and adding the offset last prevented later passes from being able to

>>>>> fold the index back in.

>>>>>

>>>>> The patch avoids this by skipping (2) if BASE + INDEX * STEP

>>>>> is a legitimate address and if OFFSET is stopping the address

>>>>> being valid.

>>>>>

>>>>> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>>>>> OK to install?

>>>>>

>>>>> Richard

>>>>>

>>>>>

>>>>> 2017-10-31  Richard Sandiford  <richard.sandiford@linaro.org>

>>>>>             Alan Hayward  <alan.hayward@arm.com>

>>>>>             David Sherwood  <david.sherwood@arm.com>

>>>>>

>>>>> gcc/

>>>>>         * tree-ssa-address.c (keep_index_p): New function.

>>>>>         (create_mem_ref): Use it.  Only split out the INDEX * STEP

>>>>>         component if that is invalid even with the symbol and offset

>>>>>         removed.

>>>>>

>>>>> Index: gcc/tree-ssa-address.c

>>>>> ===================================================================

>>>>> --- gcc/tree-ssa-address.c      2017-11-03 12:15:44.097060121 +0000

>>>>> +++ gcc/tree-ssa-address.c      2017-11-03 12:21:18.060359821 +0000

>>>>> @@ -746,6 +746,20 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>>>>>                                              true, GSI_SAME_STMT);

>>>>>  }

>>>>>

>>>>> +/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP

>>>>> +   address for type TYPE and if the offset is making it appear invalid.  */

>>>>> +

>>>>> +static bool

>>>>> +keep_index_p (tree type, mem_address parts)

>>>>

>>>> mem_ref_valid_without_offset_p (...)

>>>>

>>>> ?

>>>

>>> OK.

>>>

>>>>> +{

>>>>> +  if (!parts.base)

>>>>> +    return false;

>>>>> +

>>>>> +  gcc_assert (!parts.symbol);

>>>>> +  parts.offset = NULL_TREE;

>>>>> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

>>>>> +}

>>>>> +

>>>>>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>>>>>     computations are emitted in front of GSI.  TYPE is the mode

>>>>>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

>>>>> @@ -809,7 +823,8 @@ create_mem_ref (gimple_stmt_iterator *gs

>>>>

>>>> Which means all of the following would be more naturally written as

>>>>

>>>>>       into:

>>>>>         index' = index << step;

>>>>>         [... + index' + ,,,].  */

>>>>> -  if (parts.step && !integer_onep (parts.step))

>>>>> +  bool scaled_p = (parts.step && !integer_onep (parts.step));

>>>>> +  if (scaled_p && !keep_index_p (type, parts))

>>>>>      {

>>>>

>>>>   if (mem_ref_valid_without_offset_p (...))

>>>>    {

>>>>      ...

>>>>      return create_mem_ref_raw (...);

>>>>    }

>>>

>>> Is this inside the test for a scale:

>>>

>>>   if (parts.step && !integer_onep (parts.step))

>>>     {

>>>       if (mem_ref_valid_without_offset_p (...))

>>>         {

>>>           tree tmp = parts.offset;

>>>           if (parts.base)

>>>             {

>>>               tmp = fold_build_pointer_plus (parts.base, tmp);

>>>               tmp = force_gimple_operand_gsi_1 (gsi, tmp,

>>>                                                 is_gimple_mem_ref_addr,

>>>                                                 NULL_TREE, true,

>>>                                                 GSI_SAME_STMT);

>>>             }

>>>           parts.base = tmp;

>>>           parts.offset = NULL_TREE;

>>>           mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>>>           gcc_assert (mem_ref);

>>>           return mem_ref;

>>>         }

>>>       ...current code...

>>>     }

>>>

>>> ?  I can do that if you don't mind the cut-&-paste.  Or I could

>>> split the code that adds the offset out into a subroutine.

>>

>> Sorry for the late response.  Yes, sth like that.

>

> Sorry for the dropping the ball on this.  Is the version below OK?


Ok.

Richard.

> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64-linux-gnu.

>

> Thanks,

> Richard

>

>

> 2018-01-09  Richard Sandiford  <richard.sandiford@linaro.org>

>             Alan Hayward  <alan.hayward@arm.com>

>             David Sherwood  <david.sherwood@arm.com>

>

> gcc/

>         * tree-ssa-address.c (mem_ref_valid_without_offset_p): New function.

>         (add_offset_to_base): New function, split out from...

>         (create_mem_ref): ...here.  When handling a scale other than 1,

>         check first whether the address is valid without the offset.

>         Add it into the base if so, leaving the index and scale as-is.

>

> Index: gcc/tree-ssa-address.c

> ===================================================================

> --- gcc/tree-ssa-address.c      2018-01-05 13:38:01.651810414 +0000

> +++ gcc/tree-ssa-address.c      2018-01-09 14:36:36.533460131 +0000

> @@ -746,6 +746,35 @@ gimplify_mem_ref_parts (gimple_stmt_iter

>                                              true, GSI_SAME_STMT);

>  }

>

> +/* Return true if the OFFSET in PARTS is the only thing that is making

> +   it an invalid address for type TYPE.  */

> +

> +static bool

> +mem_ref_valid_without_offset_p (tree type, mem_address parts)

> +{

> +  if (!parts.base)

> +    parts.base = parts.offset;

> +  parts.offset = NULL_TREE;

> +  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);

> +}

> +

> +/* Fold PARTS->offset into PARTS->base, so that there is no longer

> +   a separate offset.  Emit any new instructions before GSI.  */

> +

> +static void

> +add_offset_to_base (gimple_stmt_iterator *gsi, mem_address *parts)

> +{

> +  tree tmp = parts->offset;

> +  if (parts->base)

> +    {

> +      tmp = fold_build_pointer_plus (parts->base, tmp);

> +      tmp = force_gimple_operand_gsi_1 (gsi, tmp, is_gimple_mem_ref_addr,

> +                                       NULL_TREE, true, GSI_SAME_STMT);

> +    }

> +  parts->base = tmp;

> +  parts->offset = NULL_TREE;

> +}

> +

>  /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary

>     computations are emitted in front of GSI.  TYPE is the mode

>     of created memory reference. IV_CAND is the selected iv candidate in ADDR,

> @@ -812,6 +841,14 @@ create_mem_ref (gimple_stmt_iterator *gs

>    if (parts.step && !integer_onep (parts.step))

>      {

>        gcc_assert (parts.index);

> +      if (parts.offset && mem_ref_valid_without_offset_p (type, parts))

> +       {

> +         add_offset_to_base (gsi, &parts);

> +         mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

> +         gcc_assert (mem_ref);

> +         return mem_ref;

> +       }

> +

>        parts.index = force_gimple_operand_gsi (gsi,

>                                 fold_build2 (MULT_EXPR, sizetype,

>                                              parts.index, parts.step),

> @@ -906,18 +943,7 @@ create_mem_ref (gimple_stmt_iterator *gs

>         [base'].  */

>    if (parts.offset && !integer_zerop (parts.offset))

>      {

> -      tmp = parts.offset;

> -      parts.offset = NULL_TREE;

> -      /* Add offset to base.  */

> -      if (parts.base)

> -       {

> -         tmp = fold_build_pointer_plus (parts.base, tmp);

> -         tmp = force_gimple_operand_gsi_1 (gsi, tmp,

> -                                           is_gimple_mem_ref_addr,

> -                                           NULL_TREE, true, GSI_SAME_STMT);

> -       }

> -      parts.base = tmp;

> -

> +      add_offset_to_base (gsi, &parts);

>        mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);

>        if (mem_ref)

>         return mem_ref;
diff mbox series

Patch

Index: gcc/tree-ssa-address.c
===================================================================
--- gcc/tree-ssa-address.c	2017-11-03 12:15:44.097060121 +0000
+++ gcc/tree-ssa-address.c	2017-11-03 12:21:18.060359821 +0000
@@ -746,6 +746,20 @@  gimplify_mem_ref_parts (gimple_stmt_iter
 					     true, GSI_SAME_STMT);
 }
 
+/* Return true if the STEP in PARTS gives a valid BASE + INDEX * STEP
+   address for type TYPE and if the offset is making it appear invalid.  */
+
+static bool
+keep_index_p (tree type, mem_address parts)
+{
+  if (!parts.base)
+    return false;
+
+  gcc_assert (!parts.symbol);
+  parts.offset = NULL_TREE;
+  return valid_mem_ref_p (TYPE_MODE (type), TYPE_ADDR_SPACE (type), &parts);
+}
+
 /* Creates and returns a TARGET_MEM_REF for address ADDR.  If necessary
    computations are emitted in front of GSI.  TYPE is the mode
    of created memory reference. IV_CAND is the selected iv candidate in ADDR,
@@ -809,7 +823,8 @@  create_mem_ref (gimple_stmt_iterator *gs
      into:
        index' = index << step;
        [... + index' + ,,,].  */
-  if (parts.step && !integer_onep (parts.step))
+  bool scaled_p = (parts.step && !integer_onep (parts.step));
+  if (scaled_p && !keep_index_p (type, parts))
     {
       gcc_assert (parts.index);
       parts.index = force_gimple_operand_gsi (gsi,
@@ -821,6 +836,7 @@  create_mem_ref (gimple_stmt_iterator *gs
       mem_ref = create_mem_ref_raw (type, alias_ptr_type, &parts, true);
       if (mem_ref)
 	return mem_ref;
+      scaled_p = false;
     }
 
   /* Add offset to invariant part by transforming address expression:
@@ -832,7 +848,9 @@  create_mem_ref (gimple_stmt_iterator *gs
        index' = index + offset;
        [base + index']
      depending on which one is invariant.  */
-  if (parts.offset && !integer_zerop (parts.offset))
+  if (parts.offset
+      && !integer_zerop (parts.offset)
+      && (!var_in_base || !scaled_p))
     {
       tree old_base = unshare_expr (parts.base);
       tree old_index = unshare_expr (parts.index);
@@ -882,7 +900,7 @@  create_mem_ref (gimple_stmt_iterator *gs
   /* Transform [base + index + ...] into:
        base' = base + index;
        [base' + ...].  */
-  if (parts.index)
+  if (parts.index && !scaled_p)
     {
       tmp = parts.index;
       parts.index = NULL_TREE;