Message ID | 563CDDE0.600@arm.com |
---|---|
State | New |
Headers | show |
Ping. https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html Thanks, Kyrill On 06/11/15 17:05, Kyrill Tkachov wrote: > Hi all, > > This patch introduces support for the smmul, smmla and smmls instructions that appear in armv6 architecture levels > and higher. To quote the SMMUL description from the ARMARM: > "Signed Most Significant Word Multiply multiplies two signed 32-bit values, extracts the most significant 32 bits of > the result, and writes those bits to the destination register." > > The smmla and smmls are the multiply-accumulate and multiply-subtract extensions of that multiply. > There also exists an smmulr variant that rounds the result rather than truncating it. > However, when I tried adding patterns for those forms I got an LRA ICE that I was not able to figure out. > I'll try to find a testcase for that, but in the meantime there's no reason to not have patterns for the > non-rounding variants. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > I've seen this trigger in quite a few places in SPEC2006 where it always made the code better. > > Ok for trunk? > > Thanks, > Kyrill > > 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/49526 > * config/arm/arm.md (*mulsidi3si_v6): New pattern. > (*mulsidi3siaddsi_v6): Likewise. > (*mulsidi3sisubsi_v6): Likewise. > * config/arm/predicates.md (subreg_highpart_operator): > New predicate. > > 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > PR target/49526 > * gcc.target/arm/pr49526_1.c: New test. > * gcc.target/arm/pr49526_2.c: Likewise. > * gcc.target/arm/pr49526_3.c: Likewise.
Ping. Thanks, Kyrill On 13/11/15 11:50, Kyrill Tkachov wrote: > Ping. > https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html > > Thanks, > Kyrill > > On 06/11/15 17:05, Kyrill Tkachov wrote: >> Hi all, >> >> This patch introduces support for the smmul, smmla and smmls instructions that appear in armv6 architecture levels >> and higher. To quote the SMMUL description from the ARMARM: >> "Signed Most Significant Word Multiply multiplies two signed 32-bit values, extracts the most significant 32 bits of >> the result, and writes those bits to the destination register." >> >> The smmla and smmls are the multiply-accumulate and multiply-subtract extensions of that multiply. >> There also exists an smmulr variant that rounds the result rather than truncating it. >> However, when I tried adding patterns for those forms I got an LRA ICE that I was not able to figure out. >> I'll try to find a testcase for that, but in the meantime there's no reason to not have patterns for the >> non-rounding variants. >> >> Bootstrapped and tested on arm-none-linux-gnueabihf. >> I've seen this trigger in quite a few places in SPEC2006 where it always made the code better. >> >> Ok for trunk? >> >> Thanks, >> Kyrill >> >> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/49526 >> * config/arm/arm.md (*mulsidi3si_v6): New pattern. >> (*mulsidi3siaddsi_v6): Likewise. >> (*mulsidi3sisubsi_v6): Likewise. >> * config/arm/predicates.md (subreg_highpart_operator): >> New predicate. >> >> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >> >> PR target/49526 >> * gcc.target/arm/pr49526_1.c: New test. >> * gcc.target/arm/pr49526_2.c: Likewise. >> * gcc.target/arm/pr49526_3.c: Likewise. >
Hi Kyrill, On 24 November 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Ping. > > Thanks, > Kyrill > > > > On 13/11/15 11:50, Kyrill Tkachov wrote: >> >> Ping. >> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html >> >> Thanks, >> Kyrill >> >> On 06/11/15 17:05, Kyrill Tkachov wrote: >>> >>> Hi all, >>> >>> This patch introduces support for the smmul, smmla and smmls instructions >>> that appear in armv6 architecture levels >>> and higher. To quote the SMMUL description from the ARMARM: >>> "Signed Most Significant Word Multiply multiplies two signed 32-bit >>> values, extracts the most significant 32 bits of >>> the result, and writes those bits to the destination register." >>> >>> The smmla and smmls are the multiply-accumulate and multiply-subtract >>> extensions of that multiply. >>> There also exists an smmulr variant that rounds the result rather than >>> truncating it. >>> However, when I tried adding patterns for those forms I got an LRA ICE >>> that I was not able to figure out. >>> I'll try to find a testcase for that, but in the meantime there's no >>> reason to not have patterns for the >>> non-rounding variants. I agree, but would be interested to give a look at the LRA issue when you have a testcase. >>> Bootstrapped and tested on arm-none-linux-gnueabihf. >>> I've seen this trigger in quite a few places in SPEC2006 where it always >>> made the code better. >>> >>> Ok for trunk? Patch looks good to me (but can't approved it), I just wonder if adding a subreg_highpart_p predicate in emit-rtl.c would be useful, looking at other usage of subreg_highpart_offset it doesn't seem to be the case, but maybe for consistency with the existing subreg_lowpart_p. Cheers, Yvan >>> Thanks, >>> Kyrill >>> >>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> PR target/49526 >>> * config/arm/arm.md (*mulsidi3si_v6): New pattern. >>> (*mulsidi3siaddsi_v6): Likewise. >>> (*mulsidi3sisubsi_v6): Likewise. >>> * config/arm/predicates.md (subreg_highpart_operator): >>> New predicate. >>> >>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>> >>> PR target/49526 >>> * gcc.target/arm/pr49526_1.c: New test. >>> * gcc.target/arm/pr49526_2.c: Likewise. >>> * gcc.target/arm/pr49526_3.c: Likewise. >> >> >
Hi Yvan, On 24/11/15 15:05, Yvan Roux wrote: > Hi Kyrill, > > On 24 November 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >> Ping. >> >> Thanks, >> Kyrill >> >> >> >> On 13/11/15 11:50, Kyrill Tkachov wrote: >>> Ping. >>> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html >>> >>> Thanks, >>> Kyrill >>> >>> On 06/11/15 17:05, Kyrill Tkachov wrote: >>>> Hi all, >>>> >>>> This patch introduces support for the smmul, smmla and smmls instructions >>>> that appear in armv6 architecture levels >>>> and higher. To quote the SMMUL description from the ARMARM: >>>> "Signed Most Significant Word Multiply multiplies two signed 32-bit >>>> values, extracts the most significant 32 bits of >>>> the result, and writes those bits to the destination register." >>>> >>>> The smmla and smmls are the multiply-accumulate and multiply-subtract >>>> extensions of that multiply. >>>> There also exists an smmulr variant that rounds the result rather than >>>> truncating it. >>>> However, when I tried adding patterns for those forms I got an LRA ICE >>>> that I was not able to figure out. >>>> I'll try to find a testcase for that, but in the meantime there's no >>>> reason to not have patterns for the >>>> non-rounding variants. > I agree, but would be interested to give a look at the LRA issue when > you have a testcase. I've filed PR 68536 for the ICE I've been seeing. Any insight is, of course, welcome. >>>> Bootstrapped and tested on arm-none-linux-gnueabihf. >>>> I've seen this trigger in quite a few places in SPEC2006 where it always >>>> made the code better. >>>> >>>> Ok for trunk? > Patch looks good to me (but can't approved it), I just wonder if > adding a subreg_highpart_p predicate in emit-rtl.c would be useful, > looking at other usage of subreg_highpart_offset it doesn't seem to be > the case, but maybe for consistency with the existing > subreg_lowpart_p. Perhaps, we can put it there if there's much scope for its reuse in other parts of the compiler. I didn't put it there in the first place so as not to extend the review scope of the patch to the midend... Thanks for looking at this, Kyrill > > Cheers, > Yvan > >>>> Thanks, >>>> Kyrill >>>> >>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>> >>>> PR target/49526 >>>> * config/arm/arm.md (*mulsidi3si_v6): New pattern. >>>> (*mulsidi3siaddsi_v6): Likewise. >>>> (*mulsidi3sisubsi_v6): Likewise. >>>> * config/arm/predicates.md (subreg_highpart_operator): >>>> New predicate. >>>> >>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>> >>>> PR target/49526 >>>> * gcc.target/arm/pr49526_1.c: New test. >>>> * gcc.target/arm/pr49526_2.c: Likewise. >>>> * gcc.target/arm/pr49526_3.c: Likewise. >>>
On 25 November 2015 at 11:36, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: > Hi Yvan, > > > On 24/11/15 15:05, Yvan Roux wrote: >> >> Hi Kyrill, >> >> On 24 November 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> >> wrote: >>> >>> Ping. >>> >>> Thanks, >>> Kyrill >>> >>> >>> >>> On 13/11/15 11:50, Kyrill Tkachov wrote: >>>> >>>> Ping. >>>> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html >>>> >>>> Thanks, >>>> Kyrill >>>> >>>> On 06/11/15 17:05, Kyrill Tkachov wrote: >>>>> >>>>> Hi all, >>>>> >>>>> This patch introduces support for the smmul, smmla and smmls >>>>> instructions >>>>> that appear in armv6 architecture levels >>>>> and higher. To quote the SMMUL description from the ARMARM: >>>>> "Signed Most Significant Word Multiply multiplies two signed 32-bit >>>>> values, extracts the most significant 32 bits of >>>>> the result, and writes those bits to the destination register." >>>>> >>>>> The smmla and smmls are the multiply-accumulate and multiply-subtract >>>>> extensions of that multiply. >>>>> There also exists an smmulr variant that rounds the result rather than >>>>> truncating it. >>>>> However, when I tried adding patterns for those forms I got an LRA ICE >>>>> that I was not able to figure out. >>>>> I'll try to find a testcase for that, but in the meantime there's no >>>>> reason to not have patterns for the >>>>> non-rounding variants. >> >> I agree, but would be interested to give a look at the LRA issue when >> you have a testcase. > > > I've filed PR 68536 for the ICE I've been seeing. Any insight is, of course, > welcome. Ok great, I'll give a look. >>>>> Bootstrapped and tested on arm-none-linux-gnueabihf. >>>>> I've seen this trigger in quite a few places in SPEC2006 where it >>>>> always >>>>> made the code better. >>>>> >>>>> Ok for trunk? >> >> Patch looks good to me (but can't approved it), I just wonder if >> adding a subreg_highpart_p predicate in emit-rtl.c would be useful, >> looking at other usage of subreg_highpart_offset it doesn't seem to be >> the case, but maybe for consistency with the existing >> subreg_lowpart_p. > > > Perhaps, we can put it there if there's much scope for its reuse in other > parts of the compiler. I didn't put it there in the first place so as not to > extend the review scope of the patch to the midend... Yes it's what I thought, and I don't see other usage. Now on AArch64 I see that the pattern used to generate smulh and umulh (<su>muldi3_highpart), which are doing the same kind of operation in the 64bit world, is doing it in a different manner (lshift and truncate) and I'm not sure which approach is the best here. Cheers, Yvan > Thanks for looking at this, > Kyrill > > >> >> Cheers, >> Yvan >> >>>>> Thanks, >>>>> Kyrill >>>>> >>>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>>> >>>>> PR target/49526 >>>>> * config/arm/arm.md (*mulsidi3si_v6): New pattern. >>>>> (*mulsidi3siaddsi_v6): Likewise. >>>>> (*mulsidi3sisubsi_v6): Likewise. >>>>> * config/arm/predicates.md (subreg_highpart_operator): >>>>> New predicate. >>>>> >>>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>>> >>>>> PR target/49526 >>>>> * gcc.target/arm/pr49526_1.c: New test. >>>>> * gcc.target/arm/pr49526_2.c: Likewise. >>>>> * gcc.target/arm/pr49526_3.c: Likewise. >>>> >>>> >
On 25/11/15 10:36, Kyrill Tkachov wrote: > Hi Yvan, > > On 24/11/15 15:05, Yvan Roux wrote: >> Hi Kyrill, >> >> On 24 November 2015 at 14:31, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote: >>> Ping. >>> >>> Thanks, >>> Kyrill >>> >>> >>> >>> On 13/11/15 11:50, Kyrill Tkachov wrote: >>>> Ping. >>>> https://gcc.gnu.org/ml/gcc-patches/2015-11/msg00686.html >>>> >>>> Thanks, >>>> Kyrill >>>> >>>> On 06/11/15 17:05, Kyrill Tkachov wrote: >>>>> Hi all, >>>>> >>>>> This patch introduces support for the smmul, smmla and smmls instructions >>>>> that appear in armv6 architecture levels >>>>> and higher. To quote the SMMUL description from the ARMARM: >>>>> "Signed Most Significant Word Multiply multiplies two signed 32-bit >>>>> values, extracts the most significant 32 bits of >>>>> the result, and writes those bits to the destination register." >>>>> >>>>> The smmla and smmls are the multiply-accumulate and multiply-subtract >>>>> extensions of that multiply. >>>>> There also exists an smmulr variant that rounds the result rather than >>>>> truncating it. >>>>> However, when I tried adding patterns for those forms I got an LRA ICE >>>>> that I was not able to figure out. >>>>> I'll try to find a testcase for that, but in the meantime there's no >>>>> reason to not have patterns for the >>>>> non-rounding variants. >> I agree, but would be interested to give a look at the LRA issue when >> you have a testcase. > > I've filed PR 68536 for the ICE I've been seeing. Any insight is, of course, welcome. > >>>>> Bootstrapped and tested on arm-none-linux-gnueabihf. >>>>> I've seen this trigger in quite a few places in SPEC2006 where it always >>>>> made the code better. >>>>> >>>>> Ok for trunk? >> Patch looks good to me (but can't approved it), I just wonder if >> adding a subreg_highpart_p predicate in emit-rtl.c would be useful, >> looking at other usage of subreg_highpart_offset it doesn't seem to be >> the case, but maybe for consistency with the existing >> subreg_lowpart_p. > > Perhaps, we can put it there if there's much scope for its reuse in other > parts of the compiler. I didn't put it there in the first place so as not to > extend the review scope of the patch to the midend... > I have since discovered a bug in the proposed patterns for smmla and smmls. The instructions are more involved than I thought, so it's not easy to write a pattern for them. I'll propose an updated patch adding just the smmul pattern in next stage1. So, I'm withdrawing this patch. thanks, Kyrill > Thanks for looking at this, > Kyrill > >> >> Cheers, >> Yvan >> >>>>> Thanks, >>>>> Kyrill >>>>> >>>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>>> >>>>> PR target/49526 >>>>> * config/arm/arm.md (*mulsidi3si_v6): New pattern. >>>>> (*mulsidi3siaddsi_v6): Likewise. >>>>> (*mulsidi3sisubsi_v6): Likewise. >>>>> * config/arm/predicates.md (subreg_highpart_operator): >>>>> New predicate. >>>>> >>>>> 2015-11-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> >>>>> >>>>> PR target/49526 >>>>> * gcc.target/arm/pr49526_1.c: New test. >>>>> * gcc.target/arm/pr49526_2.c: Likewise. >>>>> * gcc.target/arm/pr49526_3.c: Likewise. >>>> >
commit 88a99c0776864607f91edfb2d337894b8f698dc4 Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Mon Nov 2 15:03:14 2015 +0000 [ARM] Add support for smmul,smmla,smmls instructions diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index cf4d46e..3bed3b5 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -1559,6 +1559,53 @@ (define_insn "*mulsidi3_v6" (set_attr "predicable_short_it" "no")] ) +(define_insn "*mulsidi3si_v6" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (match_operator:SI 3 "subreg_highpart_operator" + [(mult:DI + (sign_extend:DI + (match_operand:SI 1 "s_register_operand" "%r")) + (sign_extend:DI + (match_operand:SI 2 "s_register_operand" "r")))]))] + "TARGET_32BIT && arm_arch6" + "smmul%?\t%0, %1, %2" + [(set_attr "type" "smmul") + (set_attr "predicable" "yes") + (set_attr "predicable_short_it" "no")] +) + +(define_insn "*mulsidi3siaddsi_v6" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (plus:SI + (match_operator:SI 3 "subreg_highpart_operator" + [(mult:DI + (sign_extend:DI (match_operand:SI 1 "s_register_operand" "%r")) + (sign_extend:DI (match_operand:SI 2 "s_register_operand" "r")))]) + (match_operand:SI 4 "s_register_operand" "r")))] + "TARGET_32BIT && arm_arch6" + "smmla%?\t%0, %1, %2, %4" + [(set_attr "type" "smmla") + (set_attr "predicable" "yes") + (set_attr "predicable_short_it" "no")] +) + +(define_insn "*mulsidi3sisubsi_v6" + [(set (match_operand:SI 0 "s_register_operand" "=r") + (minus:SI + (match_operand:SI 4 "s_register_operand" "r") + (match_operator:SI 3 "subreg_highpart_operator" + [(mult:DI + (sign_extend:DI + (match_operand:SI 1 "s_register_operand" "%r")) + (sign_extend:DI + (match_operand:SI 2 "s_register_operand" "r")))])))] + "TARGET_32BIT && arm_arch6" + "smmls%?\t%0, %1, %2, %4" + [(set_attr "type" "smmla") + (set_attr "predicable" "yes") + (set_attr "predicable_short_it" "no")] +) + (define_expand "umulsidi3" [(set (match_operand:DI 0 "s_register_operand" "") (mult:DI diff --git a/gcc/config/arm/predicates.md b/gcc/config/arm/predicates.md index 48e4ba8..95a36e8 100644 --- a/gcc/config/arm/predicates.md +++ b/gcc/config/arm/predicates.md @@ -117,6 +117,14 @@ (define_special_predicate "subreg_lowpart_operator" (and (match_code "subreg") (match_test "subreg_lowpart_p (op)"))) +(define_special_predicate "subreg_highpart_operator" + (and (match_code "subreg") + (match_test "GET_MODE_SIZE (mode) + < GET_MODE_SIZE (GET_MODE (SUBREG_REG (op))) + && SUBREG_BYTE (op) + == subreg_highpart_offset (mode, + GET_MODE (SUBREG_REG (op)))"))) + ;; Reg, subreg(reg) or const_int. (define_predicate "reg_or_int_operand" (ior (match_code "const_int") diff --git a/gcc/testsuite/gcc.target/arm/pr49526_1.c b/gcc/testsuite/gcc.target/arm/pr49526_1.c new file mode 100644 index 0000000..c48b4ae --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr49526_1.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_arch_v6_ok } */ +/* { dg-add-options arm_arch_v6 } */ + +int +foo (int a, int b) +{ + return ((long long) a * b) >> 32; +} + +/* { dg-final { scan-assembler "smmul\[ \t\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/pr49526_2.c b/gcc/testsuite/gcc.target/arm/pr49526_2.c new file mode 100644 index 0000000..1f83eab --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr49526_2.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_arch_v6_ok } */ +/* { dg-add-options arm_arch_v6 } */ + +int +foo (int a, int b, int c) +{ + return c + (((long long) a * b) >> 32); +} + +/* { dg-final { scan-assembler "smmla\[ \t\]" } } */ diff --git a/gcc/testsuite/gcc.target/arm/pr49526_3.c b/gcc/testsuite/gcc.target/arm/pr49526_3.c new file mode 100644 index 0000000..b3b5f78 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr49526_3.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ +/* { dg-require-effective-target arm_arch_v6_ok } */ +/* { dg-add-options arm_arch_v6 } */ + +int +foo (int a, int b, int c) +{ + return c - (((long long) a * b) >> 32); +} + +/* { dg-final { scan-assembler "smmls\[ \t\]" } } */