Message ID | 5853DC63.3030602@foss.arm.com |
---|---|
State | New |
Headers | show |
On Fri, Dec 16, 2016 at 12:21:55PM +0000, Kyrill Tkachov wrote: > > On 15/12/16 11:56, James Greenhalgh wrote: > >On Thu, Dec 08, 2016 at 09:35:09AM +0000, Kyrill Tkachov wrote: > >>Hi all, > >> > >>Similar to the previous patch this transforms X-reg UBFIZ instructions into > >>W-reg LSL instructions when the UBFIZ operands add up to 32, so we can take > >>advantage of the implicit zero-extension to DImode > >>when writing to a W-register. > >> > >>This is done by splitting the existing *andim_ashift<mode>_bfi pattern into > >>its two SImode and DImode specialisations and changing the DImode pattern > >>into a define_insn_and_split that splits into a > >>zero-extended SImode ashift when the operands match up. > >> > >>So for the code in the testcase we generate: > >>LSL W0, W0, 5 > >> > >>instead of: > >>UBFIZ X0, X0, 5, 27 > >> > >>Bootstrapped and tested on aarch64-none-linux-gnu. > >> > >>Since we're in stage 3 perhaps this is not for GCC 6, but it is fairly low > >>risk. I'm happy for it to wait for the next release if necessary. > >My comments on the previous patch also apply here. This patch should only > >need to add one new split pattern. OK with a small nit fixed. Thanks, James > Thanks, here is the version adding just a single define_split. > > Bootstrapped and tested on aarch64-none-linux-gnu. > 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * config/aarch64/aarch64.md: New define_split above bswap<mode>2. > > 2016-12-16 Kyrylo Tkachov <kyrylo.tkachov@arm.com> > > * gcc.target/aarch64/ubfiz_lsl_1.c: New test. > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -4454,6 +4454,24 @@ (define_insn "*andim_ashift<mode>_bfiz" > [(set_attr "type" "bfx")] > ) > > +;; When the bitposition and width of the equivalent extraction add up to 32 s/bitposition/bit position/
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 5a40ee6abd5e123116aaaa478dced2207dd59478..b0f7bcbb84159fc8c0c733d0b40f2f08eea241a9 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -4454,6 +4454,24 @@ (define_insn "*andim_ashift<mode>_bfiz" [(set_attr "type" "bfx")] ) +;; When the bitposition and width of the equivalent extraction add up to 32 +;; we can use a W-reg LSL instruction taking advantage of the implicit +;; zero-extension of the X-reg. +(define_split + [(set (match_operand:DI 0 "register_operand") + (and:DI (ashift:DI (match_operand:DI 1 "register_operand") + (match_operand 2 "const_int_operand")) + (match_operand 3 "const_int_operand")))] + "aarch64_mask_and_shift_for_ubfiz_p (DImode, operands[3], operands[2]) + && (INTVAL (operands[2]) + popcount_hwi (INTVAL (operands[3]))) + == GET_MODE_BITSIZE (SImode)" + [(set (match_dup 0) + (zero_extend:DI (ashift:SI (match_dup 4) (match_dup 2))))] + { + operands[4] = gen_lowpart (SImode, operands[1]); + } +) + (define_insn "bswap<mode>2" [(set (match_operand:GPI 0 "register_operand" "=r") (bswap:GPI (match_operand:GPI 1 "register_operand" "r")))] diff --git a/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c new file mode 100644 index 0000000000000000000000000000000000000000..d3fd3f234f2324d71813298210fdcf0660ac45b4 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ubfiz_lsl_1.c @@ -0,0 +1,13 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* Check that an X-reg UBFIZ can be simplified into a W-reg LSL. */ + +long long +f2 (long long x) +{ + return (x << 5) & 0xffffffff; +} + +/* { dg-final { scan-assembler "lsl\tw" } } */ +/* { dg-final { scan-assembler-not "ubfiz\tx" } } */