Message ID | 87efs4c61d.fsf@linaro.org |
---|---|
State | New |
Headers | show |
On Tue, Aug 22, 2017 at 10:20:46AM +0100, Richard Sandiford wrote: > The AArch64 port defined x2, x3 and x4 vector modes that were only used > in the rtl for the AdvSIMD LD{2,3,4} patterns. It seems unlikely that > this rtl would have led to any valid simplifications, since the values > involved were unspecs that had a different number of operands from the > non-dreg versions. (The dreg UNSPEC_LD2 had a single operand, while > the qreg one had two operands.) > > As it happened, the patterns led to invalid simplifications on big- > endian targets due to a mix-up in the operand order, see Tamar's fix > in r240271. > > This patch therefore replaces the rtl patterns with dedicated unspecs. > This allows the x2, x3 and x4 modes to be removed, avoiding a clash > with 256-bit and 512-bit SVE. > > Tested on aarch64-linux-gnu. Also tested by comparing the before and after > assembly for the testsuite at -O2 -ftree-vectorize on aarch64-linux-gnu > and aarch64_be-linux-gnu; there were no differences. OK to install? This is OK. I think we were being a bit too optimistic with the design of these RTL patterns, though clearly at least some part of this was getting used in rare corner cases, as we had a wrong code-bug. I think what these would give you is a simplification if you used a 64-bit ld2/3/4 and then extracted a lane which GCC knew to be zero. This happened by mistake with the backwards lane indexes for big-endian, causing us to think the DImode subreg we took was constant zero. In reality, there is not likely to be a common case that needs optimisation here, so I don't mind losing this. OK. Thanks, James > 2017-08-22 Richard Sandiford <richard.sandiford@linaro.org> > Alan Hayward <alan.hayward@arm.com> > David Sherwood <david.sherwood@arm.com> > > gcc/ > * config/aarch64/aarch64-modes.def: Remove 32-, 48- and 64-byte > vector modes. > * config/aarch64/iterators.md (VRL2, VRL3, VRL4): Delete. > * config/aarch64/aarch64.md (UNSPEC_LD2_DREG, UNSPEC_LD3_DREG) > (UNSPEC_LD4_DREG): New unspecs. > * config/aarch64/aarch64-simd.md (aarch64_ld2<mode>_dreg_le) > (aarch64_ld2<mode>_dreg_be): Replace with... > (aarch64_ld2<mode>_dreg): ...this pattern and use the new DREG > unspec. > (aarch64_ld3<mode>_dreg_le) > (aarch64_ld3<mode>_dreg_be): Replace with... > (aarch64_ld3<mode>_dreg): ...this pattern and use the new DREG > unspec. > (aarch64_ld4<mode>_dreg_le) > (aarch64_ld4<mode>_dreg_be): Replace with... > (aarch64_ld4<mode>_dreg): ...this pattern and use the new DREG > unspec. > > Index: gcc/config/aarch64/aarch64-modes.def > =================================================================== > --- gcc/config/aarch64/aarch64-modes.def 2017-02-23 19:54:24.000000000 +0000 > +++ gcc/config/aarch64/aarch64-modes.def 2017-08-22 10:11:04.724056356 +0100 > @@ -44,15 +44,5 @@ INT_MODE (OI, 32); > INT_MODE (CI, 48); > INT_MODE (XI, 64); > > -/* Vector modes for register lists. */ > -VECTOR_MODES (INT, 32); /* V32QI V16HI V8SI V4DI. */ > -VECTOR_MODES (FLOAT, 32); /* V8SF V4DF. */ > - > -VECTOR_MODES (INT, 48); /* V32QI V16HI V8SI V4DI. */ > -VECTOR_MODES (FLOAT, 48); /* V8SF V4DF. */ > - > -VECTOR_MODES (INT, 64); /* V32QI V16HI V8SI V4DI. */ > -VECTOR_MODES (FLOAT, 64); /* V8SF V4DF. */ > - > /* Quad float: 128-bit floating mode for long doubles. */ > FLOAT_MODE (TF, 16, ieee_quad_format); > Index: gcc/config/aarch64/iterators.md > =================================================================== > --- gcc/config/aarch64/iterators.md 2017-08-03 10:40:55.896279339 +0100 > +++ gcc/config/aarch64/iterators.md 2017-08-22 10:11:04.727125997 +0100 > @@ -711,21 +711,6 @@ (define_mode_attr Vendreg [(OI "T") (CI > ;; ld..._lane and st..._lane operations. > (define_mode_attr nregs [(OI "2") (CI "3") (XI "4")]) > > -(define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI") > - (V4HF "V16HF") > - (V2SI "V8SI") (V2SF "V8SF") > - (DI "V4DI") (DF "V4DF")]) > - > -(define_mode_attr VRL3 [(V8QI "V48QI") (V4HI "V24HI") > - (V4HF "V24HF") > - (V2SI "V12SI") (V2SF "V12SF") > - (DI "V6DI") (DF "V6DF")]) > - > -(define_mode_attr VRL4 [(V8QI "V64QI") (V4HI "V32HI") > - (V4HF "V32HF") > - (V2SI "V16SI") (V2SF "V16SF") > - (DI "V8DI") (DF "V8DF")]) > - > ;; Mode for atomic operation suffixes > (define_mode_attr atomic_sfx > [(QI "b") (HI "h") (SI "") (DI "")]) > Index: gcc/config/aarch64/aarch64.md > =================================================================== > --- gcc/config/aarch64/aarch64.md 2017-08-16 08:50:34.060622654 +0100 > +++ gcc/config/aarch64/aarch64.md 2017-08-22 10:11:04.726102783 +0100 > @@ -98,10 +98,13 @@ (define_c_enum "unspec" [ > UNSPEC_GOTTINYTLS > UNSPEC_LD1 > UNSPEC_LD2 > + UNSPEC_LD2_DREG > UNSPEC_LD2_DUP > UNSPEC_LD3 > + UNSPEC_LD3_DREG > UNSPEC_LD3_DUP > UNSPEC_LD4 > + UNSPEC_LD4_DREG > UNSPEC_LD4_DUP > UNSPEC_LD2_LANE > UNSPEC_LD3_LANE > Index: gcc/config/aarch64/aarch64-simd.md > =================================================================== > --- gcc/config/aarch64/aarch64-simd.md 2017-08-17 17:29:27.227162205 +0100 > +++ gcc/config/aarch64/aarch64-simd.md 2017-08-22 10:11:04.726102783 +0100 > @@ -4963,278 +4963,62 @@ (define_expand "aarch64_ld<VSTRUCT:nregs > DONE; > }) > > -(define_insn "aarch64_ld2<mode>_dreg_le" > +(define_insn "aarch64_ld2<mode>_dreg" > [(set (match_operand:OI 0 "register_operand" "=w") > - (subreg:OI > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD2) > - (vec_duplicate:VD (const_int 0))) > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD2) > - (vec_duplicate:VD (const_int 0)))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1" > - [(set_attr "type" "neon_load2_2reg<q>")] > -) > - > -(define_insn "aarch64_ld2<mode>_dreg_be" > - [(set (match_operand:OI 0 "register_operand" "=w") > - (subreg:OI > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD2)) > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD2))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD2_DREG))] > + "TARGET_SIMD" > "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1" > [(set_attr "type" "neon_load2_2reg<q>")] > ) > > -(define_insn "aarch64_ld2<mode>_dreg_le" > - [(set (match_operand:OI 0 "register_operand" "=w") > - (subreg:OI > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD2) > - (const_int 0)) > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD2) > - (const_int 0))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld1\\t{%S0.1d - %T0.1d}, %1" > - [(set_attr "type" "neon_load1_2reg<q>")] > -) > - > -(define_insn "aarch64_ld2<mode>_dreg_be" > +(define_insn "aarch64_ld2<mode>_dreg" > [(set (match_operand:OI 0 "register_operand" "=w") > - (subreg:OI > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD2)) > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD2))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD2_DREG))] > + "TARGET_SIMD" > "ld1\\t{%S0.1d - %T0.1d}, %1" > [(set_attr "type" "neon_load1_2reg<q>")] > ) > > -(define_insn "aarch64_ld3<mode>_dreg_le" > +(define_insn "aarch64_ld3<mode>_dreg" > [(set (match_operand:CI 0 "register_operand" "=w") > - (subreg:CI > - (vec_concat:<VRL3> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD3) > - (vec_duplicate:VD (const_int 0))) > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD3) > - (vec_duplicate:VD (const_int 0)))) > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD3) > - (vec_duplicate:VD (const_int 0)))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1" > - [(set_attr "type" "neon_load3_3reg<q>")] > -) > - > -(define_insn "aarch64_ld3<mode>_dreg_be" > - [(set (match_operand:CI 0 "register_operand" "=w") > - (subreg:CI > - (vec_concat:<VRL3> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD3)) > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD3))) > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD3))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD3_DREG))] > + "TARGET_SIMD" > "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1" > [(set_attr "type" "neon_load3_3reg<q>")] > ) > > -(define_insn "aarch64_ld3<mode>_dreg_le" > +(define_insn "aarch64_ld3<mode>_dreg" > [(set (match_operand:CI 0 "register_operand" "=w") > - (subreg:CI > - (vec_concat:<VRL3> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD3) > - (const_int 0)) > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD3) > - (const_int 0))) > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD3) > - (const_int 0))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld1\\t{%S0.1d - %U0.1d}, %1" > - [(set_attr "type" "neon_load1_3reg<q>")] > -) > - > -(define_insn "aarch64_ld3<mode>_dreg_be" > - [(set (match_operand:CI 0 "register_operand" "=w") > - (subreg:CI > - (vec_concat:<VRL3> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD3)) > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD3))) > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD3))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD3_DREG))] > + "TARGET_SIMD" > "ld1\\t{%S0.1d - %U0.1d}, %1" > [(set_attr "type" "neon_load1_3reg<q>")] > ) > > -(define_insn "aarch64_ld4<mode>_dreg_le" > - [(set (match_operand:XI 0 "register_operand" "=w") > - (subreg:XI > - (vec_concat:<VRL4> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD4) > - (vec_duplicate:VD (const_int 0))) > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4) > - (vec_duplicate:VD (const_int 0)))) > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4) > - (vec_duplicate:VD (const_int 0))) > - (vec_concat:<VDBL> > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4) > - (vec_duplicate:VD (const_int 0))))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1" > - [(set_attr "type" "neon_load4_4reg<q>")] > -) > - > -(define_insn "aarch64_ld4<mode>_dreg_be" > +(define_insn "aarch64_ld4<mode>_dreg" > [(set (match_operand:XI 0 "register_operand" "=w") > - (subreg:XI > - (vec_concat:<VRL4> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD4)) > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4))) > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4)) > - (vec_concat:<VDBL> > - (vec_duplicate:VD (const_int 0)) > - (unspec:VD [(match_dup 1)] > - UNSPEC_LD4)))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD4_DREG))] > + "TARGET_SIMD" > "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1" > [(set_attr "type" "neon_load4_4reg<q>")] > ) > > -(define_insn "aarch64_ld4<mode>_dreg_le" > +(define_insn "aarch64_ld4<mode>_dreg" > [(set (match_operand:XI 0 "register_operand" "=w") > - (subreg:XI > - (vec_concat:<VRL4> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD4) > - (const_int 0)) > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4) > - (const_int 0))) > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4) > - (const_int 0)) > - (vec_concat:<VDBL> > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4) > - (const_int 0)))) 0))] > - "TARGET_SIMD && !BYTES_BIG_ENDIAN" > - "ld1\\t{%S0.1d - %V0.1d}, %1" > - [(set_attr "type" "neon_load1_4reg<q>")] > -) > - > -(define_insn "aarch64_ld4<mode>_dreg_be" > - [(set (match_operand:XI 0 "register_operand" "=w") > - (subreg:XI > - (vec_concat:<VRL4> > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX > - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] > - UNSPEC_LD4)) > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4))) > - (vec_concat:<VRL2> > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4)) > - (vec_concat:<VDBL> > - (const_int 0) > - (unspec:DX [(match_dup 1)] > - UNSPEC_LD4)))) 0))] > - "TARGET_SIMD && BYTES_BIG_ENDIAN" > + (unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") > + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] > + UNSPEC_LD4_DREG))] > + "TARGET_SIMD" > "ld1\\t{%S0.1d - %V0.1d}, %1" > [(set_attr "type" "neon_load1_4reg<q>")] > ) > @@ -5248,12 +5032,7 @@ (define_expand "aarch64_ld<VSTRUCT:nregs > rtx mem = gen_rtx_MEM (BLKmode, operands[1]); > set_mem_size (mem, <VSTRUCT:nregs> * 8); > > - if (BYTES_BIG_ENDIAN) > - emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_be (operands[0], > - mem)); > - else > - emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_le (operands[0], > - mem)); > + emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg (operands[0], mem)); > DONE; > }) >
Index: gcc/config/aarch64/aarch64-modes.def =================================================================== --- gcc/config/aarch64/aarch64-modes.def 2017-02-23 19:54:24.000000000 +0000 +++ gcc/config/aarch64/aarch64-modes.def 2017-08-22 10:11:04.724056356 +0100 @@ -44,15 +44,5 @@ INT_MODE (OI, 32); INT_MODE (CI, 48); INT_MODE (XI, 64); -/* Vector modes for register lists. */ -VECTOR_MODES (INT, 32); /* V32QI V16HI V8SI V4DI. */ -VECTOR_MODES (FLOAT, 32); /* V8SF V4DF. */ - -VECTOR_MODES (INT, 48); /* V32QI V16HI V8SI V4DI. */ -VECTOR_MODES (FLOAT, 48); /* V8SF V4DF. */ - -VECTOR_MODES (INT, 64); /* V32QI V16HI V8SI V4DI. */ -VECTOR_MODES (FLOAT, 64); /* V8SF V4DF. */ - /* Quad float: 128-bit floating mode for long doubles. */ FLOAT_MODE (TF, 16, ieee_quad_format); Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2017-08-03 10:40:55.896279339 +0100 +++ gcc/config/aarch64/iterators.md 2017-08-22 10:11:04.727125997 +0100 @@ -711,21 +711,6 @@ (define_mode_attr Vendreg [(OI "T") (CI ;; ld..._lane and st..._lane operations. (define_mode_attr nregs [(OI "2") (CI "3") (XI "4")]) -(define_mode_attr VRL2 [(V8QI "V32QI") (V4HI "V16HI") - (V4HF "V16HF") - (V2SI "V8SI") (V2SF "V8SF") - (DI "V4DI") (DF "V4DF")]) - -(define_mode_attr VRL3 [(V8QI "V48QI") (V4HI "V24HI") - (V4HF "V24HF") - (V2SI "V12SI") (V2SF "V12SF") - (DI "V6DI") (DF "V6DF")]) - -(define_mode_attr VRL4 [(V8QI "V64QI") (V4HI "V32HI") - (V4HF "V32HF") - (V2SI "V16SI") (V2SF "V16SF") - (DI "V8DI") (DF "V8DF")]) - ;; Mode for atomic operation suffixes (define_mode_attr atomic_sfx [(QI "b") (HI "h") (SI "") (DI "")]) Index: gcc/config/aarch64/aarch64.md =================================================================== --- gcc/config/aarch64/aarch64.md 2017-08-16 08:50:34.060622654 +0100 +++ gcc/config/aarch64/aarch64.md 2017-08-22 10:11:04.726102783 +0100 @@ -98,10 +98,13 @@ (define_c_enum "unspec" [ UNSPEC_GOTTINYTLS UNSPEC_LD1 UNSPEC_LD2 + UNSPEC_LD2_DREG UNSPEC_LD2_DUP UNSPEC_LD3 + UNSPEC_LD3_DREG UNSPEC_LD3_DUP UNSPEC_LD4 + UNSPEC_LD4_DREG UNSPEC_LD4_DUP UNSPEC_LD2_LANE UNSPEC_LD3_LANE Index: gcc/config/aarch64/aarch64-simd.md =================================================================== --- gcc/config/aarch64/aarch64-simd.md 2017-08-17 17:29:27.227162205 +0100 +++ gcc/config/aarch64/aarch64-simd.md 2017-08-22 10:11:04.726102783 +0100 @@ -4963,278 +4963,62 @@ (define_expand "aarch64_ld<VSTRUCT:nregs DONE; }) -(define_insn "aarch64_ld2<mode>_dreg_le" +(define_insn "aarch64_ld2<mode>_dreg" [(set (match_operand:OI 0 "register_operand" "=w") - (subreg:OI - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD2) - (vec_duplicate:VD (const_int 0))) - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD2) - (vec_duplicate:VD (const_int 0)))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1" - [(set_attr "type" "neon_load2_2reg<q>")] -) - -(define_insn "aarch64_ld2<mode>_dreg_be" - [(set (match_operand:OI 0 "register_operand" "=w") - (subreg:OI - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD2)) - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD2))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD2_DREG))] + "TARGET_SIMD" "ld2\\t{%S0.<Vtype> - %T0.<Vtype>}, %1" [(set_attr "type" "neon_load2_2reg<q>")] ) -(define_insn "aarch64_ld2<mode>_dreg_le" - [(set (match_operand:OI 0 "register_operand" "=w") - (subreg:OI - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD2) - (const_int 0)) - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD2) - (const_int 0))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld1\\t{%S0.1d - %T0.1d}, %1" - [(set_attr "type" "neon_load1_2reg<q>")] -) - -(define_insn "aarch64_ld2<mode>_dreg_be" +(define_insn "aarch64_ld2<mode>_dreg" [(set (match_operand:OI 0 "register_operand" "=w") - (subreg:OI - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD2)) - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD2))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:OI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD2_DREG))] + "TARGET_SIMD" "ld1\\t{%S0.1d - %T0.1d}, %1" [(set_attr "type" "neon_load1_2reg<q>")] ) -(define_insn "aarch64_ld3<mode>_dreg_le" +(define_insn "aarch64_ld3<mode>_dreg" [(set (match_operand:CI 0 "register_operand" "=w") - (subreg:CI - (vec_concat:<VRL3> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD3) - (vec_duplicate:VD (const_int 0))) - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD3) - (vec_duplicate:VD (const_int 0)))) - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD3) - (vec_duplicate:VD (const_int 0)))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1" - [(set_attr "type" "neon_load3_3reg<q>")] -) - -(define_insn "aarch64_ld3<mode>_dreg_be" - [(set (match_operand:CI 0 "register_operand" "=w") - (subreg:CI - (vec_concat:<VRL3> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD3)) - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD3))) - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD3))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD3_DREG))] + "TARGET_SIMD" "ld3\\t{%S0.<Vtype> - %U0.<Vtype>}, %1" [(set_attr "type" "neon_load3_3reg<q>")] ) -(define_insn "aarch64_ld3<mode>_dreg_le" +(define_insn "aarch64_ld3<mode>_dreg" [(set (match_operand:CI 0 "register_operand" "=w") - (subreg:CI - (vec_concat:<VRL3> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD3) - (const_int 0)) - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD3) - (const_int 0))) - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD3) - (const_int 0))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld1\\t{%S0.1d - %U0.1d}, %1" - [(set_attr "type" "neon_load1_3reg<q>")] -) - -(define_insn "aarch64_ld3<mode>_dreg_be" - [(set (match_operand:CI 0 "register_operand" "=w") - (subreg:CI - (vec_concat:<VRL3> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD3)) - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD3))) - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD3))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:CI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD3_DREG))] + "TARGET_SIMD" "ld1\\t{%S0.1d - %U0.1d}, %1" [(set_attr "type" "neon_load1_3reg<q>")] ) -(define_insn "aarch64_ld4<mode>_dreg_le" - [(set (match_operand:XI 0 "register_operand" "=w") - (subreg:XI - (vec_concat:<VRL4> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD4) - (vec_duplicate:VD (const_int 0))) - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD4) - (vec_duplicate:VD (const_int 0)))) - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD4) - (vec_duplicate:VD (const_int 0))) - (vec_concat:<VDBL> - (unspec:VD [(match_dup 1)] - UNSPEC_LD4) - (vec_duplicate:VD (const_int 0))))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1" - [(set_attr "type" "neon_load4_4reg<q>")] -) - -(define_insn "aarch64_ld4<mode>_dreg_be" +(define_insn "aarch64_ld4<mode>_dreg" [(set (match_operand:XI 0 "register_operand" "=w") - (subreg:XI - (vec_concat:<VRL4> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD4)) - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD4))) - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD4)) - (vec_concat:<VDBL> - (vec_duplicate:VD (const_int 0)) - (unspec:VD [(match_dup 1)] - UNSPEC_LD4)))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:VD [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD4_DREG))] + "TARGET_SIMD" "ld4\\t{%S0.<Vtype> - %V0.<Vtype>}, %1" [(set_attr "type" "neon_load4_4reg<q>")] ) -(define_insn "aarch64_ld4<mode>_dreg_le" +(define_insn "aarch64_ld4<mode>_dreg" [(set (match_operand:XI 0 "register_operand" "=w") - (subreg:XI - (vec_concat:<VRL4> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD4) - (const_int 0)) - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD4) - (const_int 0))) - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD4) - (const_int 0)) - (vec_concat:<VDBL> - (unspec:DX [(match_dup 1)] - UNSPEC_LD4) - (const_int 0)))) 0))] - "TARGET_SIMD && !BYTES_BIG_ENDIAN" - "ld1\\t{%S0.1d - %V0.1d}, %1" - [(set_attr "type" "neon_load1_4reg<q>")] -) - -(define_insn "aarch64_ld4<mode>_dreg_be" - [(set (match_operand:XI 0 "register_operand" "=w") - (subreg:XI - (vec_concat:<VRL4> - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX - [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv")] - UNSPEC_LD4)) - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD4))) - (vec_concat:<VRL2> - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD4)) - (vec_concat:<VDBL> - (const_int 0) - (unspec:DX [(match_dup 1)] - UNSPEC_LD4)))) 0))] - "TARGET_SIMD && BYTES_BIG_ENDIAN" + (unspec:XI [(match_operand:BLK 1 "aarch64_simd_struct_operand" "Utv") + (unspec:DX [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] + UNSPEC_LD4_DREG))] + "TARGET_SIMD" "ld1\\t{%S0.1d - %V0.1d}, %1" [(set_attr "type" "neon_load1_4reg<q>")] ) @@ -5248,12 +5032,7 @@ (define_expand "aarch64_ld<VSTRUCT:nregs rtx mem = gen_rtx_MEM (BLKmode, operands[1]); set_mem_size (mem, <VSTRUCT:nregs> * 8); - if (BYTES_BIG_ENDIAN) - emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_be (operands[0], - mem)); - else - emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg_le (operands[0], - mem)); + emit_insn (gen_aarch64_ld<VSTRUCT:nregs><VDC:mode>_dreg (operands[0], mem)); DONE; })