Message ID | 20180627043328.11531-1-richard.henderson@linaro.org |
---|---|
Headers | show |
Series | target/arm SVE patches | expand |
Richard Henderson <richard.henderson@linaro.org> writes: > This is the remainder of the SVE enablement patches, > with an extra bonus patch to enable ARMv8.2-DotProd. > > V6 updates based on review. One failure from the VQ3 test set: ../qemu.git/aarch64-linux-user/qemu-aarch64 \ ./risu --test-sve=3 \ sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin \ --trace=sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin.trace Gives: loading test image sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin... starting apprentice image at 0x4000801000 starting image fish: “../qemu.git/aarch64-linux-user/…” terminated by signal SIGFPE (Floating point exception) From: http://people.linaro.org/~alex.bennee/testcases/arm64.risu/sve-all-short-v8.3+sve@vq3.tar.xz > > Patches with changes: > 0002-target-arm-Implement-SVE-Contiguous-Load-first-fa.patch > 0007-target-arm-Implement-SVE-FP-Multiply-Add-Group.patch > 0009-target-arm-Implement-SVE-load-and-broadcast-eleme.patch > 0010-target-arm-Implement-SVE-store-vector-predicate-r.patch > 0011-target-arm-Implement-SVE-scatter-stores.patch > 0013-target-arm-Implement-SVE-gather-loads.patch > 0023-target-arm-Implement-SVE-floating-point-convert-p.patch > 0027-target-arm-Implement-SVE-MOVPRFX.patch > 0030-target-arm-Pass-index-to-AdvSIMD-FCMLA-indexed.patch > 0033-target-arm-Implement-SVE-dot-product-indexed.patch > 0034-target-arm-Enable-SVE-for-aarch64-linux-user.patch > 0035-target-arm-Implement-ARMv8.2-DotProd.patch > > Patches lacking reviews: > 0002-target-arm-Implement-SVE-Contiguous-Load-first-fa.patch > 0007-target-arm-Implement-SVE-FP-Multiply-Add-Group.patch > 0013-target-arm-Implement-SVE-gather-loads.patch > 0030-target-arm-Pass-index-to-AdvSIMD-FCMLA-indexed.patch > 0031-target-arm-Implement-SVE-fp-complex-multiply-add-.patch > 0033-target-arm-Implement-SVE-dot-product-indexed.patch > > > r~ > > > Richard Henderson (35): > target/arm: Implement SVE Memory Contiguous Load Group > target/arm: Implement SVE Contiguous Load, first-fault and no-fault > target/arm: Implement SVE Memory Contiguous Store Group > target/arm: Implement SVE load and broadcast quadword > target/arm: Implement SVE integer convert to floating-point > target/arm: Implement SVE floating-point arithmetic (predicated) > target/arm: Implement SVE FP Multiply-Add Group > target/arm: Implement SVE Floating Point Accumulating Reduction Group > target/arm: Implement SVE load and broadcast element > target/arm: Implement SVE store vector/predicate register > target/arm: Implement SVE scatter stores > target/arm: Implement SVE prefetches > target/arm: Implement SVE gather loads > target/arm: Implement SVE first-fault gather loads > target/arm: Implement SVE scatter store vector immediate > target/arm: Implement SVE floating-point compare vectors > target/arm: Implement SVE floating-point arithmetic with immediate > target/arm: Implement SVE Floating Point Multiply Indexed Group > target/arm: Implement SVE FP Fast Reduction Group > target/arm: Implement SVE Floating Point Unary Operations - > Unpredicated Group > target/arm: Implement SVE FP Compare with Zero Group > target/arm: Implement SVE floating-point trig multiply-add coefficient > target/arm: Implement SVE floating-point convert precision > target/arm: Implement SVE floating-point convert to integer > target/arm: Implement SVE floating-point round to integral value > target/arm: Implement SVE floating-point unary operations > target/arm: Implement SVE MOVPRFX > target/arm: Implement SVE floating-point complex add > target/arm: Implement SVE fp complex multiply add > target/arm: Pass index to AdvSIMD FCMLA (indexed) > target/arm: Implement SVE fp complex multiply add (indexed) > target/arm: Implement SVE dot product (vectors) > target/arm: Implement SVE dot product (indexed) > target/arm: Enable SVE for aarch64-linux-user > target/arm: Implement ARMv8.2-DotProd > > target/arm/cpu.h | 1 + > target/arm/helper-sve.h | 682 +++++++++++++ > target/arm/helper.h | 44 +- > linux-user/elfload.c | 2 + > target/arm/cpu.c | 8 + > target/arm/cpu64.c | 2 + > target/arm/helper.c | 2 +- > target/arm/sve_helper.c | 1855 ++++++++++++++++++++++++++++++++++++ > target/arm/translate-a64.c | 57 +- > target/arm/translate-sve.c | 1688 +++++++++++++++++++++++++++++++- > target/arm/translate.c | 102 +- > target/arm/vec_helper.c | 311 +++++- > target/arm/sve.decode | 427 +++++++++ > 13 files changed, 5116 insertions(+), 65 deletions(-) -- Alex Bennée
Richard Henderson <richard.henderson@linaro.org> writes: > This is the remainder of the SVE enablement patches, > with an extra bonus patch to enable ARMv8.2-DotProd. > > V6 updates based on review. > <snip> > Patches lacking reviews: > 0002-target-arm-Implement-SVE-Contiguous-Load-first-fa.patch > 0007-target-arm-Implement-SVE-FP-Multiply-Add-Group.patch > 0013-target-arm-Implement-SVE-gather-loads.patch > 0030-target-arm-Pass-index-to-AdvSIMD-FCMLA-indexed.patch > 0031-target-arm-Implement-SVE-fp-complex-multiply-add-.patch > 0033-target-arm-Implement-SVE-dot-product-indexed.patch OK I have finished sweeping through the un-reviewed patches. > > > r~ > > > Richard Henderson (35): > target/arm: Implement SVE Memory Contiguous Load Group > target/arm: Implement SVE Contiguous Load, first-fault and no-fault > target/arm: Implement SVE Memory Contiguous Store Group > target/arm: Implement SVE load and broadcast quadword > target/arm: Implement SVE integer convert to floating-point > target/arm: Implement SVE floating-point arithmetic (predicated) > target/arm: Implement SVE FP Multiply-Add Group > target/arm: Implement SVE Floating Point Accumulating Reduction Group > target/arm: Implement SVE load and broadcast element > target/arm: Implement SVE store vector/predicate register > target/arm: Implement SVE scatter stores > target/arm: Implement SVE prefetches > target/arm: Implement SVE gather loads > target/arm: Implement SVE first-fault gather loads > target/arm: Implement SVE scatter store vector immediate > target/arm: Implement SVE floating-point compare vectors > target/arm: Implement SVE floating-point arithmetic with immediate > target/arm: Implement SVE Floating Point Multiply Indexed Group > target/arm: Implement SVE FP Fast Reduction Group > target/arm: Implement SVE Floating Point Unary Operations - > Unpredicated Group > target/arm: Implement SVE FP Compare with Zero Group > target/arm: Implement SVE floating-point trig multiply-add coefficient > target/arm: Implement SVE floating-point convert precision > target/arm: Implement SVE floating-point convert to integer > target/arm: Implement SVE floating-point round to integral value > target/arm: Implement SVE floating-point unary operations > target/arm: Implement SVE MOVPRFX > target/arm: Implement SVE floating-point complex add > target/arm: Implement SVE fp complex multiply add > target/arm: Pass index to AdvSIMD FCMLA (indexed) > target/arm: Implement SVE fp complex multiply add (indexed) > target/arm: Implement SVE dot product (vectors) > target/arm: Implement SVE dot product (indexed) > target/arm: Enable SVE for aarch64-linux-user > target/arm: Implement ARMv8.2-DotProd > > target/arm/cpu.h | 1 + > target/arm/helper-sve.h | 682 +++++++++++++ > target/arm/helper.h | 44 +- > linux-user/elfload.c | 2 + > target/arm/cpu.c | 8 + > target/arm/cpu64.c | 2 + > target/arm/helper.c | 2 +- > target/arm/sve_helper.c | 1855 ++++++++++++++++++++++++++++++++++++ > target/arm/translate-a64.c | 57 +- > target/arm/translate-sve.c | 1688 +++++++++++++++++++++++++++++++- > target/arm/translate.c | 102 +- > target/arm/vec_helper.c | 311 +++++- > target/arm/sve.decode | 427 +++++++++ > 13 files changed, 5116 insertions(+), 65 deletions(-) -- Alex Bennée
On 28 June 2018 at 12:30, Alex Bennée <alex.bennee@linaro.org> wrote: > > Richard Henderson <richard.henderson@linaro.org> writes: > >> This is the remainder of the SVE enablement patches, >> with an extra bonus patch to enable ARMv8.2-DotProd. >> >> V6 updates based on review. > > One failure from the VQ3 test set: > > ../qemu.git/aarch64-linux-user/qemu-aarch64 \ > ./risu --test-sve=3 \ > sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin \ > --trace=sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin.trace > > Gives: > > loading test image > sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin... > starting apprentice image at 0x4000801000 > starting image > fish: “../qemu.git/aarch64-linux-user/…” terminated by signal SIGFPE > (Floating point exception) Do you have the insn that it's barfing on? In particular, I'm guessing from the test name that this is for something covered by one of the SDIV_zpzz lines in sve.decode, which is already in master rather than in this test series. If that's true, then it shouldn't block applying this set... thanks -- PMM
On 28 June 2018 at 15:12, Peter Maydell <peter.maydell@linaro.org> wrote: > On 28 June 2018 at 12:30, Alex Bennée <alex.bennee@linaro.org> wrote: >> loading test image >> sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin... >> starting apprentice image at 0x4000801000 >> starting image >> fish: “../qemu.git/aarch64-linux-user/…” terminated by signal SIGFPE >> (Floating point exception) > > Do you have the insn that it's barfing on? In particular, > I'm guessing from the test name that this is for something > covered by one of the SDIV_zpzz lines in sve.decode, which > is already in master rather than in this test series. > If that's true, then it shouldn't block applying this set... Further discussion on IRC suggests that this is failing on MININT idiv -1, which is an annoying special case that x86 happens to generate SIGFPE for. Compare our HELPER(sdiv) code: int32_t HELPER(sdiv)(int32_t num, int32_t den) { if (den == 0) return 0; if (num == INT_MIN && den == -1) return INT_MIN; return num / den; } with what we do for SVE: #define DO_DIV(N, M) (M ? N / M : 0) This is OK for unsigned division, but signed division needs to special case INT_MIN / -1. In any case, this is in an existing insn, so I'm going to apply this series to target-arm.next (fixing up the patch 5 comment typo). thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On 28 June 2018 at 12:30, Alex Bennée <alex.bennee@linaro.org> wrote: >> >> Richard Henderson <richard.henderson@linaro.org> writes: >> >>> This is the remainder of the SVE enablement patches, >>> with an extra bonus patch to enable ARMv8.2-DotProd. >>> >>> V6 updates based on review. >> >> One failure from the VQ3 test set: >> >> ../qemu.git/aarch64-linux-user/qemu-aarch64 \ >> ./risu --test-sve=3 \ >> sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin \ >> --trace=sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin.trace >> >> Gives: >> >> loading test image >> sve-all-short-v8.3+sve@vq3/insn_sdiv_z_p_zz___INC.risu.bin... >> starting apprentice image at 0x4000801000 >> starting image >> fish: “../qemu.git/aarch64-linux-user/…” terminated by signal SIGFPE >> (Floating point exception) > > Do you have the insn that it's barfing on? In particular, > I'm guessing from the test name that this is for something > covered by one of the SDIV_zpzz lines in sve.decode, which > is already in master rather than in this test series. > If that's true, then it shouldn't block applying this set... #0 0x000055555569297f in helper_sve_sdiv_zpzz_s (vd=0x555557a522e0, vn=0x555557a522e0, vm=0x555557a51fe0, vg=0x555557a52be0, desc=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/target/arm/sve_helper.c:480 #1 0x0000555555b1283f in static_code_gen_buffer () #2 0x00005555555ea0d8 in cpu_tb_exec (itb=<optimised out>, cpu=0x555557a50320) at /home/alex/lsrc/qemu/qemu.git/accel/tcg/cpu-exec.c:171 #3 cpu_loop_exec_tb (tb_exit=<synthetic pointer>, last_tb=<synthetic pointer>, tb=<optimised out>, cpu=0x555557a50320) at /home/alex/lsrc/qemu/qemu.git/accel/tcg/cpu-exec.c:612 #4 cpu_exec (cpu=cpu@entry=0x555557a48070) at /home/alex/lsrc/qemu/qemu.git/accel/tcg/cpu-exec.c:722 #5 0x000055555560ad40 in cpu_loop (env=0x555557a50320) at /home/alex/lsrc/qemu/qemu.git/linux-user/aarch64/cpu_loop.c:82 #6 0x00005555555afb0c in main (argc=<optimised out>, argv=0x7fffffffdea8, envp=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/linux-user/main.c:813 #0 0x000055555569297f in helper_sve_sdiv_zpzz_s (vd=0x555557a522e0, vn=0x555557a522e0, vm=0x555557a51fe0, vg=0x555557a52be0, desc=<optimised out>) at /home/alex/lsrc/qemu/qemu.git/target/arm/sve_helper.c:480 480 DO_ZPZZ(sve_sdiv_zpzz_s, int32_t, H1_4, DO_DIV) => 0x55555569297f <helper_sve_sdiv_zpzz_s+63>: idiv %r10d 0x555555692982 <helper_sve_sdiv_zpzz_s+66>: mov %eax,%r11d 0x555555692985 <helper_sve_sdiv_zpzz_s+69>: mov %r11d,(%rdi,%r8,1) 0x555555692989 <helper_sve_sdiv_zpzz_s+73>: add $0x4,%r8 0x55555569298d <helper_sve_sdiv_zpzz_s+77>: shr $0x4,%r9w A syntax error in expression, near `./ $r10d'. r10d $6 = 0xffffffff rax $7 = 0x80000000 rdx $8 = 0xffffffff Yeah so from something already merged in. -- Alex Bennée