Message ID | 20180705191929.30773-2-richard.henderson@linaro.org |
---|---|
State | Superseded |
Headers | show |
Series | target/arm: SVE fixes | expand |
Richard Henderson <richard.henderson@linaro.org> writes: > Normally this is automatic in the size restrictions that are placed > on vector sizes coming from the implementation. However, for the > legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final > 24 bytes of the vector register. Without this check, do_dup selects > TCG_TYPE_V128 and clears only 16 bytes. > > Signed-off-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Tested-by: Alex Bennée <alex.bennee@linaro.org> > --- > tcg/tcg-op-gvec.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c > index 22db1590d5..61c25f5784 100644 > --- a/tcg/tcg-op-gvec.c > +++ b/tcg/tcg-op-gvec.c > @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs, > in units of LNSZ. This limits the expansion of inline code. */ > static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz) > { > - uint32_t lnct = oprsz / lnsz; > - return lnct >= 1 && lnct <= MAX_UNROLL; > + if (oprsz % lnsz == 0) { > + uint32_t lnct = oprsz / lnsz; > + return lnct >= 1 && lnct <= MAX_UNROLL; > + } > + return false; > } > > static void expand_clr(uint32_t dofs, uint32_t maxsz); -- Alex Bennée
diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 22db1590d5..61c25f5784 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -287,8 +287,11 @@ void tcg_gen_gvec_4_ptr(uint32_t dofs, uint32_t aofs, uint32_t bofs, in units of LNSZ. This limits the expansion of inline code. */ static inline bool check_size_impl(uint32_t oprsz, uint32_t lnsz) { - uint32_t lnct = oprsz / lnsz; - return lnct >= 1 && lnct <= MAX_UNROLL; + if (oprsz % lnsz == 0) { + uint32_t lnct = oprsz / lnsz; + return lnct >= 1 && lnct <= MAX_UNROLL; + } + return false; } static void expand_clr(uint32_t dofs, uint32_t maxsz);
Normally this is automatic in the size restrictions that are placed on vector sizes coming from the implementation. However, for the legitimate size tuple [oprsz=8, maxsz=32], we need to clear the final 24 bytes of the vector register. Without this check, do_dup selects TCG_TYPE_V128 and clears only 16 bytes. Signed-off-by: Richard Henderson <richard.henderson@linaro.org> --- tcg/tcg-op-gvec.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) -- 2.17.1