Message ID | 87zi0pct6j.fsf@linaro.org |
---|---|
State | New |
Headers | show |
Series | Add IFN_COND_{MUL,DIV,MOD,RDIV} | expand |
On Thu, May 24, 2018 at 11:34 AM Richard Sandiford < richard.sandiford@linaro.org> wrote: > This patch adds support for conditional multiplication and division. > It's mostly mechanical, but a few notes: > * The *_optab name and the .md names are the same as the unconditional > forms, just with "cond_" added to the front. This means we still > have the awkward difference between sdiv and div, etc. > * It was easier to retain the difference between integer and FP > division in the function names, given that they map to different > tree codes (TRUNC_DIV_EXPR and RDIV_EXPR). > * SVE has no direct support for IFN_COND_MOD, but it seemed more > consistent to add it anyway. > * Adding IFN_COND_MUL enables an extra fully-masked reduction > in gcc.dg/vect/pr53773.c. > * In practice we don't actually use the integer division forms without > if-conversion support (added by a later patch). > Tested on aarch64-linux-gnu (with and without SVE), aarch64_be-elf > and x86_64-linux-gnu. OK for the non-AArch64 bits? OK. Richard. > Richard > 2018-05-24 Richard Sandiford <richard.sandiford@linaro.org> > gcc/ > * doc/sourcebuild.texi (vect_double_cond_arith): Include > multiplication and division. > * doc/md.texi (cond_mul@var{m}, cond_div@var{m}, cond_mod@var{m}) > (cond_udiv@var{m}, cond_umod@var{m}): Document. > * optabs.def (cond_smul_optab, cond_sdiv_optab, cond_smod_optab) > (cond_udiv_optab, cond_umod_optab): New optabs. > * internal-fn.def (IFN_COND_MUL, IFN_COND_DIV, IFN_COND_MOD) > (IFN_COND_RDIV): New internal functions. > * internal-fn.c (get_conditional_internal_fn): Handle TRUNC_DIV_EXPR, > TRUNC_MOD_EXPR and RDIV_EXPR. > * genmatch.c (commutative_op): Handle CFN_COND_MUL. > * match.pd (UNCOND_BINARY, COND_BINARY): Handle them. > * config/aarch64/iterators.md (UNSPEC_COND_MUL, UNSPEC_COND_DIV): > New unspecs. > (SVE_INT_BINARY): Include mult. > (SVE_COND_FP_BINARY): Include UNSPEC_MUL and UNSPEC_DIV. > (optab, sve_int_op): Handle mult. > (optab, sve_fp_op, commutative): Handle UNSPEC_COND_MUL and > UNSPEC_COND_DIV. > * config/aarch64/aarch64-sve.md (cond_<optab><mode>): New pattern > for SVE_INT_BINARY_SD. > gcc/testsuite/ > * lib/target-supports.exp > (check_effective_target_vect_double_cond_arith): Include > multiplication and division. > * gcc.dg/vect/pr53773.c: Do not expect a scalar tail when using > fully-masked loops with a fixed vector length. > * gcc.dg/vect/vect-cond-arith-1.c: Add multiplication and division > tests. > * gcc.target/aarch64/sve/vcond_8.c: Likewise. > * gcc.target/aarch64/sve/vcond_9.c: Likewise. > * gcc.target/aarch64/sve/vcond_12.c: Add multiplication tests. > Index: gcc/doc/sourcebuild.texi > =================================================================== > --- gcc/doc/sourcebuild.texi 2018-05-24 09:54:37.508451387 +0100 > +++ gcc/doc/sourcebuild.texi 2018-05-24 10:12:10.145352193 +0100 > @@ -1426,8 +1426,9 @@ have different type from the value opera > Target supports hardware vectors of @code{double}. > @item vect_double_cond_arith > -Target supports conditional addition, subtraction, minimum and maximum > -on vectors of @code{double}, via the @code{cond_} optabs. > +Target supports conditional addition, subtraction, multiplication, > +division, minimum and maximum on vectors of @code{double}, via the > +@code{cond_} optabs. > @item vect_element_align_preferred > The target's preferred vector alignment is the same as the element > Index: gcc/doc/md.texi > =================================================================== > --- gcc/doc/md.texi 2018-05-24 09:32:10.522816506 +0100 > +++ gcc/doc/md.texi 2018-05-24 10:12:10.142352315 +0100 > @@ -6333,6 +6333,11 @@ operand 0, otherwise (operand 2 + operan > @cindex @code{cond_add@var{mode}} instruction pattern > @cindex @code{cond_sub@var{mode}} instruction pattern > +@cindex @code{cond_mul@var{mode}} instruction pattern > +@cindex @code{cond_div@var{mode}} instruction pattern > +@cindex @code{cond_udiv@var{mode}} instruction pattern > +@cindex @code{cond_mod@var{mode}} instruction pattern > +@cindex @code{cond_umod@var{mode}} instruction pattern > @cindex @code{cond_and@var{mode}} instruction pattern > @cindex @code{cond_ior@var{mode}} instruction pattern > @cindex @code{cond_xor@var{mode}} instruction pattern > @@ -6342,6 +6347,11 @@ operand 0, otherwise (operand 2 + operan > @cindex @code{cond_umax@var{mode}} instruction pattern > @item @samp{cond_add@var{mode}} > @itemx @samp{cond_sub@var{mode}} > +@itemx @samp{cond_mul@var{mode}} > +@itemx @samp{cond_div@var{mode}} > +@itemx @samp{cond_udiv@var{mode}} > +@itemx @samp{cond_mod@var{mode}} > +@itemx @samp{cond_umod@var{mode}} > @itemx @samp{cond_and@var{mode}} > @itemx @samp{cond_ior@var{mode}} > @itemx @samp{cond_xor@var{mode}} > Index: gcc/optabs.def > =================================================================== > --- gcc/optabs.def 2018-05-16 12:48:59.194282896 +0100 > +++ gcc/optabs.def 2018-05-24 10:12:10.146352152 +0100 > @@ -222,6 +222,11 @@ OPTAB_D (notcc_optab, "not$acc") > OPTAB_D (movcc_optab, "mov$acc") > OPTAB_D (cond_add_optab, "cond_add$a") > OPTAB_D (cond_sub_optab, "cond_sub$a") > +OPTAB_D (cond_smul_optab, "cond_mul$a") > +OPTAB_D (cond_sdiv_optab, "cond_div$a") > +OPTAB_D (cond_smod_optab, "cond_mod$a") > +OPTAB_D (cond_udiv_optab, "cond_udiv$a") > +OPTAB_D (cond_umod_optab, "cond_umod$a") > OPTAB_D (cond_and_optab, "cond_and$a") > OPTAB_D (cond_ior_optab, "cond_ior$a") > OPTAB_D (cond_xor_optab, "cond_xor$a") > Index: gcc/internal-fn.def > =================================================================== > --- gcc/internal-fn.def 2018-05-24 09:32:10.522816506 +0100 > +++ gcc/internal-fn.def 2018-05-24 10:12:10.146352152 +0100 > @@ -145,6 +145,12 @@ DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, > DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) > DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary) > +DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary) > +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_DIV, ECF_CONST, first, > + cond_sdiv, cond_udiv, cond_binary) > +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MOD, ECF_CONST, first, > + cond_smod, cond_umod, cond_binary) > +DEF_INTERNAL_OPTAB_FN (COND_RDIV, ECF_CONST, cond_sdiv, cond_binary) > DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MIN, ECF_CONST, first, > cond_smin, cond_umin, cond_binary) > DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MAX, ECF_CONST, first, > Index: gcc/internal-fn.c > =================================================================== > --- gcc/internal-fn.c 2018-05-24 09:32:10.522816506 +0100 > +++ gcc/internal-fn.c 2018-05-24 10:12:10.146352152 +0100 > @@ -3246,6 +3246,12 @@ get_conditional_internal_fn (tree_code c > return IFN_COND_MIN; > case MAX_EXPR: > return IFN_COND_MAX; > + case TRUNC_DIV_EXPR: > + return IFN_COND_DIV; > + case TRUNC_MOD_EXPR: > + return IFN_COND_MOD; > + case RDIV_EXPR: > + return IFN_COND_RDIV; > case BIT_AND_EXPR: > return IFN_COND_AND; > case BIT_IOR_EXPR: > Index: gcc/genmatch.c > =================================================================== > --- gcc/genmatch.c 2018-05-24 09:54:37.508451387 +0100 > +++ gcc/genmatch.c 2018-05-24 10:12:10.145352193 +0100 > @@ -487,6 +487,7 @@ commutative_op (id_base *id) > case CFN_COND_ADD: > case CFN_COND_SUB: > + case CFN_COND_MUL: > case CFN_COND_MAX: > case CFN_COND_MIN: > case CFN_COND_AND: > Index: gcc/match.pd > =================================================================== > --- gcc/match.pd 2018-05-24 09:54:37.509451356 +0100 > +++ gcc/match.pd 2018-05-24 10:12:10.146352152 +0100 > @@ -78,10 +78,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) > /* Binary operations and their associated IFN_COND_* function. */ > (define_operator_list UNCOND_BINARY > plus minus > + mult trunc_div trunc_mod rdiv > min max > bit_and bit_ior bit_xor) > (define_operator_list COND_BINARY > IFN_COND_ADD IFN_COND_SUB > + IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV > IFN_COND_MIN IFN_COND_MAX > IFN_COND_AND IFN_COND_IOR IFN_COND_XOR) > Index: gcc/config/aarch64/iterators.md > =================================================================== > --- gcc/config/aarch64/iterators.md 2018-05-24 09:54:37.508451387 +0100 > +++ gcc/config/aarch64/iterators.md 2018-05-24 10:12:10.142352315 +0100 > @@ -464,6 +464,8 @@ (define_c_enum "unspec" > UNSPEC_UMUL_HIGHPART ; Used in aarch64-sve.md. > UNSPEC_COND_ADD ; Used in aarch64-sve.md. > UNSPEC_COND_SUB ; Used in aarch64-sve.md. > + UNSPEC_COND_MUL ; Used in aarch64-sve.md. > + UNSPEC_COND_DIV ; Used in aarch64-sve.md. > UNSPEC_COND_MAX ; Used in aarch64-sve.md. > UNSPEC_COND_MIN ; Used in aarch64-sve.md. > UNSPEC_COND_LT ; Used in aarch64-sve.md. > @@ -1202,7 +1204,7 @@ (define_code_iterator SVE_INT_UNARY [neg > ;; SVE floating-point unary operations. > (define_code_iterator SVE_FP_UNARY [neg abs sqrt]) > -(define_code_iterator SVE_INT_BINARY [plus minus smax umax smin umin > +(define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin > and ior xor]) > (define_code_iterator SVE_INT_BINARY_REV [minus]) > @@ -1239,6 +1241,7 @@ (define_code_attr optab [(ashift "ashl") > (neg "neg") > (plus "add") > (minus "sub") > + (mult "mul") > (div "div") > (udiv "udiv") > (ss_plus "qadd") > @@ -1382,6 +1385,7 @@ (define_mode_attr lconst_atomic [(QI "K" > ;; The integer SVE instruction that implements an rtx code. > (define_code_attr sve_int_op [(plus "add") > (minus "sub") > + (mult "mul") > (div "sdiv") > (udiv "udiv") > (neg "neg") > @@ -1540,9 +1544,10 @@ (define_int_iterator UNPACK_UNSIGNED [UN > (define_int_iterator MUL_HIGHPART [UNSPEC_SMUL_HIGHPART UNSPEC_UMUL_HIGHPART]) > (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_ADD UNSPEC_COND_SUB > + UNSPEC_COND_MUL UNSPEC_COND_DIV > UNSPEC_COND_MAX UNSPEC_COND_MIN]) > -(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB]) > +(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB UNSPEC_COND_DIV]) > (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE > UNSPEC_COND_EQ UNSPEC_COND_NE > @@ -1573,6 +1578,8 @@ (define_int_attr optab [(UNSPEC_ANDF "an > (UNSPEC_XORV "xor") > (UNSPEC_COND_ADD "add") > (UNSPEC_COND_SUB "sub") > + (UNSPEC_COND_MUL "mul") > + (UNSPEC_COND_DIV "div") > (UNSPEC_COND_MAX "smax") > (UNSPEC_COND_MIN "smin")]) > @@ -1787,10 +1794,14 @@ (define_int_attr cmp_op [(UNSPEC_COND_LT > (define_int_attr sve_fp_op [(UNSPEC_COND_ADD "fadd") > (UNSPEC_COND_SUB "fsub") > + (UNSPEC_COND_MUL "fmul") > + (UNSPEC_COND_DIV "fdiv") > (UNSPEC_COND_MAX "fmaxnm") > (UNSPEC_COND_MIN "fminnm")]) > (define_int_attr commutative [(UNSPEC_COND_ADD "true") > (UNSPEC_COND_SUB "false") > + (UNSPEC_COND_MUL "true") > + (UNSPEC_COND_DIV "false") > (UNSPEC_COND_MIN "true") > (UNSPEC_COND_MAX "true")]) > Index: gcc/config/aarch64/aarch64-sve.md > =================================================================== > --- gcc/config/aarch64/aarch64-sve.md 2018-05-24 09:54:37.506451449 +0100 > +++ gcc/config/aarch64/aarch64-sve.md 2018-05-24 10:12:10.141352356 +0100 > @@ -1803,6 +1803,21 @@ (define_expand "cond_<optab><mode>" > aarch64_sve_prepare_conditional_op (operands, 5, commutative_p); > }) > +(define_expand "cond_<optab><mode>" > + [(set (match_operand:SVE_SDI 0 "register_operand") > + (unspec:SVE_SDI > + [(match_operand:<VPRED> 1 "register_operand") > + (SVE_INT_BINARY_SD:SVE_SDI > + (match_operand:SVE_SDI 2 "register_operand") > + (match_operand:SVE_SDI 3 "register_operand")) > + (match_operand:SVE_SDI 4 "register_operand")] > + UNSPEC_SEL))] > + "TARGET_SVE" > +{ > + bool commutative_p = (GET_RTX_CLASS (<CODE>) == RTX_COMM_ARITH); > + aarch64_sve_prepare_conditional_op (operands, 5, commutative_p); > +}) > + > ;; Predicated integer operations. > (define_insn "*cond_<optab><mode>" > [(set (match_operand:SVE_I 0 "register_operand" "=w") > @@ -1817,6 +1832,19 @@ (define_insn "*cond_<optab><mode>" > "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>" > ) > +(define_insn "*cond_<optab><mode>" > + [(set (match_operand:SVE_SDI 0 "register_operand" "=w") > + (unspec:SVE_SDI > + [(match_operand:<VPRED> 1 "register_operand" "Upl") > + (SVE_INT_BINARY_SD:SVE_SDI > + (match_operand:SVE_SDI 2 "register_operand" "0") > + (match_operand:SVE_SDI 3 "register_operand" "w")) > + (match_dup 2)] > + UNSPEC_SEL))] > + "TARGET_SVE" > + "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>" > +) > + > ;; Predicated integer operations with the operands reversed. > (define_insn "*cond_<optab><mode>" > [(set (match_operand:SVE_I 0 "register_operand" "=w") > @@ -1828,6 +1856,19 @@ (define_insn "*cond_<optab><mode>" > (match_dup 3)] > UNSPEC_SEL))] > "TARGET_SVE" > + "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>" > +) > + > +(define_insn "*cond_<optab><mode>" > + [(set (match_operand:SVE_SDI 0 "register_operand" "=w") > + (unspec:SVE_SDI > + [(match_operand:<VPRED> 1 "register_operand" "Upl") > + (SVE_INT_BINARY_SD:SVE_SDI > + (match_operand:SVE_SDI 2 "register_operand" "w") > + (match_operand:SVE_SDI 3 "register_operand" "0")) > + (match_dup 3)] > + UNSPEC_SEL))] > + "TARGET_SVE" > "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>" > ) > Index: gcc/testsuite/lib/target-supports.exp > =================================================================== > --- gcc/testsuite/lib/target-supports.exp 2018-05-24 09:54:37.511451293 +0100 > +++ gcc/testsuite/lib/target-supports.exp 2018-05-24 10:12:10.148352070 +0100 > @@ -5590,8 +5590,9 @@ proc check_effective_target_vect_double > return $et_vect_double_saved($et_index) > } > -# Return 1 if the target supports conditional addition, subtraction, minimum > -# and maximum on vectors of double, via the cond_ optabs. Return 0 otherwise. > +# Return 1 if the target supports conditional addition, subtraction, > +# multiplication, division, minimum and maximum on vectors of double, > +# via the cond_ optabs. Return 0 otherwise. > proc check_effective_target_vect_double_cond_arith { } { > return [check_effective_target_aarch64_sve] > Index: gcc/testsuite/gcc.dg/vect/pr53773.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-16 12:48:59.115202362 +0100 > +++ gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-24 10:12:10.147352111 +0100 > @@ -14,5 +14,8 @@ foo (int integral, int decimal, int powe > return integral+decimal; > } > -/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" } } */ > +/* We can avoid a scalar tail when using fully-masked loops with a fixed > + vector length. */ > +/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" { target { { ! vect_fully_masked } || vect_variable_length } } } } */ > +/* { dg-final { scan-tree-dump-times "\\* 10" 0 "optimized" { target { vect_fully_masked && { ! vect_variable_length } } } } } */ > Index: gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c > =================================================================== > --- gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24 09:54:37.509451356 +0100 > +++ gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24 10:12:10.147352111 +0100 > @@ -6,6 +6,8 @@ #define N (VECTOR_BITS * 11 / 64 + 3) > #define add(A, B) ((A) + (B)) > #define sub(A, B) ((A) - (B)) > +#define mul(A, B) ((A) * (B)) > +#define div(A, B) ((A) / (B)) > #define DEF(OP) \ > void __attribute__ ((noipa)) \ > @@ -34,6 +36,8 @@ #define TEST(OP) \ > #define FOR_EACH_OP(T) \ > T (add) \ > T (sub) \ > + T (mul) \ > + T (div) \ > T (__builtin_fmax) \ > T (__builtin_fmin) > @@ -54,5 +58,7 @@ main (void) > /* { dg-final { scan-tree-dump { = \.COND_ADD} "optimized" { target vect_double_cond_arith } } } */ > /* { dg-final { scan-tree-dump { = \.COND_SUB} "optimized" { target vect_double_cond_arith } } } */ > +/* { dg-final { scan-tree-dump { = \.COND_MUL} "optimized" { target vect_double_cond_arith } } } */ > +/* { dg-final { scan-tree-dump { = \.COND_RDIV} "optimized" { target vect_double_cond_arith } } } */ > /* { dg-final { scan-tree-dump { = \.COND_MAX} "optimized" { target vect_double_cond_arith } } } */ > /* { dg-final { scan-tree-dump { = \.COND_MIN} "optimized" { target vect_double_cond_arith } } } */ > Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c > =================================================================== > --- gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24 09:54:37.510451324 +0100 > +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24 10:12:10.147352111 +0100 > @@ -5,6 +5,8 @@ > #define add(A, B) ((A) + (B)) > #define sub(A, B) ((A) - (B)) > +#define mul(A, B) ((A) * (B)) > +#define div(A, B) ((A) / (B)) > #define max(A, B) ((A) > (B) ? (A) : (B)) > #define min(A, B) ((A) < (B) ? (A) : (B)) > #define and(A, B) ((A) & (B)) > @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ > #define FOR_EACH_INT_TYPE(T, TYPE) \ > T (TYPE, TYPE, add) \ > T (TYPE, TYPE, sub) \ > + T (TYPE, TYPE, mul) \ > T (TYPE, TYPE, max) \ > T (TYPE, TYPE, min) \ > T (TYPE, TYPE, and) \ > @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ > #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ > T (TYPE, CMPTYPE, add) \ > T (TYPE, CMPTYPE, sub) \ > + T (TYPE, CMPTYPE, mul) \ > + T (TYPE, CMPTYPE, div) \ > T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ > T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) > @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > + > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > + > +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > + > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c > =================================================================== > --- gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24 09:54:37.510451324 +0100 > +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24 10:12:10.148352070 +0100 > @@ -5,6 +5,8 @@ > #define add(A, B) ((A) + (B)) > #define sub(A, B) ((A) - (B)) > +#define mul(A, B) ((A) * (B)) > +#define div(A, B) ((A) / (B)) > #define max(A, B) ((A) > (B) ? (A) : (B)) > #define min(A, B) ((A) < (B) ? (A) : (B)) > #define and(A, B) ((A) & (B)) > @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ > #define FOR_EACH_INT_TYPE(T, TYPE) \ > T (TYPE, TYPE, add) \ > T (TYPE, TYPE, sub) \ > + T (TYPE, TYPE, mul) \ > T (TYPE, TYPE, max) \ > T (TYPE, TYPE, min) \ > T (TYPE, TYPE, and) \ > @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ > #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ > T (TYPE, CMPTYPE, add) \ > T (TYPE, CMPTYPE, sub) \ > + T (TYPE, CMPTYPE, mul) \ > + T (TYPE, CMPTYPE, div) \ > T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ > T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) > @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > + > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > + > +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > + > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c > =================================================================== > --- gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24 09:54:37.510451324 +0100 > +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24 10:12:10.147352111 +0100 > @@ -5,6 +5,8 @@ > #define add(A, B) ((A) + (B)) > #define sub(A, B) ((A) - (B)) > +#define mul(A, B) ((A) * (B)) > +#define div(A, B) ((A) / (B)) > #define max(A, B) ((A) > (B) ? (A) : (B)) > #define min(A, B) ((A) < (B) ? (A) : (B)) > #define and(A, B) ((A) & (B)) > @@ -29,6 +31,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ > #define FOR_EACH_INT_TYPE(T, TYPE) \ > T (TYPE, TYPE, add) \ > T (TYPE, TYPE, sub) \ > + T (TYPE, TYPE, mul) \ > T (TYPE, TYPE, max) \ > T (TYPE, TYPE, min) \ > T (TYPE, TYPE, and) \ > @@ -38,6 +41,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ > #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ > T (TYPE, CMPTYPE, add) \ > T (TYPE, CMPTYPE, sub) \ > + T (TYPE, CMPTYPE, mul) \ > + /* No div because that gets converted into a mul anyway. */ \ > T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ > T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) > @@ -58,10 +63,10 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.., z[0-9]+} } } */ > -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 14 } } */ > -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 18 } } */ > -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 18 } } */ > -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 18 } } */ > +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 16 } } */ > +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 21 } } */ > +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 21 } } */ > +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 21 } } */ > /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ > /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ > @@ -73,6 +78,11 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ > +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ > + > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > @@ -116,6 +126,10 @@ FOR_EACH_LOOP (DEF_LOOP) > /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ > + > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ > /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */
Index: gcc/doc/sourcebuild.texi =================================================================== --- gcc/doc/sourcebuild.texi 2018-05-24 09:54:37.508451387 +0100 +++ gcc/doc/sourcebuild.texi 2018-05-24 10:12:10.145352193 +0100 @@ -1426,8 +1426,9 @@ have different type from the value opera Target supports hardware vectors of @code{double}. @item vect_double_cond_arith -Target supports conditional addition, subtraction, minimum and maximum -on vectors of @code{double}, via the @code{cond_} optabs. +Target supports conditional addition, subtraction, multiplication, +division, minimum and maximum on vectors of @code{double}, via the +@code{cond_} optabs. @item vect_element_align_preferred The target's preferred vector alignment is the same as the element Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi 2018-05-24 09:32:10.522816506 +0100 +++ gcc/doc/md.texi 2018-05-24 10:12:10.142352315 +0100 @@ -6333,6 +6333,11 @@ operand 0, otherwise (operand 2 + operan @cindex @code{cond_add@var{mode}} instruction pattern @cindex @code{cond_sub@var{mode}} instruction pattern +@cindex @code{cond_mul@var{mode}} instruction pattern +@cindex @code{cond_div@var{mode}} instruction pattern +@cindex @code{cond_udiv@var{mode}} instruction pattern +@cindex @code{cond_mod@var{mode}} instruction pattern +@cindex @code{cond_umod@var{mode}} instruction pattern @cindex @code{cond_and@var{mode}} instruction pattern @cindex @code{cond_ior@var{mode}} instruction pattern @cindex @code{cond_xor@var{mode}} instruction pattern @@ -6342,6 +6347,11 @@ operand 0, otherwise (operand 2 + operan @cindex @code{cond_umax@var{mode}} instruction pattern @item @samp{cond_add@var{mode}} @itemx @samp{cond_sub@var{mode}} +@itemx @samp{cond_mul@var{mode}} +@itemx @samp{cond_div@var{mode}} +@itemx @samp{cond_udiv@var{mode}} +@itemx @samp{cond_mod@var{mode}} +@itemx @samp{cond_umod@var{mode}} @itemx @samp{cond_and@var{mode}} @itemx @samp{cond_ior@var{mode}} @itemx @samp{cond_xor@var{mode}} Index: gcc/optabs.def =================================================================== --- gcc/optabs.def 2018-05-16 12:48:59.194282896 +0100 +++ gcc/optabs.def 2018-05-24 10:12:10.146352152 +0100 @@ -222,6 +222,11 @@ OPTAB_D (notcc_optab, "not$acc") OPTAB_D (movcc_optab, "mov$acc") OPTAB_D (cond_add_optab, "cond_add$a") OPTAB_D (cond_sub_optab, "cond_sub$a") +OPTAB_D (cond_smul_optab, "cond_mul$a") +OPTAB_D (cond_sdiv_optab, "cond_div$a") +OPTAB_D (cond_smod_optab, "cond_mod$a") +OPTAB_D (cond_udiv_optab, "cond_udiv$a") +OPTAB_D (cond_umod_optab, "cond_umod$a") OPTAB_D (cond_and_optab, "cond_and$a") OPTAB_D (cond_ior_optab, "cond_ior$a") OPTAB_D (cond_xor_optab, "cond_xor$a") Index: gcc/internal-fn.def =================================================================== --- gcc/internal-fn.def 2018-05-24 09:32:10.522816506 +0100 +++ gcc/internal-fn.def 2018-05-24 10:12:10.146352152 +0100 @@ -145,6 +145,12 @@ DEF_INTERNAL_OPTAB_FN (FNMS, ECF_CONST, DEF_INTERNAL_OPTAB_FN (COND_ADD, ECF_CONST, cond_add, cond_binary) DEF_INTERNAL_OPTAB_FN (COND_SUB, ECF_CONST, cond_sub, cond_binary) +DEF_INTERNAL_OPTAB_FN (COND_MUL, ECF_CONST, cond_smul, cond_binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_DIV, ECF_CONST, first, + cond_sdiv, cond_udiv, cond_binary) +DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MOD, ECF_CONST, first, + cond_smod, cond_umod, cond_binary) +DEF_INTERNAL_OPTAB_FN (COND_RDIV, ECF_CONST, cond_sdiv, cond_binary) DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MIN, ECF_CONST, first, cond_smin, cond_umin, cond_binary) DEF_INTERNAL_SIGNED_OPTAB_FN (COND_MAX, ECF_CONST, first, Index: gcc/internal-fn.c =================================================================== --- gcc/internal-fn.c 2018-05-24 09:32:10.522816506 +0100 +++ gcc/internal-fn.c 2018-05-24 10:12:10.146352152 +0100 @@ -3246,6 +3246,12 @@ get_conditional_internal_fn (tree_code c return IFN_COND_MIN; case MAX_EXPR: return IFN_COND_MAX; + case TRUNC_DIV_EXPR: + return IFN_COND_DIV; + case TRUNC_MOD_EXPR: + return IFN_COND_MOD; + case RDIV_EXPR: + return IFN_COND_RDIV; case BIT_AND_EXPR: return IFN_COND_AND; case BIT_IOR_EXPR: Index: gcc/genmatch.c =================================================================== --- gcc/genmatch.c 2018-05-24 09:54:37.508451387 +0100 +++ gcc/genmatch.c 2018-05-24 10:12:10.145352193 +0100 @@ -487,6 +487,7 @@ commutative_op (id_base *id) case CFN_COND_ADD: case CFN_COND_SUB: + case CFN_COND_MUL: case CFN_COND_MAX: case CFN_COND_MIN: case CFN_COND_AND: Index: gcc/match.pd =================================================================== --- gcc/match.pd 2018-05-24 09:54:37.509451356 +0100 +++ gcc/match.pd 2018-05-24 10:12:10.146352152 +0100 @@ -78,10 +78,12 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Binary operations and their associated IFN_COND_* function. */ (define_operator_list UNCOND_BINARY plus minus + mult trunc_div trunc_mod rdiv min max bit_and bit_ior bit_xor) (define_operator_list COND_BINARY IFN_COND_ADD IFN_COND_SUB + IFN_COND_MUL IFN_COND_DIV IFN_COND_MOD IFN_COND_RDIV IFN_COND_MIN IFN_COND_MAX IFN_COND_AND IFN_COND_IOR IFN_COND_XOR) Index: gcc/config/aarch64/iterators.md =================================================================== --- gcc/config/aarch64/iterators.md 2018-05-24 09:54:37.508451387 +0100 +++ gcc/config/aarch64/iterators.md 2018-05-24 10:12:10.142352315 +0100 @@ -464,6 +464,8 @@ (define_c_enum "unspec" UNSPEC_UMUL_HIGHPART ; Used in aarch64-sve.md. UNSPEC_COND_ADD ; Used in aarch64-sve.md. UNSPEC_COND_SUB ; Used in aarch64-sve.md. + UNSPEC_COND_MUL ; Used in aarch64-sve.md. + UNSPEC_COND_DIV ; Used in aarch64-sve.md. UNSPEC_COND_MAX ; Used in aarch64-sve.md. UNSPEC_COND_MIN ; Used in aarch64-sve.md. UNSPEC_COND_LT ; Used in aarch64-sve.md. @@ -1202,7 +1204,7 @@ (define_code_iterator SVE_INT_UNARY [neg ;; SVE floating-point unary operations. (define_code_iterator SVE_FP_UNARY [neg abs sqrt]) -(define_code_iterator SVE_INT_BINARY [plus minus smax umax smin umin +(define_code_iterator SVE_INT_BINARY [plus minus mult smax umax smin umin and ior xor]) (define_code_iterator SVE_INT_BINARY_REV [minus]) @@ -1239,6 +1241,7 @@ (define_code_attr optab [(ashift "ashl") (neg "neg") (plus "add") (minus "sub") + (mult "mul") (div "div") (udiv "udiv") (ss_plus "qadd") @@ -1382,6 +1385,7 @@ (define_mode_attr lconst_atomic [(QI "K" ;; The integer SVE instruction that implements an rtx code. (define_code_attr sve_int_op [(plus "add") (minus "sub") + (mult "mul") (div "sdiv") (udiv "udiv") (neg "neg") @@ -1540,9 +1544,10 @@ (define_int_iterator UNPACK_UNSIGNED [UN (define_int_iterator MUL_HIGHPART [UNSPEC_SMUL_HIGHPART UNSPEC_UMUL_HIGHPART]) (define_int_iterator SVE_COND_FP_BINARY [UNSPEC_COND_ADD UNSPEC_COND_SUB + UNSPEC_COND_MUL UNSPEC_COND_DIV UNSPEC_COND_MAX UNSPEC_COND_MIN]) -(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB]) +(define_int_iterator SVE_COND_FP_BINARY_REV [UNSPEC_COND_SUB UNSPEC_COND_DIV]) (define_int_iterator SVE_COND_FP_CMP [UNSPEC_COND_LT UNSPEC_COND_LE UNSPEC_COND_EQ UNSPEC_COND_NE @@ -1573,6 +1578,8 @@ (define_int_attr optab [(UNSPEC_ANDF "an (UNSPEC_XORV "xor") (UNSPEC_COND_ADD "add") (UNSPEC_COND_SUB "sub") + (UNSPEC_COND_MUL "mul") + (UNSPEC_COND_DIV "div") (UNSPEC_COND_MAX "smax") (UNSPEC_COND_MIN "smin")]) @@ -1787,10 +1794,14 @@ (define_int_attr cmp_op [(UNSPEC_COND_LT (define_int_attr sve_fp_op [(UNSPEC_COND_ADD "fadd") (UNSPEC_COND_SUB "fsub") + (UNSPEC_COND_MUL "fmul") + (UNSPEC_COND_DIV "fdiv") (UNSPEC_COND_MAX "fmaxnm") (UNSPEC_COND_MIN "fminnm")]) (define_int_attr commutative [(UNSPEC_COND_ADD "true") (UNSPEC_COND_SUB "false") + (UNSPEC_COND_MUL "true") + (UNSPEC_COND_DIV "false") (UNSPEC_COND_MIN "true") (UNSPEC_COND_MAX "true")]) Index: gcc/config/aarch64/aarch64-sve.md =================================================================== --- gcc/config/aarch64/aarch64-sve.md 2018-05-24 09:54:37.506451449 +0100 +++ gcc/config/aarch64/aarch64-sve.md 2018-05-24 10:12:10.141352356 +0100 @@ -1803,6 +1803,21 @@ (define_expand "cond_<optab><mode>" aarch64_sve_prepare_conditional_op (operands, 5, commutative_p); }) +(define_expand "cond_<optab><mode>" + [(set (match_operand:SVE_SDI 0 "register_operand") + (unspec:SVE_SDI + [(match_operand:<VPRED> 1 "register_operand") + (SVE_INT_BINARY_SD:SVE_SDI + (match_operand:SVE_SDI 2 "register_operand") + (match_operand:SVE_SDI 3 "register_operand")) + (match_operand:SVE_SDI 4 "register_operand")] + UNSPEC_SEL))] + "TARGET_SVE" +{ + bool commutative_p = (GET_RTX_CLASS (<CODE>) == RTX_COMM_ARITH); + aarch64_sve_prepare_conditional_op (operands, 5, commutative_p); +}) + ;; Predicated integer operations. (define_insn "*cond_<optab><mode>" [(set (match_operand:SVE_I 0 "register_operand" "=w") @@ -1817,6 +1832,19 @@ (define_insn "*cond_<optab><mode>" "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>" ) +(define_insn "*cond_<optab><mode>" + [(set (match_operand:SVE_SDI 0 "register_operand" "=w") + (unspec:SVE_SDI + [(match_operand:<VPRED> 1 "register_operand" "Upl") + (SVE_INT_BINARY_SD:SVE_SDI + (match_operand:SVE_SDI 2 "register_operand" "0") + (match_operand:SVE_SDI 3 "register_operand" "w")) + (match_dup 2)] + UNSPEC_SEL))] + "TARGET_SVE" + "<sve_int_op>\t%0.<Vetype>, %1/m, %0.<Vetype>, %3.<Vetype>" +) + ;; Predicated integer operations with the operands reversed. (define_insn "*cond_<optab><mode>" [(set (match_operand:SVE_I 0 "register_operand" "=w") @@ -1828,6 +1856,19 @@ (define_insn "*cond_<optab><mode>" (match_dup 3)] UNSPEC_SEL))] "TARGET_SVE" + "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>" +) + +(define_insn "*cond_<optab><mode>" + [(set (match_operand:SVE_SDI 0 "register_operand" "=w") + (unspec:SVE_SDI + [(match_operand:<VPRED> 1 "register_operand" "Upl") + (SVE_INT_BINARY_SD:SVE_SDI + (match_operand:SVE_SDI 2 "register_operand" "w") + (match_operand:SVE_SDI 3 "register_operand" "0")) + (match_dup 3)] + UNSPEC_SEL))] + "TARGET_SVE" "<sve_int_op>r\t%0.<Vetype>, %1/m, %0.<Vetype>, %2.<Vetype>" ) Index: gcc/testsuite/lib/target-supports.exp =================================================================== --- gcc/testsuite/lib/target-supports.exp 2018-05-24 09:54:37.511451293 +0100 +++ gcc/testsuite/lib/target-supports.exp 2018-05-24 10:12:10.148352070 +0100 @@ -5590,8 +5590,9 @@ proc check_effective_target_vect_double return $et_vect_double_saved($et_index) } -# Return 1 if the target supports conditional addition, subtraction, minimum -# and maximum on vectors of double, via the cond_ optabs. Return 0 otherwise. +# Return 1 if the target supports conditional addition, subtraction, +# multiplication, division, minimum and maximum on vectors of double, +# via the cond_ optabs. Return 0 otherwise. proc check_effective_target_vect_double_cond_arith { } { return [check_effective_target_aarch64_sve] Index: gcc/testsuite/gcc.dg/vect/pr53773.c =================================================================== --- gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-16 12:48:59.115202362 +0100 +++ gcc/testsuite/gcc.dg/vect/pr53773.c 2018-05-24 10:12:10.147352111 +0100 @@ -14,5 +14,8 @@ foo (int integral, int decimal, int powe return integral+decimal; } -/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" } } */ +/* We can avoid a scalar tail when using fully-masked loops with a fixed + vector length. */ +/* { dg-final { scan-tree-dump-times "\\* 10" 2 "optimized" { target { { ! vect_fully_masked } || vect_variable_length } } } } */ +/* { dg-final { scan-tree-dump-times "\\* 10" 0 "optimized" { target { vect_fully_masked && { ! vect_variable_length } } } } } */ Index: gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c =================================================================== --- gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24 09:54:37.509451356 +0100 +++ gcc/testsuite/gcc.dg/vect/vect-cond-arith-1.c 2018-05-24 10:12:10.147352111 +0100 @@ -6,6 +6,8 @@ #define N (VECTOR_BITS * 11 / 64 + 3) #define add(A, B) ((A) + (B)) #define sub(A, B) ((A) - (B)) +#define mul(A, B) ((A) * (B)) +#define div(A, B) ((A) / (B)) #define DEF(OP) \ void __attribute__ ((noipa)) \ @@ -34,6 +36,8 @@ #define TEST(OP) \ #define FOR_EACH_OP(T) \ T (add) \ T (sub) \ + T (mul) \ + T (div) \ T (__builtin_fmax) \ T (__builtin_fmin) @@ -54,5 +58,7 @@ main (void) /* { dg-final { scan-tree-dump { = \.COND_ADD} "optimized" { target vect_double_cond_arith } } } */ /* { dg-final { scan-tree-dump { = \.COND_SUB} "optimized" { target vect_double_cond_arith } } } */ +/* { dg-final { scan-tree-dump { = \.COND_MUL} "optimized" { target vect_double_cond_arith } } } */ +/* { dg-final { scan-tree-dump { = \.COND_RDIV} "optimized" { target vect_double_cond_arith } } } */ /* { dg-final { scan-tree-dump { = \.COND_MAX} "optimized" { target vect_double_cond_arith } } } */ /* { dg-final { scan-tree-dump { = \.COND_MIN} "optimized" { target vect_double_cond_arith } } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24 09:54:37.510451324 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_8.c 2018-05-24 10:12:10.147352111 +0100 @@ -5,6 +5,8 @@ #define add(A, B) ((A) + (B)) #define sub(A, B) ((A) - (B)) +#define mul(A, B) ((A) * (B)) +#define div(A, B) ((A) / (B)) #define max(A, B) ((A) > (B) ? (A) : (B)) #define min(A, B) ((A) < (B) ? (A) : (B)) #define and(A, B) ((A) & (B)) @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ #define FOR_EACH_INT_TYPE(T, TYPE) \ T (TYPE, TYPE, add) \ T (TYPE, TYPE, sub) \ + T (TYPE, TYPE, mul) \ T (TYPE, TYPE, max) \ T (TYPE, TYPE, min) \ T (TYPE, TYPE, and) \ @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ T (TYPE, CMPTYPE, add) \ T (TYPE, CMPTYPE, sub) \ + T (TYPE, CMPTYPE, mul) \ + T (TYPE, CMPTYPE, div) \ T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfdiv\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24 09:54:37.510451324 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_9.c 2018-05-24 10:12:10.148352070 +0100 @@ -5,6 +5,8 @@ #define add(A, B) ((A) + (B)) #define sub(A, B) ((A) - (B)) +#define mul(A, B) ((A) * (B)) +#define div(A, B) ((A) / (B)) #define max(A, B) ((A) > (B) ? (A) : (B)) #define min(A, B) ((A) < (B) ? (A) : (B)) #define and(A, B) ((A) & (B)) @@ -27,6 +29,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ #define FOR_EACH_INT_TYPE(T, TYPE) \ T (TYPE, TYPE, add) \ T (TYPE, TYPE, sub) \ + T (TYPE, TYPE, mul) \ T (TYPE, TYPE, max) \ T (TYPE, TYPE, min) \ T (TYPE, TYPE, and) \ @@ -36,6 +39,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ T (TYPE, CMPTYPE, add) \ T (TYPE, CMPTYPE, sub) \ + T (TYPE, CMPTYPE, mul) \ + T (TYPE, CMPTYPE, div) \ T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) @@ -67,6 +72,11 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ /* { dg-final { scan-assembler-times {\tsubr\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ @@ -110,6 +120,14 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfsubr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfdivr\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ Index: gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c =================================================================== --- gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24 09:54:37.510451324 +0100 +++ gcc/testsuite/gcc.target/aarch64/sve/vcond_12.c 2018-05-24 10:12:10.147352111 +0100 @@ -5,6 +5,8 @@ #define add(A, B) ((A) + (B)) #define sub(A, B) ((A) - (B)) +#define mul(A, B) ((A) * (B)) +#define div(A, B) ((A) / (B)) #define max(A, B) ((A) > (B) ? (A) : (B)) #define min(A, B) ((A) < (B) ? (A) : (B)) #define and(A, B) ((A) & (B)) @@ -29,6 +31,7 @@ #define DEF_LOOP(TYPE, CMPTYPE, OP) \ #define FOR_EACH_INT_TYPE(T, TYPE) \ T (TYPE, TYPE, add) \ T (TYPE, TYPE, sub) \ + T (TYPE, TYPE, mul) \ T (TYPE, TYPE, max) \ T (TYPE, TYPE, min) \ T (TYPE, TYPE, and) \ @@ -38,6 +41,8 @@ #define FOR_EACH_INT_TYPE(T, TYPE) \ #define FOR_EACH_FP_TYPE(T, TYPE, CMPTYPE, SUFFIX) \ T (TYPE, CMPTYPE, add) \ T (TYPE, CMPTYPE, sub) \ + T (TYPE, CMPTYPE, mul) \ + /* No div because that gets converted into a mul anyway. */ \ T (TYPE, CMPTYPE, __builtin_fmax##SUFFIX) \ T (TYPE, CMPTYPE, __builtin_fmin##SUFFIX) @@ -58,10 +63,10 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-not {\tmov\tz[0-9]+\.., z[0-9]+} } } */ -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 14 } } */ -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 18 } } */ -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 18 } } */ -/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 18 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.b,} 16 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.h,} 21 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.s,} 21 } } */ +/* { dg-final { scan-assembler-times {\tsel\tz[0-9]+\.d,} 21 } } */ /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ /* { dg-final { scan-assembler-times {\tadd\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ @@ -73,6 +78,11 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ /* { dg-final { scan-assembler-times {\tsub\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.b, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.h, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.s, p[0-7]/m,} 2 } } */ +/* { dg-final { scan-assembler-times {\tmul\tz[0-9]+\.d, p[0-7]/m,} 2 } } */ + /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.b, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tsmax\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ @@ -116,6 +126,10 @@ FOR_EACH_LOOP (DEF_LOOP) /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfsub\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ +/* { dg-final { scan-assembler-times {\tfmul\tz[0-9]+\.d, p[0-7]/m,} 1 } } */ + /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.h, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.s, p[0-7]/m,} 1 } } */ /* { dg-final { scan-assembler-times {\tfmaxnm\tz[0-9]+\.d, p[0-7]/m,} 1 } } */