Message ID | 87r2tthiuk.fsf@linaro.org |
---|---|
State | New |
Headers | show |
Series | [076/nnn] poly_int: vectorizable_conversion | expand |
On 10/23/2017 11:30 AM, Richard Sandiford wrote: > This patch makes vectorizable_conversion cope with variable-length > vectors. We already require the number of elements in one vector > to be a multiple of the number of elements in the other vector, > so the patch uses that to choose between widening and narrowing. > > > 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org> > Alan Hayward <alan.hayward@arm.com> > David Sherwood <david.sherwood@arm.com> > > gcc/ > * tree-vect-stmts.c (vectorizable_conversion): Treat the number > of units as polynomial. Choose between WIDE and NARROW based > on multiple_p. If I'm reding this right, if nunits_in < nunits_out, but the latter is not a multiple of the former, we'll choose WIDEN, which is the opposite of what we'd do before this patch. Was that intentional? jeff
Jeff Law <law@redhat.com> writes: > On 10/23/2017 11:30 AM, Richard Sandiford wrote: >> This patch makes vectorizable_conversion cope with variable-length >> vectors. We already require the number of elements in one vector >> to be a multiple of the number of elements in the other vector, >> so the patch uses that to choose between widening and narrowing. >> >> >> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org> >> Alan Hayward <alan.hayward@arm.com> >> David Sherwood <david.sherwood@arm.com> >> >> gcc/ >> * tree-vect-stmts.c (vectorizable_conversion): Treat the number >> of units as polynomial. Choose between WIDE and NARROW based >> on multiple_p. > If I'm reding this right, if nunits_in < nunits_out, but the latter is > not a multiple of the former, we'll choose WIDEN, which is the opposite > of what we'd do before this patch. Was that intentional? That case isn't possible, so we'd assert: if (must_eq (nunits_out, nunits_in)) modifier = NONE; else if (multiple_p (nunits_out, nunits_in)) modifier = NARROW; else { gcc_checking_assert (multiple_p (nunits_in, nunits_out)); modifier = WIDEN; } We already implicitly rely on this, since we either widen one full vector to N full vectors or narrow N full vectors to one vector. Structurally this is enforced by all vectors having the same number of bytes (current_vector_size) and the number of vector elements being a power of 2 (or in the case of poly_int, a power of 2 times a runtime variant, but that's good enough, since the runtime invariant is the same in both cases). Thanks, Richard
On 11/28/2017 11:09 AM, Richard Sandiford wrote: > Jeff Law <law@redhat.com> writes: >> On 10/23/2017 11:30 AM, Richard Sandiford wrote: >>> This patch makes vectorizable_conversion cope with variable-length >>> vectors. We already require the number of elements in one vector >>> to be a multiple of the number of elements in the other vector, >>> so the patch uses that to choose between widening and narrowing. >>> >>> >>> 2017-10-23 Richard Sandiford <richard.sandiford@linaro.org> >>> Alan Hayward <alan.hayward@arm.com> >>> David Sherwood <david.sherwood@arm.com> >>> >>> gcc/ >>> * tree-vect-stmts.c (vectorizable_conversion): Treat the number >>> of units as polynomial. Choose between WIDE and NARROW based >>> on multiple_p. >> If I'm reding this right, if nunits_in < nunits_out, but the latter is >> not a multiple of the former, we'll choose WIDEN, which is the opposite >> of what we'd do before this patch. Was that intentional? > > That case isn't possible, so we'd assert: > > if (must_eq (nunits_out, nunits_in)) > modifier = NONE; > else if (multiple_p (nunits_out, nunits_in)) > modifier = NARROW; > else > { > gcc_checking_assert (multiple_p (nunits_in, nunits_out)); > modifier = WIDEN; > } > > We already implicitly rely on this, since we either widen one full > vector to N full vectors or narrow N full vectors to one vector. > > Structurally this is enforced by all vectors having the same number of > bytes (current_vector_size) and the number of vector elements being a > power of 2 (or in the case of poly_int, a power of 2 times a runtime > variant, but that's good enough, since the runtime invariant is the same > in both cases). OK. THanks for clarifying. jeff
Index: gcc/tree-vect-stmts.c =================================================================== --- gcc/tree-vect-stmts.c 2017-10-23 17:22:40.906378704 +0100 +++ gcc/tree-vect-stmts.c 2017-10-23 17:22:41.879277786 +0100 @@ -4102,8 +4102,8 @@ vectorizable_conversion (gimple *stmt, g int ndts = 2; gimple *new_stmt = NULL; stmt_vec_info prev_stmt_info; - int nunits_in; - int nunits_out; + poly_uint64 nunits_in; + poly_uint64 nunits_out; tree vectype_out, vectype_in; int ncopies, i, j; tree lhs_type, rhs_type; @@ -4238,12 +4238,15 @@ vectorizable_conversion (gimple *stmt, g nunits_in = TYPE_VECTOR_SUBPARTS (vectype_in); nunits_out = TYPE_VECTOR_SUBPARTS (vectype_out); - if (nunits_in < nunits_out) - modifier = NARROW; - else if (nunits_out == nunits_in) + if (must_eq (nunits_out, nunits_in)) modifier = NONE; + else if (multiple_p (nunits_out, nunits_in)) + modifier = NARROW; else - modifier = WIDEN; + { + gcc_checking_assert (multiple_p (nunits_in, nunits_out)); + modifier = WIDEN; + } /* Multiple types in SLP are handled by creating the appropriate number of vectorized stmts for each SLP node. Hence, NCOPIES is always 1 in