From patchwork Tue Apr 19 16:29:02 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 1096 Return-Path: Delivered-To: unknown Received: from imap.gmail.com (74.125.159.109) by localhost6.localdomain6 with IMAP4-SSL; 08 Jun 2011 14:49:07 -0000 Delivered-To: patches@linaro.org Received: by 10.224.67.148 with SMTP id r20cs77678qai; Tue, 19 Apr 2011 09:29:05 -0700 (PDT) Received: by 10.216.80.25 with SMTP id j25mr1238777wee.35.1303230545001; Tue, 19 Apr 2011 09:29:05 -0700 (PDT) Received: from mail-wy0-f178.google.com (mail-wy0-f178.google.com [74.125.82.178]) by mx.google.com with ESMTPS id p14si167743wej.31.2011.04.19.09.29.04 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 19 Apr 2011 09:29:04 -0700 (PDT) Received-SPF: neutral (google.com: 74.125.82.178 is neither permitted nor denied by best guess record for domain of richard.sandiford@linaro.org) client-ip=74.125.82.178; Authentication-Results: mx.google.com; spf=neutral (google.com: 74.125.82.178 is neither permitted nor denied by best guess record for domain of richard.sandiford@linaro.org) smtp.mail=richard.sandiford@linaro.org Received: by wyb33 with SMTP id 33so6097593wyb.37 for ; Tue, 19 Apr 2011 09:29:04 -0700 (PDT) Received: by 10.227.130.203 with SMTP id u11mr6428354wbs.219.1303230544451; Tue, 19 Apr 2011 09:29:04 -0700 (PDT) Received: from richards-thinkpad (gbibp9ph1--blueice2n1.emea.ibm.com [195.212.29.75]) by mx.google.com with ESMTPS id z13sm40349wbd.46.2011.04.19.09.29.03 (version=TLSv1/SSLv3 cipher=OTHER); Tue, 19 Apr 2011 09:29:03 -0700 (PDT) From: Richard Sandiford To: patches@linaro.org Mail-Followup-To: patches@linaro.org, richard.sandiford@linaro.org Subject: [Richard Sandiford] Make vectoriser use operand_equal_p to compare base addresses Date: Tue, 19 Apr 2011 17:29:02 +0100 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.1 (gnu/linux) MIME-Version: 1.0 Content-Disposition: inline In the attached testcase, we treat: sum += x[i][0] * x[i][0] + x[i][1] * x[i][1]; as being two independent strided loads: x[i][0] and x[i][1]. On targets with appropriate support, we therefore use two interleaved loads rather than one, then discard one half of each result. This does not happen if bar() is compiled on its own. The code before vectorisation looks like this: : i.0_5 = (unsigned int) i_32; D.3532_6 = i.0_5 * 320; D.3533_8 = x_7(D) + D.3532_6; : i.1_14 = (unsigned int) i_34; D.3559_15 = i.1_14 * 8; D.3558_16 = D.3533_8 + D.3559_15; D.3557_17 = *D.3558_16[0]; D.3555_19 = *D.3558_16[1]; ... The two DR_BASE_ADDRESSES are therefore the expansion of D.3558_8: x_7(D) + (unsigned int) i_32 * 320 which we don't think are equal. The comment in tree-data-refs.h made me think that (unsigned int) i_32 * 320 ought really to be in the DR_OFFSET, but that only seems to apply to offsets that occur in the memory reference itself. Fixed by using operand_equal_p to compare base addresses. Tested on x86_64-linux-gnu and arm-linux-gnueabi. OK to install? I've not made the test dependent on a target selector because it seemed like the message should be printed regardless. Richard gcc/ * tree-vect-data-refs.c (vect_drs_dependent_in_basic_block): Use operand_equal_p to compare DR_BASE_ADDRESSes. (vect_check_interleaving): Likewise. gcc/testsuite/ * gcc.dg/vect/vect-119.c: New test. Index: gcc/tree-vect-data-refs.c =================================================================== --- gcc/tree-vect-data-refs.c 2011-04-19 14:59:58.000000000 +0100 +++ gcc/tree-vect-data-refs.c 2011-04-19 14:59:58.000000000 +0100 @@ -353,11 +353,7 @@ vect_drs_dependent_in_basic_block (struc /* Check that the data-refs have same bases and offsets. If not, we can't determine if they are dependent. */ - if ((DR_BASE_ADDRESS (dra) != DR_BASE_ADDRESS (drb) - && (TREE_CODE (DR_BASE_ADDRESS (dra)) != ADDR_EXPR - || TREE_CODE (DR_BASE_ADDRESS (drb)) != ADDR_EXPR - || TREE_OPERAND (DR_BASE_ADDRESS (dra), 0) - != TREE_OPERAND (DR_BASE_ADDRESS (drb),0))) + if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0) || !dr_equal_offsets_p (dra, drb)) return true; @@ -403,11 +399,7 @@ vect_check_interleaving (struct data_ref /* Check that the data-refs have same first location (except init) and they are both either store or load (not load and store). */ - if ((DR_BASE_ADDRESS (dra) != DR_BASE_ADDRESS (drb) - && (TREE_CODE (DR_BASE_ADDRESS (dra)) != ADDR_EXPR - || TREE_CODE (DR_BASE_ADDRESS (drb)) != ADDR_EXPR - || TREE_OPERAND (DR_BASE_ADDRESS (dra), 0) - != TREE_OPERAND (DR_BASE_ADDRESS (drb),0))) + if (!operand_equal_p (DR_BASE_ADDRESS (dra), DR_BASE_ADDRESS (drb), 0) || !dr_equal_offsets_p (dra, drb) || !tree_int_cst_compare (DR_INIT (dra), DR_INIT (drb)) || DR_IS_READ (dra) != DR_IS_READ (drb)) Index: gcc/testsuite/gcc.dg/vect/vect-119.c =================================================================== --- /dev/null 2011-03-23 08:42:11.268792848 +0000 +++ gcc/testsuite/gcc.dg/vect/vect-119.c 2011-04-19 14:59:58.000000000 +0100 @@ -0,0 +1,28 @@ +/* { dg-do compile } */ + +#define OUTER 32 +#define INNER 40 + +static unsigned int +bar (const unsigned int x[INNER][2], unsigned int sum) +{ + int i; + + for (i = 0; i < INNER; i++) + sum += x[i][0] * x[i][0] + x[i][1] * x[i][1]; + return sum; +} + +unsigned int foo (const unsigned int x[OUTER][INNER][2]) +{ + int i; + unsigned int sum; + + sum = 0.0f; + for (i = 0; i < OUTER; i++) + sum = bar (x[i], sum); + return sum; +} + +/* { dg-final { scan-tree-dump-times "Detected interleaving of size 2" 1 "vect" } } */ +/* { dg-final { cleanup-tree-dump "vect" } } */