From patchwork Fri Oct  2 08:12:36 2015
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: James Greenhalgh <james.greenhalgh@arm.com>
X-Patchwork-Id: 54410
Return-Path: <patchwork-forward+bncBC76NPM6UIOBBDPZXCYAKGQEAIOYJHA@linaro.org>
X-Original-To: linaro@patches.linaro.org
Delivered-To: linaro@patches.linaro.org
Received: from mail-la0-f71.google.com (mail-la0-f71.google.com
 [209.85.215.71])
 by patches.linaro.org (Postfix) with ESMTPS id DB98523009
 for <linaro@patches.linaro.org>; Fri,  2 Oct 2015 08:13:02 +0000 (UTC)
Received: by lana8 with SMTP id a8sf13888457lan.1
 for <linaro@patches.linaro.org>; Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:delivered-to:mailing-list:precedence:list-id
 :list-unsubscribe:list-archive:list-post:list-help:sender
 :delivered-to:from:to:cc:subject:date:message-id:mime-version
 :content-type:x-original-sender:x-original-authentication-results;
 bh=qgrX77+ZnlM8pimKXPV12NKPQQICskGju3MHx+8lkBY=;
 b=W0x63hPMpxBMNubIGW3QODk+lPuTa0NN7VgDVsgNMbxBtlSSkGFgYvzZ4fhPegQmnJ
 5hTCdubJ1UXGlltQQZ5vVRts38BiBO6AkhuuVA7MbBgiUT1oNZiwddNmMbVCJ+0Da3tC
 K1VZ+esMXsc0fOGSCaX39L/Ob6YYK7+rNtvf/l2sv0qfq95SL/QJRqxYdnf8rM6QeSe5
 YZ9AwUBQDw/LkHw273vbqs3PofWeoaom5Y8o016649rnFRSDzoSSno9c1y9YiHnfm0/C
 pe7pNl2pzo6WYpKEWc3Z4gLnwV049X9D9FESre2BH8SnaMTPhLCNRgaKaFNrwLyzlCVU
 fcZQ==
X-Gm-Message-State: ALoCoQmd4X6giCXCN8D8DG+Ki6fMGvrwDtGwDCo4lr8kR1JPY700sfV1QZPVDjyKNd7xqUbS5feA
X-Received: by 10.194.156.193 with SMTP id wg1mr2301013wjb.3.1443773581849; 
 Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
X-BeenThere: patchwork-forward@linaro.org
Received: by 10.25.157.136 with SMTP id g130ls202889lfe.73.gmail; Fri, 02 Oct
 2015 01:13:01 -0700 (PDT)
X-Received: by 10.25.166.139 with SMTP id p133mr3210117lfe.51.1443773581479; 
 Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
Received: from mail-la0-x22a.google.com (mail-la0-x22a.google.com.
 [2a00:1450:4010:c03::22a]) by mx.google.com with ESMTPS id
 pq6si5447997lbb.61.2015.10.02.01.13.01
 for <patchwork-forward@linaro.org>
 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22a as permitted sender)
 client-ip=2a00:1450:4010:c03::22a; 
Received: by lafb9 with SMTP id b9so163306laf.0
 for <patchwork-forward@linaro.org>;
 Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
X-Received: by 10.112.159.136 with SMTP id xc8mr2583680lbb.76.1443773581160; 
 Fri, 02 Oct 2015 01:13:01 -0700 (PDT)
X-Forwarded-To: patchwork-forward@linaro.org
X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org
Delivered-To: patch@linaro.org
Received: by 10.112.59.35 with SMTP id w3csp1061903lbq;
 Fri, 2 Oct 2015 01:13:00 -0700 (PDT)
X-Received: by 10.107.18.167 with SMTP id 39mr17000180ios.34.1443773580010; 
 Fri, 02 Oct 2015 01:13:00 -0700 (PDT)
Received: from sourceware.org (server1.sourceware.org. [209.132.180.131])
 by mx.google.com with ESMTPS id
 t32si7768157ioi.154.2015.10.02.01.12.59 for <patch@linaro.org>
 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
 Fri, 02 Oct 2015 01:12:59 -0700 (PDT)
Received-SPF: pass (google.com: domain of
 gcc-patches-return-408946-patch=linaro.org@gcc.gnu.org
 designates 209.132.180.131 as permitted sender)
 client-ip=209.132.180.131; 
Received: (qmail 104875 invoked by alias); 2 Oct 2015 08:12:48 -0000
Mailing-List: list patchwork-forward@linaro.org;
 contact patchwork-forward+owners@linaro.org
Precedence: list
List-Id: <patchwork-forward.linaro.org>
List-Unsubscribe: <mailto:googlegroups-manage+836684582541+unsubscribe@googlegroups.com>, 
 <http://groups.google.com/a/linaro.org/group/patchwork-forward/subscribe>
List-Archive: <http://groups.google.com/a/linaro.org/group/patchwork-forward/>
List-Post: <http://groups.google.com/a/linaro.org/group/patchwork-forward/post>, 
 <mailto:patchwork-forward@linaro.org>
List-Help: <http://support.google.com/a/linaro.org/bin/topic.py?topic=25838>, 
 <mailto:patchwork-forward+help@linaro.org>
Sender: gcc-patches-owner@gcc.gnu.org
Delivered-To: mailing list gcc-patches@gcc.gnu.org
Received: (qmail 104866 invoked by uid 89); 2 Oct 2015 08:12:47 -0000
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00,
 SPF_PASS autolearn=ham version=3.3.2
X-HELO: eu-smtp-delivery-143.mimecast.com
Received: from eu-smtp-delivery-143.mimecast.com (HELO
 eu-smtp-delivery-143.mimecast.com) (207.82.80.143) by
 sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP;
 Fri, 02 Oct 2015 08:12:46 +0000
Received: from cam-owa2.Emea.Arm.com (fw-tnat.cambridge.arm.com
 [217.140.96.140]) by eu-smtp-1.mimecast.com with ESMTP id
 uk-mta-24-PNaypRbKQVeZhV08xHcWiA-1; Fri, 02 Oct 2015 09:12:40 +0100
Received: from e107456-lin.cambridge.arm.com ([10.1.2.79]) by
 cam-owa2.Emea.Arm.com with Microsoft SMTPSVC(6.0.3790.3959);
 Fri, 2 Oct 2015 09:12:40 +0100
From: James Greenhalgh <james.greenhalgh@arm.com>
To: gcc-patches@gcc.gnu.org
Cc: marcus.shawcroft@arm.com,	richard.earnshaw@arm.com
Subject: [Patch AArch64] Improve SIMD concatenation with zeroes
Date: Fri,  2 Oct 2015 09:12:36 +0100
Message-Id: <1443773556-11626-1-git-send-email-james.greenhalgh@arm.com>
MIME-Version: 1.0
X-MC-Unique: PNaypRbKQVeZhV08xHcWiA-1
X-IsSubscribed: yes
X-Original-Sender: james.greenhalgh@arm.com
X-Original-Authentication-Results: mx.google.com; spf=pass (google.com:
 domain of
 patch+caf_=patchwork-forward=linaro.org@linaro.org designates
 2a00:1450:4010:c03::22a as permitted sender)
 smtp.mailfrom=patch+caf_=patchwork-forward=linaro.org@linaro.org;
 dkim=pass header.i=@gcc.gnu.org
X-Google-Group-Id: 836684582541

Hi,

In AArch64, SIMD instructions which only touch the bottom 64-bits of a
vector register write zeroes to the upper 64-bits. In other words, we have
a cheap way to implement a "zero extend" of a SIMD operation, and can
generate efficient code for:

  [(set (match_operand 0)
        (vec_concat:128-bit mode
	   (other vector operations in a 64-bit mode)
	   (match_operand 2 [zeroes])))]

And for the big-endian equivalent of this.

This small patch catches two important cases of this, namely loading a
64-bit vector and moving a 64-bit vector from general purpose registers to
vector registers.

Bootstrapped on aarch64-none-linux-gnu with no issues, and aarch64.exp run
for aarch64_be-none-elf.

Ok for trunk?

Thanks,
James
---
gcc/

2015-10-01  James Greenhalgh  <james.greenhalgh@arm.com>

	* config/aarch64/aarch64-simd.md (*aarch64_combinez<mode>): Add
	alternatives for reads from memory and moves from general-purpose
	registers.
	(*aarch64_combinez_be<mode>): Likewise.

2015-10-01  James Greenhalgh  <james.greenhalgh@arm.com>

	* gcc.target/aarch64/vect_combine_zeroes_1.c: New.

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 541faf9..6a2ab61 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -2530,23 +2530,33 @@
 ;; dest vector.
 
 (define_insn "*aarch64_combinez<mode>"
-  [(set (match_operand:<VDBL> 0 "register_operand" "=&w")
+  [(set (match_operand:<VDBL> 0 "register_operand" "=w,w,w")
         (vec_concat:<VDBL>
-	   (match_operand:VD_BHSI 1 "register_operand" "w")
-	   (match_operand:VD_BHSI 2 "aarch64_simd_imm_zero" "Dz")))]
+	   (match_operand:VD_BHSI 1 "general_operand" "w,r,m")
+	   (match_operand:VD_BHSI 2 "aarch64_simd_imm_zero" "Dz,Dz,Dz")))]
   "TARGET_SIMD && !BYTES_BIG_ENDIAN"
-  "mov\\t%0.8b, %1.8b"
-  [(set_attr "type" "neon_move<q>")]
+  "@
+   mov\\t%0.8b, %1.8b
+   fmov\t%d0, %1
+   ldr\\t%d0, %1"
+  [(set_attr "type" "neon_move<q>, neon_from_gp, neon_load1_1reg")
+   (set_attr "simd" "yes,*,yes")
+   (set_attr "fp" "*,yes,*")]
 )
 
 (define_insn "*aarch64_combinez_be<mode>"
-  [(set (match_operand:<VDBL> 0 "register_operand" "=&w")
+  [(set (match_operand:<VDBL> 0 "register_operand" "=w,w,w")
         (vec_concat:<VDBL>
-	   (match_operand:VD_BHSI 2 "aarch64_simd_imm_zero" "Dz")
-	   (match_operand:VD_BHSI 1 "register_operand" "w")))]
+	   (match_operand:VD_BHSI 2 "aarch64_simd_imm_zero" "Dz,Dz,Dz")
+	   (match_operand:VD_BHSI 1 "general_operand" "w,r,m")))]
   "TARGET_SIMD && BYTES_BIG_ENDIAN"
-  "mov\\t%0.8b, %1.8b"
-  [(set_attr "type" "neon_move<q>")]
+  "@
+   mov\\t%0.8b, %1.8b
+   fmov\t%d0, %1
+   ldr\\t%d0, %1"
+  [(set_attr "type" "neon_move<q>, neon_from_gp, neon_load1_1reg")
+   (set_attr "simd" "yes,*,yes")
+   (set_attr "fp" "*,yes,*")]
 )
 
 (define_expand "aarch64_combine<mode>"
diff --git a/gcc/testsuite/gcc.target/aarch64/vect_combine_zeroes_1.c b/gcc/testsuite/gcc.target/aarch64/vect_combine_zeroes_1.c
new file mode 100644
index 0000000..6257fa9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect_combine_zeroes_1.c
@@ -0,0 +1,24 @@
+/* { dg-options "-O2 --save-temps" } */
+
+#include "arm_neon.h"
+
+int32x4_t
+foo (int32x2_t *x)
+{
+  int32x2_t i = *x;
+  int32x2_t zeroes = vcreate_s32 (0l);
+  int32x4_t ret = vcombine_s32 (i, zeroes);
+  return ret;
+}
+
+int32x4_t
+bar (int64_t x)
+{
+  int32x2_t i = vcreate_s32 (x);
+  int32x2_t zeroes = vcreate_s32 (0l);
+  int32x4_t ret = vcombine_s32 (i, zeroes);
+  return ret;
+}
+
+/* { dg-final { scan-assembler-not "mov\tv\[0-9\]+.8b, v\[0-9\]+.8b" } } */
+