From patchwork Fri Jan 19 04:54:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 125077 Delivered-To: patch@linaro.org Received: by 10.46.66.141 with SMTP id h13csp131166ljf; Thu, 18 Jan 2018 20:59:50 -0800 (PST) X-Google-Smtp-Source: ACJfBosaE4yfC2vpaEFcLUG06XB4s2HcwxoS3KoBpXzp8xoeK9ZSq6rZN7FC12qo+dwvfez3oYhC X-Received: by 10.129.172.19 with SMTP id k19mr8381453ywh.274.1516337990186; Thu, 18 Jan 2018 20:59:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516337990; cv=none; d=google.com; s=arc-20160816; b=Kg22xBy2eUhNEcTYb4d3AgcEkyqhg+tMWRi95ROXU2OxeCHLm2mgyLU9OolyoiBUuB 77mdsf7Z1VIwQDGsq0t8S9Bpwnwc2/C501ei3DKsYdtJ53nHi8XDCoRPNlxsMaBYH6bJ v5J2QjOCmhSqUi4dRmkTlaHDXr4JMp9yBt5pkEX1rxB8B6L3ZbF58rQIULQnqIAOPKwL La6EORmx6s5xD7OIr4n7nPHkNnptjT6FfCH0l06u4MrsIXto9midaY/z/3vA07WXd9+V km8BFLNARAIrLyEjjLENYblHLp2exXPMMTv38druAc9pOlIG6j3ISy4/e3jxF52Ck8z3 nSrA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:subject:references:in-reply-to :message-id:date:to:from:dkim-signature:arc-authentication-results; bh=cqrA8CU4eKHFUueqSZ+9jYOSEtrX5Z7lO22SsZclNa8=; b=b7/Y40R6ToTW9NdKSo144todJyaaYvoaKeUmQBFZ/gPTHp3RKfaUw+gocVKumZM1lx WSUa98wc0Jc0LbOqcJLLMh4SdUu5ZT7/lNT9f4QLvVZMuqIeDNv8uidp9/YjpEpwdz1Z kAdRPiAv7EPSsM7Lhclq4IlIhuGaHUC6QZRawxIWZPPnHaS5taTaye0SjMf0hChAkk1a BvwRfn/kMZiyV2UGymlG5BV5pHCqb9ABPUJ1vGgktoiEWQHsh2htaqiXwj01X5JhC8/s 7rpE6Cgdfy2A9pFSGC6AF+pBVgWwSsun+6MlwAi04+qyTDZNUbo9rCKIofjlTIBmaTgj bCdg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hKZcTZti; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [2001:4830:134:3::11]) by mx.google.com with ESMTPS id q2si2066849ywi.122.2018.01.18.20.59.49 for (version=TLS1 cipher=AES128-SHA bits=128/128); Thu, 18 Jan 2018 20:59:50 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) client-ip=2001:4830:134:3::11; Authentication-Results: mx.google.com; dkim=fail header.i=@linaro.org header.s=google header.b=hKZcTZti; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 2001:4830:134:3::11 as permitted sender) smtp.mailfrom=qemu-devel-bounces+patch=linaro.org@nongnu.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from localhost ([::1]:52336 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ecOmL-0005mM-L0 for patch@linaro.org; Thu, 18 Jan 2018 23:59:49 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58614) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ecOhY-00028h-99 for qemu-devel@nongnu.org; Thu, 18 Jan 2018 23:54:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ecOhW-0008Te-Ap for qemu-devel@nongnu.org; Thu, 18 Jan 2018 23:54:52 -0500 Received: from mail-pf0-x243.google.com ([2607:f8b0:400e:c00::243]:33532) by eggs.gnu.org with esmtps (TLS1.0:RSA_AES_128_CBC_SHA1:16) (Exim 4.71) (envelope-from ) id 1ecOhW-0008TI-2u for qemu-devel@nongnu.org; Thu, 18 Jan 2018 23:54:50 -0500 Received: by mail-pf0-x243.google.com with SMTP id t5so537828pfi.0 for ; Thu, 18 Jan 2018 20:54:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=cqrA8CU4eKHFUueqSZ+9jYOSEtrX5Z7lO22SsZclNa8=; b=hKZcTZtipnou93Si/P8M65jD2QZkAz+Gzk3k/kXmIAIJwNaLbrpgv8/y16ZE6AhPFf iKRr+gr2MzKcFrMW5jAE2TubvN4bOdet3OdxEgRMNzNYOx6h6mPF2pTEMtiqm/OF6QrW BrCCpjbcWcywqG9gEpd2wwMhFAhRlIUIoZCxI= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=cqrA8CU4eKHFUueqSZ+9jYOSEtrX5Z7lO22SsZclNa8=; b=SQW9tCpm4WQey4UyF0YZmMIFp+k7qtJViiXGUZfQplqZKSBHXPUwrx1q1cumbG/cYf suT2+rYMo46KFeo0rrONCD9FTX37JCx0nOYlohKFitScFXtuqG3Zbuj/IqxRTQ2KhM7z TDEHAa7iOLbPM/RnH/pIOr1yZNh4vzemrbFV+5Gwbf3z0O7CtYcgwHrq/kHm+jSvXXG0 oGo6gaYoGa0Z4gDyh5ZNJmgh+wkwvBZlQ+WNrVFrV1ymDKRBe7F9wPg4iGUVuQ61hsFb jwSyW06gN5IbacorKDYpvxeHQzM9GYwRerYhRtJeG++BfAdI4A9puuXTxyEe7LVAOyc/ icqA== X-Gm-Message-State: AKGB3mKjoqN95I3Y4m7Sli1FB9Qp3sMr2hABtKCbK0M6OoPYgA5RoJZe lqZ6mzSWVF76z/xQ3XNlmiiFEa/l4sQ= X-Received: by 10.99.175.66 with SMTP id s2mr39379313pgo.168.1516337688489; Thu, 18 Jan 2018 20:54:48 -0800 (PST) Received: from cloudburst.twiddle.net (97-113-183-164.tukw.qwest.net. [97.113.183.164]) by smtp.gmail.com with ESMTPSA id m12sm13690022pga.68.2018.01.18.20.54.47 (version=TLS1_2 cipher=ECDHE-RSA-CHACHA20-POLY1305 bits=256/256); Thu, 18 Jan 2018 20:54:47 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 18 Jan 2018 20:54:25 -0800 Message-Id: <20180119045438.28582-4-richard.henderson@linaro.org> X-Mailer: git-send-email 2.14.3 In-Reply-To: <20180119045438.28582-1-richard.henderson@linaro.org> References: <20180119045438.28582-1-richard.henderson@linaro.org> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 2607:f8b0:400e:c00::243 Subject: [Qemu-devel] [PATCH v2 03/16] target/arm: Use pointers in neon zip/uzp helpers X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: peter.maydell@linaro.org Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: "Qemu-devel" Rather than passing regnos to the helpers, pass pointers to the vector registers directly. This eliminates the need to pass in the environment pointer and reduces the number of places that directly access env->vfp.regs[]. Reviewed-by: Peter Maydell Signed-off-by: Richard Henderson --- target/arm/helper.h | 20 +++--- target/arm/neon_helper.c | 162 +++++++++++++++++++++++++---------------------- target/arm/translate.c | 42 ++++++------ 3 files changed, 120 insertions(+), 104 deletions(-) -- 2.14.3 Reviewed-by: Alex Bennée Reviewed-by: Alex Bennée diff --git a/target/arm/helper.h b/target/arm/helper.h index 688380af6b..dbdc38fcb7 100644 --- a/target/arm/helper.h +++ b/target/arm/helper.h @@ -511,16 +511,16 @@ DEF_HELPER_3(iwmmxt_muladdsl, i64, i64, i32, i32) DEF_HELPER_3(iwmmxt_muladdsw, i64, i64, i32, i32) DEF_HELPER_3(iwmmxt_muladdswl, i64, i64, i32, i32) -DEF_HELPER_3(neon_unzip8, void, env, i32, i32) -DEF_HELPER_3(neon_unzip16, void, env, i32, i32) -DEF_HELPER_3(neon_qunzip8, void, env, i32, i32) -DEF_HELPER_3(neon_qunzip16, void, env, i32, i32) -DEF_HELPER_3(neon_qunzip32, void, env, i32, i32) -DEF_HELPER_3(neon_zip8, void, env, i32, i32) -DEF_HELPER_3(neon_zip16, void, env, i32, i32) -DEF_HELPER_3(neon_qzip8, void, env, i32, i32) -DEF_HELPER_3(neon_qzip16, void, env, i32, i32) -DEF_HELPER_3(neon_qzip32, void, env, i32, i32) +DEF_HELPER_FLAGS_2(neon_unzip8, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_unzip16, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qunzip8, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qunzip16, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qunzip32, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_zip8, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_zip16, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qzip8, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qzip16, TCG_CALL_NO_RWG, void, ptr, ptr) +DEF_HELPER_FLAGS_2(neon_qzip32, TCG_CALL_NO_RWG, void, ptr, ptr) DEF_HELPER_FLAGS_3(crypto_aese, TCG_CALL_NO_RWG, void, ptr, ptr, i32) DEF_HELPER_FLAGS_3(crypto_aesmc, TCG_CALL_NO_RWG, void, ptr, ptr, i32) diff --git a/target/arm/neon_helper.c b/target/arm/neon_helper.c index ebdf7c9b10..689491cad3 100644 --- a/target/arm/neon_helper.c +++ b/target/arm/neon_helper.c @@ -2027,12 +2027,12 @@ uint64_t HELPER(neon_acgt_f64)(uint64_t a, uint64_t b, void *fpstp) #define ELEM(V, N, SIZE) (((V) >> ((N) * (SIZE))) & ((1ull << (SIZE)) - 1)) -void HELPER(neon_qunzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qunzip8)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 8) | (ELEM(zd0, 2, 8) << 8) | (ELEM(zd0, 4, 8) << 16) | (ELEM(zd0, 6, 8) << 24) | (ELEM(zd1, 0, 8) << 32) | (ELEM(zd1, 2, 8) << 40) @@ -2049,18 +2049,19 @@ void HELPER(neon_qunzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zm0, 5, 8) << 16) | (ELEM(zm0, 7, 8) << 24) | (ELEM(zm1, 1, 8) << 32) | (ELEM(zm1, 3, 8) << 40) | (ELEM(zm1, 5, 8) << 48) | (ELEM(zm1, 7, 8) << 56); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_qunzip16)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qunzip16)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 16) | (ELEM(zd0, 2, 16) << 16) | (ELEM(zd1, 0, 16) << 32) | (ELEM(zd1, 2, 16) << 48); uint64_t d1 = ELEM(zm0, 0, 16) | (ELEM(zm0, 2, 16) << 16) @@ -2069,32 +2070,35 @@ void HELPER(neon_qunzip16)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zd1, 1, 16) << 32) | (ELEM(zd1, 3, 16) << 48); uint64_t m1 = ELEM(zm0, 1, 16) | (ELEM(zm0, 3, 16) << 16) | (ELEM(zm1, 1, 16) << 32) | (ELEM(zm1, 3, 16) << 48); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_qunzip32)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qunzip32)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 32) | (ELEM(zd1, 0, 32) << 32); uint64_t d1 = ELEM(zm0, 0, 32) | (ELEM(zm1, 0, 32) << 32); uint64_t m0 = ELEM(zd0, 1, 32) | (ELEM(zd1, 1, 32) << 32); uint64_t m1 = ELEM(zm0, 1, 32) | (ELEM(zm1, 1, 32) << 32); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_unzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_unzip8)(void *vd, void *vm) { - uint64_t zm = float64_val(env->vfp.regs[rm]); - uint64_t zd = float64_val(env->vfp.regs[rd]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd = rd[0], zm = rm[0]; + uint64_t d0 = ELEM(zd, 0, 8) | (ELEM(zd, 2, 8) << 8) | (ELEM(zd, 4, 8) << 16) | (ELEM(zd, 6, 8) << 24) | (ELEM(zm, 0, 8) << 32) | (ELEM(zm, 2, 8) << 40) @@ -2103,28 +2107,31 @@ void HELPER(neon_unzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zd, 5, 8) << 16) | (ELEM(zd, 7, 8) << 24) | (ELEM(zm, 1, 8) << 32) | (ELEM(zm, 3, 8) << 40) | (ELEM(zm, 5, 8) << 48) | (ELEM(zm, 7, 8) << 56); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rd] = make_float64(d0); + + rm[0] = m0; + rd[0] = d0; } -void HELPER(neon_unzip16)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_unzip16)(void *vd, void *vm) { - uint64_t zm = float64_val(env->vfp.regs[rm]); - uint64_t zd = float64_val(env->vfp.regs[rd]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd = rd[0], zm = rm[0]; + uint64_t d0 = ELEM(zd, 0, 16) | (ELEM(zd, 2, 16) << 16) | (ELEM(zm, 0, 16) << 32) | (ELEM(zm, 2, 16) << 48); uint64_t m0 = ELEM(zd, 1, 16) | (ELEM(zd, 3, 16) << 16) | (ELEM(zm, 1, 16) << 32) | (ELEM(zm, 3, 16) << 48); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rd] = make_float64(d0); + + rm[0] = m0; + rd[0] = d0; } -void HELPER(neon_qzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qzip8)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 8) | (ELEM(zm0, 0, 8) << 8) | (ELEM(zd0, 1, 8) << 16) | (ELEM(zm0, 1, 8) << 24) | (ELEM(zd0, 2, 8) << 32) | (ELEM(zm0, 2, 8) << 40) @@ -2141,18 +2148,19 @@ void HELPER(neon_qzip8)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zd1, 5, 8) << 16) | (ELEM(zm1, 5, 8) << 24) | (ELEM(zd1, 6, 8) << 32) | (ELEM(zm1, 6, 8) << 40) | (ELEM(zd1, 7, 8) << 48) | (ELEM(zm1, 7, 8) << 56); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_qzip16)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qzip16)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 16) | (ELEM(zm0, 0, 16) << 16) | (ELEM(zd0, 1, 16) << 32) | (ELEM(zm0, 1, 16) << 48); uint64_t d1 = ELEM(zd0, 2, 16) | (ELEM(zm0, 2, 16) << 16) @@ -2161,32 +2169,35 @@ void HELPER(neon_qzip16)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zd1, 1, 16) << 32) | (ELEM(zm1, 1, 16) << 48); uint64_t m1 = ELEM(zd1, 2, 16) | (ELEM(zm1, 2, 16) << 16) | (ELEM(zd1, 3, 16) << 32) | (ELEM(zm1, 3, 16) << 48); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_qzip32)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_qzip32)(void *vd, void *vm) { - uint64_t zm0 = float64_val(env->vfp.regs[rm]); - uint64_t zm1 = float64_val(env->vfp.regs[rm + 1]); - uint64_t zd0 = float64_val(env->vfp.regs[rd]); - uint64_t zd1 = float64_val(env->vfp.regs[rd + 1]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd0 = rd[0], zd1 = rd[1]; + uint64_t zm0 = rm[0], zm1 = rm[1]; + uint64_t d0 = ELEM(zd0, 0, 32) | (ELEM(zm0, 0, 32) << 32); uint64_t d1 = ELEM(zd0, 1, 32) | (ELEM(zm0, 1, 32) << 32); uint64_t m0 = ELEM(zd1, 0, 32) | (ELEM(zm1, 0, 32) << 32); uint64_t m1 = ELEM(zd1, 1, 32) | (ELEM(zm1, 1, 32) << 32); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rm + 1] = make_float64(m1); - env->vfp.regs[rd] = make_float64(d0); - env->vfp.regs[rd + 1] = make_float64(d1); + + rm[0] = m0; + rm[1] = m1; + rd[0] = d0; + rd[1] = d1; } -void HELPER(neon_zip8)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_zip8)(void *vd, void *vm) { - uint64_t zm = float64_val(env->vfp.regs[rm]); - uint64_t zd = float64_val(env->vfp.regs[rd]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd = rd[0], zm = rm[0]; + uint64_t d0 = ELEM(zd, 0, 8) | (ELEM(zm, 0, 8) << 8) | (ELEM(zd, 1, 8) << 16) | (ELEM(zm, 1, 8) << 24) | (ELEM(zd, 2, 8) << 32) | (ELEM(zm, 2, 8) << 40) @@ -2195,20 +2206,23 @@ void HELPER(neon_zip8)(CPUARMState *env, uint32_t rd, uint32_t rm) | (ELEM(zd, 5, 8) << 16) | (ELEM(zm, 5, 8) << 24) | (ELEM(zd, 6, 8) << 32) | (ELEM(zm, 6, 8) << 40) | (ELEM(zd, 7, 8) << 48) | (ELEM(zm, 7, 8) << 56); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rd] = make_float64(d0); + + rm[0] = m0; + rd[0] = d0; } -void HELPER(neon_zip16)(CPUARMState *env, uint32_t rd, uint32_t rm) +void HELPER(neon_zip16)(void *vd, void *vm) { - uint64_t zm = float64_val(env->vfp.regs[rm]); - uint64_t zd = float64_val(env->vfp.regs[rd]); + uint64_t *rd = vd, *rm = vm; + uint64_t zd = rd[0], zm = rm[0]; + uint64_t d0 = ELEM(zd, 0, 16) | (ELEM(zm, 0, 16) << 16) | (ELEM(zd, 1, 16) << 32) | (ELEM(zm, 1, 16) << 48); uint64_t m0 = ELEM(zd, 2, 16) | (ELEM(zm, 2, 16) << 16) | (ELEM(zd, 3, 16) << 32) | (ELEM(zm, 3, 16) << 48); - env->vfp.regs[rm] = make_float64(m0); - env->vfp.regs[rd] = make_float64(d0); + + rm[0] = m0; + rd[0] = d0; } /* Helper function for 64 bit polynomial multiply case: diff --git a/target/arm/translate.c b/target/arm/translate.c index 7b5db15861..6f02c56abb 100644 --- a/target/arm/translate.c +++ b/target/arm/translate.c @@ -4687,22 +4687,23 @@ static inline TCGv_i32 neon_get_scalar(int size, int reg) static int gen_neon_unzip(int rd, int rm, int size, int q) { - TCGv_i32 tmp, tmp2; + TCGv_ptr pd, pm; + if (!q && size == 2) { return 1; } - tmp = tcg_const_i32(rd); - tmp2 = tcg_const_i32(rm); + pd = vfp_reg_ptr(true, rd); + pm = vfp_reg_ptr(true, rm); if (q) { switch (size) { case 0: - gen_helper_neon_qunzip8(cpu_env, tmp, tmp2); + gen_helper_neon_qunzip8(pd, pm); break; case 1: - gen_helper_neon_qunzip16(cpu_env, tmp, tmp2); + gen_helper_neon_qunzip16(pd, pm); break; case 2: - gen_helper_neon_qunzip32(cpu_env, tmp, tmp2); + gen_helper_neon_qunzip32(pd, pm); break; default: abort(); @@ -4710,38 +4711,39 @@ static int gen_neon_unzip(int rd, int rm, int size, int q) } else { switch (size) { case 0: - gen_helper_neon_unzip8(cpu_env, tmp, tmp2); + gen_helper_neon_unzip8(pd, pm); break; case 1: - gen_helper_neon_unzip16(cpu_env, tmp, tmp2); + gen_helper_neon_unzip16(pd, pm); break; default: abort(); } } - tcg_temp_free_i32(tmp); - tcg_temp_free_i32(tmp2); + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(pm); return 0; } static int gen_neon_zip(int rd, int rm, int size, int q) { - TCGv_i32 tmp, tmp2; + TCGv_ptr pd, pm; + if (!q && size == 2) { return 1; } - tmp = tcg_const_i32(rd); - tmp2 = tcg_const_i32(rm); + pd = vfp_reg_ptr(true, rd); + pm = vfp_reg_ptr(true, rm); if (q) { switch (size) { case 0: - gen_helper_neon_qzip8(cpu_env, tmp, tmp2); + gen_helper_neon_qzip8(pd, pm); break; case 1: - gen_helper_neon_qzip16(cpu_env, tmp, tmp2); + gen_helper_neon_qzip16(pd, pm); break; case 2: - gen_helper_neon_qzip32(cpu_env, tmp, tmp2); + gen_helper_neon_qzip32(pd, pm); break; default: abort(); @@ -4749,17 +4751,17 @@ static int gen_neon_zip(int rd, int rm, int size, int q) } else { switch (size) { case 0: - gen_helper_neon_zip8(cpu_env, tmp, tmp2); + gen_helper_neon_zip8(pd, pm); break; case 1: - gen_helper_neon_zip16(cpu_env, tmp, tmp2); + gen_helper_neon_zip16(pd, pm); break; default: abort(); } } - tcg_temp_free_i32(tmp); - tcg_temp_free_i32(tmp2); + tcg_temp_free_ptr(pd); + tcg_temp_free_ptr(pm); return 0; }