From patchwork Sun Dec 22 16:24:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 852930 Delivered-To: patch@linaro.org Received: by 2002:a5d:4888:0:b0:385:e875:8a9e with SMTP id g8csp3036278wrq; Sun, 22 Dec 2024 08:31:58 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCV//oNJQQ+aBQRyZN0+76CrL4wCCD1Mu7Uf1xJ+sVy81zjXW3fFSAVAmY/gXGduVmADY1kXGA==@linaro.org X-Google-Smtp-Source: AGHT+IH0zb1T8akGdxW56oIg6zYWPN0Lwb6feN6HYJ05GC+tpOEx4wS0KzGNqQ7G1vaRDjEEhh3A X-Received: by 2002:a05:620a:28c4:b0:7b7:142d:53b3 with SMTP id af79cd13be357-7b9ba81f465mr1747250985a.55.1734885118617; Sun, 22 Dec 2024 08:31:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1734885118; cv=none; d=google.com; s=arc-20240605; b=Dx3rS8eWJb9s6wbfitEezZB8ulkmSlysyCbt+nz3MkatrjyhRgTKCTXgwg+K1DR1r4 X8SdBSFfg3X/rd2ql6jSaOY3bylbM9ruzws3gih3Xrw1S/zNYhqZUXIQ1lBT9q09tpB6 vMwkDU0c5hmuYdg6LVQXyyqK9fmA220DFeBnNWV8A+mA9bCeAn94aNctQeWOjyd1+HOm zrHPqAAXIN5ehknEIFOcKsAlHeeQGpQa5AD+eyB0dPoIufXNbmH7S1QWMnOr1xUfDL18 GARHdRqKzdcQC+4h+44NWHpSm66Vp9vO2EZmKK8/I45x/wv4bT21livKNSGPzf3QUiT3 L6Cg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=sender:errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:to:from :dkim-signature; bh=xb4e6w7ahnBB6mbpBKhRckqM5FrOSeMfN/028Rd+wf0=; fh=PnYt+qEB9tAfMKoqBm2xjKOFpYyFFGPudh5cVIoieJM=; b=k/HPz9GaT537iyT6Hjdq6C9GJJTkFgWX7l3oL8BVETWyrw7FiJ6qxNZxvKMfatUO51 VAqswk9tJk0i7nn++YQ4F4huCUsyZorZ2pH/sMd8FrNdLHq6/h4aJZydTxQaz/1D0ASq Pj8m4pld2HU/U3USel0WODN8IyGBcJgnxiwno6PcFZkRjwbhW723l+2GSiRP74F/30Wy Od6pZl8kARipDdpNIo1syau6CWqlYC1dKQeOHFoQGpOqcwC5XbRvJhxu63tbIGZPihzv nmkl5tu6bVN+I+3O5LjjWPTHll0rlZCsUqCKN/uPCtcumPF4IwSTw1JrMl/WzQIaUP7A agMg==; dara=google.com ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PMDvGyBf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Return-Path: Received: from lists.gnu.org (lists.gnu.org. [209.51.188.17]) by mx.google.com with ESMTPS id af79cd13be357-7b9ac2ab60csi913474385a.81.2024.12.22.08.31.58 for (version=TLS1_2 cipher=ECDHE-ECDSA-CHACHA20-POLY1305 bits=256/256); Sun, 22 Dec 2024 08:31:58 -0800 (PST) Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=PMDvGyBf; spf=pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom="qemu-devel-bounces+patch=linaro.org@nongnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org; dara=neutral header.i=@linaro.org Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tPOlp-0002qR-3E; Sun, 22 Dec 2024 11:25:33 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tPOlD-0002YP-Hy for qemu-devel@nongnu.org; Sun, 22 Dec 2024 11:24:55 -0500 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1tPOlB-0002wx-F1 for qemu-devel@nongnu.org; Sun, 22 Dec 2024 11:24:54 -0500 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-2165448243fso36704645ad.1 for ; Sun, 22 Dec 2024 08:24:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1734884692; x=1735489492; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=xb4e6w7ahnBB6mbpBKhRckqM5FrOSeMfN/028Rd+wf0=; b=PMDvGyBfyIcl7LkXgz1NSfTl5ajUpRnXEGytqKtqvZV8KFmkwNS+56k33uwalpMoTi fvC9QjLb8WlkeZv519lzm7wJINuWat0uhKOEVHMyP+ebkzjcTvrjzdg5bVsJProCQta9 jc0m61O8emHwcrb+7DEn9F/difoO01wC/0/CWilEcJxyhSP5LAnBaxvDlXqI8DS+1joJ o34+/2NGmfqzTP2wdCIHDp9NmGq6x5WnZtIio+QW7Cid9+PQzA/rCkPAb6q0TyPBqole 18ct1Pggz1kBeZA3aGUMY+YcVhLlyc6it36SFmSxlrjR9bDnAIcLnhYzMlqip8XBVeKe dHag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734884692; x=1735489492; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xb4e6w7ahnBB6mbpBKhRckqM5FrOSeMfN/028Rd+wf0=; b=hnhUGERzel8/3k//K2q/J9V+zoAdSeKpggsm2Z9ZpJvP2xAPiOlskHoFOT2qz7JR8X SOu6E02Z1VnQGhc3bC9mv73OiZP0AoDAQ0IwDL8Zzy3eYJw/MwRL5gllQ46IPEEQOxCG WmmJ6P0eAe8X3Vv7aRcCiD9CPrjHTVjE5gmO7MTKr7ZTNZp0Fdn1sg/A5H8kQWTYsB7Y Ed0Ux9LF2svASBVOW/xsHnQdp3rsBKZipwfykd5dNy+VbKNA3Vv1vf5Y7GsmbuLorTpk H2Lxvau6mX0hD4/Gi4FiIqH5Qte3uE1HQpOPs91BGASUC+X0s1L/fFDsK3VRObQGX2lC Zuxw== X-Gm-Message-State: AOJu0YypN6aqxzvXaVDqSsS/VhltzpuxvrpxC9W6FVrm9Q5sXFVcOGJJ msdQujg8J+1ohqi9waMXIVjp9k2bMJELwX8IJx7E0ztd0WqvfVSETu/IpKkONUWXGMbb7Ys45Wq rU3w= X-Gm-Gg: ASbGncu29h5/+l5HoOx4ovYOnwVcm54eWg1Rzy1mYhxxq5ZVgVOaVX7XRa7uoCNbsw/ qIB3fhuW9IstVZaWMmYTH6s7ruD2Lsp4b9uI5TerlKt1vsKZo33qDiFwrPmYmGC5ilKfJshF1RD sIx3IvKTw12ms8KyXOSBZqIz7idm0CnICosqtGdecXP+vDnAocLCaid4t+0kcam+N2vf8bNboTe 7x0tvBLfO0AT8qOZ3QT5K0zOMTYS5Jf9U9JFngBb7T1NpndO8AgdCYnVB7VPdc= X-Received: by 2002:a17:902:e892:b0:216:73f0:ef63 with SMTP id d9443c01a7336-219e6f284e4mr148748365ad.49.1734884691781; Sun, 22 Dec 2024 08:24:51 -0800 (PST) Received: from stoup.. ([71.212.144.252]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-219dc971814sm58461385ad.79.2024.12.22.08.24.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 Dec 2024 08:24:51 -0800 (PST) From: Richard Henderson To: qemu-devel@nongnu.org Subject: [PATCH v3 06/51] tcg/optimize: Change representation of s_mask Date: Sun, 22 Dec 2024 08:24:01 -0800 Message-ID: <20241222162446.2415717-7-richard.henderson@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241222162446.2415717-1-richard.henderson@linaro.org> References: <20241222162446.2415717-1-richard.henderson@linaro.org> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=richard.henderson@linaro.org; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org Change the representation from sign bit repetitions to all bits equal to the sign bit, including the sign bit itself. The previous format has a problem in that it is difficult to recreate a valid sign mask after a shift operation: the "repetitions" part of the previous format meant that applying the same shift as for the value lead to an off-by-one value. The new format, including the sign bit itself, means that the sign mask can be manipulated in exactly the same way as the value, canonicalization is easier. Canonicalize the s_mask in fold_masks_zs, rather than requiring callers to do so. Treat 0 as a non-canonical but typeless input for no sign information, which will be reset as appropriate for the data type. We can easily fold in the data from z_mask while canonicalizing. Temporarily disable optimizations using s_mask while each operation is converted to use fold_masks_zs and to the new form. Signed-off-by: Richard Henderson Reviewed-by: Pierrick Bouvier --- tcg/optimize.c | 64 ++++++++++++-------------------------------------- 1 file changed, 15 insertions(+), 49 deletions(-) diff --git a/tcg/optimize.c b/tcg/optimize.c index d8f6542c4f..fbc0dc5588 100644 --- a/tcg/optimize.c +++ b/tcg/optimize.c @@ -52,7 +52,7 @@ typedef struct TempOptInfo { QSIMPLEQ_HEAD(, MemCopyInfo) mem_copy; uint64_t val; uint64_t z_mask; /* mask bit is 0 if and only if value bit is 0 */ - uint64_t s_mask; /* a left-aligned mask of clrsb(value) bits. */ + uint64_t s_mask; /* mask bit is 1 if value bit matches msb */ } TempOptInfo; typedef struct OptContext { @@ -65,49 +65,10 @@ typedef struct OptContext { /* In flight values from optimization. */ uint64_t z_mask; /* mask bit is 0 iff value bit is 0 */ - uint64_t s_mask; /* mask of clrsb(value) bits */ + uint64_t s_mask; /* mask bit is 1 if value bit matches msb */ TCGType type; } OptContext; -/* Calculate the smask for a specific value. */ -static uint64_t smask_from_value(uint64_t value) -{ - int rep = clrsb64(value); - return ~(~0ull >> rep); -} - -/* - * Calculate the smask for a given set of known-zeros. - * If there are lots of zeros on the left, we can consider the remainder - * an unsigned field, and thus the corresponding signed field is one bit - * larger. - */ -static uint64_t smask_from_zmask(uint64_t zmask) -{ - /* - * Only the 0 bits are significant for zmask, thus the msb itself - * must be zero, else we have no sign information. - */ - int rep = clz64(zmask); - if (rep == 0) { - return 0; - } - rep -= 1; - return ~(~0ull >> rep); -} - -/* - * Recreate a properly left-aligned smask after manipulation. - * Some bit-shuffling, particularly shifts and rotates, may - * retain sign bits on the left, but may scatter disconnected - * sign bits on the right. Retain only what remains to the left. - */ -static uint64_t smask_from_smask(int64_t smask) -{ - /* Only the 1 bits are significant for smask */ - return smask_from_zmask(~smask); -} - static inline TempOptInfo *ts_info(TCGTemp *ts) { return ts->state_ptr; @@ -173,7 +134,7 @@ static void init_ts_info(OptContext *ctx, TCGTemp *ts) ti->is_const = true; ti->val = ts->val; ti->z_mask = ts->val; - ti->s_mask = smask_from_value(ts->val); + ti->s_mask = INT64_MIN >> clrsb64(ts->val); } else { ti->is_const = false; ti->z_mask = -1; @@ -992,7 +953,6 @@ static void finish_folding(OptContext *ctx, TCGOp *op) */ if (i == 0) { ts_info(ts)->z_mask = ctx->z_mask; - ts_info(ts)->s_mask = ctx->s_mask; } } } @@ -1051,11 +1011,12 @@ static bool fold_const2_commutative(OptContext *ctx, TCGOp *op) * The passed s_mask may be augmented by z_mask. */ static bool fold_masks_zs(OptContext *ctx, TCGOp *op, - uint64_t z_mask, uint64_t s_mask) + uint64_t z_mask, int64_t s_mask) { const TCGOpDef *def = &tcg_op_defs[op->opc]; TCGTemp *ts; TempOptInfo *ti; + int rep; /* Only single-output opcodes are supported here. */ tcg_debug_assert(def->nb_oargs == 1); @@ -1069,7 +1030,7 @@ static bool fold_masks_zs(OptContext *ctx, TCGOp *op, */ if (ctx->type == TCG_TYPE_I32) { z_mask = (int32_t)z_mask; - s_mask |= MAKE_64BIT_MASK(32, 32); + s_mask |= INT32_MIN; } if (z_mask == 0) { @@ -1081,7 +1042,13 @@ static bool fold_masks_zs(OptContext *ctx, TCGOp *op, ti = ts_info(ts); ti->z_mask = z_mask; - ti->s_mask = s_mask | smask_from_zmask(z_mask); + + /* Canonicalize s_mask and incorporate data from z_mask. */ + rep = clz64(~s_mask); + rep = MAX(rep, clz64(z_mask)); + rep = MAX(rep - 1, 0); + ti->s_mask = INT64_MIN >> rep; + return true; } @@ -1807,7 +1774,7 @@ static bool fold_exts(OptContext *ctx, TCGOp *op) ctx->z_mask = z_mask; ctx->s_mask = s_mask; - if (!type_change && fold_affected_mask(ctx, op, s_mask & ~s_mask_old)) { + if (0 && !type_change && fold_affected_mask(ctx, op, s_mask & ~s_mask_old)) { return true; } @@ -2509,7 +2476,7 @@ static bool fold_sextract(OptContext *ctx, TCGOp *op) s_mask |= MAKE_64BIT_MASK(len, 64 - len); ctx->s_mask = s_mask; - if (pos == 0 && fold_affected_mask(ctx, op, s_mask & ~s_mask_old)) { + if (0 && pos == 0 && fold_affected_mask(ctx, op, s_mask & ~s_mask_old)) { return true; } @@ -2535,7 +2502,6 @@ static bool fold_shift(OptContext *ctx, TCGOp *op) ctx->z_mask = do_constant_folding(op->opc, ctx->type, z_mask, sh); s_mask = do_constant_folding(op->opc, ctx->type, s_mask, sh); - ctx->s_mask = smask_from_smask(s_mask); return fold_masks(ctx, op); }