[0/5] target/i386: CCOp cleanups

Message ID	20240701025115.1265117-1-richard.henderson@linaro.org
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Richard Henderson <richard.henderson@linaro.org> To: qemu-devel@nongnu.org Cc: pbonzini@redhat.com Subject: [PATCH 0/5] target/i386: CCOp cleanups Date: Sun, 30 Jun 2024 19:51:10 -0700 Message-Id: <20240701025115.1265117-1-richard.henderson@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2607:f8b0:4864:20::434; envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	target/i386: CCOp cleanups \| expand [0/5] target/i386: CCOp cleanups [1/5] target/i386: Tidy cc_op_str usage [2/5] target/i386: Convert cc_op_live to a function [3/5] target/i386: Rearrange CCOp [4/5] target/i386: Remove default in cc_op_live [5/5] target/i386: Introduce cc_op_size

Message ID

20240701025115.1265117-1-richard.henderson@linaro.org

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Richard Henderson <richard.henderson@linaro.org>
To: qemu-devel@nongnu.org
Cc: pbonzini@redhat.com
Subject: [PATCH 0/5] target/i386: CCOp cleanups
Date: Sun, 30 Jun 2024 19:51:10 -0700
Message-Id: <20240701025115.1265117-1-richard.henderson@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2607:f8b0:4864:20::434;
 envelope-from=richard.henderson@linaro.org; helo=mail-pf1-x434.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

target/i386: CCOp cleanups | expand

Message

Richard Henderson July 1, 2024, 2:51 a.m. UTC

While debugging #2413, I spent quite a bit of time trying to work
out if the CCOp value was incorrect.  I think the following is a
worthwhile cleanup, isolating potential problems to asserts.


r~


Richard Henderson (5):
  target/i386: Tidy cc_op_str usage
  target/i386: Convert cc_op_live to a function
  target/i386: Rearrange CCOp
  target/i386: Remove default in cc_op_live
  target/i386: Introduce cc_op_size

 target/i386/cpu.h                | 44 +++++++++--------
 target/i386/cpu-dump.c           | 17 ++++---
 target/i386/tcg/translate.c      | 84 +++++++++++++++++++-------------
 target/i386/tcg/decode-new.c.inc |  2 +-
 target/i386/tcg/emit.c.inc       |  9 ++--
 5 files changed, 93 insertions(+), 63 deletions(-)

Comments

Paolo Bonzini July 1, 2024, 5:48 p.m. UTC | #1

On Mon, Jul 1, 2024 at 4:51 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
> While debugging #2413, I spent quite a bit of time trying to work
> out if the CCOp value was incorrect.  I think the following is a
> worthwhile cleanup, isolating potential problems to asserts.

Hi Richard,

no objections at all to introducing more asserts. I think keeping the
array is a better underlying implementation for cc_op_live() however.

I'm also not very fond of mixing "sized" and "unsized" CCOps in the
4..7 range, there's no real reason why CC_OP_DYNAMIC and CC_OP_CLR
must be close to CC_OP_EFLAGS and the ADCOX CCOps.  I also think it's
clearer to keep CC_OP_POPCNT[BWLQ] (even though in practice only one
will be used because popcnt needs zero extension anyway).

As an aside, I'm wondering if CC_OP_CLR is particularly important; I
expect "xor reg, reg" to be followed by more ALU operations most of
the time and to not be followed by a jump, so it only saves a spill if
xor reg, reg is followed by a lot or store. If gen_XOR used either
CC_OP_LOGICn or CC_OP_EFLAGS for "xor reg, reg", the values in
decode->cc_op_* (CC_OP_DST=0 for CC_OP_LOGICn; CC_OP_SRC=CC_Z|CC_P for
CC_OP_EFLAGS) would be constant and wouldn't add to register pressure.

Paolo

Richard Henderson July 1, 2024, 7:05 p.m. UTC | #2

On 7/1/24 10:48, Paolo Bonzini wrote:
> On Mon, Jul 1, 2024 at 4:51 AM Richard Henderson
> <richard.henderson@linaro.org> wrote:
>> While debugging #2413, I spent quite a bit of time trying to work
>> out if the CCOp value was incorrect.  I think the following is a
>> worthwhile cleanup, isolating potential problems to asserts.
> 
> Hi Richard,
> 
> no objections at all to introducing more asserts. I think keeping the
> array is a better underlying implementation for cc_op_live() however.

Hmm.  I had an implementation that would detect missing entries at runtime, but this one 
detects missing entries at compile time.

> 
> I'm also not very fond of mixing "sized" and "unsized" CCOps in the
> 4..7 range, there's no real reason why CC_OP_DYNAMIC and CC_OP_CLR
> must be close to CC_OP_EFLAGS and the ADCOX CCOps.  I also think it's
> clearer to keep CC_OP_POPCNT[BWLQ] (even though in practice only one
> will be used because popcnt needs zero extension anyway).

My objection to keeping the unused POPCNT* enumerators is that it interferes with proper 
cooperation with -Wswitch, to diagnose missing enumerators.  This is also why I removed 
CC_OP_NB.

If those extra enumerators are present, you either have to include them explicitly in the 
switch, which is less than desirable, or add a default case, which disables -Wswitch entirely.

If you don't like abusing 4..7, then we can use 8..11, placing POPCNT at 10 or 11.

> As an aside, I'm wondering if CC_OP_CLR is particularly important; I
> expect "xor reg, reg" to be followed by more ALU operations most of
> the time and to not be followed by a jump, so it only saves a spill if
> xor reg, reg is followed by a lot or store. If gen_XOR used either
> CC_OP_LOGICn or CC_OP_EFLAGS for "xor reg, reg", the values in
> decode->cc_op_* (CC_OP_DST=0 for CC_OP_LOGICn; CC_OP_SRC=CC_Z|CC_P for
> CC_OP_EFLAGS) would be constant and wouldn't add to register pressure.

You could easily be right.  Improvements to tcg in the last 11 years may have made it 
redundant, or it might have been wishful thinking even at the time.

r~

Paolo Bonzini July 1, 2024, 7:30 p.m. UTC | #3

On Mon, Jul 1, 2024 at 9:05 PM Richard Henderson
<richard.henderson@linaro.org> wrote:
> > no objections at all to introducing more asserts. I think keeping the
> > array is a better underlying implementation for cc_op_live() however.
>
> Hmm.  I had an implementation that would detect missing entries at runtime, but this one
> detects missing entries at compile time.

But how common is it to add new CCOps? I find the array more readable
(and it can also be printed quickly from gdb), while the switch
statement optimizes for a really rare case.

> > I'm also not very fond of mixing "sized" and "unsized" CCOps in the
> > 4..7 range, there's no real reason why CC_OP_DYNAMIC and CC_OP_CLR
> > must be close to CC_OP_EFLAGS and the ADCOX CCOps.  I also think it's
> > clearer to keep CC_OP_POPCNT[BWLQ] (even though in practice only one
> > will be used because popcnt needs zero extension anyway).
>
> My objection to keeping the unused POPCNT* enumerators is that it interferes with proper
> cooperation with -Wswitch, to diagnose missing enumerators.  This is also why I removed
> CC_OP_NB.

Yes, I agree with removing CC_OP_NB. However the unused POPCNT[BWLQ]
can be implemented trivially (they're all the same).

> > As an aside, I'm wondering if CC_OP_CLR is particularly important; I
> > expect "xor reg, reg" to be followed by more ALU operations most of
> > the time and to not be followed by a jump, so it only saves a spill if
> > xor reg, reg is followed by a lot or store. If gen_XOR used either
> > CC_OP_LOGICn or CC_OP_EFLAGS for "xor reg, reg", the values in
> > decode->cc_op_* (CC_OP_DST=0 for CC_OP_LOGICn; CC_OP_SRC=CC_Z|CC_P for
> > CC_OP_EFLAGS) would be constant and wouldn't add to register pressure.
>
> You could easily be right.  Improvements to tcg in the last 11 years may have made it
> redundant, or it might have been wishful thinking even at the time.

Maybe. Just looking at the last couple years, with PCREL the cost of
translation has decreased substantially, and with the new decoder the
number of tcg ops has increased a bit. In both cases that means that
counting the tcg ops becomes less important.

BTW I found an easy way to implement X86_SPECIAL_BitTest without
crashes (just use cpu_regs[op->n] when computing the displacement
since you cannot have ah/bh/ch/dh). But I think it will be for 9.2.
Maybe these patches can wait too?

Paolo

Richard Henderson July 1, 2024, 7:51 p.m. UTC | #4

On 7/1/24 12:30, Paolo Bonzini wrote:
> BTW I found an easy way to implement X86_SPECIAL_BitTest without
> crashes (just use cpu_regs[op->n] when computing the displacement
> since you cannot have ah/bh/ch/dh). But I think it will be for 9.2.
> Maybe these patches can wait too?

Certainly.


r~