Message ID | 20250217125055.160887-7-peter.maydell@linaro.org |
---|---|
State | New |
Headers | show |
Series | fpu: Remove remaining target ifdefs and build only once | expand |
On 2/17/25 04:50, Peter Maydell wrote: > Currently we compile-time set an 'm68k_denormal' flag in the FloatFmt > for floatx80 for m68k. This controls our handling of what the Intel > documentation calls a "pseudo-denormal": a value where the exponent > field is zero and the explicit integer bit is set. > > For x86, the x87 FPU is supposed to accept a pseudo-denormal as > input, but never generate one on output. For m68k, these values are > permitted on input and may be produced on output. > > Replace the flag in the FloatFmt with a flag indicating whether the > float format has an explicit bit (which will be true for floatx80 for > all targets, and false for every other float type). Then we can gate > the handling of these pseudo-denormals on the setting of a > floatx80_behaviour flag. > > As far as I can see from the code we don't actually handle the > x86-mandated "accept on input but don't generate" behaviour, because > the handling in partsN(canonicalize) looked at fmt->m68k_denormal. > So I have added TODO comments to that effect. > > This commit doesn't change any behaviour for any target. > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > --- > I'm confident this commit preserves existing behaviour, but > somewhat less confident that I've correctly analysed what our > current code does, in particular that it doesn't do the x86 > mandated "handle pseudo-denormals on input" part. Test case: #include <stdio.h> int main() { union { struct { unsigned long long m; unsigned short e; } i; long double f; } u[2] = { }; unsigned short sw; asm volatile("fnclex" : : : "memory"); u[0].i.m = 0xc000000000000000ull; u[0].f += u[1].f; /* denormal + zero -> renormalize */ asm volatile("fstsw %w0" : "=a"(sw) : : "memory"); printf("%04x %016llx %04x\n", u[0].i.e, u[0].i.m, sw); return 0; } Expected behaviour is setting DE for consuming a denormal. As you say, this patch preserves existing behaviour so, Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
On Mon, 17 Feb 2025 at 19:14, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 2/17/25 04:50, Peter Maydell wrote: > > Currently we compile-time set an 'm68k_denormal' flag in the FloatFmt > > for floatx80 for m68k. This controls our handling of what the Intel > > documentation calls a "pseudo-denormal": a value where the exponent > > field is zero and the explicit integer bit is set. > > > > For x86, the x87 FPU is supposed to accept a pseudo-denormal as > > input, but never generate one on output. For m68k, these values are > > permitted on input and may be produced on output. > > > > Replace the flag in the FloatFmt with a flag indicating whether the > > float format has an explicit bit (which will be true for floatx80 for > > all targets, and false for every other float type). Then we can gate > > the handling of these pseudo-denormals on the setting of a > > floatx80_behaviour flag. > > > > As far as I can see from the code we don't actually handle the > > x86-mandated "accept on input but don't generate" behaviour, because > > the handling in partsN(canonicalize) looked at fmt->m68k_denormal. > > So I have added TODO comments to that effect. > > > > This commit doesn't change any behaviour for any target. > > > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > > --- > > I'm confident this commit preserves existing behaviour, but > > somewhat less confident that I've correctly analysed what our > > current code does, in particular that it doesn't do the x86 > > mandated "handle pseudo-denormals on input" part. > > Test case: > > #include <stdio.h> > > int main() > { > union { > struct { > unsigned long long m; > unsigned short e; > } i; > long double f; > } u[2] = { }; > unsigned short sw; > > asm volatile("fnclex" : : : "memory"); > > u[0].i.m = 0xc000000000000000ull; > u[0].f += u[1].f; /* denormal + zero -> renormalize */ > > asm volatile("fstsw %w0" : "=a"(sw) : : "memory"); > > printf("%04x %016llx %04x\n", u[0].i.e, u[0].i.m, sw); > return 0; > } > > Expected behaviour is setting DE for consuming a denormal. So, on real hardware I see 0001 c000000000000000 0002 and on QEMU, with the patchset that fixes the DE bit handling https://patchew.org/QEMU/20250213142613.151308-1-peter.maydell@linaro.org/ I also see the same thing. That suggests that we are correctly implementing the x87 required behaviour in QEMU, and so that the TODO comment I add in this patch isn't right. But then I'm a bit confused about what the code is actually doing. Why do we need to look at fmt->m68k_denormal in the input (canonicalize) code (i.e. to have different behaviour here for x86 and m68k), if both x86 and m68k accept these pseudodenormals as input? Is the difference that for x86 we accept but canonicalize into the equivalent normal number immediately on input, whereas for m68k we accept and leave the pseudodenormal as a pseudodenormal (well, m68k calls these a kind of normal number) ? thanks -- PMM
On 2/20/25 09:12, Peter Maydell wrote: > That suggests that we are correctly implementing the x87 > required behaviour in QEMU, and so that the TODO comment > I add in this patch isn't right. But then I'm a bit confused > about what the code is actually doing. Why do we need to look > at fmt->m68k_denormal in the input (canonicalize) code (i.e. > to have different behaviour here for x86 and m68k), if > both x86 and m68k accept these pseudodenormals as input? > > Is the difference that for x86 we accept but canonicalize > into the equivalent normal number immediately on input, > whereas for m68k we accept and leave the pseudodenormal > as a pseudodenormal (well, m68k calls these a kind of > normal number) ? The difference is in interpretation: x86 ignores the explicit integer bit of the pseudo-denormal, m68k considers it part of the input value. This gives m68k one extra bit of range in their denormal, which allows representation of smaller numbers. r~
On Thu, 20 Feb 2025 at 18:39, Richard Henderson <richard.henderson@linaro.org> wrote: > > On 2/20/25 09:12, Peter Maydell wrote: > > That suggests that we are correctly implementing the x87 > > required behaviour in QEMU, and so that the TODO comment > > I add in this patch isn't right. But then I'm a bit confused > > about what the code is actually doing. Why do we need to look > > at fmt->m68k_denormal in the input (canonicalize) code (i.e. > > to have different behaviour here for x86 and m68k), if > > both x86 and m68k accept these pseudodenormals as input? > > > > Is the difference that for x86 we accept but canonicalize > > into the equivalent normal number immediately on input, > > whereas for m68k we accept and leave the pseudodenormal > > as a pseudodenormal (well, m68k calls these a kind of > > normal number) ? > The difference is in interpretation: x86 ignores the explicit integer bit of the > pseudo-denormal, m68k considers it part of the input value. This gives m68k one extra bit > of range in their denormal, which allows representation of smaller numbers. Ah, I see. So I suppose: (1) we should call the floatx80_status flag "floatx80_pseudo_denormal_valid" since it affects both inputs and outputs, and document it in the enum like: + /* + * If the exponent is 0 and the Integer bit is set, Intel call + * this a "pseudo-denormal"; x86 supports that only on input + * (treating them as denormals by ignoring the Integer bit). + * For m68k, the integer bit is considered validly part of the + * input value when the exponent is 0, and may be 0 or 1, + * giving extra range. They may also be generated as outputs. + * (The m68k manual actually calls these values part of the + * normalized number range, not the denormalized number range.) + * + * By default you get the Intel behaviour where the Integer + * bit is ignored; if this is set then the Integer bit value + * is honoured, m68k-style. + * + * Either way, floatx80_invalid_encoding() will always accept + * pseudo-denormals. + */ + floatx80_pseudo_denormal_valid = 16, (2) the comment I add in canonicalize should instead read: + /* + * It's target-dependent how to handle the case of exponent 0 + * and Integer bit set. Intel calls these "pseudodenormals", + * and treats them as if the integer bit was 0, and never + * produces them on output. This is the default behaviour for QEMU. + * For m68k, the integer bit is considered validly part of the + * input value when the exponent is 0, and may be 0 or 1, + * giving extra range. They may also be generated as outputs. + * (The m68k manual actually calls these values part of the + * normalized number range, not the denormalized number range, + * but that distinction is not important for us, because + * m68k doesn't care about the input_denormal_used status flag.) + * floatx80_pseudo_denormal_valid selects the m68k behaviour, + * which changes both how we canonicalize such a value and + * how we uncanonicalize results. + */ ? thanks -- PMM
On 20/2/25 19:54, Peter Maydell wrote: > On Thu, 20 Feb 2025 at 18:39, Richard Henderson > <richard.henderson@linaro.org> wrote: >> >> On 2/20/25 09:12, Peter Maydell wrote: >>> That suggests that we are correctly implementing the x87 >>> required behaviour in QEMU, and so that the TODO comment >>> I add in this patch isn't right. But then I'm a bit confused >>> about what the code is actually doing. Why do we need to look >>> at fmt->m68k_denormal in the input (canonicalize) code (i.e. >>> to have different behaviour here for x86 and m68k), if >>> both x86 and m68k accept these pseudodenormals as input? >>> >>> Is the difference that for x86 we accept but canonicalize >>> into the equivalent normal number immediately on input, >>> whereas for m68k we accept and leave the pseudodenormal >>> as a pseudodenormal (well, m68k calls these a kind of >>> normal number) ? >> The difference is in interpretation: x86 ignores the explicit integer bit of the >> pseudo-denormal, m68k considers it part of the input value. This gives m68k one extra bit >> of range in their denormal, which allows representation of smaller numbers. > > Ah, I see. So I suppose: > > (1) we should call the floatx80_status flag > "floatx80_pseudo_denormal_valid" since it affects both inputs > and outputs, and document it in the enum like: > > + /* > + * If the exponent is 0 and the Integer bit is set, Intel call > + * this a "pseudo-denormal"; x86 supports that only on input > + * (treating them as denormals by ignoring the Integer bit). > + * For m68k, the integer bit is considered validly part of the > + * input value when the exponent is 0, and may be 0 or 1, > + * giving extra range. They may also be generated as outputs. > + * (The m68k manual actually calls these values part of the > + * normalized number range, not the denormalized number range.) > + * > + * By default you get the Intel behaviour where the Integer > + * bit is ignored; if this is set then the Integer bit value > + * is honoured, m68k-style. > + * > + * Either way, floatx80_invalid_encoding() will always accept > + * pseudo-denormals. > + */ > + floatx80_pseudo_denormal_valid = 16, > > > (2) the comment I add in canonicalize should instead read: > > + /* > + * It's target-dependent how to handle the case of exponent 0 > + * and Integer bit set. Intel calls these "pseudodenormals", > + * and treats them as if the integer bit was 0, and never > + * produces them on output. This is the default behaviour for QEMU. > + * For m68k, the integer bit is considered validly part of the > + * input value when the exponent is 0, and may be 0 or 1, > + * giving extra range. They may also be generated as outputs. > + * (The m68k manual actually calls these values part of the > + * normalized number range, not the denormalized number range, > + * but that distinction is not important for us, because > + * m68k doesn't care about the input_denormal_used status flag.) > + * floatx80_pseudo_denormal_valid selects the m68k behaviour, > + * which changes both how we canonicalize such a value and > + * how we uncanonicalize results. > + */ Both changes LGTM but I'm no expert here ;) To the best of my FPU knowledge: Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h index b1941384aef..f57c7193653 100644 --- a/include/fpu/softfloat-types.h +++ b/include/fpu/softfloat-types.h @@ -349,6 +349,16 @@ typedef enum __attribute__((__packed__)) { * and using them as inputs to a float op will raise Invalid. */ floatx80_unnormal_valid = 8, + /* + * If the exponent is 0 and the Integer bit is set, Intel call + * this a "pseudo-denormal"; x86 supports that only on input + * (treating them as denormals). m68k calls these a type of + * normalized number, and may both input and output them. + * TODO: currently we only support "handle on both input and + * output" or "don't handle and don't generate". + * floatx80_invalid_encoding() will always accept them. + */ + floatx80_pseudo_denormal_output_valid = 16, } FloatX80Behaviour; /* diff --git a/fpu/softfloat.c b/fpu/softfloat.c index 2a20ae871eb..b299cfaf860 100644 --- a/fpu/softfloat.c +++ b/fpu/softfloat.c @@ -537,7 +537,8 @@ typedef struct { * round_mask: bits below lsb which must be rounded * The following optional modifiers are available: * arm_althp: handle ARM Alternative Half Precision - * m68k_denormal: explicit integer bit for extended precision may be 1 + * has_explicit_bit: has an explicit integer bit; this affects whether + * the float_status floatx80_behaviour handling applies */ typedef struct { int exp_size; @@ -547,7 +548,7 @@ typedef struct { int frac_size; int frac_shift; bool arm_althp; - bool m68k_denormal; + bool has_explicit_bit; uint64_t round_mask; } FloatFmt; @@ -600,9 +601,7 @@ static const FloatFmt floatx80_params[3] = { [floatx80_precision_d] = { FLOATX80_PARAMS(52) }, [floatx80_precision_x] = { FLOATX80_PARAMS(64), -#ifdef TARGET_M68K - .m68k_denormal = true, -#endif + .has_explicit_bit = true, }, }; diff --git a/target/m68k/cpu.c b/target/m68k/cpu.c index 505fa97a53f..b21813a1b96 100644 --- a/target/m68k/cpu.c +++ b/target/m68k/cpu.c @@ -139,7 +139,8 @@ static void m68k_cpu_reset_hold(Object *obj, ResetType type) set_floatx80_behaviour(floatx80_default_inf_int_bit_is_zero | floatx80_pseudo_inf_valid | floatx80_pseudo_nan_valid | - floatx80_unnormal_valid, + floatx80_unnormal_valid | + floatx80_pseudo_denormal_output_valid, &env->fp_status); nan = floatx80_default_nan(&env->fp_status); diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc index 1d09f066c5d..6abd6b28b85 100644 --- a/fpu/softfloat-parts.c.inc +++ b/fpu/softfloat-parts.c.inc @@ -195,6 +195,18 @@ static FloatPartsN *partsN(pick_nan_muladd)(FloatPartsN *a, FloatPartsN *b, static void partsN(canonicalize)(FloatPartsN *p, float_status *status, const FloatFmt *fmt) { + /* + * It's target-dependent whether exponent 0 and Integer bit set + * is valid. TODO: the x86 manual says that the x87 FPU will + * accept these on inputs, treating them as denormals, but will + * never output them. However we seem to not handle them on + * either input or output. Also we think of them as denormal + * numbers for m68k, though that doesn't matter since m68k doesn't + * use the input_denormal_used flag. + */ + bool has_pseudo_denormals = fmt->has_explicit_bit && + (status->floatx80_behaviour & floatx80_pseudo_denormal_output_valid); + if (unlikely(p->exp == 0)) { if (likely(frac_eqz(p))) { p->cls = float_class_zero; @@ -206,7 +218,7 @@ static void partsN(canonicalize)(FloatPartsN *p, float_status *status, int shift = frac_normalize(p); p->cls = float_class_denormal; p->exp = fmt->frac_shift - fmt->exp_bias - - shift + !fmt->m68k_denormal; + - shift + !has_pseudo_denormals; } } else if (likely(p->exp < fmt->exp_max) || fmt->arm_althp) { p->cls = float_class_normal; @@ -342,13 +354,15 @@ static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s, frac_clear(p); } else { bool is_tiny = s->tininess_before_rounding || exp < 0; + bool has_pseudo_denormals = fmt->has_explicit_bit && + (s->floatx80_behaviour & floatx80_pseudo_denormal_output_valid); if (!is_tiny) { FloatPartsN discard; is_tiny = !frac_addi(&discard, p, inc); } - frac_shrjam(p, !fmt->m68k_denormal - exp); + frac_shrjam(p, !has_pseudo_denormals - exp); if (p->frac_lo & round_mask) { /* Need to recompute round-to-even/round-to-odd. */ @@ -379,7 +393,7 @@ static void partsN(uncanon_normal)(FloatPartsN *p, float_status *s, p->frac_lo &= ~round_mask; } - exp = (p->frac_hi & DECOMPOSED_IMPLICIT_BIT) && !fmt->m68k_denormal; + exp = (p->frac_hi & DECOMPOSED_IMPLICIT_BIT) && !has_pseudo_denormals; frac_shr(p, frac_shift); if (is_tiny) {
Currently we compile-time set an 'm68k_denormal' flag in the FloatFmt for floatx80 for m68k. This controls our handling of what the Intel documentation calls a "pseudo-denormal": a value where the exponent field is zero and the explicit integer bit is set. For x86, the x87 FPU is supposed to accept a pseudo-denormal as input, but never generate one on output. For m68k, these values are permitted on input and may be produced on output. Replace the flag in the FloatFmt with a flag indicating whether the float format has an explicit bit (which will be true for floatx80 for all targets, and false for every other float type). Then we can gate the handling of these pseudo-denormals on the setting of a floatx80_behaviour flag. As far as I can see from the code we don't actually handle the x86-mandated "accept on input but don't generate" behaviour, because the handling in partsN(canonicalize) looked at fmt->m68k_denormal. So I have added TODO comments to that effect. This commit doesn't change any behaviour for any target. Signed-off-by: Peter Maydell <peter.maydell@linaro.org> --- I'm confident this commit preserves existing behaviour, but somewhat less confident that I've correctly analysed what our current code does, in particular that it doesn't do the x86 mandated "handle pseudo-denormals on input" part. --- include/fpu/softfloat-types.h | 10 ++++++++++ fpu/softfloat.c | 9 ++++----- target/m68k/cpu.c | 3 ++- fpu/softfloat-parts.c.inc | 20 +++++++++++++++++--- 4 files changed, 33 insertions(+), 9 deletions(-)