[PULL,03/68] fpu: Implement float_flag_input_denormal_used

Message ID	20250211162554.4135349-4-peter.maydell@linaro.org
State	Not Applicable
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; From: Peter Maydell <peter.maydell@linaro.org> To: qemu-devel@nongnu.org Subject: [PULL 03/68] fpu: Implement float_flag_input_denormal_used Date: Tue, 11 Feb 2025 16:24:49 +0000 Message-Id: <20250211162554.4135349-4-peter.maydell@linaro.org> In-Reply-To: <20250211162554.4135349-1-peter.maydell@linaro.org> References: <20250211162554.4135349-1-peter.maydell@linaro.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org
Series	[PULL,01/68] target/alpha: Don't corrupt error_code with unknown softfloat flags \| expand [PULL,01/68] target/alpha: Don't corrupt error_code with unknown softfloat flags [PULL,02/68] fpu: Add float_class_denormal [PULL,03/68] fpu: Implement float_flag_input_denormal_used [PULL,04/68] fpu: allow flushing of output denormals to be after rounding [PULL,05/68] target/arm: Define FPCR AH, FIZ, NEP bits [PULL,06/68] target/arm: Implement FPCR.FIZ handling [PULL,07/68] target/arm: Adjust FP behaviour for FPCR.AH = 1 [PULL,08/68] target/arm: Adjust exception flag handling for AH = 1 [PULL,09/68] target/arm: Add FPCR.AH to tbflags [PULL,10/68] target/arm: Set up float_status to use for FPCR.AH=1 behaviour [PULL,11/68] target/arm: Use FPST_FPCR_AH for FRECPE, FRECPS, FRECPX, FRSQRTE, FRSQRTS [PULL,12/68] target/arm: Use FPST_FPCR_AH for BFCVT* insns [PULL,13/68] target/arm: Use FPST_FPCR_AH for BFMLAL, BFMLSL insns [PULL,14/68] target/arm: Add FPCR.NEP to TBFLAGS [PULL,15/68] target/arm: Define and use new write_fp_reg_merging() functions [PULL,16/68] target/arm: Handle FPCR.NEP for 3-input scalar operations [PULL,17/68] target/arm: Handle FPCR.NEP for BFCVT scalar [PULL,18/68] target/arm: Handle FPCR.NEP for 1-input scalar operations [PULL,19/68] target/arm: Handle FPCR.NEP in do_cvtf_scalar() [PULL,20/68] target/arm: Handle FPCR.NEP for scalar FABS and FNEG [PULL,21/68] target/arm: Handle FPCR.NEP for FCVTXN (scalar) [PULL,22/68] target/arm: Handle FPCR.NEP for NEP for FMUL, FMULX scalar by element [PULL,23/68] target/arm: Implement FPCR.AH semantics for scalar FMIN/FMAX [PULL,24/68] target/arm: Implement FPCR.AH semantics for vector FMIN/FMAX [PULL,25/68] target/arm: Implement FPCR.AH semantics for FMAXV and FMINV [PULL,26/68] target/arm: Implement FPCR.AH semantics for FMINP and FMAXP [PULL,27/68] target/arm: Implement FPCR.AH semantics for SVE FMAXV and FMINV [PULL,28/68] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX immediate [PULL,29/68] target/arm: Implement FPCR.AH semantics for SVE FMIN/FMAX vector [PULL,30/68] target/arm: Implement FPCR.AH handling of negation of NaN [PULL,31/68] target/arm: Implement FPCR.AH handling for scalar FABS and FABD [PULL,32/68] target/arm: Handle FPCR.AH in vector FABD [PULL,33/68] target/arm: Handle FPCR.AH in SVE FNEG [PULL,34/68] target/arm: Handle FPCR.AH in SVE FABS [PULL,35/68] target/arm: Handle FPCR.AH in SVE FABD [PULL,36/68] target/arm: Handle FPCR.AH in negation steps in SVE FCADD [PULL,37/68] target/arm: Handle FPCR.AH in negation steps in FCADD [PULL,38/68] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS scalar insns [PULL,39/68] target/arm: Handle FPCR.AH in FRECPS and FRSQRTS vector insns [PULL,40/68] target/arm: Handle FPCR.AH in negation step in FMLS (indexed) [PULL,41/68] target/arm: Handle FPCR.AH in negation in FMLS (vector) [PULL,42/68] target/arm: Handle FPCR.AH in negation step in SVE FMLS (vector) [PULL,43/68] target/arm: Handle FPCR.AH in SVE FTSSEL [PULL,44/68] target/arm: Handle FPCR.AH in SVE FTMAD [PULL,45/68] target/arm: Handle FPCR.AH in vector FCMLA [PULL,46/68] target/arm: Handle FPCR.AH in FCMLA by index [PULL,47/68] target/arm: Handle FPCR.AH in SVE FCMLA [PULL,48/68] target/arm: Handle FPCR.AH in FMLSL (by element and vector) [PULL,49/68] target/arm: Handle FPCR.AH in SVE FMLSL (indexed) [PULL,50/68] target/arm: Handle FPCR.AH in SVE FMLSLB, FMLSLT (vectors) [PULL,51/68] target/arm: Enable FEAT_AFP for '-cpu max' [PULL,52/68] target/arm: Plumb FEAT_RPRES frecpe and frsqrte through to new helper [PULL,53/68] target/arm: Implement increased precision FRECPE [PULL,54/68] target/arm: Implement increased precision FRSQRTE [PULL,55/68] target/arm: Enable FEAT_RPRES for -cpu max [PULL,56/68] target/arm: Introduce CPUARMState.vfp.fp_status[] [PULL,57/68] target/arm: Remove standard_fp_status_f16 [PULL,58/68] target/arm: Remove standard_fp_status [PULL,59/68] target/arm: Remove ah_fp_status_f16 [PULL,60/68] target/arm: Remove ah_fp_status [PULL,61/68] target/arm: Remove fp_status_f16_a64 [PULL,62/68] target/arm: Remove fp_status_f16_a32 [PULL,63/68] target/arm: Remove fp_status_a64 [PULL,64/68] target/arm: Remove fp_status_a32 [PULL,65/68] target/arm: Simplify fp_status indexing in mve_helper.c [PULL,66/68] target/arm: Simplify DO_VFP_cmp in vfp_helper.c [PULL,67/68] target/arm: Read fz16 from env->vfp.fpcr [PULL,68/68] target/arm: Sink fp_status and fpcr access into do_fmlal

Message ID

20250211162554.4135349-4-peter.maydell@linaro.org

State

Not Applicable

Headers

Received-SPF: pass (google.com: domain of
 qemu-devel-bounces+patch=linaro.org@nongnu.org designates 209.51.188.17 as
 permitted sender) client-ip=209.51.188.17;
From: Peter Maydell <peter.maydell@linaro.org>
To: qemu-devel@nongnu.org
Subject: [PULL 03/68] fpu: Implement float_flag_input_denormal_used
Date: Tue, 11 Feb 2025 16:24:49 +0000
Message-Id: <20250211162554.4135349-4-peter.maydell@linaro.org>
In-Reply-To: <20250211162554.4135349-1-peter.maydell@linaro.org>
References: <20250211162554.4135349-1-peter.maydell@linaro.org>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: pass client-ip=2a00:1450:4864:20::32c;
 envelope-from=peter.maydell@linaro.org; helo=mail-wm1-x32c.google.com
X-Spam_score_int: -20
X-Spam_score: -2.1
X-Spam_bar: --
X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1,
 RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+patch=linaro.org@nongnu.org
Sender: qemu-devel-bounces+patch=linaro.org@nongnu.org

Series

[PULL,01/68] target/alpha: Don't corrupt error_code with unknown softfloat flags | expand

Commit Message

Peter Maydell Feb. 11, 2025, 4:24 p.m. UTC

For the x86 and the Arm FEAT_AFP semantics, we need to be able to
tell the target code that the FPU operation has used an input
denormal.  Implement this; when it happens we set the new
float_flag_denormal_input_used.

Note that we only set this when an input denormal is actually used by
the operation: if the operation results in Invalid Operation or
Divide By Zero or the result is a NaN because some other input was a
NaN then we never needed to look at the input denormal and do not set
denormal_input_used.

We mostly do not need to adjust the hardfloat codepaths to deal with
this flag, because almost all hardfloat operations are already gated
on the input not being a denormal, and will fall back to softfloat
for a denormal input.  The only exception is the comparison
operations, where we need to add the check for input denormals, which
must now fall back to softfloat where they did not before.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 include/fpu/softfloat-types.h |  7 ++++
 fpu/softfloat.c               | 38 +++++++++++++++++---
 fpu/softfloat-parts.c.inc     | 68 ++++++++++++++++++++++++++++++++++-
 3 files changed, 107 insertions(+), 6 deletions(-)

diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h
index 2e43d1dd9e6..bba1c397bb7 100644
--- a/include/fpu/softfloat-types.h
+++ b/include/fpu/softfloat-types.h
@@ -165,6 +165,13 @@  enum {
     float_flag_invalid_sqrt    = 0x0800,  /* sqrt(-x) */
     float_flag_invalid_cvti    = 0x1000,  /* non-nan to integer */
     float_flag_invalid_snan    = 0x2000,  /* any operand was snan */
+    /*
+     * An input was denormal and we used it (without flushing it to zero).
+     * Not set if we do not actually use the denormal input (e.g.
+     * because some other input was a NaN, or because the operation
+     * wasn't actually carried out (divide-by-zero; invalid))
+     */
+    float_flag_input_denormal_used = 0x4000,
 };
 
 /*
diff --git a/fpu/softfloat.c b/fpu/softfloat.c
index 03a604c38ec..f4fed9bfda9 100644
--- a/fpu/softfloat.c
+++ b/fpu/softfloat.c
@@ -2718,8 +2718,10 @@  static void parts_float_to_ahp(FloatParts64 *a, float_status *s)
                                   float16_params_ahp.frac_size + 1);
         break;
 
-    case float_class_normal:
     case float_class_denormal:
+        float_raise(float_flag_input_denormal_used, s);
+        break;
+    case float_class_normal:
     case float_class_zero:
         break;
 
@@ -2733,6 +2735,9 @@  static void parts64_float_to_float(FloatParts64 *a, float_status *s)
     if (is_nan(a->cls)) {
         parts_return_nan(a, s);
     }
+    if (a->cls == float_class_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
 }
 
 static void parts128_float_to_float(FloatParts128 *a, float_status *s)
@@ -2740,6 +2745,9 @@  static void parts128_float_to_float(FloatParts128 *a, float_status *s)
     if (is_nan(a->cls)) {
         parts_return_nan(a, s);
     }
+    if (a->cls == float_class_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
 }
 
 #define parts_float_to_float(P, S) \
@@ -2752,12 +2760,21 @@  static void parts_float_to_float_narrow(FloatParts64 *a, FloatParts128 *b,
     a->sign = b->sign;
     a->exp = b->exp;
 
-    if (is_anynorm(a->cls)) {
+    switch (a->cls) {
+    case float_class_denormal:
+        float_raise(float_flag_input_denormal_used, s);
+        /* fall through */
+    case float_class_normal:
         frac_truncjam(a, b);
-    } else if (is_nan(a->cls)) {
+        break;
+    case float_class_snan:
+    case float_class_qnan:
         /* Discard the low bits of the NaN. */
         a->frac = b->frac_hi;
         parts_return_nan(a, s);
+        break;
+    default:
+        break;
     }
 }
 
@@ -2772,6 +2789,9 @@  static void parts_float_to_float_widen(FloatParts128 *a, FloatParts64 *b,
     if (is_nan(a->cls)) {
         parts_return_nan(a, s);
     }
+    if (a->cls == float_class_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
 }
 
 float32 float16_to_float32(float16 a, bool ieee, float_status *s)
@@ -4411,7 +4431,11 @@  float32_hs_compare(float32 xa, float32 xb, float_status *s, bool is_quiet)
         goto soft;
     }
 
-    float32_input_flush2(&ua.s, &ub.s, s);
+    if (unlikely(float32_is_denormal(ua.s) || float32_is_denormal(ub.s))) {
+        /* We may need to set the input_denormal_used flag */
+        goto soft;
+    }
+
     if (isgreaterequal(ua.h, ub.h)) {
         if (isgreater(ua.h, ub.h)) {
             return float_relation_greater;
@@ -4461,7 +4485,11 @@  float64_hs_compare(float64 xa, float64 xb, float_status *s, bool is_quiet)
         goto soft;
     }
 
-    float64_input_flush2(&ua.s, &ub.s, s);
+    if (unlikely(float64_is_denormal(ua.s) || float64_is_denormal(ub.s))) {
+        /* We may need to set the input_denormal_used flag */
+        goto soft;
+    }
+
     if (isgreaterequal(ua.h, ub.h)) {
         if (isgreater(ua.h, ub.h)) {
             return float_relation_greater;
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc
index 8621cb87185..0122b35008a 100644
--- a/fpu/softfloat-parts.c.inc
+++ b/fpu/softfloat-parts.c.inc
@@ -433,6 +433,15 @@  static FloatPartsN *partsN(addsub)(FloatPartsN *a, FloatPartsN *b,
     bool b_sign = b->sign ^ subtract;
     int ab_mask = float_cmask(a->cls) | float_cmask(b->cls);
 
+    /*
+     * For addition and subtraction, we will consume an
+     * input denormal unless the other input is a NaN.
+     */
+    if ((ab_mask & (float_cmask_denormal | float_cmask_anynan)) ==
+        float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     if (a->sign != b_sign) {
         /* Subtraction */
         if (likely(cmask_is_only_normals(ab_mask))) {
@@ -516,6 +525,10 @@  static FloatPartsN *partsN(mul)(FloatPartsN *a, FloatPartsN *b,
     if (likely(cmask_is_only_normals(ab_mask))) {
         FloatPartsW tmp;
 
+        if (ab_mask & float_cmask_denormal) {
+            float_raise(float_flag_input_denormal_used, s);
+        }
+
         frac_mulw(&tmp, a, b);
         frac_truncjam(a, &tmp);
 
@@ -541,6 +554,10 @@  static FloatPartsN *partsN(mul)(FloatPartsN *a, FloatPartsN *b,
     }
 
     /* Multiply by 0 or Inf */
+    if (ab_mask & float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     if (ab_mask & float_cmask_inf) {
         a->cls = float_class_inf;
         a->sign = sign;
@@ -664,6 +681,16 @@  static FloatPartsN *partsN(muladd_scalbn)(FloatPartsN *a, FloatPartsN *b,
     if (flags & float_muladd_negate_result) {
         a->sign ^= 1;
     }
+
+    /*
+     * All result types except for "return the default NaN
+     * because this is an Invalid Operation" go through here;
+     * this matches the set of cases where we consumed a
+     * denormal input.
+     */
+    if (abc_mask & float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
     return a;
 
  return_sub_zero:
@@ -693,6 +720,9 @@  static FloatPartsN *partsN(div)(FloatPartsN *a, FloatPartsN *b,
     bool sign = a->sign ^ b->sign;
 
     if (likely(cmask_is_only_normals(ab_mask))) {
+        if (ab_mask & float_cmask_denormal) {
+            float_raise(float_flag_input_denormal_used, s);
+        }
         a->sign = sign;
         a->exp -= b->exp + frac_div(a, b);
         return a;
@@ -713,6 +743,10 @@  static FloatPartsN *partsN(div)(FloatPartsN *a, FloatPartsN *b,
         return parts_pick_nan(a, b, s);
     }
 
+    if ((ab_mask & float_cmask_denormal) && b->cls != float_class_zero) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     a->sign = sign;
 
     /* Inf / X */
@@ -751,6 +785,9 @@  static FloatPartsN *partsN(modrem)(FloatPartsN *a, FloatPartsN *b,
     int ab_mask = float_cmask(a->cls) | float_cmask(b->cls);
 
     if (likely(cmask_is_only_normals(ab_mask))) {
+        if (ab_mask & float_cmask_denormal) {
+            float_raise(float_flag_input_denormal_used, s);
+        }
         frac_modrem(a, b, mod_quot);
         return a;
     }
@@ -771,6 +808,10 @@  static FloatPartsN *partsN(modrem)(FloatPartsN *a, FloatPartsN *b,
         return a;
     }
 
+    if (ab_mask & float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     /* N % Inf; 0 % N */
     g_assert(b->cls == float_class_inf || a->cls == float_class_zero);
     return a;
@@ -801,6 +842,10 @@  static void partsN(sqrt)(FloatPartsN *a, float_status *status,
     if (unlikely(a->cls != float_class_normal)) {
         switch (a->cls) {
         case float_class_denormal:
+            if (!a->sign) {
+                /* -ve denormal will be InvalidOperation */
+                float_raise(float_flag_input_denormal_used, status);
+            }
             break;
         case float_class_snan:
         case float_class_qnan:
@@ -1431,6 +1476,9 @@  static FloatPartsN *partsN(minmax)(FloatPartsN *a, FloatPartsN *b,
         if ((flags & (minmax_isnum | minmax_isnumber))
             && !(ab_mask & float_cmask_snan)
             && (ab_mask & ~float_cmask_qnan)) {
+            if (ab_mask & float_cmask_denormal) {
+                float_raise(float_flag_input_denormal_used, s);
+            }
             return is_nan(a->cls) ? b : a;
         }
 
@@ -1455,6 +1503,10 @@  static FloatPartsN *partsN(minmax)(FloatPartsN *a, FloatPartsN *b,
         return parts_pick_nan(a, b, s);
     }
 
+    if (ab_mask & float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     a_exp = a->exp;
     b_exp = b->exp;
 
@@ -1524,6 +1576,10 @@  static FloatRelation partsN(compare)(FloatPartsN *a, FloatPartsN *b,
     if (likely(cmask_is_only_normals(ab_mask))) {
         FloatRelation cmp;
 
+        if (ab_mask & float_cmask_denormal) {
+            float_raise(float_flag_input_denormal_used, s);
+        }
+
         if (a->sign != b->sign) {
             goto a_sign;
         }
@@ -1549,6 +1605,10 @@  static FloatRelation partsN(compare)(FloatPartsN *a, FloatPartsN *b,
         return float_relation_unordered;
     }
 
+    if (ab_mask & float_cmask_denormal) {
+        float_raise(float_flag_input_denormal_used, s);
+    }
+
     if (ab_mask & float_cmask_zero) {
         if (ab_mask == float_cmask_zero) {
             return float_relation_equal;
@@ -1588,8 +1648,10 @@  static void partsN(scalbn)(FloatPartsN *a, int n, float_status *s)
     case float_class_zero:
     case float_class_inf:
         break;
-    case float_class_normal:
     case float_class_denormal:
+        float_raise(float_flag_input_denormal_used, s);
+        /* fall through */
+    case float_class_normal:
         a->exp += MIN(MAX(n, -0x10000), 0x10000);
         break;
     default:
@@ -1609,6 +1671,10 @@  static void partsN(log2)(FloatPartsN *a, float_status *s, const FloatFmt *fmt)
     if (unlikely(a->cls != float_class_normal)) {
         switch (a->cls) {
         case float_class_denormal:
+            if (!a->sign) {
+                /* -ve denormal will be InvalidOperation */
+                float_raise(float_flag_input_denormal_used, s);
+            }
             break;
         case float_class_snan:
         case float_class_qnan:

[PULL,03/68] fpu: Implement float_flag_input_denormal_used

Commit Message

Patch