From patchwork Mon Apr 28 17:03:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 885485 Delivered-To: patch@linaro.org Received: by 2002:a5d:474d:0:b0:38f:210b:807b with SMTP id o13csp5292206wrs; Mon, 28 Apr 2025 10:10:21 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVs0cK0VeEYRQiu1brmJVXG0cbQ9sZ9CriTp/jM5hLmZCEGuXt0ZmfhQq/oypdmoeMb/bcsNw==@linaro.org X-Google-Smtp-Source: AGHT+IFKVoBSWX+EqEVSthw7lEHWJkmFA56LC835lhTy+8Wu4rBlDiWzr25BpJS6GgcwK0v2Z3xu X-Received: by 2002:a05:620a:40d3:b0:7c5:61b2:b95 with SMTP id af79cd13be357-7cabdd8f731mr72225285a.30.1745860221624; Mon, 28 Apr 2025 10:10:21 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1745860221; cv=pass; d=google.com; s=arc-20240605; b=JHNgvEyzqwL9D5XlKA3Ok7DErdVz4dFl091m1sECrCv+vjQGYRatZ8w4qSoECdxTbX EUQvEoy0uhwEkXhZATT/Z4U0C6ojfl4ANbiLvCN8qQMCMjs3WhFKf1Jgv9ovb2dCohcX gNoO4kxRSO821diaPcIVJwBKKlXwhdZSWWqaDX4CePXVl0q9at6k/zYoRo9xwH5IRIN/ 4uTiVk3Htv+ECH+Dpq5gYxSTLzrlc3gHVLsXAzVzVD19r3iuzsAfoHv16peHlQAOMJWk VB76TdHS2NaDxqMdO61kmctcWuFEIjZFS1fM9LpGWOaqpUXuwVA+sYZtU9NYuq28dPFK 720A== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dkim-filter:arc-filter:dmarc-filter :delivered-to:dkim-filter; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; fh=GVJhlWdd4NNaBhOZ/uQFycFAyzDMQer7CEL5IpLL3Iw=; b=TBtzxS4A0xu9n4Zp0FhxDu7NvYqc/jOiAEvibN9V2x6mviINIFpSOqQlr4QB9WzyhD yHSw3jfTO1NDR7mGYsGM+3mECVh1D/Rm6uVtmwdIv4lQ4Nu25h4odYevVHA0qpSwmL3t q0TC8Ky0+mGwv77U2EZnxwccd9x889AKgzxAESGnsQ7+CzIwekk7lnyz3odPjnsoNTol t4WlUuJOwP5bC6fyNRN7tM4cFFlbRP5w+L8DQQw67QDOmmP3lHh7V08thvt/dfSY/bAu RjM4WLsPYbLstE4ptaQDqLUnLMrmLb9JZ4itNLduRVPZYlsmeDPDSUQvOKxmdFblouiE iurA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BUi6CA38; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id af79cd13be357-7c958eb8968si1037618485a.523.2025.04.28.10.10.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Apr 2025 10:10:21 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=BUi6CA38; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 25E473857C6E for ; Mon, 28 Apr 2025 17:10:21 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 25E473857C6E Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=BUi6CA38 X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pj1-x102e.google.com (mail-pj1-x102e.google.com [IPv6:2607:f8b0:4864:20::102e]) by sourceware.org (Postfix) with ESMTPS id A0E77385840B for ; Mon, 28 Apr 2025 17:04:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A0E77385840B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A0E77385840B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::102e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745859880; cv=none; b=xUYGUHSMnn0S4QWx9gDbowsmq6B6suTXPXEfE95HK012su1eu9IrjtQ0IT6sB3zelIYO+hDpPy222xPQFPvWJa986YRWrJC0LUQok7hKGbgbpoVgeIPVXOLiCAusQHxIgqGd7kEGUMDBuLtK/YHwIe7T/FkQRrCWkgeT8urzbsw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1745859880; c=relaxed/simple; bh=vm4GV3mSzgBQAlfatnEHYao32yPuNi9GGPmlnPW81mw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=eKSPT1telhowxHpvze2U0/GU8+0U4mufaxIXzSf4VciIN0oR0k7YnFlNHRyYrcdwWDQCytLxIB+UhaNvcObZS30tfSb+ofFLGUZZ2Tzu9o94USn9cx+EKVV7ymPSFXISdznnGeYUGBavv7TczYFHj3CBQXXv+8LMj35BPdSLYPo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org A0E77385840B Received: by mail-pj1-x102e.google.com with SMTP id 98e67ed59e1d1-306bf444ba2so4907555a91.1 for ; Mon, 28 Apr 2025 10:04:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1745859879; x=1746464679; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; b=BUi6CA38ZeQh3gqkuMt+Wcmt1qVWqAIJFgKvjeKI/ijSv8Yj8Vn8iOAE9UL5+G1ob9 8zgjRKj5v6jjTKSlLCtarOoR5aoqb7xYYMlqEOPF7F3dG19V1t57xZzZRoKkDZV+YK+l vrAqTSDfAYx3tgLNN/4KYrpsnzMeuKsRE00hrIVvo89fjStSmm5detgUf7S6FDRLIgYZ leIyZpElx5skj7BzL1lcHcaZBn1KaxumeKPWBraa6UdYfO46D3g90uV/bCbvVrgJoGW4 uB4Mp+ps02aYP++IQeVFEFBdIfvcPPHzIcuj/eQXKjKtz8umdW0k3Xlzx2O1AP+CIAZM 9MIg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745859879; x=1746464679; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9qByV/z7W2+pndrAqQZUwvdFrIt7m8Hac+R4dpiQzc4=; b=M0/4pwBA4hWcF8oIs3bN9+fHuO2x9Gt0fDcT76mbW6EubgvJYB/AZl2Wuq09U/6Jt0 PMRKKIl3dccDFcmnNWwhmrGOfnCAr4Qbx2H1vXHvc79cCQk4hRz6iPPsB/rVkNrNgLbO jN44LphBgDUsdBbTyPRD5m51BojXP1tzxMB/XiMoBekrO8RhcE6tlB6DhsJ21Lw8mHB+ /mH+g0rjVjCFZ3luvVX1FDCVc0jSvHT4/LU/6mOeni8f67vAvWoI3YCCDgqNiQD7jlqq MHxsw3pk2ymE6mGLccxquFSY3s0sdHQHJe4uX7XpZ/4O6ljqsrAWjAYFKsXPEGFpFqje HCNA== X-Gm-Message-State: AOJu0YwTq6LoOFpSRzzexcFOQV5BxAOOmoEBuf/Z92I+FyNaKDW0QgsA qCTHozfwqNMU5DhQSf1qeZlasShnP4o9HoDtZh4h9QtMm2KF7pN3GXCWVLdxOZiATrPpFPuiMXI / X-Gm-Gg: ASbGncsjL1mTWU5cRIM0bMQP2RsFqdtqnSVeyErxYiR41ztfcxgBMP3380fsctM+wlS /d6kuFeXW4hPHP4P6Bj1OgjoV1B/JMli4G5W/q9Dk16WdFOCQMbos5Xs/0msQs/nLLlMfD2VRlD /BsVgWmnZW0XgjfA1+XyMU7TScnAu6zxM72Ii47JyKonusJwUe1Iq5InZhVomzfIvoDpSKQ3PQa 6VVs2f1fOPUGTmMnObZbzQByTuz7h8Q7z1C2pjlmIBkzoOVDufst73TL7JIiZozmYPCoNg664VQ 1AIPBhxFAiBfJPVUqG3YzA+ViiUt7HaHiijqOgqQj/bjkjqLc4kHtA== X-Received: by 2002:a17:90b:1cc8:b0:2fa:f8d:65de with SMTP id 98e67ed59e1d1-30a21593a56mr733124a91.22.1745859879202; Mon, 28 Apr 2025 10:04:39 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c0:9bf1:ce18:36e8:dea9:8b39]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-309f773725csm8322332a91.3.2025.04.28.10.04.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 28 Apr 2025 10:04:38 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Wilco Dijkstra Subject: [PATCH v2 1/4] math: Remove UB and optimize double ilogb Date: Mon, 28 Apr 2025 14:03:41 -0300 Message-ID: <20250428170430.2030400-2-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250428170430.2030400-1-adhemerval.zanella@linaro.org> References: <20250428170430.2030400-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The subnormal exponent calculation invokes UB by left shifting the signed expoenent to find the first leading bit. The implementation also used 32 bits operations, which is generates suboptimal code in 64 bits architectures. The patch reimplements ilogb using the math_config.h macros and uses the new stdbit function to simplify the subnormal handling. On aarch64 it generates better code: * master: 0000000000000000 <__ieee754_ilogb>: 0: 9e660000 fmov x0, d0 4: d360fc02 lsr x2, x0, #32 8: d360f801 ubfx x1, x0, #32, #31 c: f26c285f tst x2, #0x7ff00000 10: 540001a1 b.ne 44 <__ieee754_ilogb+0x44> // b.any 14: 2a000022 orr w2, w1, w0 18: 34000322 cbz w2, 7c <__ieee754_ilogb+0x7c> 1c: 35000221 cbnz w1, 60 <__ieee754_ilogb+0x60> 20: 2a0003e1 mov w1, w0 24: 7100001f cmp w0, #0x0 28: 12808240 mov w0, #0xfffffbed // #-1043 2c: 540000ad b.le 40 <__ieee754_ilogb+0x40> 30: 531f7821 lsl w1, w1, #1 34: 51000400 sub w0, w0, #0x1 38: 7100003f cmp w1, #0x0 3c: 54ffffac b.gt 30 <__ieee754_ilogb+0x30> 40: d65f03c0 ret 44: 13147c20 asr w0, w1, #20 48: 12b00202 mov w2, #0x7fefffff // #2146435071 4c: 510ffc00 sub w0, w0, #0x3ff 50: 6b02003f cmp w1, w2 54: 12b00001 mov w1, #0x7fffffff // #2147483647 58: 1a819000 csel w0, w0, w1, ls // ls = plast 5c: d65f03c0 ret 60: 53155021 lsl w1, w1, #11 64: 12807fa0 mov w0, #0xfffffc02 // #-1022 68: 531f7821 lsl w1, w1, #1 6c: 51000400 sub w0, w0, #0x1 70: 7100003f cmp w1, #0x0 74: 54ffffac b.gt 68 <__ieee754_ilogb+0x68> 78: d65f03c0 ret 7c: 320107e0 mov w0, #0x80000001 // #-2147483647 80: d65f03c0 ret * patch: 0000000000000000 <__ieee754_ilogb>: 0: 9e660001 fmov x1, d0 4: d374f820 ubfx x0, x1, #52, #11 8: 350000e0 cbnz w0, 24 <__ieee754_ilogb+0x24> c: d374cc21 lsl x1, x1, #12 10: b4000141 cbz x1, 38 <__ieee754_ilogb+0x38> 14: dac01021 clz x1, x1 18: 12807fc0 mov w0, #0xfffffc01 // #-1023 1c: 4b010000 sub w0, w0, w1 20: d65f03c0 ret 24: 711ffc1f cmp w0, #0x7ff 28: 510ffc00 sub w0, w0, #0x3ff 2c: 12b00001 mov w1, #0x7fffffff // #2147483647 30: 1a811000 csel w0, w0, w1, ne // ne = any 34: d65f03c0 ret 38: 320107e0 mov w0, #0x80000001 // #-2147483647 3c: d65f03c0 ret Other architecture with support for stdc_leading_zeros and/or __builtin_clzll should have similar improvements. Checked on aarch64-linux-gnu and x86_64-linux-gnu. --- sysdeps/ieee754/dbl-64/e_ilogb.c | 80 ++++++++++++-------------------- 1 file changed, 29 insertions(+), 51 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/e_ilogb.c b/sysdeps/ieee754/dbl-64/e_ilogb.c index 1e338a59c1..89e7498266 100644 --- a/sysdeps/ieee754/dbl-64/e_ilogb.c +++ b/sysdeps/ieee754/dbl-64/e_ilogb.c @@ -1,63 +1,41 @@ -/* @(#)s_ilogb.c 5.1 93/09/24 */ -/* - * ==================================================== - * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved. - * - * Developed at SunPro, a Sun Microsystems, Inc. business. - * Permission to use, copy, modify, and distribute this - * software is freely granted, provided that this notice - * is preserved. - * ==================================================== - */ +/* Get integer exponent of a floating-point value. + Copyright (C) 1999-2025 Free Software Foundation, Inc. + This file is part of the GNU C Library. -#if defined(LIBM_SCCS) && !defined(lint) -static char rcsid[] = "$NetBSD: s_ilogb.c,v 1.9 1995/05/10 20:47:28 jtc Exp $"; -#endif + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. -/* ilogb(double x) - * return the binary exponent of non-zero x - * ilogb(0) = FP_ILOGB0 - * ilogb(NaN) = FP_ILOGBNAN (no signal is raised) - * ilogb(+-Inf) = INT_MAX (no signal is raised) - */ + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ #include #include -#include +#include +#include "math_config.h" int __ieee754_ilogb (double x) { - int32_t hx, lx, ix; - - GET_HIGH_WORD (hx, x); - hx &= 0x7fffffff; - if (hx < 0x00100000) + uint64_t ux = asuint64 (x); + int ex = (ux & ~SIGN_MASK) >> MANTISSA_WIDTH; + if (ex == 0) /* zero or subnormal */ { - GET_LOW_WORD (lx, x); - if ((hx | lx) == 0) - return FP_ILOGB0; /* ilogb(0) = FP_ILOGB0 */ - else /* subnormal x */ - if (hx == 0) - { - for (ix = -1043; lx > 0; lx <<= 1) - ix -= 1; - } - else - { - for (ix = -1022, hx <<= 11; hx > 0; hx <<= 1) - ix -= 1; - } - return ix; + /* Clear sign and exponent */ + ux <<= 12; + if (ux == 0) + return FP_ILOGB0; + /* subnormal */ + return -1023 - stdc_leading_zeros (ux); } - else if (hx < 0x7ff00000) - return (hx >> 20) - 1023; - else if (FP_ILOGBNAN != INT_MAX) - { - /* ISO C99 requires ilogb(+-Inf) == INT_MAX. */ - GET_LOW_WORD (lx, x); - if (((hx ^ 0x7ff00000) | lx) == 0) - return INT_MAX; - } - return FP_ILOGBNAN; + if (ex == EXPONENT_MASK >> MANTISSA_WIDTH) /* NaN or Inf */ + return ux << 12 ? FP_ILOGBNAN : INT_MAX; + return ex - 1023; }