From patchwork Tue Jun 3 19:51:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella Netto X-Patchwork-Id: 893872 Delivered-To: patch@linaro.org Received: by 2002:adf:a2d4:0:b0:3a4:ee3f:8f15 with SMTP id t20csp486931wra; Tue, 3 Jun 2025 12:54:29 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCVS9GS0cHxH9JEzxGWjzP62YrvZMjryT3+vBXY10Qd43snV/gPDebd3gUzimoUOYlAh//Jxyw==@linaro.org X-Google-Smtp-Source: AGHT+IELF41zuNXwQANfHqg0EbG5hsiFo2TEvGWtLD2uuR6Aqm9piKP163wS1R7+0zbMC4SsmBIJ X-Received: by 2002:a05:622a:578e:b0:4a4:3e89:d5b9 with SMTP id d75a77b69052e-4a5a57ee4d3mr2275641cf.43.1748980469027; Tue, 03 Jun 2025 12:54:29 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1748980469; cv=pass; d=google.com; s=arc-20240605; b=VHBtNQ/2QPOA0+Hz6MrvELXgL6RB1Uzvg3by+DhCZrorCm06MSwxLo2GuN4qOg6riG Ei+8JrORNXHaatfZGhqnvuoE6srzkyisHbZrFXFdVXEqHTSjuHGsNsKeRcSpRRIRFsvW FTGgRAqS0/+wfhmZKy5htPljnIRhzA4PMEleEhXDg7kkBbHV0SDjAiAZFmLy0wpml35k fROBMc/mhbFAGFh1WE7aTkkCUchrwcVmwfhgQT67nnMi5SDnNTZG/dtqtbs9+b7KkFPF /JbUlxOskhFgZ+HxidbpX0Yeyo+aBaY/YU5sDMKo/BNne+aMUlI+0pCICkyPCqMhrTjB eFSw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:dkim-filter:arc-filter:dmarc-filter :delivered-to:dkim-filter; bh=rNzCfNwqOy6+6GTOEjzNRJd5qBnMBFYpgCzxTXKOh3c=; fh=7/RTOikOdpDhr80qQ1MnirR4MhHwp+iZ0OKvKkXmJzI=; b=FFn7HwUCtcdzH9shKu9CWsBvl0SNdsFIvue4MmP9VGYo3bPGosKOn1zDZDqvz2+hYM 2oa9ZLtpL3vwi7cCVB6PKgZ59IzPTQdHnvGsPItBjzw8tyl/iAQM0q6q5sV4TfvzjnCm Lw9sL7jcce+hpkqw4vchR3c+GMpnZdATGj4fCunXMOtqK4ATW6OCpCnJkWm5ZHYkfY1o IHLUgr7U+eWpR3VmYl5uytAiOumAiw5JypIPOSWGHl8lFWQz3qXn0Mkehucz/sxvqzQI F703VFnqXu3cxT2sGkYsoMGy2UvJGKUx1r9MbFtFjArIv4WcCy5G5VRdbISeGWYW4k6+ z6+w==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bdJgDUeX; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d75a77b69052e-4a435a3c0c4si135402651cf.415.2025.06.03.12.54.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 12:54:28 -0700 (PDT) Received-SPF: pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=bdJgDUeX; arc=pass (i=1); spf=pass (google.com: domain of libc-alpha-bounces~patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="libc-alpha-bounces~patch=linaro.org@sourceware.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7DC7A3858408 for ; Tue, 3 Jun 2025 19:54:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 7DC7A3858408 Authentication-Results: sourceware.org; dkim=pass (2048-bit key, unprotected) header.d=linaro.org header.i=@linaro.org header.a=rsa-sha256 header.s=google header.b=bdJgDUeX X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-vk1-xa33.google.com (mail-vk1-xa33.google.com [IPv6:2607:f8b0:4864:20::a33]) by sourceware.org (Postfix) with ESMTPS id 97F913858433 for ; Tue, 3 Jun 2025 19:53:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 97F913858433 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linaro.org ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 97F913858433 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::a33 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1748980425; cv=none; b=cSHgyM5iGnMzV2Rm812qN5eBSgar3IjV7AC52zLNmQZsuZouTwng+Mmld6f6UP3vAWREQtMuylTpd+6QflYMAjrz3JcYmS/1Gczv+3vuv3mc5yyyYGx6RL9N5LXhGqZ8WcgQUtRLOEPQ3blEXxXV8zsch/y/25k7Nm5bRkodPyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1748980425; c=relaxed/simple; bh=W50bCjXY6raNGi4JeaqFtDcZ8JS/fcO97r03puAnQBY=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=QjV5E5OG4SLz/13IjQ55YiMiKt+0w2O1b/iHHi5NhyGZm9SmF69X1dAR4OxPiE2guQwmTQsO9yJi3ZwhtkmqDOFPdtVl3x9raeEx/oJ7uGJsfms8IvYXQmEmUqUwqRz2YPU1XhWEz3NRUygGIc9d3d4CVql76rAIiOre7MoG3Cs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 97F913858433 Received: by mail-vk1-xa33.google.com with SMTP id 71dfb90a1353d-5308d2d76f4so1474211e0c.0 for ; Tue, 03 Jun 2025 12:53:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1748980424; x=1749585224; darn=sourceware.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rNzCfNwqOy6+6GTOEjzNRJd5qBnMBFYpgCzxTXKOh3c=; b=bdJgDUeX68AjtsTtNaR3PyzVORw6w6cXAx+PFoz6OMrCGhEOHLtJ/5INGtA4AZeLw+ rDCOPYjfE2ugVj753jqIm9Egm85XwNUtLXT0EHyt7GOZb/5ZrG0gtZLVk+SBIAkrpfto wSf8aWFbmWQhVP278Wc5nbqMNKeE2eQGP8Sdj+rs21+3Og3mcs2EIRpJueqSmhpVR3E3 9z7Io1+hpxxcAECnPmzDApeVIJtg2R7PdI/BAMSwWSpyMNduWaJDOTKtGvH2sEUdjvhO nvi4ONXGrahF75WVOyeNH3vzhxOm9KeryN8G6SabqpvCqBoOrG4N6E/few/UI1z1fKLQ f7CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748980424; x=1749585224; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rNzCfNwqOy6+6GTOEjzNRJd5qBnMBFYpgCzxTXKOh3c=; b=VwyrSjrOuj/f3dhOVEIpK8Md4mG0/8StCNM7g1TfFekMhTDDI0LNjVPLpqSbPLphBs NdqmlN1x1aGJN+u/whRa9jq3aNbwMGxPMbb5TjHyeHrSL5HsyyEX2HRl+Grb1UnRKjhP ZoA5uKXEbGvN7thRAdmTP07tcUyLMsCu6A1mzaJ49TGT9BH2SVJ6sTNM1/dpuLEiGwpr RX+2cGjfBuRmv+c7ie2jZXdsI9Ew+/dtFstxulExJKhVOU70jW7pnE4YU1AqP+fz/3NV aVgtphfSwpt/r7F5V71DPZXwQXNW/Nczp8KJr5ERGIBOjJ1NBaAOc/o1oNFRZlxnsC6o 7oDw== X-Gm-Message-State: AOJu0Yy9xsb1Gpuu2O9gJS2KxM2jhP5ApAw81MiSKhyAwE3PO1lbVyEq qy5nuJ2ZiPCqL/KvdJmwL5DvUQnTFLwyrMYmWueqd2QpKym6QS3LZCFYDxq0/ImUEtXskLWBx8q rgLVf X-Gm-Gg: ASbGncsZ73/zbtTNfx5GIHw+SE+B/Q01qMNs7DbPFWj0VN4o3Ko1IEDl7IWnGszt2m2 5EwinflrndxWZa98kBbJXqIXFt2qplasLdmwfaqsOwrtQO1P1jSaKByDiCceD/LWdtgafCwwb3e SZHWk9I3eOIDjhW2NQ55syp2qAb1xUybI7J8N1F4h8xWtrPCHNI57FtH2Y3g1A/p82Ob4nOEw+M PZrS5HS1wnC3zv8wOszMkhHHRppyL5SzzsK353K6t2Ce2M6bSE5Kw/1Ci7sQyy6TkI7ilEgGO4p NoEWMCyLZnYlKF3b1ERTpCgUdH3CN3rmw70xIkXZdxt9sFWSsf8QQL0SNCYIouV57xo0GtxNhNA = X-Received: by 2002:a05:6122:2a52:b0:530:66e6:e21a with SMTP id 71dfb90a1353d-530c72f72f7mr321285e0c.3.1748980424424; Tue, 03 Jun 2025 12:53:44 -0700 (PDT) Received: from mandiga.. ([2804:1b3:a7c2:8c5f:3b12:b58a:98b8:5b2f]) by smtp.gmail.com with ESMTPSA id 71dfb90a1353d-53074aaf498sm9857566e0c.5.2025.06.03.12.53.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 12:53:43 -0700 (PDT) From: Adhemerval Zanella To: libc-alpha@sourceware.org Cc: Carlos O'Donell , Andrew Pinski Subject: [PATCH 3/3] resolv: Optimize inet_ntop Date: Tue, 3 Jun 2025 16:51:54 -0300 Message-ID: <20250603195332.2822499-4-adhemerval.zanella@linaro.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250603195332.2822499-1-adhemerval.zanella@linaro.org> References: <20250603195332.2822499-1-adhemerval.zanella@linaro.org> MIME-Version: 1.0 X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~patch=linaro.org@sourceware.org The benchtests/inet_ntop_ipv4 and benchtests/inet_ntop_ipv6 profile shows that most of time is spent in costly sprint operations: $ perf record ./benchtests/bench-inet_ntop_ipv4 && perf report --stdio [...] 38.53% bench-inet_ntop libc.so [.] __printf_buffer 18.69% bench-inet_ntop libc.so [.] __printf_buffer_write 11.01% bench-inet_ntop libc.so [.] _itoa_word 8.02% bench-inet_ntop bench-inet_ntop_ipv4 [.] bench_start 6.99% bench-inet_ntop libc.so [.] __memmove_avx_unaligned_erms 3.86% bench-inet_ntop libc.so [.] __strchrnul_avx2 2.82% bench-inet_ntop libc.so [.] __strcpy_avx2 1.90% bench-inet_ntop libc.so [.] inet_ntop4 1.78% bench-inet_ntop libc.so [.] __vsprintf_internal 1.55% bench-inet_ntop libc.so [.] __sprintf_chk 1.18% bench-inet_ntop libc.so [.] __GI___inet_ntop $ perf record ./benchtests/bench-inet_ntop_ipv6 && perf report --stdio 35.44% bench-inet_ntop libc.so [.] __printf_buffer 14.35% bench-inet_ntop libc.so [.] __printf_buffer_write 10.27% bench-inet_ntop libc.so [.] __GI___inet_ntop 7.93% bench-inet_ntop libc.so [.] _itoa_word 7.00% bench-inet_ntop libc.so [.] __sprintf_chk 6.20% bench-inet_ntop libc.so [.] __vsprintf_internal 5.26% bench-inet_ntop libc.so [.] __strchrnul_avx2 5.05% bench-inet_ntop bench-inet_ntop_ipv6 [.] bench_start 3.70% bench-inet_ntop libc.so [.] __memmove_avx_unaligned_erms 2.11% bench-inet_ntop libc.so [.] __printf_buffer_done The printf usage is replaced with an expanded function that prints either an IPv4 octet or and IPv6 quartet, the strcpy is replaced with a memcpy (since ABIs usually optimizes the symbol), and inline is used for both inet_ntop4 and inet_ntop6. The performance results on aarch64 Neoverse1 with gcc 14.2.1: * master aarch64-linux-gnu-master$ ./benchtests/bench-inet_ntop_ipv4 "inet_ntop_ipv4": { "workload-ipv4-random": { "duration": 1.43067e+09, "iterations": 8e+06, "reciprocal-throughput": 178.572, "latency": 179.096, "max-throughput": 5.59997e+06, "min-throughput": 5.58359e+06 } aarch64-linux-gnu-master$ ./benchtests/bench-inet_ntop_ipv6 "inet_ntop_ipv6": { "workload-ipv6-random": { "duration": 1.68539e+09, "iterations": 4e+06, "reciprocal-throughput": 421.307, "latency": 421.388, "max-throughput": 2.37357e+06, "min-throughput": 2.37311e+06 } } * patched aarch64-linux-gnu$ ./benchtests/bench-inet_ntop_ipv4 "inet_ntop_ipv4": { "workload-ipv4-random": { "duration": 1.06324e+09, "iterations": 4.4e+07, "reciprocal-throughput": 24.1509, "latency": 24.178, "max-throughput": 4.14063e+07, "min-throughput": 4.13599e+07 } aarch64-linux-gnu$ ./benchtests/bench-inet_ntop_ipv6 "inet_ntop_ipv6": { "workload-ipv6-random": { "duration": 1.08476e+09, "iterations": 2.4e+07, "reciprocal-throughput": 45.2794, "latency": 45.1174, "max-throughput": 2.20851e+07, "min-throughput": 2.21644e+07 } } Checked on aarch64-linux-gnu and x86_64-linux-gnu. Reviewed-by: Collin Funk --- resolv/inet_ntop.c | 134 +++++++++++++++++++++++++++++---------------- 1 file changed, 87 insertions(+), 47 deletions(-) diff --git a/resolv/inet_ntop.c b/resolv/inet_ntop.c index acf5f3cb88..3f2a9ddd87 100644 --- a/resolv/inet_ntop.c +++ b/resolv/inet_ntop.c @@ -26,45 +26,44 @@ #include #include #include - -#ifdef SPRINTF_CHAR -# define SPRINTF(x) strlen(sprintf/**/x) -#else -# define SPRINTF(x) ((size_t)sprintf x) -#endif +#include +#include <_itoa.h> /* * WARNING: Don't even consider trying to compile this on a system where * sizeof(int) < 4. sizeof(int) > 4 is fine; all the world's not a VAX. */ -static const char *inet_ntop4 (const u_char *src, char *dst, socklen_t size); -static const char *inet_ntop6 (const u_char *src, char *dst, socklen_t size); - -/* char * - * __inet_ntop(af, src, dst, size) - * convert a network format address to presentation format. - * return: - * pointer to presentation format address (`dst'), or NULL (see errno). - * author: - * Paul Vixie, 1996. - */ -const char * -__inet_ntop (int af, const void *src, char *dst, socklen_t size) +static inline char *put_uint8 (uint8_t word, char *tp) { - switch (af) { - case AF_INET: - return (inet_ntop4(src, dst, size)); - case AF_INET6: - return (inet_ntop6(src, dst, size)); - default: - __set_errno (EAFNOSUPPORT); - return (NULL); - } - /* NOTREACHED */ + intptr_t s = 1; + if (word >= 100) + { + tp[2] = _itoa_lower_digits[word % 10]; + word /= 10; + s += 1; + } + if (word >= 10) + { + tp[1] = _itoa_lower_digits[word % 10]; + word /= 10; + s += 1; + } + *tp = _itoa_lower_digits[word % 10]; + return tp + s; +} + +static inline char *put_uint16 (uint16_t word, char *tp) +{ + if (word >= 0x1000) + *tp++ = _itoa_lower_digits[(word >> 12) & 0xf]; + if (word >= 0x100) + *tp++ = _itoa_lower_digits[(word >> 8) & 0xf]; + if (word >= 0x10) + *tp++ = _itoa_lower_digits[(word >> 4) & 0xf]; + *tp++ = _itoa_lower_digits[word & 0xf]; + return tp; } -libc_hidden_def (__inet_ntop) -weak_alias (__inet_ntop, inet_ntop) /* const char * * inet_ntop4(src, dst, size) @@ -74,20 +73,36 @@ weak_alias (__inet_ntop, inet_ntop) * notes: * (1) uses no statics * (2) takes a u_char* not an in_addr as input - * author: - * Paul Vixie, 1996. */ -static const char * +static __always_inline const char * inet_ntop4 (const u_char *src, char *dst, socklen_t size) { - static const char fmt[] = "%u.%u.%u.%u"; - char tmp[sizeof "255.255.255.255"]; + enum + { + oct_size = INT_BUFSIZE_BOUND (u_char), + tmp_size = 4 * oct_size + 3 /* '.' */ + 1 /* '\0' */ + }; + char tmp[tmp_size]; + char *tmp_r = tmp; - if (SPRINTF((tmp, fmt, src[0], src[1], src[2], src[3])) >= size) { - __set_errno (ENOSPC); - return (NULL); - } - return strcpy(dst, tmp); + tmp_r = put_uint8 (src[0], tmp_r); + *(tmp_r++) = '.'; + tmp_r = put_uint8 (src[1], tmp_r); + *(tmp_r++) = '.'; + tmp_r = put_uint8 (src[2], tmp_r); + *(tmp_r++) = '.'; + tmp_r = put_uint8 (src[3], tmp_r); + *tmp_r++ = '\0'; + + socklen_t tmp_s = tmp_r - tmp; + if (tmp_s > size) + { + __set_errno (ENOSPC); + return NULL; + } + memcpy (dst, tmp, tmp_s); + + return dst; } /* const char * @@ -96,7 +111,7 @@ inet_ntop4 (const u_char *src, char *dst, socklen_t size) * author: * Paul Vixie, 1996. */ -static const char * +static inline const char * inet_ntop6 (const u_char *src, char *dst, socklen_t size) { /* @@ -108,7 +123,7 @@ inet_ntop6 (const u_char *src, char *dst, socklen_t size) */ char tmp[sizeof "ffff:ffff:ffff:ffff:ffff:ffff:255.255.255.255"], *tp; struct { int base, len; } best, cur; - u_int words[NS_IN6ADDRSZ / NS_INT16SZ]; + uint16_t words[NS_IN6ADDRSZ / NS_INT16SZ] = { 0 }; int i; /* @@ -116,7 +131,6 @@ inet_ntop6 (const u_char *src, char *dst, socklen_t size) * Copy the input (bytewise) array into a wordwise array. * Find the longest run of 0x00's in src[] for :: shorthanding. */ - memset(words, '\0', sizeof words); for (i = 0; i < NS_IN6ADDRSZ; i += 2) words[i / 2] = (src[i] << 8) | src[i + 1]; best.base = -1; @@ -167,7 +181,7 @@ inet_ntop6 (const u_char *src, char *dst, socklen_t size) tp += strlen(tp); break; } - tp += SPRINTF((tp, "%x", words[i])); + tp = put_uint16 (words[i], tp); } /* Was it a trailing run of 0x00's? */ if (best.base != -1 && (best.base + best.len) == @@ -178,9 +192,35 @@ inet_ntop6 (const u_char *src, char *dst, socklen_t size) /* * Check for overflow, copy, and we're done. */ - if ((socklen_t)(tp - tmp) > size) { + socklen_t tmp_s = tp - tmp; + if (tmp_s > size) { __set_errno (ENOSPC); return (NULL); } - return strcpy(dst, tmp); + return memcpy(dst, tmp, tmp_s); } + +/* char * + * __inet_ntop(af, src, dst, size) + * convert a network format address to presentation format. + * return: + * pointer to presentation format address (`dst'), or NULL (see errno). + * author: + * Paul Vixie, 1996. + */ +const char * +__inet_ntop (int af, const void *src, char *dst, socklen_t size) +{ + switch (af) { + case AF_INET: + return (inet_ntop4(src, dst, size)); + case AF_INET6: + return (inet_ntop6(src, dst, size)); + default: + __set_errno (EAFNOSUPPORT); + return (NULL); + } + /* NOTREACHED */ +} +libc_hidden_def (__inet_ntop) +weak_alias (__inet_ntop, inet_ntop)