From patchwork Mon Nov 30 18:51:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335686 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98674C83017 for ; Mon, 30 Nov 2020 18:53:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5F3DA20725 for ; Mon, 30 Nov 2020 18:53:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ld4A59NS" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729441AbgK3SxV (ORCPT ); Mon, 30 Nov 2020 13:53:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727994AbgK3SxR (ORCPT ); Mon, 30 Nov 2020 13:53:17 -0500 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1687C0613D2; Mon, 30 Nov 2020 10:52:36 -0800 (PST) Received: by mail-pj1-x1041.google.com with SMTP id j13so140663pjz.3; Mon, 30 Nov 2020 10:52:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=51BwrRYGJ7LQ1zXcHPZuY4NmBhlhOfnyJEatKNAr0LE=; b=ld4A59NSIcdCL5BqxfmJX609ykQLz2znY/jawB1RJT/ClafD1yR8ruWkUQEklA0RYs xWdaJoH6WF3yn8NUyi5b/PUGfpLWvvT9it5xBSxDUz/1ykVFQeh52kp6/l3wWfWJtVGL VLNXPedA+OA3qyc+A0xpiCXJ9zK2vxaP4duS/0qBO6t55gJnt3PuhxCUkiF6Nd6PNhnV kFMK5yTx6BMGnCxXVgQEOGzXp0xS1nwDWI+wXmh/Pzh0TclYrQZmMeXTivMx6Ug7LoKZ 0MfZq0MAjcBk/pPdAGYqpMmv48t/zFVVXR92LfN0UwVic4N0bXHnDrA/uGQUXP21MBr+ tixw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=51BwrRYGJ7LQ1zXcHPZuY4NmBhlhOfnyJEatKNAr0LE=; b=E9mE7Ap6maHjGCRf6MHr2uYxNQV5SvhlmrxiuhQc2AgxUjksGFYVyyZLmBMIXEjbvS +lGIWAhfDIxyqKZNZ2znGPpdNI6x4QDryRkxFfQD7hyZaEuKNLEIhW6hSswPdUxLy0AG bhIx/XNzXhNJYzVFRCc8KAEZDnlPrKZXDBpobLvOhi4fI9ivWxzbnrbLOo4YDwEeVELF tNvSB2BqP6/QPYNwEAsV98YMNom0r5yFLiOk9s9xp7SiTvQWXXD5FJErLrd7ym0B/YbF vAtEkfuH7YLi9bMYlA2Mh0QrE9R6toNPe1ngC40fAt2jJUqjUYnSIh7rErM4XSa5DSDh gFpg== X-Gm-Message-State: AOAM532r1bPuqHb1CDSpotuuCb+DDXK6VLzhi4/ImypI4iTkeTFf7Qf3 cYLNoyRugHoHD50DTavA05n+MSBn8nj3vzob X-Google-Smtp-Source: ABdhPJw3TNUs+PJ2n3Roq3n9RB38UwJBJRcLlk5uSknkqAf4wt/E8VRWIkzSdYHgMiotrVsPXc4Z1A== X-Received: by 2002:a17:90a:4d84:: with SMTP id m4mr196221pjh.145.1606762355503; Mon, 30 Nov 2020 10:52:35 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.52.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:52:34 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 01/10] net: introduce preferred busy-polling Date: Mon, 30 Nov 2020 19:51:56 +0100 Message-Id: <20201130185205.196029-2-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel The existing busy-polling mode, enabled by the SO_BUSY_POLL socket option or system-wide using the /proc/sys/net/core/busy_read knob, is an opportunistic. That means that if the NAPI context is not scheduled, it will poll it. If, after busy-polling, the budget is exceeded the busy-polling logic will schedule the NAPI onto the regular softirq handling. One implication of the behavior above is that a busy/heavy loaded NAPI context will never enter/allow for busy-polling. Some applications prefer that most NAPI processing would be done by busy-polling. This series adds a new socket option, SO_PREFER_BUSY_POLL, that works in concert with the napi_defer_hard_irqs and gro_flush_timeout knobs. The napi_defer_hard_irqs and gro_flush_timeout knobs were introduced in commit 6f8b12d661d0 ("net: napi: add hard irqs deferral feature"), and allows for a user to defer interrupts to be enabled and instead schedule the NAPI context from a watchdog timer. When a user enables the SO_PREFER_BUSY_POLL, again with the other knobs enabled, and the NAPI context is being processed by a softirq, the softirq NAPI processing will exit early to allow the busy-polling to be performed. If the application stops performing busy-polling via a system call, the watchdog timer defined by gro_flush_timeout will timeout, and regular softirq handling will resume. In summary; Heavy traffic applications that prefer busy-polling over softirq processing should use this option. Example usage: $ echo 2 | sudo tee /sys/class/net/ens785f1/napi_defer_hard_irqs $ echo 200000 | sudo tee /sys/class/net/ens785f1/gro_flush_timeout Note that the timeout should be larger than the userspace processing window, otherwise the watchdog will timeout and fall back to regular softirq processing. Enable the SO_BUSY_POLL/SO_PREFER_BUSY_POLL options on your socket. Reviewed-by: Jakub Kicinski Signed-off-by: Björn Töpel --- arch/alpha/include/uapi/asm/socket.h | 2 + arch/mips/include/uapi/asm/socket.h | 2 + arch/parisc/include/uapi/asm/socket.h | 2 + arch/sparc/include/uapi/asm/socket.h | 2 + fs/eventpoll.c | 2 +- include/linux/netdevice.h | 35 +++++++----- include/net/busy_poll.h | 5 +- include/net/sock.h | 4 ++ include/uapi/asm-generic/socket.h | 2 + net/core/dev.c | 78 +++++++++++++++++++++------ net/core/sock.c | 9 ++++ 11 files changed, 111 insertions(+), 32 deletions(-) diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h index de6c4df61082..538359642554 100644 --- a/arch/alpha/include/uapi/asm/socket.h +++ b/arch/alpha/include/uapi/asm/socket.h @@ -124,6 +124,8 @@ #define SO_DETACH_REUSEPORT_BPF 68 +#define SO_PREFER_BUSY_POLL 69 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h index d0a9ed2ca2d6..e406e73b5e6e 100644 --- a/arch/mips/include/uapi/asm/socket.h +++ b/arch/mips/include/uapi/asm/socket.h @@ -135,6 +135,8 @@ #define SO_DETACH_REUSEPORT_BPF 68 +#define SO_PREFER_BUSY_POLL 69 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h index 10173c32195e..1bc46200889d 100644 --- a/arch/parisc/include/uapi/asm/socket.h +++ b/arch/parisc/include/uapi/asm/socket.h @@ -116,6 +116,8 @@ #define SO_DETACH_REUSEPORT_BPF 0x4042 +#define SO_PREFER_BUSY_POLL 0x4043 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h index 8029b681fc7c..99688cf673a4 100644 --- a/arch/sparc/include/uapi/asm/socket.h +++ b/arch/sparc/include/uapi/asm/socket.h @@ -117,6 +117,8 @@ #define SO_DETACH_REUSEPORT_BPF 0x0047 +#define SO_PREFER_BUSY_POLL 0x0048 + #if !defined(__KERNEL__) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index 4df61129566d..e11fab3a0b9e 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -397,7 +397,7 @@ static void ep_busy_loop(struct eventpoll *ep, int nonblock) unsigned int napi_id = READ_ONCE(ep->napi_id); if ((napi_id >= MIN_NAPI_ID) && net_busy_loop_on()) - napi_busy_loop(napi_id, nonblock ? NULL : ep_busy_loop_end, ep); + napi_busy_loop(napi_id, nonblock ? NULL : ep_busy_loop_end, ep, false); } static inline void ep_reset_busy_poll_napi_id(struct eventpoll *ep) diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 7ce648a564f7..52d1cc2bd8a7 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -350,23 +350,25 @@ struct napi_struct { }; enum { - NAPI_STATE_SCHED, /* Poll is scheduled */ - NAPI_STATE_MISSED, /* reschedule a napi */ - NAPI_STATE_DISABLE, /* Disable pending */ - NAPI_STATE_NPSVC, /* Netpoll - don't dequeue from poll_list */ - NAPI_STATE_LISTED, /* NAPI added to system lists */ - NAPI_STATE_NO_BUSY_POLL,/* Do not add in napi_hash, no busy polling */ - NAPI_STATE_IN_BUSY_POLL,/* sk_busy_loop() owns this NAPI */ + NAPI_STATE_SCHED, /* Poll is scheduled */ + NAPI_STATE_MISSED, /* reschedule a napi */ + NAPI_STATE_DISABLE, /* Disable pending */ + NAPI_STATE_NPSVC, /* Netpoll - don't dequeue from poll_list */ + NAPI_STATE_LISTED, /* NAPI added to system lists */ + NAPI_STATE_NO_BUSY_POLL, /* Do not add in napi_hash, no busy polling */ + NAPI_STATE_IN_BUSY_POLL, /* sk_busy_loop() owns this NAPI */ + NAPI_STATE_PREFER_BUSY_POLL, /* prefer busy-polling over softirq processing*/ }; enum { - NAPIF_STATE_SCHED = BIT(NAPI_STATE_SCHED), - NAPIF_STATE_MISSED = BIT(NAPI_STATE_MISSED), - NAPIF_STATE_DISABLE = BIT(NAPI_STATE_DISABLE), - NAPIF_STATE_NPSVC = BIT(NAPI_STATE_NPSVC), - NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), - NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), - NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), + NAPIF_STATE_SCHED = BIT(NAPI_STATE_SCHED), + NAPIF_STATE_MISSED = BIT(NAPI_STATE_MISSED), + NAPIF_STATE_DISABLE = BIT(NAPI_STATE_DISABLE), + NAPIF_STATE_NPSVC = BIT(NAPI_STATE_NPSVC), + NAPIF_STATE_LISTED = BIT(NAPI_STATE_LISTED), + NAPIF_STATE_NO_BUSY_POLL = BIT(NAPI_STATE_NO_BUSY_POLL), + NAPIF_STATE_IN_BUSY_POLL = BIT(NAPI_STATE_IN_BUSY_POLL), + NAPIF_STATE_PREFER_BUSY_POLL = BIT(NAPI_STATE_PREFER_BUSY_POLL), }; enum gro_result { @@ -437,6 +439,11 @@ static inline bool napi_disable_pending(struct napi_struct *n) return test_bit(NAPI_STATE_DISABLE, &n->state); } +static inline bool napi_prefer_busy_poll(struct napi_struct *n) +{ + return test_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state); +} + bool napi_schedule_prep(struct napi_struct *n); /** diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index b001fa91c14e..0292b8353d7e 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -43,7 +43,7 @@ bool sk_busy_loop_end(void *p, unsigned long start_time); void napi_busy_loop(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), - void *loop_end_arg); + void *loop_end_arg, bool prefer_busy_poll); #else /* CONFIG_NET_RX_BUSY_POLL */ static inline unsigned long net_busy_loop_on(void) @@ -105,7 +105,8 @@ static inline void sk_busy_loop(struct sock *sk, int nonblock) unsigned int napi_id = READ_ONCE(sk->sk_napi_id); if (napi_id >= MIN_NAPI_ID) - napi_busy_loop(napi_id, nonblock ? NULL : sk_busy_loop_end, sk); + napi_busy_loop(napi_id, nonblock ? NULL : sk_busy_loop_end, sk, + READ_ONCE(sk->sk_prefer_busy_poll)); #endif } diff --git a/include/net/sock.h b/include/net/sock.h index a5c6ae78df77..d49b89b071b6 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -301,6 +301,7 @@ struct bpf_local_storage; * @sk_ack_backlog: current listen backlog * @sk_max_ack_backlog: listen backlog set in listen() * @sk_uid: user id of owner + * @sk_prefer_busy_poll: prefer busypolling over softirq processing * @sk_priority: %SO_PRIORITY setting * @sk_type: socket type (%SOCK_STREAM, etc) * @sk_protocol: which protocol this socket belongs in this network family @@ -479,6 +480,9 @@ struct sock { u32 sk_ack_backlog; u32 sk_max_ack_backlog; kuid_t sk_uid; +#ifdef CONFIG_NET_RX_BUSY_POLL + u8 sk_prefer_busy_poll; +#endif struct pid *sk_peer_pid; const struct cred *sk_peer_cred; long sk_rcvtimeo; diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index 77f7c1638eb1..7dd02408b7ce 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -119,6 +119,8 @@ #define SO_DETACH_REUSEPORT_BPF 68 +#define SO_PREFER_BUSY_POLL 69 + #if !defined(__KERNEL__) #if __BITS_PER_LONG == 64 || (defined(__x86_64__) && defined(__ILP32__)) diff --git a/net/core/dev.c b/net/core/dev.c index 60d325bda0d7..6f8d2cffb7c5 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6458,7 +6458,8 @@ bool napi_complete_done(struct napi_struct *n, int work_done) WARN_ON_ONCE(!(val & NAPIF_STATE_SCHED)); - new = val & ~(NAPIF_STATE_MISSED | NAPIF_STATE_SCHED); + new = val & ~(NAPIF_STATE_MISSED | NAPIF_STATE_SCHED | + NAPIF_STATE_PREFER_BUSY_POLL); /* If STATE_MISSED was set, leave STATE_SCHED set, * because we will call napi->poll() one more time. @@ -6497,8 +6498,29 @@ static struct napi_struct *napi_by_id(unsigned int napi_id) #define BUSY_POLL_BUDGET 8 -static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock) +static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) { + if (!skip_schedule) { + gro_normal_list(napi); + __napi_schedule(napi); + return; + } + + if (napi->gro_bitmask) { + /* flush too old packets + * If HZ < 1000, flush all packets. + */ + napi_gro_flush(napi, HZ >= 1000); + } + + gro_normal_list(napi); + clear_bit(NAPI_STATE_SCHED, &napi->state); +} + +static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll) +{ + bool skip_schedule = false; + unsigned long timeout; int rc; /* Busy polling means there is a high chance device driver hard irq @@ -6515,6 +6537,15 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock) local_bh_disable(); + if (prefer_busy_poll) { + napi->defer_hard_irqs_count = READ_ONCE(napi->dev->napi_defer_hard_irqs); + timeout = READ_ONCE(napi->dev->gro_flush_timeout); + if (napi->defer_hard_irqs_count && timeout) { + hrtimer_start(&napi->timer, ns_to_ktime(timeout), HRTIMER_MODE_REL_PINNED); + skip_schedule = true; + } + } + /* All we really want here is to re-enable device interrupts. * Ideally, a new ndo_busy_poll_stop() could avoid another round. */ @@ -6525,19 +6556,14 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock) */ trace_napi_poll(napi, rc, BUSY_POLL_BUDGET); netpoll_poll_unlock(have_poll_lock); - if (rc == BUSY_POLL_BUDGET) { - /* As the whole budget was spent, we still own the napi so can - * safely handle the rx_list. - */ - gro_normal_list(napi); - __napi_schedule(napi); - } + if (rc == BUSY_POLL_BUDGET) + __busy_poll_stop(napi, skip_schedule); local_bh_enable(); } void napi_busy_loop(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), - void *loop_end_arg) + void *loop_end_arg, bool prefer_busy_poll) { unsigned long start_time = loop_end ? busy_loop_current_time() : 0; int (*napi_poll)(struct napi_struct *napi, int budget); @@ -6565,12 +6591,18 @@ void napi_busy_loop(unsigned int napi_id, * we avoid dirtying napi->state as much as we can. */ if (val & (NAPIF_STATE_DISABLE | NAPIF_STATE_SCHED | - NAPIF_STATE_IN_BUSY_POLL)) + NAPIF_STATE_IN_BUSY_POLL)) { + if (prefer_busy_poll) + set_bit(NAPI_STATE_PREFER_BUSY_POLL, &napi->state); goto count; + } if (cmpxchg(&napi->state, val, val | NAPIF_STATE_IN_BUSY_POLL | - NAPIF_STATE_SCHED) != val) + NAPIF_STATE_SCHED) != val) { + if (prefer_busy_poll) + set_bit(NAPI_STATE_PREFER_BUSY_POLL, &napi->state); goto count; + } have_poll_lock = netpoll_poll_lock(napi); napi_poll = napi->poll; } @@ -6588,7 +6620,7 @@ void napi_busy_loop(unsigned int napi_id, if (unlikely(need_resched())) { if (napi_poll) - busy_poll_stop(napi, have_poll_lock); + busy_poll_stop(napi, have_poll_lock, prefer_busy_poll); preempt_enable(); rcu_read_unlock(); cond_resched(); @@ -6599,7 +6631,7 @@ void napi_busy_loop(unsigned int napi_id, cpu_relax(); } if (napi_poll) - busy_poll_stop(napi, have_poll_lock); + busy_poll_stop(napi, have_poll_lock, prefer_busy_poll); preempt_enable(); out: rcu_read_unlock(); @@ -6650,8 +6682,10 @@ static enum hrtimer_restart napi_watchdog(struct hrtimer *timer) * NAPI_STATE_MISSED, since we do not react to a device IRQ. */ if (!napi_disable_pending(napi) && - !test_and_set_bit(NAPI_STATE_SCHED, &napi->state)) + !test_and_set_bit(NAPI_STATE_SCHED, &napi->state)) { + clear_bit(NAPI_STATE_PREFER_BUSY_POLL, &napi->state); __napi_schedule_irqoff(napi); + } return HRTIMER_NORESTART; } @@ -6709,6 +6743,7 @@ void napi_disable(struct napi_struct *n) hrtimer_cancel(&n->timer); + clear_bit(NAPI_STATE_PREFER_BUSY_POLL, &n->state); clear_bit(NAPI_STATE_DISABLE, &n->state); } EXPORT_SYMBOL(napi_disable); @@ -6781,6 +6816,19 @@ static int napi_poll(struct napi_struct *n, struct list_head *repoll) goto out_unlock; } + /* The NAPI context has more processing work, but busy-polling + * is preferred. Exit early. + */ + if (napi_prefer_busy_poll(n)) { + if (napi_complete_done(n, work)) { + /* If timeout is not set, we need to make sure + * that the NAPI is re-scheduled. + */ + napi_schedule(n); + } + goto out_unlock; + } + if (n->gro_bitmask) { /* flush too old packets * If HZ < 1000, flush all packets. diff --git a/net/core/sock.c b/net/core/sock.c index 727ea1cc633c..e05f2e52b5a8 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1159,6 +1159,12 @@ int sock_setsockopt(struct socket *sock, int level, int optname, sk->sk_ll_usec = val; } break; + case SO_PREFER_BUSY_POLL: + if (valbool && !capable(CAP_NET_ADMIN)) + ret = -EPERM; + else + WRITE_ONCE(sk->sk_prefer_busy_poll, valbool); + break; #endif case SO_MAX_PACING_RATE: @@ -1523,6 +1529,9 @@ int sock_getsockopt(struct socket *sock, int level, int optname, case SO_BUSY_POLL: v.val = sk->sk_ll_usec; break; + case SO_PREFER_BUSY_POLL: + v.val = READ_ONCE(sk->sk_prefer_busy_poll); + break; #endif case SO_MAX_PACING_RATE: From patchwork Mon Nov 30 18:51:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.9 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26A92C83019 for ; Mon, 30 Nov 2020 18:53:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D593C20725 for ; Mon, 30 Nov 2020 18:53:27 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mtqCkvm/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729534AbgK3SxX (ORCPT ); Mon, 30 Nov 2020 13:53:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727994AbgK3SxX (ORCPT ); Mon, 30 Nov 2020 13:53:23 -0500 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 36A76C0613D3; Mon, 30 Nov 2020 10:52:43 -0800 (PST) Received: by mail-pg1-x541.google.com with SMTP id g18so461860pgk.1; Mon, 30 Nov 2020 10:52:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=i7EkevhifqqUFdgCfgKKs/O6gCihrGJrZvSxHf0P6vg=; b=mtqCkvm/uEd+sMsnwH9uknEsRk/Q7UTkcZE/HwAjQX93oMXwjTj3MZ4bw//R8a/vVG eZDmd5j3H3yJoQg5FkG4c6aUZOF6p+eSZW8MqKnIlU/KWkmlgqBgD4FErN7fE3c7lwNd 0GwFMjRjgyI47Ln56b4pMkAn+/mxDOa6PPPamlZEBv7pvbwpIWr3gecz97DPOhHaK0iS q0uZU8dBp4w6edyYV/HxrtE4M+yCEvmaS4n2UYRU/TmZSJ9zroXHtcwtPiTx5xASOxiZ nhq5TdTyEr7ShwMt7Ru4hI5f5eJyKM20ODOVLOAXUtUbA/XKO4fGifLma6g7DmLoRbdg 09wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=i7EkevhifqqUFdgCfgKKs/O6gCihrGJrZvSxHf0P6vg=; b=RKywTBVvLbwk2WYdOPBRL4CbKu6AWYBBE2RTrsN/lJVIOwz+FR6m5fOmPuVGvGGt2h Zs9nRydqPbgO8yWbTxq71y7mwT3JQGMFdNQrJ9vCMT2mcQupwC4ToS/pA9A1I6xW8uPV p43Hqsk5ArFb9JQQFTgTNXUVkjHPYmfqp9r1U4ngJ7NEgYT+kKu3PVjBYZEuqNpGSvI2 +dGfc9PgO0hhaOssnW9sLnsLu1HlY+A8EurEOocorafY3Xia2JWodKqsZBhU7UYu7eCJ QGrC5NFcZx7MFt1eMLQSScWGaY/L+edIDnqbxXBXMHxrFV43f3f41kNLjhYfqpeqtevK /oZQ== X-Gm-Message-State: AOAM532KwrF10HBugSoo5OfPbrbvSWHI0sFPsXm6yMF3SMteQXJbbJhs 2Y25s8Hoq5ZRemXYJo9Cr5/95V8+FAdbB/wL X-Google-Smtp-Source: ABdhPJxpswjRDVQfOrfZF3OezU9tccSEiwBqm4JkCsVv21vGfCUD7fwFJSEpEyzkIUJcpaDO+x6M+g== X-Received: by 2002:a63:3d8c:: with SMTP id k134mr19040634pga.53.1606762362048; Mon, 30 Nov 2020 10:52:42 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.52.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:52:41 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 02/10] net: add SO_BUSY_POLL_BUDGET socket option Date: Mon, 30 Nov 2020 19:51:57 +0100 Message-Id: <20201130185205.196029-3-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel This option lets a user set a per socket NAPI budget for busy-polling. If the options is not set, it will use the default of 8. Reviewed-by: Jakub Kicinski Signed-off-by: Björn Töpel --- arch/alpha/include/uapi/asm/socket.h | 1 + arch/mips/include/uapi/asm/socket.h | 1 + arch/parisc/include/uapi/asm/socket.h | 1 + arch/sparc/include/uapi/asm/socket.h | 1 + fs/eventpoll.c | 3 ++- include/net/busy_poll.h | 7 +++++-- include/net/sock.h | 2 ++ include/uapi/asm-generic/socket.h | 1 + net/core/dev.c | 21 ++++++++++----------- net/core/sock.c | 10 ++++++++++ 10 files changed, 34 insertions(+), 14 deletions(-) diff --git a/arch/alpha/include/uapi/asm/socket.h b/arch/alpha/include/uapi/asm/socket.h index 538359642554..57420356ce4c 100644 --- a/arch/alpha/include/uapi/asm/socket.h +++ b/arch/alpha/include/uapi/asm/socket.h @@ -125,6 +125,7 @@ #define SO_DETACH_REUSEPORT_BPF 68 #define SO_PREFER_BUSY_POLL 69 +#define SO_BUSY_POLL_BUDGET 70 #if !defined(__KERNEL__) diff --git a/arch/mips/include/uapi/asm/socket.h b/arch/mips/include/uapi/asm/socket.h index e406e73b5e6e..2d949969313b 100644 --- a/arch/mips/include/uapi/asm/socket.h +++ b/arch/mips/include/uapi/asm/socket.h @@ -136,6 +136,7 @@ #define SO_DETACH_REUSEPORT_BPF 68 #define SO_PREFER_BUSY_POLL 69 +#define SO_BUSY_POLL_BUDGET 70 #if !defined(__KERNEL__) diff --git a/arch/parisc/include/uapi/asm/socket.h b/arch/parisc/include/uapi/asm/socket.h index 1bc46200889d..f60904329bbc 100644 --- a/arch/parisc/include/uapi/asm/socket.h +++ b/arch/parisc/include/uapi/asm/socket.h @@ -117,6 +117,7 @@ #define SO_DETACH_REUSEPORT_BPF 0x4042 #define SO_PREFER_BUSY_POLL 0x4043 +#define SO_BUSY_POLL_BUDGET 0x4044 #if !defined(__KERNEL__) diff --git a/arch/sparc/include/uapi/asm/socket.h b/arch/sparc/include/uapi/asm/socket.h index 99688cf673a4..848a22fbac20 100644 --- a/arch/sparc/include/uapi/asm/socket.h +++ b/arch/sparc/include/uapi/asm/socket.h @@ -118,6 +118,7 @@ #define SO_DETACH_REUSEPORT_BPF 0x0047 #define SO_PREFER_BUSY_POLL 0x0048 +#define SO_BUSY_POLL_BUDGET 0x0049 #if !defined(__KERNEL__) diff --git a/fs/eventpoll.c b/fs/eventpoll.c index e11fab3a0b9e..73c346e503d7 100644 --- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -397,7 +397,8 @@ static void ep_busy_loop(struct eventpoll *ep, int nonblock) unsigned int napi_id = READ_ONCE(ep->napi_id); if ((napi_id >= MIN_NAPI_ID) && net_busy_loop_on()) - napi_busy_loop(napi_id, nonblock ? NULL : ep_busy_loop_end, ep, false); + napi_busy_loop(napi_id, nonblock ? NULL : ep_busy_loop_end, ep, false, + BUSY_POLL_BUDGET); } static inline void ep_reset_busy_poll_napi_id(struct eventpoll *ep) diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index 0292b8353d7e..2f8f51807b83 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -23,6 +23,8 @@ */ #define MIN_NAPI_ID ((unsigned int)(NR_CPUS + 1)) +#define BUSY_POLL_BUDGET 8 + #ifdef CONFIG_NET_RX_BUSY_POLL struct napi_struct; @@ -43,7 +45,7 @@ bool sk_busy_loop_end(void *p, unsigned long start_time); void napi_busy_loop(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), - void *loop_end_arg, bool prefer_busy_poll); + void *loop_end_arg, bool prefer_busy_poll, u16 budget); #else /* CONFIG_NET_RX_BUSY_POLL */ static inline unsigned long net_busy_loop_on(void) @@ -106,7 +108,8 @@ static inline void sk_busy_loop(struct sock *sk, int nonblock) if (napi_id >= MIN_NAPI_ID) napi_busy_loop(napi_id, nonblock ? NULL : sk_busy_loop_end, sk, - READ_ONCE(sk->sk_prefer_busy_poll)); + READ_ONCE(sk->sk_prefer_busy_poll), + READ_ONCE(sk->sk_busy_poll_budget) ?: BUSY_POLL_BUDGET); #endif } diff --git a/include/net/sock.h b/include/net/sock.h index d49b89b071b6..77ba2c2737db 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -302,6 +302,7 @@ struct bpf_local_storage; * @sk_max_ack_backlog: listen backlog set in listen() * @sk_uid: user id of owner * @sk_prefer_busy_poll: prefer busypolling over softirq processing + * @sk_busy_poll_budget: napi processing budget when busypolling * @sk_priority: %SO_PRIORITY setting * @sk_type: socket type (%SOCK_STREAM, etc) * @sk_protocol: which protocol this socket belongs in this network family @@ -482,6 +483,7 @@ struct sock { kuid_t sk_uid; #ifdef CONFIG_NET_RX_BUSY_POLL u8 sk_prefer_busy_poll; + u16 sk_busy_poll_budget; #endif struct pid *sk_peer_pid; const struct cred *sk_peer_cred; diff --git a/include/uapi/asm-generic/socket.h b/include/uapi/asm-generic/socket.h index 7dd02408b7ce..4dcd13d097a9 100644 --- a/include/uapi/asm-generic/socket.h +++ b/include/uapi/asm-generic/socket.h @@ -120,6 +120,7 @@ #define SO_DETACH_REUSEPORT_BPF 68 #define SO_PREFER_BUSY_POLL 69 +#define SO_BUSY_POLL_BUDGET 70 #if !defined(__KERNEL__) diff --git a/net/core/dev.c b/net/core/dev.c index 6f8d2cffb7c5..7a1e5936c67f 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6496,8 +6496,6 @@ static struct napi_struct *napi_by_id(unsigned int napi_id) #if defined(CONFIG_NET_RX_BUSY_POLL) -#define BUSY_POLL_BUDGET 8 - static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) { if (!skip_schedule) { @@ -6517,7 +6515,8 @@ static void __busy_poll_stop(struct napi_struct *napi, bool skip_schedule) clear_bit(NAPI_STATE_SCHED, &napi->state); } -static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll) +static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool prefer_busy_poll, + u16 budget) { bool skip_schedule = false; unsigned long timeout; @@ -6549,21 +6548,21 @@ static void busy_poll_stop(struct napi_struct *napi, void *have_poll_lock, bool /* All we really want here is to re-enable device interrupts. * Ideally, a new ndo_busy_poll_stop() could avoid another round. */ - rc = napi->poll(napi, BUSY_POLL_BUDGET); + rc = napi->poll(napi, budget); /* We can't gro_normal_list() here, because napi->poll() might have * rearmed the napi (napi_complete_done()) in which case it could * already be running on another CPU. */ - trace_napi_poll(napi, rc, BUSY_POLL_BUDGET); + trace_napi_poll(napi, rc, budget); netpoll_poll_unlock(have_poll_lock); - if (rc == BUSY_POLL_BUDGET) + if (rc == budget) __busy_poll_stop(napi, skip_schedule); local_bh_enable(); } void napi_busy_loop(unsigned int napi_id, bool (*loop_end)(void *, unsigned long), - void *loop_end_arg, bool prefer_busy_poll) + void *loop_end_arg, bool prefer_busy_poll, u16 budget) { unsigned long start_time = loop_end ? busy_loop_current_time() : 0; int (*napi_poll)(struct napi_struct *napi, int budget); @@ -6606,8 +6605,8 @@ void napi_busy_loop(unsigned int napi_id, have_poll_lock = netpoll_poll_lock(napi); napi_poll = napi->poll; } - work = napi_poll(napi, BUSY_POLL_BUDGET); - trace_napi_poll(napi, work, BUSY_POLL_BUDGET); + work = napi_poll(napi, budget); + trace_napi_poll(napi, work, budget); gro_normal_list(napi); count: if (work > 0) @@ -6620,7 +6619,7 @@ void napi_busy_loop(unsigned int napi_id, if (unlikely(need_resched())) { if (napi_poll) - busy_poll_stop(napi, have_poll_lock, prefer_busy_poll); + busy_poll_stop(napi, have_poll_lock, prefer_busy_poll, budget); preempt_enable(); rcu_read_unlock(); cond_resched(); @@ -6631,7 +6630,7 @@ void napi_busy_loop(unsigned int napi_id, cpu_relax(); } if (napi_poll) - busy_poll_stop(napi, have_poll_lock, prefer_busy_poll); + busy_poll_stop(napi, have_poll_lock, prefer_busy_poll, budget); preempt_enable(); out: rcu_read_unlock(); diff --git a/net/core/sock.c b/net/core/sock.c index e05f2e52b5a8..d422a6808405 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -1165,6 +1165,16 @@ int sock_setsockopt(struct socket *sock, int level, int optname, else WRITE_ONCE(sk->sk_prefer_busy_poll, valbool); break; + case SO_BUSY_POLL_BUDGET: + if (val > READ_ONCE(sk->sk_busy_poll_budget) && !capable(CAP_NET_ADMIN)) { + ret = -EPERM; + } else { + if (val < 0 || val > U16_MAX) + ret = -EINVAL; + else + WRITE_ONCE(sk->sk_busy_poll_budget, val); + } + break; #endif case SO_MAX_PACING_RATE: From patchwork Mon Nov 30 18:51:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335685 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E887EC64E7B for ; Mon, 30 Nov 2020 18:54:14 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 876B32073C for ; Mon, 30 Nov 2020 18:54:14 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="f/N8v4Hy" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387735AbgK3Sxa (ORCPT ); Mon, 30 Nov 2020 13:53:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60692 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727994AbgK3Sx3 (ORCPT ); Mon, 30 Nov 2020 13:53:29 -0500 Received: from mail-pg1-x541.google.com (mail-pg1-x541.google.com [IPv6:2607:f8b0:4864:20::541]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7F141C0613D4; Mon, 30 Nov 2020 10:52:49 -0800 (PST) Received: by mail-pg1-x541.google.com with SMTP id l4so4555164pgu.5; Mon, 30 Nov 2020 10:52:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ROQPewlDIZikhUQm8VWG2wYdta6lahk0LWlwgo5VsUU=; b=f/N8v4HywGPMyIs5jYMKdayJDHXJ+XxuqdxmWZ8SBOcSkyoYPFe7FzSxb3WREwvkPQ rjDZi2+M7F1V9OWwRmSn/SxyWOpeT80yTWUG1FNfOYqzsp5Vi+DbYolDNvsHOnl3swSx ClQPvYGUsCLhQAHAzGVUKMfcdbn2sxIFvssiPts5gi9GQuvSy+znDPjI/z9VCjo+U439 ZhIJMwLSatlxAdi4jdOdl09pfBX2bVNck8Bgr1MRpQkNg8LNt2AloRO9N0Mgcry/rGA7 7/vdeDeYVIAfjq6w1GVxhfx8ZPaMjVoBL6S3A8Nmfq93WAEZCsDTDBuIoqtdwzcytZE9 4jug== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ROQPewlDIZikhUQm8VWG2wYdta6lahk0LWlwgo5VsUU=; b=AsKgYTjDArXdMv6wBhcYvz3vvUlbkWeZIbI9reXa30eS7j3C1XIn2wP1pz+/giZOIU BaglR6YvR4mhwPEBEoSS5+iBPJJz4Vk88TavCv/ywQQ2RKr/xe37e+2tKF/81nR1lnwi Be/1c4Aba1zPYaV9LfcjeNRIXdps3Hxgeigcj10ZEVDaTvFLwkf2Y5EDy+6aUbPn5fwu ZUVcSzi4MGyP0XIHvr7p0sacp4L2FEuhygqLQBfrIu+vxQs2/cUbtfuj/Bdut99tIRKk yZZjXTr5IIIR9wM535w++83pe1AcJgnpH3hDRKbgFR4Cn6VTuVIVIjI8ZZKzTVoZ7VdN zoJw== X-Gm-Message-State: AOAM532z769rIEGf+bG/EWfjYUW6f9lJUfxKV+BoYi6r+m3Wmj3kqwge jdu8viFkwwdj6yU05CGHocyzgI9k0oS4V1/i X-Google-Smtp-Source: ABdhPJxwGTC1uc2X1fuPER+0GYkFTroSL/pkerjOYqTcT4Mdjjs64SYS1bLFcZlbnP52+MoF5biX8g== X-Received: by 2002:a63:5f49:: with SMTP id t70mr13091194pgb.288.1606762368611; Mon, 30 Nov 2020 10:52:48 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.52.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:52:47 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 03/10] xsk: add support for recvmsg() Date: Mon, 30 Nov 2020 19:51:58 +0100 Message-Id: <20201130185205.196029-4-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Add support for non-blocking recvmsg() to XDP sockets. Previously, only sendmsg() was supported by XDP socket. Now, for symmetry and the upcoming busy-polling support, recvmsg() is added. Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- net/xdp/xsk.c | 22 +++++++++++++++++++++- 1 file changed, 21 insertions(+), 1 deletion(-) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 3e33cb6e68e6..1d1e0cd51a95 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -531,6 +531,26 @@ static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) return __xsk_sendmsg(sk); } +static int xsk_recvmsg(struct socket *sock, struct msghdr *m, size_t len, int flags) +{ + bool need_wait = !(flags & MSG_DONTWAIT); + struct sock *sk = sock->sk; + struct xdp_sock *xs = xdp_sk(sk); + + if (unlikely(!(xs->dev->flags & IFF_UP))) + return -ENETDOWN; + if (unlikely(!xs->rx)) + return -ENOBUFS; + if (unlikely(!xsk_is_bound(xs))) + return -ENXIO; + if (unlikely(need_wait)) + return -EOPNOTSUPP; + + if (xs->pool->cached_need_wakeup & XDP_WAKEUP_RX && xs->zc) + return xsk_wakeup(xs, XDP_WAKEUP_RX); + return 0; +} + static __poll_t xsk_poll(struct file *file, struct socket *sock, struct poll_table_struct *wait) { @@ -1191,7 +1211,7 @@ static const struct proto_ops xsk_proto_ops = { .setsockopt = xsk_setsockopt, .getsockopt = xsk_getsockopt, .sendmsg = xsk_sendmsg, - .recvmsg = sock_no_recvmsg, + .recvmsg = xsk_recvmsg, .mmap = xsk_mmap, .sendpage = sock_no_sendpage, }; From patchwork Mon Nov 30 18:51:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45AB9C71155 for ; Mon, 30 Nov 2020 18:54:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E5AD2073C for ; Mon, 30 Nov 2020 18:54:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="UZBDYAGf" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387909AbgK3Sxh (ORCPT ); Mon, 30 Nov 2020 13:53:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727345AbgK3Sxg (ORCPT ); Mon, 30 Nov 2020 13:53:36 -0500 Received: from mail-pj1-x1041.google.com (mail-pj1-x1041.google.com [IPv6:2607:f8b0:4864:20::1041]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6B956C0613D6; Mon, 30 Nov 2020 10:52:56 -0800 (PST) Received: by mail-pj1-x1041.google.com with SMTP id hk16so135676pjb.4; Mon, 30 Nov 2020 10:52:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=YxY+jVb6DuD2Q2vdVgYXmx6Ypq4tBvL2ZuLpqRZo/Yk=; b=UZBDYAGf3Lr9iCEeIKHLoE7/AMHhgOXcVw/smAqPm+CNZmuU+YHI9psuTprGgbzUik 3sVTevmgThkK8tRegzvVjw190dShXZM8I9IGsidACic/AMo4HjhZtL5/pvIx7VCKErNY Lsn2m7dkqoHVrt5Lfm3p82IG3K8Gcrj3fHUTaQc+X6m6p96Vsj1H4rinEljdWsAWv5H/ VzpN8YJXg/xBkTl+vJBk+cIXjAf9kjCStJBgzte4G1Z1e1ZwARjVurIZtLzlxtVMGnqX spjGEamQ7vYBZ/4FrIZTACjotU/zjDlnVusIF6455CjoRc0lPM2MlfJpGUwd8iwH8sXr pn8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=YxY+jVb6DuD2Q2vdVgYXmx6Ypq4tBvL2ZuLpqRZo/Yk=; b=W4tDKPP+QvubWlrU1BkTs/TyKf7gRrnFpXEg9cSlvicMES182xm5iDUbVtaCK0LryR vnVdd7fBPuJF55kHHnT12XM0ikEWXHLmpdDxVcM5X2yTtbLD6oQXr1eMRSh2xIn0Lc6w zI3g2rsM5TpGWrE3CisWEMWfLnZ0vNbpCc93iuk5MjQKtMNeGwr2qeFo9uGc0zK5V4BS kzlcgJjjCif6gh9y178YdL12CsGKL3/8kGRFcqKSF4pKV7mEhPTnkl4+OE964bRghemi guuxDG8Ux5nUxdkC+Yya9M3xNEVXrvrWjUTh2/w4UywNCa9uc1kK14aJVneR+CIpbIXA 2E0Q== X-Gm-Message-State: AOAM531NBO84NGV4B89vYwvyc/co2bFKVqenlKpkrkoFX0ylykznazlq FioX613fRnNrAREGYM2fPSU/oT+F8m4O5dU3 X-Google-Smtp-Source: ABdhPJwoRGJmxX0D2k4SdQnRrskU5crp5uOhAIMO0ZHJqAczz4BuY8kV0UGFxISi+IVFXwr/txldTg== X-Received: by 2002:a17:902:59dc:b029:da:84c7:f175 with SMTP id d28-20020a17090259dcb02900da84c7f175mr3577991plj.75.1606762375387; Mon, 30 Nov 2020 10:52:55 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.52.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:52:54 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 04/10] xsk: check need wakeup flag in sendmsg() Date: Mon, 30 Nov 2020 19:51:59 +0100 Message-Id: <20201130185205.196029-5-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Add a check for need wake up in sendmsg(), so that if a user calls sendmsg() when no wakeup is needed, do not trigger a wakeup. To simplify the need wakeup check in the syscall, unconditionally enable the need wakeup flag for Tx. This has a side-effect for poll(); If poll() is called for a socket without enabled need wakeup, a Tx wakeup is unconditionally performed. The wakeup matrix for AF_XDP now looks like: need wakeup | poll() | sendmsg() | recvmsg() ------------+--------------+-------------+------------ disabled | wake Tx | wake Tx | nop enabled | check flag; | check flag; | check flag; | wake Tx/Rx | wake Tx | wake Rx Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- net/xdp/xsk.c | 6 +++++- net/xdp/xsk_buff_pool.c | 13 ++++++------- 2 files changed, 11 insertions(+), 8 deletions(-) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 1d1e0cd51a95..c72b4dba2878 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -522,13 +522,17 @@ static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) bool need_wait = !(m->msg_flags & MSG_DONTWAIT); struct sock *sk = sock->sk; struct xdp_sock *xs = xdp_sk(sk); + struct xsk_buff_pool *pool; if (unlikely(!xsk_is_bound(xs))) return -ENXIO; if (unlikely(need_wait)) return -EOPNOTSUPP; - return __xsk_sendmsg(sk); + pool = xs->pool; + if (pool->cached_need_wakeup & XDP_WAKEUP_TX) + return __xsk_sendmsg(sk); + return 0; } static int xsk_recvmsg(struct socket *sock, struct msghdr *m, size_t len, int flags) diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index 8a3bf4e1318e..96bb607853ad 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -144,14 +144,13 @@ static int __xp_assign_dev(struct xsk_buff_pool *pool, if (err) return err; - if (flags & XDP_USE_NEED_WAKEUP) { + if (flags & XDP_USE_NEED_WAKEUP) pool->uses_need_wakeup = true; - /* Tx needs to be explicitly woken up the first time. - * Also for supporting drivers that do not implement this - * feature. They will always have to call sendto(). - */ - pool->cached_need_wakeup = XDP_WAKEUP_TX; - } + /* Tx needs to be explicitly woken up the first time. Also + * for supporting drivers that do not implement this + * feature. They will always have to call sendto() or poll(). + */ + pool->cached_need_wakeup = XDP_WAKEUP_TX; dev_hold(netdev); From patchwork Mon Nov 30 18:52:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335684 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E5A4CC83013 for ; Mon, 30 Nov 2020 18:54:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 989B4207F7 for ; Mon, 30 Nov 2020 18:54:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="uNhAZky2" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388040AbgK3Sxu (ORCPT ); Mon, 30 Nov 2020 13:53:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387973AbgK3Sxu (ORCPT ); Mon, 30 Nov 2020 13:53:50 -0500 Received: from mail-pf1-x442.google.com (mail-pf1-x442.google.com [IPv6:2607:f8b0:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F314C0617A6; Mon, 30 Nov 2020 10:53:03 -0800 (PST) Received: by mail-pf1-x442.google.com with SMTP id n137so11018652pfd.3; Mon, 30 Nov 2020 10:53:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=LH7Sw3V4LJlwW+1kWBdhvDdansinX+voulSeYgx/7Lg=; b=uNhAZky2zYu+v3GclfTq8LqVHaF+bdWnvmUrOKBgl2ap333LZFiqfYbxaU30Q66+iA 1Qum1y4rz44gyjSV5N+z5SzzGuP8zGKL3liWiPTTJZKE6eylB+itNdCiCt+0oIXeOlGm dPIheOUW5D1+zL+nVC0CAJ1YGp6GIkzdaru0048d2E8h0X//Dww/8jzQVcuQjW0ZvMPv C90Udk8w5kAOx8KYROSfT0TqembDYcHD8K9CWsZttIzo9aEd2qEXB5jLD/1WtL5UYlef 8gAEE0RVCFpW1/1fE6+qCC79TPXgSmp0sHWAEyEMe2Gsfger3xVONCp5ltmKDVWnkpnV wksA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=LH7Sw3V4LJlwW+1kWBdhvDdansinX+voulSeYgx/7Lg=; b=cwRrqZCLBgWXjRsmrrWgt3K4ZmnPRBVjtXXU2ULcjn0iHPDPjhh/iYDgdW9No1mFHf 4rzojJKjMwNlUNxQnhrcmcEdAROc/qrG/qLzUa8atp8N4AZmBONLIMjnvhu4IvuU04Ph q9Drm1ZHTCkEDaTN6l3wl+x2Ubfc7KETp8j+RYDxyjJB8WXfmrJJiXXi/6JmFw5l2rbH Z2VZDzsl43tcM53GDssXJnBRT1gLPS7swCgcU3tcHgxIZlSlsQMULgfzcpSHhbt0KM8G X7g8LPbPv8IQSKlByGwRlWwtFe5CL71Tomr/uIOULrgaYLx4mkVPbWfGNPDHoWni39Io 8m/Q== X-Gm-Message-State: AOAM531rdz0KvMJ8zVErzc9SMs7qyIiZsP71AftDxIP2QKrOTc51k7bu 3IH6UlW3kYZHmNCi+uYvELcFbJH2ALmtDwZ7 X-Google-Smtp-Source: ABdhPJxSE+kIDXb53rslbZiaeJSanGiMwErLImrKX10TaqiWmSbuon6pUF2ujbj3wh7oB6DuzhAZHA== X-Received: by 2002:a63:2351:: with SMTP id u17mr2834290pgm.72.1606762382018; Mon, 30 Nov 2020 10:53:02 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.52.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:01 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 05/10] xsk: add busy-poll support for {recv, send}msg() Date: Mon, 30 Nov 2020 19:52:00 +0100 Message-Id: <20201130185205.196029-6-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Wire-up XDP socket busy-poll support for recvmsg() and sendmsg(). If the XDP socket prefers busy-polling, make sure that no wakeup/IPI is performed. Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- net/xdp/xsk.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+) diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index c72b4dba2878..a8501a5477cf 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -23,6 +23,7 @@ #include #include #include +#include #include #include "xsk_queue.h" @@ -517,6 +518,17 @@ static int __xsk_sendmsg(struct sock *sk) return xs->zc ? xsk_zc_xmit(xs) : xsk_generic_xmit(sk); } +static bool xsk_no_wakeup(struct sock *sk) +{ +#ifdef CONFIG_NET_RX_BUSY_POLL + /* Prefer busy-polling, skip the wakeup. */ + return READ_ONCE(sk->sk_prefer_busy_poll) && READ_ONCE(sk->sk_ll_usec) && + READ_ONCE(sk->sk_napi_id) >= MIN_NAPI_ID; +#else + return false; +#endif +} + static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) { bool need_wait = !(m->msg_flags & MSG_DONTWAIT); @@ -529,6 +541,12 @@ static int xsk_sendmsg(struct socket *sock, struct msghdr *m, size_t total_len) if (unlikely(need_wait)) return -EOPNOTSUPP; + if (sk_can_busy_loop(sk)) + sk_busy_loop(sk, 1); /* only support non-blocking sockets */ + + if (xsk_no_wakeup(sk)) + return 0; + pool = xs->pool; if (pool->cached_need_wakeup & XDP_WAKEUP_TX) return __xsk_sendmsg(sk); @@ -550,6 +568,12 @@ static int xsk_recvmsg(struct socket *sock, struct msghdr *m, size_t len, int fl if (unlikely(need_wait)) return -EOPNOTSUPP; + if (sk_can_busy_loop(sk)) + sk_busy_loop(sk, 1); /* only support non-blocking sockets */ + + if (xsk_no_wakeup(sk)) + return 0; + if (xs->pool->cached_need_wakeup & XDP_WAKEUP_RX && xs->zc) return xsk_wakeup(xs, XDP_WAKEUP_RX); return 0; From patchwork Mon Nov 30 18:52:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335683 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7830EC8301D for ; Mon, 30 Nov 2020 18:54:18 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2B1632085B for ; Mon, 30 Nov 2020 18:54:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="DSBklMAh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388193AbgK3SyC (ORCPT ); Mon, 30 Nov 2020 13:54:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60770 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729784AbgK3Sx5 (ORCPT ); Mon, 30 Nov 2020 13:53:57 -0500 Received: from mail-pg1-x544.google.com (mail-pg1-x544.google.com [IPv6:2607:f8b0:4864:20::544]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EAFE8C0617A7; Mon, 30 Nov 2020 10:53:15 -0800 (PST) Received: by mail-pg1-x544.google.com with SMTP id e23so5766762pgk.12; Mon, 30 Nov 2020 10:53:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Q8ZzzbiqFjy9hOmCojei++SQ6pwejjwqhlAmiEMrlCE=; b=DSBklMAh4WZv4etJMpMugIDDc+jfy/4DroJZSJen+3kWpW2b4PhunE4OnvbBhUI8P1 1GErMk1ZZPww2a5ZGa7K7x1nTlXQBaIhEX66h8h6axPhsoN+uSYAdAHZFbsqSU8B/84w vmfyPHo4868jZgmbM4bM32w4KPzQ8eNEsSnEz8T+XQQPDHmATp9fpaG3xpWWur3BgUqp T4dowH7u76h1BxYL6+jtMuPN1Fdb3xe7oIz5wUlzBgqwCrv3c21FQVH1htm7d7vBtH8Q kDOikFuZWByzk6I7sJjMeeGlhzdBHUjSPA9XmbEv8njJvbjGj9kUIcvc5BL/le8eKLGj Ecvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Q8ZzzbiqFjy9hOmCojei++SQ6pwejjwqhlAmiEMrlCE=; b=W3z3dnMAM29dyomOSZiZvAz5Y0McDZii/7qbNpk/FocjxYatY9ifF/lOhRs+gcdv+K I4lIYiSS4eYl/eQsbOMVGUWfNUl+4U52GWSRyMXvznuYDIIp8rtqI9N8VHIRmuYr+wBY QyBqEk960wF/um/gjrSDxp4pygw1DwvfFmiuiZNcfWGJgzTIvceI5T2a1dyDsGioNgYO snQhpYS1YxQAhFfuTAJnXa/Gtnj2awIcYHyn0aXc+hiKXhMUsk1Cc5Y8gUM1A+8eb3RQ xtlV24ekhoG6BfNrUilejyAWrTTBDhXkK3dDk4iTGiiejIm2taarZUh+ZSwAGkux9jGE 9Ukg== X-Gm-Message-State: AOAM532oOAAzDRwRA1rj35TqRofY05B6OCrlCRYYaldfMq76lxo2BNhU BrIpOI+4tIVowTPUoN36z6N/cm50vYY6txqN7Ns= X-Google-Smtp-Source: ABdhPJyKflzbnRzPAyZIxgistNBztBFbjODlUVgkTzRZ9Xr5SrnWYYZGBUvQ5V5bwx7+roSidNIo7w== X-Received: by 2002:a65:6299:: with SMTP id f25mr18300617pgv.402.1606762394664; Mon, 30 Nov 2020 10:53:14 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.53.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:13 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com, intel-wired-lan@lists.osuosl.org, netanel@amazon.com, akiyano@amazon.com, michael.chan@broadcom.com, sgoutham@marvell.com, ioana.ciornei@nxp.com, ruxandra.radulescu@nxp.com, thomas.petazzoni@bootlin.com, mcroce@microsoft.com, saeedm@nvidia.com, tariqt@nvidia.com, aelior@marvell.com, ecree@solarflare.com, ilias.apalodimas@linaro.org, grygorii.strashko@ti.com, sthemmin@microsoft.com, mst@redhat.com, kda@linux-powerpc.org Subject: [PATCH bpf-next v4 06/10] xsk: propagate napi_id to XDP socket Rx path Date: Mon, 30 Nov 2020 19:52:01 +0100 Message-Id: <20201130185205.196029-7-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Add napi_id to the xdp_rxq_info structure, and make sure the XDP socket pick up the napi_id in the Rx path. The napi_id is used to find the corresponding NAPI structure for socket busy polling. Acked-by: Ilias Apalodimas Acked-by: Michael S. Tsirkin Acked-by: Tariq Toukan Signed-off-by: Björn Töpel --- drivers/net/ethernet/amazon/ena/ena_netdev.c | 2 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 2 +- .../ethernet/cavium/thunder/nicvf_queues.c | 2 +- .../net/ethernet/freescale/dpaa2/dpaa2-eth.c | 2 +- drivers/net/ethernet/intel/i40e/i40e_txrx.c | 2 +- drivers/net/ethernet/intel/ice/ice_base.c | 4 ++-- drivers/net/ethernet/intel/ice/ice_txrx.c | 2 +- drivers/net/ethernet/intel/igb/igb_main.c | 2 +- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 2 +- .../net/ethernet/intel/ixgbevf/ixgbevf_main.c | 2 +- drivers/net/ethernet/marvell/mvneta.c | 2 +- .../net/ethernet/marvell/mvpp2/mvpp2_main.c | 4 ++-- drivers/net/ethernet/mellanox/mlx4/en_rx.c | 2 +- .../net/ethernet/mellanox/mlx5/core/en_main.c | 2 +- .../ethernet/netronome/nfp/nfp_net_common.c | 2 +- drivers/net/ethernet/qlogic/qede/qede_main.c | 2 +- drivers/net/ethernet/sfc/rx_common.c | 2 +- drivers/net/ethernet/socionext/netsec.c | 2 +- drivers/net/ethernet/ti/cpsw_priv.c | 2 +- drivers/net/hyperv/netvsc.c | 2 +- drivers/net/tun.c | 2 +- drivers/net/veth.c | 12 ++++++++---- drivers/net/virtio_net.c | 2 +- drivers/net/xen-netfront.c | 2 +- include/net/busy_poll.h | 19 +++++++++++++++---- include/net/xdp.h | 3 ++- net/core/dev.c | 2 +- net/core/xdp.c | 3 ++- net/xdp/xsk.c | 1 + 29 files changed, 54 insertions(+), 36 deletions(-) diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c index e8131dadc22c..6ad59f0068f6 100644 --- a/drivers/net/ethernet/amazon/ena/ena_netdev.c +++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c @@ -416,7 +416,7 @@ static int ena_xdp_register_rxq_info(struct ena_ring *rx_ring) { int rc; - rc = xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, rx_ring->qid); + rc = xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, rx_ring->qid, 0); if (rc) { netif_err(rx_ring->adapter, ifup, rx_ring->netdev, diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 7975f59735d6..725d929eddb1 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -2884,7 +2884,7 @@ static int bnxt_alloc_rx_rings(struct bnxt *bp) if (rc) return rc; - rc = xdp_rxq_info_reg(&rxr->xdp_rxq, bp->dev, i); + rc = xdp_rxq_info_reg(&rxr->xdp_rxq, bp->dev, i, 0); if (rc < 0) return rc; diff --git a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c index 7a141ce32e86..f782e6af45e9 100644 --- a/drivers/net/ethernet/cavium/thunder/nicvf_queues.c +++ b/drivers/net/ethernet/cavium/thunder/nicvf_queues.c @@ -770,7 +770,7 @@ static void nicvf_rcv_queue_config(struct nicvf *nic, struct queue_set *qs, rq->caching = 1; /* Driver have no proper error path for failed XDP RX-queue info reg */ - WARN_ON(xdp_rxq_info_reg(&rq->xdp_rxq, nic->netdev, qidx) < 0); + WARN_ON(xdp_rxq_info_reg(&rq->xdp_rxq, nic->netdev, qidx, 0) < 0); /* Send a mailbox msg to PF to config RQ */ mbx.rq.msg = NIC_MBOX_MSG_RQ_CFG; diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c index cf9400a9886d..40953980e846 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-eth.c @@ -3334,7 +3334,7 @@ static int dpaa2_eth_setup_rx_flow(struct dpaa2_eth_priv *priv, return 0; err = xdp_rxq_info_reg(&fq->channel->xdp_rxq, priv->net_dev, - fq->flowid); + fq->flowid, 0); if (err) { dev_err(dev, "xdp_rxq_info_reg failed\n"); return err; diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c index c21548c71bb1..9f73cd7aee09 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c @@ -1447,7 +1447,7 @@ int i40e_setup_rx_descriptors(struct i40e_ring *rx_ring) /* XDP RX-queue info only needed for RX rings exposed to XDP */ if (rx_ring->vsi->type == I40E_VSI_MAIN) { err = xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, - rx_ring->queue_index); + rx_ring->queue_index, rx_ring->q_vector->napi.napi_id); if (err < 0) return err; } diff --git a/drivers/net/ethernet/intel/ice/ice_base.c b/drivers/net/ethernet/intel/ice/ice_base.c index fe4320e2d1f2..3124a3bf519a 100644 --- a/drivers/net/ethernet/intel/ice/ice_base.c +++ b/drivers/net/ethernet/intel/ice/ice_base.c @@ -306,7 +306,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring) if (!xdp_rxq_info_is_reg(&ring->xdp_rxq)) /* coverity[check_return] */ xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev, - ring->q_index); + ring->q_index, ring->q_vector->napi.napi_id); ring->xsk_pool = ice_xsk_pool(ring); if (ring->xsk_pool) { @@ -333,7 +333,7 @@ int ice_setup_rx_ctx(struct ice_ring *ring) /* coverity[check_return] */ xdp_rxq_info_reg(&ring->xdp_rxq, ring->netdev, - ring->q_index); + ring->q_index, ring->q_vector->napi.napi_id); err = xdp_rxq_info_reg_mem_model(&ring->xdp_rxq, MEM_TYPE_PAGE_SHARED, diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c index eae75260fe20..77d5eae6b4c2 100644 --- a/drivers/net/ethernet/intel/ice/ice_txrx.c +++ b/drivers/net/ethernet/intel/ice/ice_txrx.c @@ -483,7 +483,7 @@ int ice_setup_rx_ring(struct ice_ring *rx_ring) if (rx_ring->vsi->type == ICE_VSI_PF && !xdp_rxq_info_is_reg(&rx_ring->xdp_rxq)) if (xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, - rx_ring->q_index)) + rx_ring->q_index, rx_ring->q_vector->napi.napi_id)) goto err; return 0; diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 5fc2c381da55..6a4ef4934fcf 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4352,7 +4352,7 @@ int igb_setup_rx_resources(struct igb_ring *rx_ring) /* XDP RX-queue info */ if (xdp_rxq_info_reg(&rx_ring->xdp_rxq, rx_ring->netdev, - rx_ring->queue_index) < 0) + rx_ring->queue_index, 0) < 0) goto err; return 0; diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 45ae33e15303..50e6b8b6ba7b 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -6577,7 +6577,7 @@ int ixgbe_setup_rx_resources(struct ixgbe_adapter *adapter, /* XDP RX-queue info */ if (xdp_rxq_info_reg(&rx_ring->xdp_rxq, adapter->netdev, - rx_ring->queue_index) < 0) + rx_ring->queue_index, rx_ring->q_vector->napi.napi_id) < 0) goto err; rx_ring->xdp_prog = adapter->xdp_prog; diff --git a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c index 82fce27f682b..4061cd7db5dd 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c +++ b/drivers/net/ethernet/intel/ixgbevf/ixgbevf_main.c @@ -3493,7 +3493,7 @@ int ixgbevf_setup_rx_resources(struct ixgbevf_adapter *adapter, /* XDP RX-queue info */ if (xdp_rxq_info_reg(&rx_ring->xdp_rxq, adapter->netdev, - rx_ring->queue_index) < 0) + rx_ring->queue_index, 0) < 0) goto err; rx_ring->xdp_prog = adapter->xdp_prog; diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c index 183530ed4d1d..ba6dcb19bb1d 100644 --- a/drivers/net/ethernet/marvell/mvneta.c +++ b/drivers/net/ethernet/marvell/mvneta.c @@ -3227,7 +3227,7 @@ static int mvneta_create_page_pool(struct mvneta_port *pp, return err; } - err = xdp_rxq_info_reg(&rxq->xdp_rxq, pp->dev, rxq->id); + err = xdp_rxq_info_reg(&rxq->xdp_rxq, pp->dev, rxq->id, 0); if (err < 0) goto err_free_pp; diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c index 3069e192d773..5504cbc24970 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c @@ -2614,11 +2614,11 @@ static int mvpp2_rxq_init(struct mvpp2_port *port, mvpp2_rxq_status_update(port, rxq->id, 0, rxq->size); if (priv->percpu_pools) { - err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id); + err = xdp_rxq_info_reg(&rxq->xdp_rxq_short, port->dev, rxq->id, 0); if (err < 0) goto err_free_dma; - err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id); + err = xdp_rxq_info_reg(&rxq->xdp_rxq_long, port->dev, rxq->id, 0); if (err < 0) goto err_unregister_rxq_short; diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c index b0f79a5151cf..40775cb8fb2a 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c @@ -283,7 +283,7 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv, ring->log_stride = ffs(ring->stride) - 1; ring->buf_size = ring->size * ring->stride + TXBB_SIZE; - if (xdp_rxq_info_reg(&ring->xdp_rxq, priv->dev, queue_index) < 0) + if (xdp_rxq_info_reg(&ring->xdp_rxq, priv->dev, queue_index, 0) < 0) goto err_ring; tmp = size * roundup_pow_of_two(MLX4_EN_MAX_RX_FRAGS * diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 527c5f12c5af..427fc376fe1a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -434,7 +434,7 @@ static int mlx5e_alloc_rq(struct mlx5e_channel *c, rq_xdp_ix = rq->ix; if (xsk) rq_xdp_ix += params->num_channels * MLX5E_RQ_GROUP_XSK; - err = xdp_rxq_info_reg(&rq->xdp_rxq, rq->netdev, rq_xdp_ix); + err = xdp_rxq_info_reg(&rq->xdp_rxq, rq->netdev, rq_xdp_ix, 0); if (err < 0) goto err_rq_xdp_prog; diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c index b150da43adb2..b4acf2f41e84 100644 --- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c +++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c @@ -2533,7 +2533,7 @@ nfp_net_rx_ring_alloc(struct nfp_net_dp *dp, struct nfp_net_rx_ring *rx_ring) if (dp->netdev) { err = xdp_rxq_info_reg(&rx_ring->xdp_rxq, dp->netdev, - rx_ring->idx); + rx_ring->idx, rx_ring->r_vec->napi.napi_id); if (err < 0) return err; } diff --git a/drivers/net/ethernet/qlogic/qede/qede_main.c b/drivers/net/ethernet/qlogic/qede/qede_main.c index 05e3a3b60269..9cf960a6d007 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_main.c +++ b/drivers/net/ethernet/qlogic/qede/qede_main.c @@ -1762,7 +1762,7 @@ static void qede_init_fp(struct qede_dev *edev) /* Driver have no error path from here */ WARN_ON(xdp_rxq_info_reg(&fp->rxq->xdp_rxq, edev->ndev, - fp->rxq->rxq_id) < 0); + fp->rxq->rxq_id, 0) < 0); if (xdp_rxq_info_reg_mem_model(&fp->rxq->xdp_rxq, MEM_TYPE_PAGE_ORDER0, diff --git a/drivers/net/ethernet/sfc/rx_common.c b/drivers/net/ethernet/sfc/rx_common.c index 19cf7cac1e6e..68fc7d317693 100644 --- a/drivers/net/ethernet/sfc/rx_common.c +++ b/drivers/net/ethernet/sfc/rx_common.c @@ -262,7 +262,7 @@ void efx_init_rx_queue(struct efx_rx_queue *rx_queue) /* Initialise XDP queue information */ rc = xdp_rxq_info_reg(&rx_queue->xdp_rxq_info, efx->net_dev, - rx_queue->core_index); + rx_queue->core_index, 0); if (rc) { netif_err(efx, rx_err, efx->net_dev, diff --git a/drivers/net/ethernet/socionext/netsec.c b/drivers/net/ethernet/socionext/netsec.c index 1503cc9ec6e2..27d3c9d9210e 100644 --- a/drivers/net/ethernet/socionext/netsec.c +++ b/drivers/net/ethernet/socionext/netsec.c @@ -1304,7 +1304,7 @@ static int netsec_setup_rx_dring(struct netsec_priv *priv) goto err_out; } - err = xdp_rxq_info_reg(&dring->xdp_rxq, priv->ndev, 0); + err = xdp_rxq_info_reg(&dring->xdp_rxq, priv->ndev, 0, priv->napi.napi_id); if (err) goto err_out; diff --git a/drivers/net/ethernet/ti/cpsw_priv.c b/drivers/net/ethernet/ti/cpsw_priv.c index 31c5e36ff706..6dd73bd0f458 100644 --- a/drivers/net/ethernet/ti/cpsw_priv.c +++ b/drivers/net/ethernet/ti/cpsw_priv.c @@ -1186,7 +1186,7 @@ static int cpsw_ndev_create_xdp_rxq(struct cpsw_priv *priv, int ch) pool = cpsw->page_pool[ch]; rxq = &priv->xdp_rxq[ch]; - ret = xdp_rxq_info_reg(rxq, priv->ndev, ch); + ret = xdp_rxq_info_reg(rxq, priv->ndev, ch, 0); if (ret) return ret; diff --git a/drivers/net/hyperv/netvsc.c b/drivers/net/hyperv/netvsc.c index 0c3de94b5178..fa8341f8359a 100644 --- a/drivers/net/hyperv/netvsc.c +++ b/drivers/net/hyperv/netvsc.c @@ -1499,7 +1499,7 @@ struct netvsc_device *netvsc_device_add(struct hv_device *device, u64_stats_init(&nvchan->tx_stats.syncp); u64_stats_init(&nvchan->rx_stats.syncp); - ret = xdp_rxq_info_reg(&nvchan->xdp_rxq, ndev, i); + ret = xdp_rxq_info_reg(&nvchan->xdp_rxq, ndev, i, 0); if (ret) { netdev_err(ndev, "xdp_rxq_info_reg fail: %d\n", ret); diff --git a/drivers/net/tun.c b/drivers/net/tun.c index 3d45d56172cb..8867d39db6ac 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -780,7 +780,7 @@ static int tun_attach(struct tun_struct *tun, struct file *file, } else { /* Setup XDP RX-queue info, for new tfile getting attached */ err = xdp_rxq_info_reg(&tfile->xdp_rxq, - tun->dev, tfile->queue_index); + tun->dev, tfile->queue_index, 0); if (err < 0) goto out; err = xdp_rxq_info_reg_mem_model(&tfile->xdp_rxq, diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 8c737668008a..9bd37c7151f8 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -884,7 +884,6 @@ static int veth_napi_add(struct net_device *dev) for (i = 0; i < dev->real_num_rx_queues; i++) { struct veth_rq *rq = &priv->rq[i]; - netif_napi_add(dev, &rq->xdp_napi, veth_poll, NAPI_POLL_WEIGHT); napi_enable(&rq->xdp_napi); } @@ -926,7 +925,8 @@ static int veth_enable_xdp(struct net_device *dev) for (i = 0; i < dev->real_num_rx_queues; i++) { struct veth_rq *rq = &priv->rq[i]; - err = xdp_rxq_info_reg(&rq->xdp_rxq, dev, i); + netif_napi_add(dev, &rq->xdp_napi, veth_poll, NAPI_POLL_WEIGHT); + err = xdp_rxq_info_reg(&rq->xdp_rxq, dev, i, rq->xdp_napi.napi_id); if (err < 0) goto err_rxq_reg; @@ -952,8 +952,12 @@ static int veth_enable_xdp(struct net_device *dev) err_reg_mem: xdp_rxq_info_unreg(&priv->rq[i].xdp_rxq); err_rxq_reg: - for (i--; i >= 0; i--) - xdp_rxq_info_unreg(&priv->rq[i].xdp_rxq); + for (i--; i >= 0; i--) { + struct veth_rq *rq = &priv->rq[i]; + + xdp_rxq_info_unreg(&rq->xdp_rxq); + netif_napi_del(&rq->xdp_napi); + } return err; } diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c index 21b71148c532..052975ea0af4 100644 --- a/drivers/net/virtio_net.c +++ b/drivers/net/virtio_net.c @@ -1485,7 +1485,7 @@ static int virtnet_open(struct net_device *dev) if (!try_fill_recv(vi, &vi->rq[i], GFP_KERNEL)) schedule_delayed_work(&vi->refill, 0); - err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i); + err = xdp_rxq_info_reg(&vi->rq[i].xdp_rxq, dev, i, vi->rq[i].napi.napi_id); if (err < 0) return err; diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c index 920cac4385bf..b01848ef4649 100644 --- a/drivers/net/xen-netfront.c +++ b/drivers/net/xen-netfront.c @@ -2014,7 +2014,7 @@ static int xennet_create_page_pool(struct netfront_queue *queue) } err = xdp_rxq_info_reg(&queue->xdp_rxq, queue->info->netdev, - queue->id); + queue->id, 0); if (err) { netdev_err(queue->info->netdev, "xdp_rxq_info_reg failed\n"); goto err_free_pp; diff --git a/include/net/busy_poll.h b/include/net/busy_poll.h index 2f8f51807b83..45b3e04b99d3 100644 --- a/include/net/busy_poll.h +++ b/include/net/busy_poll.h @@ -135,14 +135,25 @@ static inline void sk_mark_napi_id(struct sock *sk, const struct sk_buff *skb) sk_rx_queue_set(sk, skb); } -/* variant used for unconnected sockets */ -static inline void sk_mark_napi_id_once(struct sock *sk, - const struct sk_buff *skb) +static inline void __sk_mark_napi_id_once_xdp(struct sock *sk, unsigned int napi_id) { #ifdef CONFIG_NET_RX_BUSY_POLL if (!READ_ONCE(sk->sk_napi_id)) - WRITE_ONCE(sk->sk_napi_id, skb->napi_id); + WRITE_ONCE(sk->sk_napi_id, napi_id); #endif } +/* variant used for unconnected sockets */ +static inline void sk_mark_napi_id_once(struct sock *sk, + const struct sk_buff *skb) +{ + __sk_mark_napi_id_once_xdp(sk, skb->napi_id); +} + +static inline void sk_mark_napi_id_once_xdp(struct sock *sk, + const struct xdp_buff *xdp) +{ + __sk_mark_napi_id_once_xdp(sk, xdp->rxq->napi_id); +} + #endif /* _LINUX_NET_BUSY_POLL_H */ diff --git a/include/net/xdp.h b/include/net/xdp.h index 7d48b2ae217a..700ad5db7f5d 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -59,6 +59,7 @@ struct xdp_rxq_info { u32 queue_index; u32 reg_state; struct xdp_mem_info mem; + unsigned int napi_id; } ____cacheline_aligned; /* perf critical, avoid false-sharing */ struct xdp_txq_info { @@ -226,7 +227,7 @@ static inline void xdp_release_frame(struct xdp_frame *xdpf) } int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq, - struct net_device *dev, u32 queue_index); + struct net_device *dev, u32 queue_index, unsigned int napi_id); void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq); void xdp_rxq_info_unused(struct xdp_rxq_info *xdp_rxq); bool xdp_rxq_info_is_reg(struct xdp_rxq_info *xdp_rxq); diff --git a/net/core/dev.c b/net/core/dev.c index 7a1e5936c67f..3b6b0e175fe7 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9810,7 +9810,7 @@ static int netif_alloc_rx_queues(struct net_device *dev) rx[i].dev = dev; /* XDP RX-queue setup */ - err = xdp_rxq_info_reg(&rx[i].xdp_rxq, dev, i); + err = xdp_rxq_info_reg(&rx[i].xdp_rxq, dev, i, 0); if (err < 0) goto err_rxq_info; } diff --git a/net/core/xdp.c b/net/core/xdp.c index 3d330ebda893..17ffd33c6b18 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -158,7 +158,7 @@ static void xdp_rxq_info_init(struct xdp_rxq_info *xdp_rxq) /* Returns 0 on success, negative on failure */ int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq, - struct net_device *dev, u32 queue_index) + struct net_device *dev, u32 queue_index, unsigned int napi_id) { if (xdp_rxq->reg_state == REG_STATE_UNUSED) { WARN(1, "Driver promised not to register this"); @@ -179,6 +179,7 @@ int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq, xdp_rxq_info_init(xdp_rxq); xdp_rxq->dev = dev; xdp_rxq->queue_index = queue_index; + xdp_rxq->napi_id = napi_id; xdp_rxq->reg_state = REG_STATE_REGISTERED; return 0; diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index a8501a5477cf..7588e599a048 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -233,6 +233,7 @@ static int xsk_rcv(struct xdp_sock *xs, struct xdp_buff *xdp, if (xs->dev != xdp->rxq->dev || xs->queue_id != xdp->rxq->queue_index) return -EINVAL; + sk_mark_napi_id_once_xdp(&xs->sk, xdp); len = xdp->data_end - xdp->data; return xdp->rxq->mem.type == MEM_TYPE_XSK_BUFF_POOL ? From patchwork Mon Nov 30 18:52:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 04FCAC83023 for ; Mon, 30 Nov 2020 18:54:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C14132074A for ; Mon, 30 Nov 2020 18:54:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mLWsM3dg" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388246AbgK3SyF (ORCPT ); Mon, 30 Nov 2020 13:54:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60790 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388215AbgK3SyD (ORCPT ); Mon, 30 Nov 2020 13:54:03 -0500 Received: from mail-pj1-x1044.google.com (mail-pj1-x1044.google.com [IPv6:2607:f8b0:4864:20::1044]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E5E2C0613D2; Mon, 30 Nov 2020 10:53:22 -0800 (PST) Received: by mail-pj1-x1044.google.com with SMTP id r9so133595pjl.5; Mon, 30 Nov 2020 10:53:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fmqhingHLo6a3b7AT4xKvWss37atmCmXSS2JFS2qwt8=; b=mLWsM3dgojb+LvnMZdQrQaRRXrS0CPK0KYxTsh1ET1ITv5UOfwPa5v8lmvpE0zAUHO Ltf6T8kfYScaqRkrf1khDQqRg8K+si3jrln5C0bP33QNNfEixeCLQNKOzF4L+E5o/r3R 5u+iZ9z2JRL8CsgBzMzxl1xEdRMdvD77Ejtu07MWF6mBOSVZ6T5Fzu+gv+21wE8St1nA zd8GOHZZnEhPu4VcLZy0HkmgfGSWo0549V9OQjypmKlCqUTOiYXchYfQie2ctiS7hVgB ZCKoKUfSRNh6jhxl2U0CdfXs3v6UHav1Ix6crkmZ5L2meoFo3DoJLEYR6gnz8yPlh0N+ cDJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fmqhingHLo6a3b7AT4xKvWss37atmCmXSS2JFS2qwt8=; b=tErBh1/gezi+Q7XCl0dTK3LpqWLgMNlYic3Ln3k4/Kx0WLsfADKdA13BzG/wfwhy/Y aAPkkIqjXhmwpgKsX80dwXNZTVT2yj1afOfM2goNT37eX1z0xc9S8IrncIad7wYhW2zl 4hc2NlpW/udgu0q/nsFmjLKNO7vo/iGe3pcINrusfVSi5n6qL8WBD9p+OlR7t5YF3k+e YXluySbOiCjtRoU1zNUEuBZftoBVryJmHtxm6LkMvfCmH6YHA+DJKeOZ5TNZCmuL4Fb1 pirCZOHPIOUxL1+81BxO46h5bIw1PUag/NQg6iKwbppx86eew7/+Jn1FFBxOwRxxZ0r8 Nonw== X-Gm-Message-State: AOAM530cLqdWHpFcFCpSth5DY2Po7cHJrE6bJMSoorqIG2FrqbHYztQ0 0HG+xoZSG+lHJflkwcv0mAPpBoTNXQc8qJRn7OY= X-Google-Smtp-Source: ABdhPJxJp+GRWsvXDUjngQkjuJUrcfigF+espLaDp5UTTjcihuycguUANO1ZCtChnG9dLSn004YVhg== X-Received: by 2002:a17:90b:3658:: with SMTP id nh24mr256052pjb.80.1606762401345; Mon, 30 Nov 2020 10:53:21 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.53.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:20 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 07/10] samples/bpf: use recvfrom() in xdpsock/rxdrop Date: Mon, 30 Nov 2020 19:52:02 +0100 Message-Id: <20201130185205.196029-8-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Start using recvfrom() the rxdrop scenario. Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- samples/bpf/xdpsock_user.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 2567f0db5aca..f90111b95b2e 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -1170,7 +1170,7 @@ static inline void complete_tx_only(struct xsk_socket_info *xsk, } } -static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds) +static void rx_drop(struct xsk_socket_info *xsk) { unsigned int rcvd, i; u32 idx_rx = 0, idx_fq = 0; @@ -1180,7 +1180,7 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds) if (!rcvd) { if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.rx_empty_polls++; - ret = poll(fds, num_socks, opt_timeout); + recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } return; } @@ -1191,7 +1191,7 @@ static void rx_drop(struct xsk_socket_info *xsk, struct pollfd *fds) exit_with_error(-ret); if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.fill_fail_polls++; - ret = poll(fds, num_socks, opt_timeout); + recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } ret = xsk_ring_prod__reserve(&xsk->umem->fq, rcvd, &idx_fq); } @@ -1233,7 +1233,7 @@ static void rx_drop_all(void) } for (i = 0; i < num_socks; i++) - rx_drop(xsks[i], fds); + rx_drop(xsks[i]); if (benchmark_done) break; From patchwork Mon Nov 30 18:52:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9123CC64E8A for ; Mon, 30 Nov 2020 18:54:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 475CB20725 for ; Mon, 30 Nov 2020 18:54:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Ev+I42/W" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388255AbgK3SyJ (ORCPT ); Mon, 30 Nov 2020 13:54:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60806 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388189AbgK3SyJ (ORCPT ); Mon, 30 Nov 2020 13:54:09 -0500 Received: from mail-pg1-x542.google.com (mail-pg1-x542.google.com [IPv6:2607:f8b0:4864:20::542]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D5385C0613D3; Mon, 30 Nov 2020 10:53:28 -0800 (PST) Received: by mail-pg1-x542.google.com with SMTP id g18so463162pgk.1; Mon, 30 Nov 2020 10:53:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ifPzIxfVrGp0MkZKK1c0SVJ6+sKps1Wx5g+mcmrszg4=; b=Ev+I42/WDjgKUzgGtnANClHQ9gR9WDUZnmBy9KkrbcSt3QcylXsfL0V6ER4k8QAj3A uptoYE24Uzb9huFBiktrYfn1lLSv+IsAklDvvwnmX5Gx4y4Rq+Uyfk04FBYD+O+SK0id ZCY0ULZLuv+N7sT3O4VnQBuT2VIC5kSSX/hvtsg9sArb3CU3CtDLzY7JB44LPfXPGYBv LEC6Aao5FtJWrCitLIJ1etzWF/teGntq76Si/t+O6xlr5hZAcgSkoSND/HvpJ+zypPBm soPac9db49ZBZT/FVpfYmlLXQzawoRrEALTq9Z4uAi/hmXuYo34rjW/kBerWgMkzoe0p tAmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ifPzIxfVrGp0MkZKK1c0SVJ6+sKps1Wx5g+mcmrszg4=; b=qlahen5rjN939KYTISzvaC5YxAhxKT9btwQg65R6oDjAVzkA+QhUgsb2cj9zlucswm 2F1psE6LH2T0IO+uDlyN6pj1muExzXE4raGflgrw0G9f0lJ3lzLVtzwWzz0xabz07EqS u0ALr02IRx1Ods8edoTznWOp/P+PoppA5AQKEbEtpMOvzaPqMjD2cznh3yIt+VeaIkt4 qcL2iAeE0Y07xiNtIZVMCsuCYEKsciz7IyAQF7dwxca7mrTsHGnqz1nuTm8lIGHbQtfT dXTaVtCnHv337VgkV0pMKn4Ce0JNOez4pEij6HH0q7y9qO360QRp3oxicZRYuUE+SAeX Y/Ng== X-Gm-Message-State: AOAM531ywvOkqLV+C5XHo9cSwY8FejnDqf51DHuWGgZPowzIhO5ybu3o bH6Pu/N2MMfK/0yH2Q/EAqTTgKgDCYv3SQhvCL4= X-Google-Smtp-Source: ABdhPJzFJI8III8i86/rrPJj8GhR259X/PPDwsEFS+WdKScE+bnODdG5zyDLCbnEbObXNm51WPJ6Ew== X-Received: by 2002:a63:3d8c:: with SMTP id k134mr19043360pga.53.1606762407800; Mon, 30 Nov 2020 10:53:27 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.53.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:26 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 08/10] samples/bpf: use recvfrom() in xdpsock/l2fwd Date: Mon, 30 Nov 2020 19:52:03 +0100 Message-Id: <20201130185205.196029-9-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Start using recvfrom() the l2fwd scenario, instead of poll() which is more expensive and need additional knobs for busy-polling. Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- samples/bpf/xdpsock_user.c | 26 ++++++++++++-------------- 1 file changed, 12 insertions(+), 14 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index f90111b95b2e..a1a3d6f02ba9 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -1098,8 +1098,7 @@ static void kick_tx(struct xsk_socket_info *xsk) exit_with_error(errno); } -static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk, - struct pollfd *fds) +static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk) { struct xsk_umem_info *umem = xsk->umem; u32 idx_cq = 0, idx_fq = 0; @@ -1134,7 +1133,8 @@ static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk, exit_with_error(-ret); if (xsk_ring_prod__needs_wakeup(&umem->fq)) { xsk->app_stats.fill_fail_polls++; - ret = poll(fds, num_socks, opt_timeout); + recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, + NULL); } ret = xsk_ring_prod__reserve(&umem->fq, rcvd, &idx_fq); } @@ -1331,19 +1331,19 @@ static void tx_only_all(void) complete_tx_only_all(); } -static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds) +static void l2fwd(struct xsk_socket_info *xsk) { unsigned int rcvd, i; u32 idx_rx = 0, idx_tx = 0; int ret; - complete_tx_l2fwd(xsk, fds); + complete_tx_l2fwd(xsk); rcvd = xsk_ring_cons__peek(&xsk->rx, opt_batch_size, &idx_rx); if (!rcvd) { if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.rx_empty_polls++; - ret = poll(fds, num_socks, opt_timeout); + recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } return; } @@ -1353,7 +1353,7 @@ static void l2fwd(struct xsk_socket_info *xsk, struct pollfd *fds) while (ret != rcvd) { if (ret < 0) exit_with_error(-ret); - complete_tx_l2fwd(xsk, fds); + complete_tx_l2fwd(xsk); if (xsk_ring_prod__needs_wakeup(&xsk->tx)) { xsk->app_stats.tx_wakeup_sendtos++; kick_tx(xsk); @@ -1388,22 +1388,20 @@ static void l2fwd_all(void) struct pollfd fds[MAX_SOCKS] = {}; int i, ret; - for (i = 0; i < num_socks; i++) { - fds[i].fd = xsk_socket__fd(xsks[i]->xsk); - fds[i].events = POLLOUT | POLLIN; - } - for (;;) { if (opt_poll) { - for (i = 0; i < num_socks; i++) + for (i = 0; i < num_socks; i++) { + fds[i].fd = xsk_socket__fd(xsks[i]->xsk); + fds[i].events = POLLOUT | POLLIN; xsks[i]->app_stats.opt_polls++; + } ret = poll(fds, num_socks, opt_timeout); if (ret <= 0) continue; } for (i = 0; i < num_socks; i++) - l2fwd(xsks[i], fds); + l2fwd(xsks[i]); if (benchmark_done) break; From patchwork Mon Nov 30 18:52:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335682 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B093C8302C for ; Mon, 30 Nov 2020 18:54:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 41FD220725 for ; Mon, 30 Nov 2020 18:54:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oELndJ2H" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388285AbgK3SyR (ORCPT ); Mon, 30 Nov 2020 13:54:17 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60824 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388189AbgK3SyP (ORCPT ); Mon, 30 Nov 2020 13:54:15 -0500 Received: from mail-pf1-x441.google.com (mail-pf1-x441.google.com [IPv6:2607:f8b0:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8110DC0613D4; Mon, 30 Nov 2020 10:53:35 -0800 (PST) Received: by mail-pf1-x441.google.com with SMTP id s21so10982072pfu.13; Mon, 30 Nov 2020 10:53:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=7nxznByoRK661Y5sKA3lOE1hFkbu17g9h/tzfUN4O8U=; b=oELndJ2HGY7KT3MSqF6sOrP2oktK0nbVDHPOLqVFWQ89PU3eoWVh8MHs7GUQ9B+Na+ E4ATTNBUEs3x+xUQdZWaVzJRsclaTRnivuKsTfxBI4cGfvVU4QRN7Py+9zOCLhXcikwf /Zn5atxbwsvdzUMSIwed+Egnn2cFXaCHTPLi2X6EiVIyuMm/bkqSXXEboZtLRk6QWdM2 VVpF5jcd8T2KChaoTBsDfm/hdph/HTviSX1O1KIRlphVBH3dc5xo8kQGRq8khcWPzsJE wQndJBNr7k9ZlF4I51FW+VZhbDEotv78g/pvh86BWdQi2dL7yVASLuGJSUoQhx3McbQA cmMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7nxznByoRK661Y5sKA3lOE1hFkbu17g9h/tzfUN4O8U=; b=MYS8AQgHQrSpRqwKABCJsPf2yrF+7Q/8F8M4SDLL5fceNSbSf3fBeTExMR/MEjx4RY APBsWsmm2h6F4dI5xnp1CLlT+7xXYUBlXaKto6npwvtVY0tA6LZsX2xg/hSBUuPFjHmY TEt0/VTHihR23XzQYwiaT5LWmckELDoWz2ui4vy1uhjOD3mX/nLGdHmgNj8a7lxk0VGp 7Fwz3nz+48FuX+A+JAVa+NH8ywqBtcmKSI0JjfWZJXFFl0or6J+fL9F97JpgFdh/5iZH +kfXQlCbLAMUwoG6PqZh5HeaM5jbD0APc9RJkjSVxzhzZkWGk2wgdxyHS8ergnBkBfUo d72g== X-Gm-Message-State: AOAM532rKcbwr/3fDNQuJKdiJ6m3oC1QCW7sm6BXLA6tXDTePEQGysLm 8x04qnwnZICy3TnZKdW8o0e0Q5dxH5w287ZBjM4= X-Google-Smtp-Source: ABdhPJwUeUxTq6/xnaNF5YUbPN0/gffBUovwwhhdE0ZsoJbhgAeURjx7CYH7je5Cb6GvQvADs+z1xQ== X-Received: by 2002:a63:5f49:: with SMTP id t70mr13093837pgb.288.1606762414353; Mon, 30 Nov 2020 10:53:34 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.53.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:33 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 09/10] samples/bpf: add busy-poll support to xdpsock Date: Mon, 30 Nov 2020 19:52:04 +0100 Message-Id: <20201130185205.196029-10-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Add a new option to xdpsock, 'B', for busy-polling. This option will also set the batching size, 'b' option, to the busy-poll budget. Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- samples/bpf/xdpsock_user.c | 40 +++++++++++++++++++++++++++++++------- 1 file changed, 33 insertions(+), 7 deletions(-) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index a1a3d6f02ba9..4622a17fafe1 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -95,6 +95,7 @@ static int opt_timeout = 1000; static bool opt_need_wakeup = true; static u32 opt_num_xsks = 1; static u32 prog_id; +static bool opt_busy_poll; struct xsk_ring_stats { unsigned long rx_npkts; @@ -911,6 +912,7 @@ static struct option long_options[] = { {"quiet", no_argument, 0, 'Q'}, {"app-stats", no_argument, 0, 'a'}, {"irq-string", no_argument, 0, 'I'}, + {"busy-poll", no_argument, 0, 'B'}, {0, 0, 0, 0} }; @@ -949,6 +951,7 @@ static void usage(const char *prog) " -Q, --quiet Do not display any stats.\n" " -a, --app-stats Display application (syscall) statistics.\n" " -I, --irq-string Display driver interrupt statistics for interface associated with irq-string.\n" + " -B, --busy-poll Busy poll.\n" "\n"; fprintf(stderr, str, prog, XSK_UMEM__DEFAULT_FRAME_SIZE, opt_batch_size, MIN_PKT_SIZE, MIN_PKT_SIZE, @@ -964,7 +967,7 @@ static void parse_command_line(int argc, char **argv) opterr = 0; for (;;) { - c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:", + c = getopt_long(argc, argv, "Frtli:q:pSNn:czf:muMd:b:C:s:P:xQaI:B", long_options, &option_index); if (c == -1) break; @@ -1062,7 +1065,9 @@ static void parse_command_line(int argc, char **argv) fprintf(stderr, "ERROR: Failed to get irqs for %s\n", opt_irq_str); usage(basename(argv[0])); } - + break; + case 'B': + opt_busy_poll = 1; break; default: usage(basename(argv[0])); @@ -1131,7 +1136,7 @@ static inline void complete_tx_l2fwd(struct xsk_socket_info *xsk) while (ret != rcvd) { if (ret < 0) exit_with_error(-ret); - if (xsk_ring_prod__needs_wakeup(&umem->fq)) { + if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&umem->fq)) { xsk->app_stats.fill_fail_polls++; recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); @@ -1178,7 +1183,7 @@ static void rx_drop(struct xsk_socket_info *xsk) rcvd = xsk_ring_cons__peek(&xsk->rx, opt_batch_size, &idx_rx); if (!rcvd) { - if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { + if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.rx_empty_polls++; recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } @@ -1189,7 +1194,7 @@ static void rx_drop(struct xsk_socket_info *xsk) while (ret != rcvd) { if (ret < 0) exit_with_error(-ret); - if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { + if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.fill_fail_polls++; recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } @@ -1341,7 +1346,7 @@ static void l2fwd(struct xsk_socket_info *xsk) rcvd = xsk_ring_cons__peek(&xsk->rx, opt_batch_size, &idx_rx); if (!rcvd) { - if (xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { + if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->umem->fq)) { xsk->app_stats.rx_empty_polls++; recvfrom(xsk_socket__fd(xsk->xsk), NULL, 0, MSG_DONTWAIT, NULL, NULL); } @@ -1354,7 +1359,7 @@ static void l2fwd(struct xsk_socket_info *xsk) if (ret < 0) exit_with_error(-ret); complete_tx_l2fwd(xsk); - if (xsk_ring_prod__needs_wakeup(&xsk->tx)) { + if (opt_busy_poll || xsk_ring_prod__needs_wakeup(&xsk->tx)) { xsk->app_stats.tx_wakeup_sendtos++; kick_tx(xsk); } @@ -1459,6 +1464,24 @@ static void enter_xsks_into_map(struct bpf_object *obj) } } +static void apply_setsockopt(struct xsk_socket_info *xsk) +{ + int sock_opt; + + if (!opt_busy_poll) + return; + + sock_opt = 1; + if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_PREFER_BUSY_POLL, + (void *)&sock_opt, sizeof(sock_opt)) < 0) + exit_with_error(errno); + + sock_opt = 20; + if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_BUSY_POLL, + (void *)&sock_opt, sizeof(sock_opt)) < 0) + exit_with_error(errno); +} + int main(int argc, char **argv) { struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY}; @@ -1500,6 +1523,9 @@ int main(int argc, char **argv) for (i = 0; i < opt_num_xsks; i++) xsks[num_socks++] = xsk_configure_socket(umem, rx, tx); + for (i = 0; i < opt_num_xsks; i++) + apply_setsockopt(xsks[i]); + if (opt_bench == BENCH_TXONLY) { gen_eth_hdr_data(); From patchwork Mon Nov 30 18:52:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= X-Patchwork-Id: 335034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI, SPF_HELO_NONE, SPF_PASS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C536CC64E8A for ; Mon, 30 Nov 2020 18:55:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7DF0B2073C for ; Mon, 30 Nov 2020 18:55:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="a6hzTvLq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388189AbgK3SyX (ORCPT ); Mon, 30 Nov 2020 13:54:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60844 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388288AbgK3SyW (ORCPT ); Mon, 30 Nov 2020 13:54:22 -0500 Received: from mail-pl1-x642.google.com (mail-pl1-x642.google.com [IPv6:2607:f8b0:4864:20::642]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E96EAC061A48; Mon, 30 Nov 2020 10:53:41 -0800 (PST) Received: by mail-pl1-x642.google.com with SMTP id t18so7007675plo.0; Mon, 30 Nov 2020 10:53:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Ouf0WBHGZpA6/0T7zSjGFwwnqyhXPQ0sX/VyNuYesJo=; b=a6hzTvLqRYg7ldf1Khs4RKbCmBut9yjGjfHdb8GujHheQdE2i/oOou4SFRuLrBgg8U K5lLwMQHjHk2cZxK2OK+iUG4nInYmifQqneQ/pV5CDfnbZnL0yoCuIqQVBR71mySif90 FnQG/An25tnFZ+lb8A6HjIehHhP0/uZNyCgWJvCJ/bU2FoRp6giMF9CKvWUMA77Xm/Cq 91a1ISkixkoVOmG59WwjZJBEG+ojQRVIkswDuOQn1EyWS/A13uQM7V9VzO1Zfsld01iT d3vxg6D4FMVgJC9j6AWWtlolsaSGIQj/x4+4XRuyHJWI7pOpQKm1lbEMBbiRvhIVxJIG wfVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Ouf0WBHGZpA6/0T7zSjGFwwnqyhXPQ0sX/VyNuYesJo=; b=nlhN6Wwp65H9nBXZ2O64Qvw+u3MmClwnBIRVdLCSCcVeKbYZ6zh2+XBvfqS2a+0kzY v5DY8xuvIwXvHoafinfeeRpkeeH2cWPhu1s32qdUivN/ExYdzvOc0gyjOkR8yPT0NvJ3 vjvB21SlGdCdtswgdnXmHXqyDjnHmyEf2Aw3Ge60jrlTOq2sWtiwMZQRQV12rlo9qgS7 oM4PAFGQFp4L7SduQisudsXhKqMbNM4fV4UpzApw2Y2ZP6K+IVfL2kll9PQHIaVFq13S vN7+xbehV+Sdg/Y7iZYLjhRhHnoEtgM19WOICBxL5kj6wO2wSOoBYkZOXRBo9GLr8peb pYGQ== X-Gm-Message-State: AOAM531QktjrWUAJfObcX0Z2hkUSItJgVwOODX9hh+l9tInwIHSO+Skw AvTC6yNxDMSX64f6yhg9Ds2SxDpnBXkztm/vLAs= X-Google-Smtp-Source: ABdhPJyttLGEJqv8v1u3Ouay0Po85DyvLAuxQfsGNmkUMiVn4Vsp3s/W3cvAdjIkAD/udEbZWdf4UQ== X-Received: by 2002:a17:90a:6b4b:: with SMTP id x11mr229180pjl.3.1606762421076; Mon, 30 Nov 2020 10:53:41 -0800 (PST) Received: from btopel-mobl.ger.intel.com ([192.55.55.41]) by smtp.gmail.com with ESMTPSA id i3sm12005978pgq.12.2020.11.30.10.53.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 30 Nov 2020 10:53:39 -0800 (PST) From: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= To: netdev@vger.kernel.org, bpf@vger.kernel.org Cc: =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , magnus.karlsson@intel.com, ast@kernel.org, daniel@iogearbox.net, maciej.fijalkowski@intel.com, sridhar.samudrala@intel.com, jesse.brandeburg@intel.com, qi.z.zhang@intel.com, kuba@kernel.org, edumazet@google.com, jonathan.lemon@gmail.com, maximmi@nvidia.com Subject: [PATCH bpf-next v4 10/10] samples/bpf: add option to set the busy-poll budget Date: Mon, 30 Nov 2020 19:52:05 +0100 Message-Id: <20201130185205.196029-11-bjorn.topel@gmail.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201130185205.196029-1-bjorn.topel@gmail.com> References: <20201130185205.196029-1-bjorn.topel@gmail.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org From: Björn Töpel Support for the SO_BUSY_POLL_BUDGET setsockopt, via the batching option ('b'). Acked-by: Magnus Karlsson Signed-off-by: Björn Töpel --- samples/bpf/xdpsock_user.c | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 4622a17fafe1..036bd019e400 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -1480,6 +1480,11 @@ static void apply_setsockopt(struct xsk_socket_info *xsk) if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_BUSY_POLL, (void *)&sock_opt, sizeof(sock_opt)) < 0) exit_with_error(errno); + + sock_opt = opt_batch_size; + if (setsockopt(xsk_socket__fd(xsk->xsk), SOL_SOCKET, SO_BUSY_POLL_BUDGET, + (void *)&sock_opt, sizeof(sock_opt)) < 0) + exit_with_error(errno); } int main(int argc, char **argv)