From patchwork Thu Feb 6 06:04:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Yoong Siang X-Patchwork-Id: 862771 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D150B2E64A; Thu, 6 Feb 2025 06:04:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821875; cv=none; b=q2zL0VQ50nPVnhdjGAD6008o1QTJ8HiLSAqEePSZk7Z7mIh7wG+eyOQQ6f8Z4mQB18UvpdNICtfAc9ppWUhTYrDJt4K0rVOUOe+bYySDNlx0aK8ouO9kH0FMGoRmIHXJIuQakeZ9/+Hb9w4D9RVq7jRh7gDLxHEZTtvCqhJd6Kk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821875; c=relaxed/simple; bh=2wyIDkPbbE3D+rRyGO1+SmcWucwUizW3A2KhOWL6EtI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=K/DHrQy6u6Glu3+utTOjwUWy+FF5IYDd7OlELhsLEQqDNotvDg1zzHDZBttiHvJOQPgViA9nsprFGo2ywVf/kJtukZN9FdUS1wKV0vi1quQctb3qKL6gZjY+bsCsXHnFoeFQrjjadxPSn+nU7rtuekxSn1n3ekVYj+zRPmzQEJ8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=SNEiUW+K; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="SNEiUW+K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738821873; x=1770357873; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=2wyIDkPbbE3D+rRyGO1+SmcWucwUizW3A2KhOWL6EtI=; b=SNEiUW+K2VNEMT+LeXsvxEVY+uGC6htXJ/OqCtwo2BZahZXHBwQeLjUz 2NZrhmC1Cx8fDnx7ozHQadR3y9SAqYor/BNdK+GMJdxL/3gGlEMojYT9F EUA9VQoHhwkIRIRXlpbqKOlwM5IuWpAKKWyLJQdS2bNWDPDFM2J/65tlo t1VDiSAuQQTO1lJCf9ljWNky+F656r/BzAyTuMRZt4WjDIEt6+GJ1Jt3t VVRlsmmdMNQLEeabwpyXtf9V3wPOP/Hn1+teA9e2RKAsGf5W9RMp2A+Xy zZn8WZ88DsmQ/rMKy7b0xOrs2EW4+wuyxQf0uJlp6g8lfHGDkXl5F4sH2 g==; X-CSE-ConnectionGUID: ERtRekBkQhCCnkwD79M6EQ== X-CSE-MsgGUID: RJvNERgtT/iHOBx/Z+hRGw== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="50051002" X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="50051002" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2025 22:04:32 -0800 X-CSE-ConnectionGUID: vo8zm1ZAR/qMXBRtmYm1Fg== X-CSE-MsgGUID: 6hhPH50JR16z5zl08IrWeg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="110944355" Received: from p12ill20yoongsia.png.intel.com ([10.88.227.38]) by fmviesa006.fm.intel.com with ESMTP; 05 Feb 2025 22:04:21 -0800 From: Song Yoong Siang To: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Willem de Bruijn , Florian Bezdeka , Donald Hunter , Jonathan Corbet , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Andrew Lunn , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Joe Damato , Stanislav Fomichev , Xuan Zhuo , Mina Almasry , Daniel Jurgens , Song Yoong Siang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Hao Luo , Jiri Olsa , Shuah Khan , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Tony Nguyen , Przemek Kitszel , Faizal Rahim , Choong Yong Liang , Bouska Zdenek Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, intel-wired-lan@lists.osuosl.org, xdp-hints@xdp-project.net Subject: [PATCH bpf-next v9 1/5] xsk: Add launch time hardware offload support to XDP Tx metadata Date: Thu, 6 Feb 2025 14:04:04 +0800 Message-Id: <20250206060408.808325-2-yoong.siang.song@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250206060408.808325-1-yoong.siang.song@intel.com> References: <20250206060408.808325-1-yoong.siang.song@intel.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Extend the XDP Tx metadata framework so that user can requests launch time hardware offload, where the Ethernet device will schedule the packet for transmission at a pre-determined time called launch time. The value of launch time is communicated from user space to Ethernet driver via launch_time field of struct xsk_tx_metadata. Suggested-by: Stanislav Fomichev Acked-by: Stanislav Fomichev Signed-off-by: Song Yoong Siang --- Documentation/netlink/specs/netdev.yaml | 4 ++ Documentation/networking/xsk-tx-metadata.rst | 62 ++++++++++++++++++++ include/net/xdp_sock.h | 10 ++++ include/net/xdp_sock_drv.h | 1 + include/uapi/linux/if_xdp.h | 10 ++++ include/uapi/linux/netdev.h | 3 + net/core/netdev-genl.c | 2 + net/xdp/xsk.c | 3 + tools/include/uapi/linux/if_xdp.h | 10 ++++ tools/include/uapi/linux/netdev.h | 3 + 10 files changed, 108 insertions(+) diff --git a/Documentation/netlink/specs/netdev.yaml b/Documentation/netlink/specs/netdev.yaml index cbb544bd6c84..901b5afb3df0 100644 --- a/Documentation/netlink/specs/netdev.yaml +++ b/Documentation/netlink/specs/netdev.yaml @@ -70,6 +70,10 @@ definitions: name: tx-checksum doc: L3 checksum HW offload is supported by the driver. + - + name: tx-launch-time-fifo + doc: + Launch time HW offload is supported by the driver. - name: queue-type type: enum diff --git a/Documentation/networking/xsk-tx-metadata.rst b/Documentation/networking/xsk-tx-metadata.rst index e76b0cfc32f7..df53a10ccac3 100644 --- a/Documentation/networking/xsk-tx-metadata.rst +++ b/Documentation/networking/xsk-tx-metadata.rst @@ -50,6 +50,10 @@ The flags field enables the particular offload: checksum. ``csum_start`` specifies byte offset of where the checksumming should start and ``csum_offset`` specifies byte offset where the device should store the computed checksum. +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the + packet for transmission at a pre-determined time called launch time. The + value of launch time is indicated by ``launch_time`` field of + ``union xsk_tx_metadata``. Besides the flags above, in order to trigger the offloads, the first packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` @@ -65,6 +69,63 @@ In this case, when running in ``XDK_COPY`` mode, the TX checksum is calculated on the CPU. Do not enable this option in production because it will negatively affect performance. +Launch Time +=========== + +The value of the requested launch time should be based on the device's PTP +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path +compared to the ETF queuing discipline, which organizes packets and delays +their transmission. Instead, AF_XDP immediately hands off the packets to +the device driver without rearranging their order or holding them prior to +transmission. Since the driver maintains FIFO behavior and does not perform +packet reordering, a packet with a launch time request will block other +packets in the same Tx Queue until it is sent. Therefore, it is recommended +to allocate separate queue for scheduling traffic that is intended for +future transmission. + +In scenarios where the launch time offload feature is disabled, the device +driver is expected to disregard the launch time request. For correct +interpretation and meaningful operation, the launch time should never be +set to a value larger than the farthest programmable time in the future +(the horizon). Different devices have different hardware limitations on the +launch time offload feature. + +stmmac driver +------------- + +For stmmac, TSO and launch time (TBS) features are mutually exclusive for +each individual Tx Queue. By default, the driver configures Tx Queue 0 to +support TSO and the rest of the Tx Queues to support TBS. The launch time +hardware offload feature can be enabled or disabled by using the tc-etf +command to call the driver's ndo_setup_tc() callback. + +The value of the launch time that is programmed in the Enhanced Normal +Transmit Descriptors is a 32-bit value, where the most significant 8 bits +represent the time in seconds and the remaining 24 bits represent the time +in 256 ns increments. The programmed launch time is compared against the +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the +future. + +igc driver +---------- + +For igc, all four Tx Queues support the launch time feature. The launch +time hardware offload feature can be enabled or disabled by using the +tc-etf command to call the driver's ndo_setup_tc() callback. When entering +TSN mode, the igc driver will reset the device and create a default Qbv +schedule with a 1-second cycle time, with all Tx Queues open at all times. + +The value of the launch time that is programmed in the Advanced Transmit +Context Descriptor is a relative offset to the starting time of the Qbv +transmission window of the queue. The Frst flag of the descriptor can be +set to schedule the packet for the next Qbv cycle. Therefore, the horizon +of the launch time for i225 and i226 is the ending time of the next cycle +of the Qbv transmission window of the queue. For example, when the Qbv +cycle time is set to 1 second, the horizon of the launch time ranges +from 1 second to 2 seconds, depending on where the Qbv cycle is currently +running. + Querying Device Capabilities ============================ @@ -74,6 +135,7 @@ Refer to ``xsk-flags`` features bitmask in - ``tx-timestamp``: device supports ``XDP_TXMD_FLAGS_TIMESTAMP`` - ``tx-checksum``: device supports ``XDP_TXMD_FLAGS_CHECKSUM`` +- ``tx-launch-time-fifo``: device supports ``XDP_TXMD_FLAGS_LAUNCH_TIME`` See ``tools/net/ynl/samples/netdev.c`` on how to query this information. diff --git a/include/net/xdp_sock.h b/include/net/xdp_sock.h index bfe625b55d55..a58ae7589d12 100644 --- a/include/net/xdp_sock.h +++ b/include/net/xdp_sock.h @@ -110,11 +110,16 @@ struct xdp_sock { * indicates position where checksumming should start. * csum_offset indicates position where checksum should be stored. * + * void (*tmo_request_launch_time)(u64 launch_time, void *priv) + * Called when AF_XDP frame requested launch time HW offload support. + * launch_time indicates the PTP time at which the device can schedule the + * packet for transmission. */ struct xsk_tx_metadata_ops { void (*tmo_request_timestamp)(void *priv); u64 (*tmo_fill_timestamp)(void *priv); void (*tmo_request_checksum)(u16 csum_start, u16 csum_offset, void *priv); + void (*tmo_request_launch_time)(u64 launch_time, void *priv); }; #ifdef CONFIG_XDP_SOCKETS @@ -162,6 +167,11 @@ static inline void xsk_tx_metadata_request(const struct xsk_tx_metadata *meta, if (!meta) return; + if (ops->tmo_request_launch_time) + if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME) + ops->tmo_request_launch_time(meta->request.launch_time, + priv); + if (ops->tmo_request_timestamp) if (meta->flags & XDP_TXMD_FLAGS_TIMESTAMP) ops->tmo_request_timestamp(priv); diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 784cd34f5bba..fe4afb251719 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -199,6 +199,7 @@ static inline void *xsk_buff_raw_get_data(struct xsk_buff_pool *pool, u64 addr) #define XDP_TXMD_FLAGS_VALID ( \ XDP_TXMD_FLAGS_TIMESTAMP | \ XDP_TXMD_FLAGS_CHECKSUM | \ + XDP_TXMD_FLAGS_LAUNCH_TIME | \ 0) static inline bool diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h index 42ec5ddaab8d..42869770776e 100644 --- a/include/uapi/linux/if_xdp.h +++ b/include/uapi/linux/if_xdp.h @@ -127,6 +127,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) +/* Request launch time hardware offload. The device will schedule the packet for + * transmission at a pre-determined time called launch time. The value of + * launch time is communicated via launch_time field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -142,6 +148,10 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_LAUNCH_TIME */ + /* Launch time in nanosecond against the PTP HW Clock */ + __u64 launch_time; } request; struct { diff --git a/include/uapi/linux/netdev.h b/include/uapi/linux/netdev.h index e4be227d3ad6..5ab85f4af009 100644 --- a/include/uapi/linux/netdev.h +++ b/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the + * driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_LAUNCH_TIME = 4, }; enum netdev_queue_type { diff --git a/net/core/netdev-genl.c b/net/core/netdev-genl.c index 715f85c6b62e..9b1e8b846d79 100644 --- a/net/core/netdev-genl.c +++ b/net/core/netdev-genl.c @@ -52,6 +52,8 @@ XDP_METADATA_KFUNC_xxx xsk_features |= NETDEV_XSK_FLAGS_TX_TIMESTAMP; if (netdev->xsk_tx_metadata_ops->tmo_request_checksum) xsk_features |= NETDEV_XSK_FLAGS_TX_CHECKSUM; + if (netdev->xsk_tx_metadata_ops->tmo_request_launch_time) + xsk_features |= NETDEV_XSK_FLAGS_LAUNCH_TIME; } if (nla_put_u32(rsp, NETDEV_A_DEV_IFINDEX, netdev->ifindex) || diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 89d2bef96469..41093ea63700 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -742,6 +742,9 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs, goto free_err; } } + + if (meta->flags & XDP_TXMD_FLAGS_LAUNCH_TIME) + skb->skb_mstamp_ns = meta->request.launch_time; } } diff --git a/tools/include/uapi/linux/if_xdp.h b/tools/include/uapi/linux/if_xdp.h index 42ec5ddaab8d..42869770776e 100644 --- a/tools/include/uapi/linux/if_xdp.h +++ b/tools/include/uapi/linux/if_xdp.h @@ -127,6 +127,12 @@ struct xdp_options { */ #define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) +/* Request launch time hardware offload. The device will schedule the packet for + * transmission at a pre-determined time called launch time. The value of + * launch time is communicated via launch_time field of struct xsk_tx_metadata. + */ +#define XDP_TXMD_FLAGS_LAUNCH_TIME (1 << 2) + /* AF_XDP offloads request. 'request' union member is consumed by the driver * when the packet is being transmitted. 'completion' union member is * filled by the driver when the transmit completion arrives. @@ -142,6 +148,10 @@ struct xsk_tx_metadata { __u16 csum_start; /* Offset from csum_start where checksum should be stored. */ __u16 csum_offset; + + /* XDP_TXMD_FLAGS_LAUNCH_TIME */ + /* Launch time in nanosecond against the PTP HW Clock */ + __u64 launch_time; } request; struct { diff --git a/tools/include/uapi/linux/netdev.h b/tools/include/uapi/linux/netdev.h index e4be227d3ad6..5ab85f4af009 100644 --- a/tools/include/uapi/linux/netdev.h +++ b/tools/include/uapi/linux/netdev.h @@ -59,10 +59,13 @@ enum netdev_xdp_rx_metadata { * by the driver. * @NETDEV_XSK_FLAGS_TX_CHECKSUM: L3 checksum HW offload is supported by the * driver. + * @NETDEV_XSK_FLAGS_LAUNCH_TIME: Launch Time HW offload is supported by the + * driver. */ enum netdev_xsk_flags { NETDEV_XSK_FLAGS_TX_TIMESTAMP = 1, NETDEV_XSK_FLAGS_TX_CHECKSUM = 2, + NETDEV_XSK_FLAGS_LAUNCH_TIME = 4, }; enum netdev_queue_type { From patchwork Thu Feb 6 06:04:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Yoong Siang X-Patchwork-Id: 862770 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9495C2248A0; Thu, 6 Feb 2025 06:05:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821905; cv=none; b=TMh+INbpwiYXVx3gBHtIVHZ0YbXuk0SZwQJw7emgvdhPQpXzOxD+RWpVOLE74ySjGxko7LUbAdeZrteOCEvH65pxiZ1Gztaw53lkrXQFndTnxwCSN5LK8ycdSISlSHvPhlzJebYMoLsm0clB+31+71rmhQdxoncoNHYoQuzq8IA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821905; c=relaxed/simple; bh=abljFYKEQRlJk1Api1dEJbwUN4B4ZTB76d5g1mTTLRI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=aO5WQB+5HoGhYbMyHa5aybWl8yRDEhp1mU7hDnpDb466YoeoGq1mk6mA8nywxxbZypSZV80zhqZL0YjGdSD/EEMP05Qvq5+RWocF8D51eIJpBhMQVXKmPSkzPzloPgGCc5PEfL1pdCV2AYCFEXkPrrUB6urQT6BKcwuJLP8ujMA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JNJfteXK; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JNJfteXK" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738821903; x=1770357903; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=abljFYKEQRlJk1Api1dEJbwUN4B4ZTB76d5g1mTTLRI=; b=JNJfteXKSG3Bv/qZm8hT5ZjUVKzbBc+A+btv3kwPnuhrRffwuRe8tv5v 8nO5eXUPUFRXobPtJLcmlY1Rwhd9espDFzg8SCq8c8lO8M3yqbD5kxIk4 7WQbmrauSpC6elOPQJ63fR89VMMdLNHHWxxjYb/q/CzwOVPyYjRmiQXnZ AlcKECsyiCCW7KLXicHk9m3zEotXcnrB1PCZS/wqkAvD8VLCNKV7835/D e4v9U8Zeps5eEQ9d5yjO6IlUKyhdqAfGsZ+xmr1+98XogVpDVIpCr1bR5 2kUL/PpE+rmgxYp16jo95sg3pzPtTzWvlbLtvKV9vpnVWaQsrZSagbQZ8 w==; X-CSE-ConnectionGUID: m8KAogqmR62+D/LjBV271g== X-CSE-MsgGUID: h58UDtuiSbKOxlTV+n+yHA== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="50051130" X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="50051130" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2025 22:04:54 -0800 X-CSE-ConnectionGUID: N3+tOSIDSpWeaTitPrc1Cg== X-CSE-MsgGUID: g7rkif0pShSWDeWMaj1MHg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="110944366" Received: from p12ill20yoongsia.png.intel.com ([10.88.227.38]) by fmviesa006.fm.intel.com with ESMTP; 05 Feb 2025 22:04:43 -0800 From: Song Yoong Siang To: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Willem de Bruijn , Florian Bezdeka , Donald Hunter , Jonathan Corbet , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Andrew Lunn , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Joe Damato , Stanislav Fomichev , Xuan Zhuo , Mina Almasry , Daniel Jurgens , Song Yoong Siang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Hao Luo , Jiri Olsa , Shuah Khan , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Tony Nguyen , Przemek Kitszel , Faizal Rahim , Choong Yong Liang , Bouska Zdenek Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, intel-wired-lan@lists.osuosl.org, xdp-hints@xdp-project.net Subject: [PATCH bpf-next v9 3/5] net: stmmac: Add launch time support to XDP ZC Date: Thu, 6 Feb 2025 14:04:06 +0800 Message-Id: <20250206060408.808325-4-yoong.siang.song@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250206060408.808325-1-yoong.siang.song@intel.com> References: <20250206060408.808325-1-yoong.siang.song@intel.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Enable launch time (Time-Based Scheduling) support for XDP zero copy via the XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel Tiger Lake platform. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp0s30f4 -l 1000000000 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 16.963 us, as shown in printout of the xdp_hw_metadata application below. 0x55b5864717a8: rx_desc[4]->addr=88100 addr=88100 comp_addr=88100 EoP No rx_hash, err=-95 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to User RX-time sec:0.0004 (375.624 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User RX-time sec:0.0001 (88.498 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x55b5864717a8: ping-pong with csum=5619 (want 0000) csum_start=34 csum_offset=6 HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW Launch-time sec:1.0000 (1000000.000 usec) 0x55b5864717a8: complete tx idx=4 addr=4018 HW Launch-time: 1734579066767717328 (sec:1734579066.7677) delta to HW TX-complete-time sec:0.0000 (16.963 usec) HW TX-complete-time: 1734579066767734291 (sec:1734579066.7677) delta to User TX-complete-time sec:0.0001 (130.408 usec) XDP RX-time: 1734579065768004454 (sec:1734579065.7680) delta to User TX-complete-time sec:0.9999 (999860.245 usec) HW RX-time: 1734579065767717328 (sec:1734579065.7677) delta to HW TX-complete-time sec:1.0000 (1000016.963 usec) 0x55b5864717a8: complete rx idx=132 addr=88100 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp0s30f4 -l 500000 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 13.854 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 08.410 us Avr delta: 13.854 us Max delta: 17.076 us Total packets forwarded: 1000 Reviewed-by: Choong Yong Liang Signed-off-by: Song Yoong Siang --- drivers/net/ethernet/stmicro/stmmac/stmmac.h | 2 ++ drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 13 +++++++++++++ 2 files changed, 15 insertions(+) diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac.h b/drivers/net/ethernet/stmicro/stmmac/stmmac.h index f05cae103d83..925d8b97a42b 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac.h +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac.h @@ -106,6 +106,8 @@ struct stmmac_metadata_request { struct stmmac_priv *priv; struct dma_desc *tx_desc; bool *set_ic; + struct dma_edesc *edesc; + int tbs; }; struct stmmac_xsk_tx_complete { diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index d04543e5697b..5e5d24924ce7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -2514,9 +2514,20 @@ static u64 stmmac_xsk_fill_timestamp(void *_priv) return 0; } +static void stmmac_xsk_request_launch_time(u64 launch_time, void *_priv) +{ + struct stmmac_metadata_request *meta_req = _priv; + struct timespec64 ts = ns_to_timespec64(launch_time); + + if (meta_req->tbs & STMMAC_TBS_EN) + stmmac_set_desc_tbs(meta_req->priv, meta_req->edesc, ts.tv_sec, + ts.tv_nsec); +} + static const struct xsk_tx_metadata_ops stmmac_xsk_tx_metadata_ops = { .tmo_request_timestamp = stmmac_xsk_request_timestamp, .tmo_fill_timestamp = stmmac_xsk_fill_timestamp, + .tmo_request_launch_time = stmmac_xsk_request_launch_time, }; static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) @@ -2600,6 +2611,8 @@ static bool stmmac_xdp_xmit_zc(struct stmmac_priv *priv, u32 queue, u32 budget) meta_req.priv = priv; meta_req.tx_desc = tx_desc; meta_req.set_ic = &set_ic; + meta_req.tbs = tx_q->tbs; + meta_req.edesc = &tx_q->dma_entx[entry]; xsk_tx_metadata_request(meta, &stmmac_xsk_tx_metadata_ops, &meta_req); if (set_ic) { From patchwork Thu Feb 6 06:04:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Song Yoong Siang X-Patchwork-Id: 862769 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.9]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13F702248B5; Thu, 6 Feb 2025 06:05:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.9 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821920; cv=none; b=dpfwtLGJO5xQsiPSpqNPQv3RpyRyjJ9+8ayAMVJrMjkEgOc4W/0xVOhr5kwuQcObxhOkKQrDENXDMnn8AIlU6ZGodt/vp+QF95X2MWPQDrlhYCph5d9Ge7aQXCOthIIb8ERJCz+/Az3L9oH1hY0BkF881Va1N14RlyvCtR03DIo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1738821920; c=relaxed/simple; bh=h5F9ULWX/IazPkL53mq2S83VDIY5FdzebPtuIVDlMQ0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=sQpQf5m6B0+RgjMFAbMXHcNCAI45S57V2I3YGizk3rAYtXlR8psYZS0bwa+hBxXBdpBbQRiT9aIG5KEvbmt/+4e5JV58tBB9Sav/nE5DhPPi+lWoW9oKJuHYbXMR5OTFBJdAJl4m64nKrL43iMTx/pMKBH1RhyKjKoX5HVRfRKI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mRU9rlOB; arc=none smtp.client-ip=192.198.163.9 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mRU9rlOB" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1738821918; x=1770357918; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=h5F9ULWX/IazPkL53mq2S83VDIY5FdzebPtuIVDlMQ0=; b=mRU9rlOB/W7zYMaaRGcjn7Ok/ARLC5awt3WjQCIzEBJwfdhuf8ENas0g 6qQNJEbZfg5pP+Ze2Hj6EKFX2MiD59xSpluTBzfMERkMLbnvdyXFXDQ9N 5azKkqGX65b2QVrtGZZ59YE1H3C5qjccijy0BQyw4SHBoYKHasE+LoKCT havTQv3wMJ8qV5VFUGZj5IuFkJJjHPT6D2v0/yMPyaUUbPo3a4tmt1yku 5tSrjpISuUx3J2Kf5J4kwSugIked9Oxbtfli1NJozno/i15ydeFjR6WWW 09A4MRNxlC1Cavyokvg/Wubgmpb57JHxjSWmUMEM2vcxRCl087JAKDNnI w==; X-CSE-ConnectionGUID: CjcbOV9DTO6Lh4Wg7/VRqA== X-CSE-MsgGUID: e8C4nPi7S1GVgl1Fv1QPtQ== X-IronPort-AV: E=McAfee;i="6700,10204,11336"; a="50051192" X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="50051192" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Feb 2025 22:05:16 -0800 X-CSE-ConnectionGUID: PVvGlUBRQKWV9apBj3Iv9Q== X-CSE-MsgGUID: 7AIUXagISJW44G3R9VFWTg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,263,1732608000"; d="scan'208";a="110944383" Received: from p12ill20yoongsia.png.intel.com ([10.88.227.38]) by fmviesa006.fm.intel.com with ESMTP; 05 Feb 2025 22:05:05 -0800 From: Song Yoong Siang To: "David S . Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Simon Horman , Willem de Bruijn , Florian Bezdeka , Donald Hunter , Jonathan Corbet , Bjorn Topel , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Andrew Lunn , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , Joe Damato , Stanislav Fomichev , Xuan Zhuo , Mina Almasry , Daniel Jurgens , Song Yoong Siang , Andrii Nakryiko , Eduard Zingerman , Mykola Lysenko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Hao Luo , Jiri Olsa , Shuah Khan , Alexandre Torgue , Jose Abreu , Maxime Coquelin , Tony Nguyen , Przemek Kitszel , Faizal Rahim , Choong Yong Liang , Bouska Zdenek Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, intel-wired-lan@lists.osuosl.org, xdp-hints@xdp-project.net Subject: [PATCH bpf-next v9 5/5] igc: Add launch time support to XDP ZC Date: Thu, 6 Feb 2025 14:04:08 +0800 Message-Id: <20250206060408.808325-6-yoong.siang.song@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20250206060408.808325-1-yoong.siang.song@intel.com> References: <20250206060408.808325-1-yoong.siang.song@intel.com> Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Enable Launch Time Control (LTC) support for XDP zero copy via XDP Tx metadata framework. This patch has been tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel I225-LM Ethernet controller. Below are the test steps and result. Test 1: Send a single packet with the launch time set to 1 s in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo ./xdp_hw_metadata enp2s0 -l 1000000000 -L 1 2. On the Link Partner, send a UDP packet with VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 1 s in the future, the delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in printout of the xdp_hw_metadata application below. 0x562ff5dc8880: rx_desc[4]->addr=84110 addr=84110 comp_addr=84110 EoP rx_hash: 0xE343384 with RSS type:0x1 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to User RX-time sec:0.0002 (183.103 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User RX-time sec:0.0001 (80.309 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x562ff5dc8880: ping-pong with csum=561c (want c7dd) csum_start=34 csum_offset=6 HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW Launch-time sec:1.0000 (1000000.000 usec) 0x562ff5dc8880: complete tx idx=4 addr=4018 HW Launch-time: 1734578016467548904 (sec:1734578016.4675) delta to HW TX-complete-time sec:0.0000 (0.016 usec) HW TX-complete-time: 1734578016467548920 (sec:1734578016.4675) delta to User TX-complete-time sec:0.0000 (32.546 usec) XDP RX-time: 1734578015467651698 (sec:1734578015.4677) delta to User TX-complete-time sec:0.9999 (999929.768 usec) HW RX-time: 1734578015467548904 (sec:1734578015.4675) delta to HW TX-complete-time sec:1.0000 (1000000.016 usec) 0x562ff5dc8880: complete rx idx=132 addr=84110 Test 2: Send 1000 packets with a 10 ms interval and the launch time set to 500 us in the future. Test steps: 1. On the DUT, start the xdp_hw_metadata selftest application: $ sudo chrt -f 99 ./xdp_hw_metadata enp2s0 -l 500000 -L 1 > \ /dev/shm/result.log 2. On the Link Partner, send 1000 UDP packets with a 10 ms interval and VLAN priority 1 to port 9091 of the DUT. Result: When the launch time is set to 500 us in the future, the average delta between the launch time and the transmit hardware timestamp is 0.016 us, as shown in the analysis of /dev/shm/result.log below. The XDP launch time works correctly in sending 1000 packets continuously. Min delta: 0.005 us Avr delta: 0.016 us Max delta: 0.031 us Total packets forwarded: 1000 Signed-off-by: Song Yoong Siang --- drivers/net/ethernet/intel/igc/igc.h | 1 + drivers/net/ethernet/intel/igc/igc_main.c | 57 ++++++++++++++++++++++- 2 files changed, 56 insertions(+), 2 deletions(-) diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index b8111ad9a9a8..cd1d7b6c1782 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -579,6 +579,7 @@ struct igc_metadata_request { struct xsk_tx_metadata *meta; struct igc_ring *tx_ring; u32 cmd_type; + u16 used_desc; }; struct igc_q_vector { diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 3df608601a4b..f239f744247d 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -2973,9 +2973,44 @@ static u64 igc_xsk_fill_timestamp(void *_priv) return *(u64 *)_priv; } +static void igc_xsk_request_launch_time(u64 launch_time, void *_priv) +{ + struct igc_metadata_request *meta_req = _priv; + struct igc_ring *tx_ring = meta_req->tx_ring; + __le32 launch_time_offset; + bool insert_empty = false; + bool first_flag = false; + + if (!tx_ring->launchtime_enable) + return; + + launch_time_offset = igc_tx_launchtime(tx_ring, + ns_to_ktime(launch_time), + &first_flag, &insert_empty); + if (insert_empty) { + /* Disregard the launch time request if the required empty frame + * fails to be inserted. + */ + if (igc_insert_empty_frame(tx_ring)) + return; + + meta_req->tx_buffer = + &tx_ring->tx_buffer_info[tx_ring->next_to_use]; + /* Inserting an empty packet requires two descriptors: + * one data descriptor and one context descriptor. + */ + meta_req->used_desc += 2; + } + + /* Use one context descriptor to specify launch time and first flag. */ + igc_tx_ctxtdesc(tx_ring, launch_time_offset, first_flag, 0, 0, 0); + meta_req->used_desc += 1; +} + const struct xsk_tx_metadata_ops igc_xsk_tx_metadata_ops = { .tmo_request_timestamp = igc_xsk_request_timestamp, .tmo_fill_timestamp = igc_xsk_fill_timestamp, + .tmo_request_launch_time = igc_xsk_request_launch_time, }; static void igc_xdp_xmit_zc(struct igc_ring *ring) @@ -2998,7 +3033,13 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) ntu = ring->next_to_use; budget = igc_desc_unused(ring); - while (xsk_tx_peek_desc(pool, &xdp_desc) && budget--) { + /* Packets with launch time require one data descriptor and one context + * descriptor. When the launch time falls into the next Qbv cycle, we + * may need to insert an empty packet, which requires two more + * descriptors. Therefore, to be safe, we always ensure we have at least + * 4 descriptors available. + */ + while (xsk_tx_peek_desc(pool, &xdp_desc) && budget >= 4) { struct igc_metadata_request meta_req; struct xsk_tx_metadata *meta = NULL; struct igc_tx_buffer *bi; @@ -3019,9 +3060,19 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) meta_req.tx_ring = ring; meta_req.tx_buffer = bi; meta_req.meta = meta; + meta_req.used_desc = 0; xsk_tx_metadata_request(meta, &igc_xsk_tx_metadata_ops, &meta_req); + /* xsk_tx_metadata_request() may have updated next_to_use */ + ntu = ring->next_to_use; + + /* xsk_tx_metadata_request() may have updated Tx buffer info */ + bi = meta_req.tx_buffer; + + /* xsk_tx_metadata_request() may use a few descriptors */ + budget -= meta_req.used_desc; + tx_desc = IGC_TX_DESC(ring, ntu); tx_desc->read.cmd_type_len = cpu_to_le32(meta_req.cmd_type); tx_desc->read.olinfo_status = cpu_to_le32(olinfo_status); @@ -3039,9 +3090,11 @@ static void igc_xdp_xmit_zc(struct igc_ring *ring) ntu++; if (ntu == ring->count) ntu = 0; + + ring->next_to_use = ntu; + budget--; } - ring->next_to_use = ntu; if (tx_desc) { igc_flush_tx_descriptors(ring); xsk_tx_release(pool);