From patchwork Thu May 29 05:34:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893239 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6378F255E34; Thu, 29 May 2025 05:41:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497306; cv=none; b=NV6AaGNrLM9IkHvpH3GYEaNy20hgjl70xEgNFfrrtmAwDWetfaHqrI0CcxUDa6JX8dBo5KnVv/n0G2O6k2AQQRTR+0OXDOJcRX5kMoKRTrWbzFtJluOtQYDeQWTqWZZAQC0lQ33ime8oP3rZ/kA60oW1Yuhc7G1gyzjKR26R9Ww= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497306; c=relaxed/simple; bh=W0b07hR404BkCAfz+PJrRZ5URoG6kJoNFO7BJKOwmPU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=u+EJgQ+k33cmA+0bu1j2LY4xWWfZLsz8MaBx2mG+nqDIRXZrHqA7/+dNZPM+xWjJolwxCFj22Jf+urmrmRQKSvKtbi2IbFNY/ZXQpezKe4QnvIPuDpYz578qzvfvUWU58UT4ZcH4cPiIbWxUecouOKFYM3JAVFJmqu44lhHFe2U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=QXgx35mw; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="QXgx35mw" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497305; x=1780033305; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=W0b07hR404BkCAfz+PJrRZ5URoG6kJoNFO7BJKOwmPU=; b=QXgx35mwOm4JuH00k+1JSxzAi84yn6Pc8N2e7qavOS5UqTlqboAPfzF0 rKj0bUnOy4UgVQZct9B/foMiaWhe5gUVqjm8VSv/TnuGprtPHUr5d3x3l HlK+D7nWSw/vgCUjc8jTI32cK5kfVZT7UVCX/HKRWa71l0mZqbdYjuKoI Ie0mDDb8uubs5GXQBZvjS6Vx6+S8/O8JaBmmAkFStPmqxwkz0gQFS+D1/ NZqJl92jfKaCysP6LRMHMNolrj08oFkm+FAUNFP3jWRI74C80AhxPQeKx 6ZRFh6KXmqj/QdCA9fKwSFYc3Z68T+aVrXhguFfcuI0gcuVHx6Hmx3uWd Q==; X-CSE-ConnectionGUID: FTSCRALxQkm0i0z30MCtyA== X-CSE-MsgGUID: pAAI+TXAS7+MZEzRCi6i7g== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962885" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962885" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:41:44 -0700 X-CSE-ConnectionGUID: W2xU6Q6eQZOMvGEEwweqRw== X-CSE-MsgGUID: w6p2HlQdT0SvCYJRbHO61w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443223" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:41:37 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 01/30] HACK: dma-buf: Introduce dma_buf_get_pfn_unlocked() kAPI Date: Thu, 29 May 2025 13:34:44 +0800 Message-Id: <20250529053513.1592088-2-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This is just to illustrate the idea that dma-buf provides a new buffer sharing mode - importer mapping. Exporter provides the target memory resource description, importer decides what's the best way to map the memory based on the information of target memory and importing device. The get_pfn() kAPI is an initial attempt of this idea, obviously it is not a full description for all kinds of memory types. But it enables the FD based MMIO mapping in KVM to support private device assignement, There are other concerns discussed [1] for this implementation, need further investigation to work out a improved solution. For now, no change to the previous version. [1] [1]: https://lore.kernel.org/all/20250107142719.179636-2-yilun.xu@linux.intel.com/ Signed-off-by: Xu Yilun --- drivers/dma-buf/dma-buf.c | 87 +++++++++++++++++++++++++++++++-------- include/linux/dma-buf.h | 13 ++++++ 2 files changed, 83 insertions(+), 17 deletions(-) diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 5baa83b85515..58752f0bee36 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -630,10 +630,10 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info) size_t alloc_size = sizeof(struct dma_buf); int ret; - if (WARN_ON(!exp_info->priv || !exp_info->ops - || !exp_info->ops->map_dma_buf - || !exp_info->ops->unmap_dma_buf - || !exp_info->ops->release)) + if (WARN_ON(!exp_info->priv || !exp_info->ops || + (!!exp_info->ops->map_dma_buf != !!exp_info->ops->unmap_dma_buf) || + (!exp_info->ops->map_dma_buf && !exp_info->ops->get_pfn) || + !exp_info->ops->release)) return ERR_PTR(-EINVAL); if (WARN_ON(exp_info->ops->cache_sgt_mapping && @@ -909,7 +909,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, struct dma_buf_attachment *attach; int ret; - if (WARN_ON(!dmabuf || !dev)) + if (WARN_ON(!dmabuf)) return ERR_PTR(-EINVAL); if (WARN_ON(importer_ops && !importer_ops->move_notify)) @@ -941,7 +941,7 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, */ if (dma_buf_attachment_is_dynamic(attach) != dma_buf_is_dynamic(dmabuf)) { - struct sg_table *sgt; + struct sg_table *sgt = NULL; dma_resv_lock(attach->dmabuf->resv, NULL); if (dma_buf_is_dynamic(attach->dmabuf)) { @@ -950,13 +950,16 @@ dma_buf_dynamic_attach(struct dma_buf *dmabuf, struct device *dev, goto err_unlock; } - sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); - if (!sgt) - sgt = ERR_PTR(-ENOMEM); - if (IS_ERR(sgt)) { - ret = PTR_ERR(sgt); - goto err_unpin; + if (dev && dmabuf->ops->map_dma_buf) { + sgt = __map_dma_buf(attach, DMA_BIDIRECTIONAL); + if (!sgt) + sgt = ERR_PTR(-ENOMEM); + if (IS_ERR(sgt)) { + ret = PTR_ERR(sgt); + goto err_unpin; + } } + dma_resv_unlock(attach->dmabuf->resv); attach->sgt = sgt; attach->dir = DMA_BIDIRECTIONAL; @@ -1119,7 +1122,8 @@ struct sg_table *dma_buf_map_attachment(struct dma_buf_attachment *attach, might_sleep(); - if (WARN_ON(!attach || !attach->dmabuf)) + if (WARN_ON(!attach || !attach->dmabuf || !attach->dev || + !attach->dmabuf->ops->map_dma_buf)) return ERR_PTR(-EINVAL); dma_resv_assert_held(attach->dmabuf->resv); @@ -1195,7 +1199,8 @@ dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, might_sleep(); - if (WARN_ON(!attach || !attach->dmabuf)) + if (WARN_ON(!attach || !attach->dmabuf || !attach->dev || + !attach->dmabuf->ops->map_dma_buf)) return ERR_PTR(-EINVAL); dma_resv_lock(attach->dmabuf->resv, NULL); @@ -1222,7 +1227,8 @@ void dma_buf_unmap_attachment(struct dma_buf_attachment *attach, { might_sleep(); - if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) + if (WARN_ON(!attach || !attach->dmabuf || !attach->dev || + !attach->dmabuf->ops->unmap_dma_buf || !sg_table)) return; dma_resv_assert_held(attach->dmabuf->resv); @@ -1254,7 +1260,8 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, { might_sleep(); - if (WARN_ON(!attach || !attach->dmabuf || !sg_table)) + if (WARN_ON(!attach || !attach->dmabuf || !attach->dev || + !attach->dmabuf->ops->unmap_dma_buf || !sg_table)) return; dma_resv_lock(attach->dmabuf->resv, NULL); @@ -1263,6 +1270,52 @@ void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, } EXPORT_SYMBOL_NS_GPL(dma_buf_unmap_attachment_unlocked, "DMA_BUF"); +/** + * dma_buf_get_pfn_unlocked - + * @attach: [in] attachment to get pfn from + * @pgoff: [in] page offset of the buffer against the start of dma_buf + * @pfn: [out] returns the pfn of the buffer + * @max_order [out] returns the max mapping order of the buffer + */ +int dma_buf_get_pfn_unlocked(struct dma_buf_attachment *attach, + pgoff_t pgoff, u64 *pfn, int *max_order) +{ + struct dma_buf *dmabuf = attach->dmabuf; + int ret; + + if (WARN_ON(!attach || !attach->dmabuf || + !attach->dmabuf->ops->get_pfn)) + return -EINVAL; + + /* + * Open: + * + * When dma_buf is dynamic but dma_buf move is disabled, the buffer + * should be pinned before use, See dma_buf_map_attachment() for + * reference. + * + * But for now no pin is intended inside dma_buf_get_pfn(), otherwise + * need another API to unpin the dma_buf. So just fail out this case. + */ + if (dma_buf_is_dynamic(attach->dmabuf) && + !IS_ENABLED(CONFIG_DMABUF_MOVE_NOTIFY)) + return -ENOENT; + + dma_resv_lock(attach->dmabuf->resv, NULL); + ret = dmabuf->ops->get_pfn(attach, pgoff, pfn, max_order); + /* + * Open: + * + * Is dma_resv_wait_timeout() needed? I assume no. The DMA buffer + * content synchronization could be done when the buffer is to be + * mapped by importer. + */ + dma_resv_unlock(attach->dmabuf->resv); + + return ret; +} +EXPORT_SYMBOL_NS_GPL(dma_buf_get_pfn_unlocked, "DMA_BUF"); + /** * dma_buf_move_notify - notify attachments that DMA-buf is moving * @@ -1662,7 +1715,7 @@ static int dma_buf_debug_show(struct seq_file *s, void *unused) attach_count = 0; list_for_each_entry(attach_obj, &buf_obj->attachments, node) { - seq_printf(s, "\t%s\n", dev_name(attach_obj->dev)); + seq_printf(s, "\t%s\n", attach_obj->dev ? dev_name(attach_obj->dev) : NULL); attach_count++; } dma_resv_unlock(buf_obj->resv); diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h index 36216d28d8bd..b16183edfb3a 100644 --- a/include/linux/dma-buf.h +++ b/include/linux/dma-buf.h @@ -194,6 +194,17 @@ struct dma_buf_ops { * if the call would block. */ + /** + * @get_pfn: + * + * This is called by dma_buf_get_pfn(). It is used to get the pfn + * of the buffer positioned by the page offset against the start of + * the dma_buf. It can only be called if @attach has been called + * successfully. + */ + int (*get_pfn)(struct dma_buf_attachment *attach, pgoff_t pgoff, + u64 *pfn, int *max_order); + /** * @release: * @@ -629,6 +640,8 @@ dma_buf_map_attachment_unlocked(struct dma_buf_attachment *attach, void dma_buf_unmap_attachment_unlocked(struct dma_buf_attachment *attach, struct sg_table *sg_table, enum dma_data_direction direction); +int dma_buf_get_pfn_unlocked(struct dma_buf_attachment *attach, + pgoff_t pgoff, u64 *pfn, int *max_order); int dma_buf_mmap(struct dma_buf *, struct vm_area_struct *, unsigned long); From patchwork Thu May 29 05:34:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893497 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C1EC7256C7E; Thu, 29 May 2025 05:41:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497312; cv=none; b=sJSJjlAaF+0crfm0SqwXkFGEnFtgPmrKs1CUvhqxtUXiXEpsaXjpZ/yrEIO3Q/aGikOr4CZJnjiptO/+PwFtMZK/Vm5KwxcgjSJp5sVVeA+VIgxIY7mMWggad48Lu4XBUnIjG9OKiA9SZNMGM2cgiEcspYRF4/FhOJin7IN6vNQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497312; c=relaxed/simple; bh=s6ATRYdccLI0SPU9T1CSnNoi9tI9HxN5sGOM3iiteL0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=i7YhH+fB5cYAXhgCpWMJmYXYqOCvfQmFLce19Q4l2uycuCteJFD0Xd7pPCT0HoIYCg2bccWeCKnFKFrnzjlK0Z/ddI1WN8Lvlj4IxcHo3rhNWtUNHjkp1YVus6K0gFuSYc/+IIEXTGDDd8/quWq1M9dzd/YFFDn3wBZMpmix4ec= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fyaEZ4at; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fyaEZ4at" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497311; x=1780033311; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s6ATRYdccLI0SPU9T1CSnNoi9tI9HxN5sGOM3iiteL0=; b=fyaEZ4atcviJgSOrVFuA7bBZlA1NF8+vAtcHOYQnuTA2oZc17ef132Yx 2v/XIprpCSbw7mSzKHBkrOxOu1H3pAQx9+Icknd5yPyK6LGkCoQH/e6ZR xxPMYHcvw8Uzxdn+34C2NYi2IzSXyKsRPe/4qg6RHFQEECRHrJnytYsZ4 LI4Nd2JtZznRAfo202FOnsP9a+W/94NRgEvAZGIBeFgDgGNUG6ERIMSDR Bpg/TRmOq82S9rM3yfyBRw3OSnA7cRG1xkmfZscVPTuvqW91AusR6PXUU DTAuwFzcZR9HJP2riqInr+nwvjD/WVtQyGCGJPLv1cAGcy1F/DZ2BxIut w==; X-CSE-ConnectionGUID: R1br230ITTGE9QnmEsYrLA== X-CSE-MsgGUID: FAAM9qGjQIe030gWUqjU1A== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962906" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962906" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:41:51 -0700 X-CSE-ConnectionGUID: cuUVC9FURPi72lhPv54gcA== X-CSE-MsgGUID: F9VlM59vRuyz2/iwvuiEyg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443247" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:41:44 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 02/30] vfio: Export vfio device get and put registration helpers Date: Thu, 29 May 2025 13:34:45 +0800 Message-Id: <20250529053513.1592088-3-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Vivek Kasireddy These helpers are useful for managing additional references taken on the device from other associated VFIO modules. Original-patch-by: Jason Gunthorpe Signed-off-by: Vivek Kasireddy --- drivers/vfio/vfio_main.c | 2 ++ include/linux/vfio.h | 2 ++ 2 files changed, 4 insertions(+) diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 1fd261efc582..620a3ee5d04d 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -171,11 +171,13 @@ void vfio_device_put_registration(struct vfio_device *device) if (refcount_dec_and_test(&device->refcount)) complete(&device->comp); } +EXPORT_SYMBOL_GPL(vfio_device_put_registration); bool vfio_device_try_get_registration(struct vfio_device *device) { return refcount_inc_not_zero(&device->refcount); } +EXPORT_SYMBOL_GPL(vfio_device_try_get_registration); /* * VFIO driver API diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 707b00772ce1..ba65bbdffd0b 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -293,6 +293,8 @@ static inline void vfio_put_device(struct vfio_device *device) int vfio_register_group_dev(struct vfio_device *device); int vfio_register_emulated_iommu_dev(struct vfio_device *device); void vfio_unregister_group_dev(struct vfio_device *device); +bool vfio_device_try_get_registration(struct vfio_device *device); +void vfio_device_put_registration(struct vfio_device *device); int vfio_assign_device_set(struct vfio_device *device, void *set_id); unsigned int vfio_device_set_open_count(struct vfio_device_set *dev_set); From patchwork Thu May 29 05:34:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893238 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DB99E254AFF; Thu, 29 May 2025 05:41:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497319; cv=none; b=RE7x588UNx8GSQoMiYgYFBR3yp+vPtTL5I673vklZ2N3q9ckPShUsXowEqE7HuViMVbKh2iEDhzcCdFRpCw9sxP347Wt4/0KtYsKxP5mjJzxDU+9KHOguT1xpup0jdE8sJynrL065rSPXOIqaXr31IKaMP55wm/iQ6IVw5gR6UI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497319; c=relaxed/simple; bh=TRQEEZ1OracM0hm6pRFYkzpmvjZcegNdwWKvcxgIwb4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=qzzAOS66PpoV4+XMT9ncDTf99Po8UpL+tpnpD4hkm/ps1ddQVw4EUCn6OlwegH/2IQg2HBJ+3zYYmEvLnuJDo2XgbRMFB+mSY2yVlHzxYvdF5QkoxnKFTNWee49J1xjQoxL6QpRMc0nQ8MAOLCgymcMBuq3Yd/C2uCx2FMVO2IA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=fuKhGgDZ; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="fuKhGgDZ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497318; x=1780033318; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TRQEEZ1OracM0hm6pRFYkzpmvjZcegNdwWKvcxgIwb4=; b=fuKhGgDZ7+k+z93CjLFGiflxgF0B0+e4gpo+IKz5D3XFHXW0QOGrzlKm KtTvfWP5ZPWPwOFgIs27lk04A/BrPzSW/9pOADfna9QHL5NZQFt2XSCQQ 5tf8zY7qb6ZiUpmp2JDL6e65kX0sNIbP+NUjhcN4O1yCvTjGC48b7v+pS A+n+HR+M4JmVJ0JzRqW3NnqQMNQjwnubKiI3cTb7LVxNDirmRcIt8bab6 lltIkMAokIXqQtf2ADH15dCA1tYXgVMppIOcbtplnz2spqP1xwiH1G5Ih CvQSZtYv7YG/Od2autTUmzAPG9l6BxcdsHobBx9UVHrnMBzDRGgYUbb+O A==; X-CSE-ConnectionGUID: 4Woy1NgfSha1JNUier57PQ== X-CSE-MsgGUID: 9E87mfKkQZW3Gy0iJ13Srw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962934" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962934" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:41:57 -0700 X-CSE-ConnectionGUID: PJ2aymzVQJ25ZHzgKfZyBw== X-CSE-MsgGUID: slbn4NN2RgyGM0xNaRthRw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443277" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:41:50 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 03/30] vfio/pci: Share the core device pointer while invoking feature functions Date: Thu, 29 May 2025 13:34:46 +0800 Message-Id: <20250529053513.1592088-4-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Vivek Kasireddy There is no need to share the main device pointer (struct vfio_device *) with all the feature functions as they only need the core device pointer. Therefore, extract the core device pointer once in the caller (vfio_pci_core_ioctl_feature) and share it instead. Signed-off-by: Vivek Kasireddy --- drivers/vfio/pci/vfio_pci_core.c | 30 +++++++++++++----------------- 1 file changed, 13 insertions(+), 17 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 35f9046af315..adfcbc2231cb 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -300,11 +300,9 @@ static int vfio_pci_runtime_pm_entry(struct vfio_pci_core_device *vdev, return 0; } -static int vfio_pci_core_pm_entry(struct vfio_device *device, u32 flags, +static int vfio_pci_core_pm_entry(struct vfio_pci_core_device *vdev, u32 flags, void __user *arg, size_t argsz) { - struct vfio_pci_core_device *vdev = - container_of(device, struct vfio_pci_core_device, vdev); int ret; ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, 0); @@ -321,12 +319,10 @@ static int vfio_pci_core_pm_entry(struct vfio_device *device, u32 flags, } static int vfio_pci_core_pm_entry_with_wakeup( - struct vfio_device *device, u32 flags, + struct vfio_pci_core_device *vdev, u32 flags, struct vfio_device_low_power_entry_with_wakeup __user *arg, size_t argsz) { - struct vfio_pci_core_device *vdev = - container_of(device, struct vfio_pci_core_device, vdev); struct vfio_device_low_power_entry_with_wakeup entry; struct eventfd_ctx *efdctx; int ret; @@ -377,11 +373,9 @@ static void vfio_pci_runtime_pm_exit(struct vfio_pci_core_device *vdev) up_write(&vdev->memory_lock); } -static int vfio_pci_core_pm_exit(struct vfio_device *device, u32 flags, +static int vfio_pci_core_pm_exit(struct vfio_pci_core_device *vdev, u32 flags, void __user *arg, size_t argsz) { - struct vfio_pci_core_device *vdev = - container_of(device, struct vfio_pci_core_device, vdev); int ret; ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_SET, 0); @@ -1474,11 +1468,10 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, } EXPORT_SYMBOL_GPL(vfio_pci_core_ioctl); -static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags, - uuid_t __user *arg, size_t argsz) +static int vfio_pci_core_feature_token(struct vfio_pci_core_device *vdev, + u32 flags, uuid_t __user *arg, + size_t argsz) { - struct vfio_pci_core_device *vdev = - container_of(device, struct vfio_pci_core_device, vdev); uuid_t uuid; int ret; @@ -1505,16 +1498,19 @@ static int vfio_pci_core_feature_token(struct vfio_device *device, u32 flags, int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, void __user *arg, size_t argsz) { + struct vfio_pci_core_device *vdev = + container_of(device, struct vfio_pci_core_device, vdev); + switch (flags & VFIO_DEVICE_FEATURE_MASK) { case VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY: - return vfio_pci_core_pm_entry(device, flags, arg, argsz); + return vfio_pci_core_pm_entry(vdev, flags, arg, argsz); case VFIO_DEVICE_FEATURE_LOW_POWER_ENTRY_WITH_WAKEUP: - return vfio_pci_core_pm_entry_with_wakeup(device, flags, + return vfio_pci_core_pm_entry_with_wakeup(vdev, flags, arg, argsz); case VFIO_DEVICE_FEATURE_LOW_POWER_EXIT: - return vfio_pci_core_pm_exit(device, flags, arg, argsz); + return vfio_pci_core_pm_exit(vdev, flags, arg, argsz); case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN: - return vfio_pci_core_feature_token(device, flags, arg, argsz); + return vfio_pci_core_feature_token(vdev, flags, arg, argsz); default: return -ENOTTY; } From patchwork Thu May 29 05:34:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893496 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 13D1A2550D8; Thu, 29 May 2025 05:42:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497326; cv=none; b=lgU3E3dE1CvDZzoRbUVY+QbS4nwC57eeEMKnQcBfVIJmaOxW578zqOlL/VA2mCSX6CEX4WznJ93tROGZTfCHWyRjzj21tbaSmUcBizqLmYKGbN2X7KKLsXH4YOhSPGeGDXVedf1XXay/KfjiVI5GuH74cHS3yvTraXpSxxmob+4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497326; c=relaxed/simple; bh=BZUqH7fy4jIrOBL4+ESCmWpxVOr+YN/WirSjxbsneV0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=JDmMRFM0GLmepvoYKKIQclFWvZqFTsBaDPuMaqPyL72ECVADoRFCFkH0LYmItGFDm7LiRm3fKvF9b6Yw9GULZHdGX1lriVIJYpFVe7fqizjVL12hosrDLI8XDUKq5GXg+ZuTgY3QLC9QffeEhc9w86jxnlsVMxoZVjcRY31LmSI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ak6olYaG; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ak6olYaG" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497324; x=1780033324; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=BZUqH7fy4jIrOBL4+ESCmWpxVOr+YN/WirSjxbsneV0=; b=ak6olYaGR4txb1CfCZaq2WD/ytQ6/VpvEP8SSvT16Xm83zFN3OSd52pX Riv8XwY4wwklbBFdqmATdYKbzLdpUUhS9G3kWTTKJDV5f05IJ3OYWuZIb gRkXslXiMKSsDb7d0gyNdfyH4GTDdoszgqz45LtAG3QI2NgTUqOqztLNq FUmbKDvbS1lJMKMKpNtlZo63tvUbfAk1LZqh9ZL/hRnDYln87IS6/DaAU dzgCSonMylmi/hDHTwoDGGxJXpZCpyqEbngUQqMUYJHv43a8GeXafPxI7 ZfPrcgF24R4e3OwnqRDL79mcuyMlfG93xivOQjUM6V/0mMNTI8BuTdrcz g==; X-CSE-ConnectionGUID: 3uzRCTrBS/62y0cIeT3tQg== X-CSE-MsgGUID: mb9/J9FQScWElZGKdDYlXg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962944" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962944" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:04 -0700 X-CSE-ConnectionGUID: hJkpmOPNQM6k1HLQ2PV6HA== X-CSE-MsgGUID: +eBON0SOS+SVZir5i6qQRA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443284" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:41:57 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 04/30] vfio/pci: Allow MMIO regions to be exported through dma-buf Date: Thu, 29 May 2025 13:34:47 +0800 Message-Id: <20250529053513.1592088-5-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Vivek Kasireddy >From Jason Gunthorpe: "dma-buf has become a way to safely acquire a handle to non-struct page memory that can still have lifetime controlled by the exporter. Notably RDMA can now import dma-buf FDs and build them into MRs which allows for PCI P2P operations. Extend this to allow vfio-pci to export MMIO memory from PCI device BARs. The patch design loosely follows the pattern in commit db1a8dd916aa ("habanalabs: add support for dma-buf exporter") except this does not support pinning. Instead, this implements what, in the past, we've called a revocable attachment using move. In normal situations the attachment is pinned, as a BAR does not change physical address. However when the VFIO device is closed, or a PCI reset is issued, access to the MMIO memory is revoked. Revoked means that move occurs, but an attempt to immediately re-map the memory will fail. In the reset case a future move will be triggered when MMIO access returns. As both close and reset are under userspace control it is expected that userspace will suspend use of the dma-buf before doing these operations, the revoke is purely for kernel self-defense against a hostile userspace." Following enhancements are made to the original patch: - Add support for creating dmabuf from multiple areas (or ranges) Cc: Alex Williamson Cc: Simona Vetter Cc: Christian König Original-patch-by: Jason Gunthorpe Signed-off-by: Vivek Kasireddy --- drivers/vfio/pci/Makefile | 1 + drivers/vfio/pci/vfio_pci_config.c | 22 +- drivers/vfio/pci/vfio_pci_core.c | 20 +- drivers/vfio/pci/vfio_pci_dmabuf.c | 359 +++++++++++++++++++++++++++++ drivers/vfio/pci/vfio_pci_priv.h | 23 ++ include/linux/vfio_pci_core.h | 1 + include/uapi/linux/vfio.h | 25 ++ 7 files changed, 446 insertions(+), 5 deletions(-) create mode 100644 drivers/vfio/pci/vfio_pci_dmabuf.c diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile index cf00c0a7e55c..c33ec0cbe930 100644 --- a/drivers/vfio/pci/Makefile +++ b/drivers/vfio/pci/Makefile @@ -2,6 +2,7 @@ vfio-pci-core-y := vfio_pci_core.o vfio_pci_intrs.o vfio_pci_rdwr.o vfio_pci_config.o vfio-pci-core-$(CONFIG_VFIO_PCI_ZDEV_KVM) += vfio_pci_zdev.o +vfio-pci-core-$(CONFIG_DMA_SHARED_BUFFER) += vfio_pci_dmabuf.o obj-$(CONFIG_VFIO_PCI_CORE) += vfio-pci-core.o vfio-pci-y := vfio_pci.o diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 14437396d721..efccbb2d2a42 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -589,10 +589,12 @@ static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos, virt_mem = !!(le16_to_cpu(*virt_cmd) & PCI_COMMAND_MEMORY); new_mem = !!(new_cmd & PCI_COMMAND_MEMORY); - if (!new_mem) + if (!new_mem) { vfio_pci_zap_and_down_write_memory_lock(vdev); - else + vfio_pci_dma_buf_move(vdev, true); + } else { down_write(&vdev->memory_lock); + } /* * If the user is writing mem/io enable (new_mem/io) and we @@ -627,6 +629,8 @@ static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos, *virt_cmd &= cpu_to_le16(~mask); *virt_cmd |= cpu_to_le16(new_cmd & mask); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); } @@ -707,12 +711,16 @@ static int __init init_pci_cap_basic_perm(struct perm_bits *perm) static void vfio_lock_and_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t state) { - if (state >= PCI_D3hot) + if (state >= PCI_D3hot) { vfio_pci_zap_and_down_write_memory_lock(vdev); - else + vfio_pci_dma_buf_move(vdev, true); + } else { down_write(&vdev->memory_lock); + } vfio_pci_set_power_state(vdev, state); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); } @@ -900,7 +908,10 @@ static int vfio_exp_config_write(struct vfio_pci_core_device *vdev, int pos, if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) { vfio_pci_zap_and_down_write_memory_lock(vdev); + vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, true); up_write(&vdev->memory_lock); } } @@ -982,7 +993,10 @@ static int vfio_af_config_write(struct vfio_pci_core_device *vdev, int pos, if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) { vfio_pci_zap_and_down_write_memory_lock(vdev); + vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, true); up_write(&vdev->memory_lock); } } diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index adfcbc2231cb..116964057b0b 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -287,6 +287,8 @@ static int vfio_pci_runtime_pm_entry(struct vfio_pci_core_device *vdev, * semaphore. */ vfio_pci_zap_and_down_write_memory_lock(vdev); + vfio_pci_dma_buf_move(vdev, true); + if (vdev->pm_runtime_engaged) { up_write(&vdev->memory_lock); return -EINVAL; @@ -370,6 +372,8 @@ static void vfio_pci_runtime_pm_exit(struct vfio_pci_core_device *vdev) */ down_write(&vdev->memory_lock); __vfio_pci_runtime_pm_exit(vdev); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); } @@ -690,6 +694,8 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev) #endif vfio_pci_core_disable(vdev); + vfio_pci_dma_buf_cleanup(vdev); + mutex_lock(&vdev->igate); if (vdev->err_trigger) { eventfd_ctx_put(vdev->err_trigger); @@ -1222,7 +1228,10 @@ static int vfio_pci_ioctl_reset(struct vfio_pci_core_device *vdev, */ vfio_pci_set_power_state(vdev, PCI_D0); + vfio_pci_dma_buf_move(vdev, true); ret = pci_try_reset_function(vdev->pdev); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); return ret; @@ -1511,6 +1520,8 @@ int vfio_pci_core_ioctl_feature(struct vfio_device *device, u32 flags, return vfio_pci_core_pm_exit(vdev, flags, arg, argsz); case VFIO_DEVICE_FEATURE_PCI_VF_TOKEN: return vfio_pci_core_feature_token(vdev, flags, arg, argsz); + case VFIO_DEVICE_FEATURE_DMA_BUF: + return vfio_pci_core_feature_dma_buf(vdev, flags, arg, argsz); default: return -ENOTTY; } @@ -2087,6 +2098,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) INIT_LIST_HEAD(&vdev->dummy_resources_list); INIT_LIST_HEAD(&vdev->ioeventfds_list); INIT_LIST_HEAD(&vdev->sriov_pfs_item); + INIT_LIST_HEAD(&vdev->dmabufs); init_rwsem(&vdev->memory_lock); xa_init(&vdev->ctx); @@ -2469,11 +2481,17 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, * cause the PCI config space reset without restoring the original * state (saved locally in 'vdev->pm_save'). */ - list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list) + list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list) { + vfio_pci_dma_buf_move(vdev, true); vfio_pci_set_power_state(vdev, PCI_D0); + } ret = pci_reset_bus(pdev); + list_for_each_entry(vdev, &dev_set->device_list, vdev.dev_set_list) + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); + vdev = list_last_entry(&dev_set->device_list, struct vfio_pci_core_device, vdev.dev_set_list); diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c new file mode 100644 index 000000000000..a4c313ca5bda --- /dev/null +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -0,0 +1,359 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2025, NVIDIA CORPORATION & AFFILIATES. + */ +#include +#include +#include + +#include "vfio_pci_priv.h" + +MODULE_IMPORT_NS("DMA_BUF"); + +struct vfio_pci_dma_buf { + struct dma_buf *dmabuf; + struct vfio_pci_core_device *vdev; + struct list_head dmabufs_elm; + unsigned int nr_ranges; + struct vfio_region_dma_range *dma_ranges; + unsigned int orig_nents; + bool revoked; +}; + +static int vfio_pci_dma_buf_attach(struct dma_buf *dmabuf, + struct dma_buf_attachment *attachment) +{ + struct vfio_pci_dma_buf *priv = dmabuf->priv; + int rc; + + rc = pci_p2pdma_distance_many(priv->vdev->pdev, &attachment->dev, 1, + true); + if (rc < 0) + attachment->peer2peer = false; + return 0; +} + +static void vfio_pci_dma_buf_unpin(struct dma_buf_attachment *attachment) +{ +} + +static int vfio_pci_dma_buf_pin(struct dma_buf_attachment *attachment) +{ + /* + * Uses the dynamic interface but must always allow for + * dma_buf_move_notify() to do revoke + */ + return -EINVAL; +} + +static int populate_sgt(struct dma_buf_attachment *attachment, + enum dma_data_direction dir, + struct sg_table *sgt, size_t sgl_size) +{ + struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv; + struct vfio_region_dma_range *dma_ranges = priv->dma_ranges; + size_t offset, chunk_size; + struct scatterlist *sgl; + dma_addr_t dma_addr; + phys_addr_t phys; + int i, j, ret; + + for_each_sgtable_sg(sgt, sgl, j) + sgl->length = 0; + + sgl = sgt->sgl; + for (i = 0; i < priv->nr_ranges; i++) { + phys = pci_resource_start(priv->vdev->pdev, + dma_ranges[i].region_index); + phys += dma_ranges[i].offset; + + /* + * Break the BAR's physical range up into max sized SGL's + * according to the device's requirement. + */ + for (offset = 0; offset != dma_ranges[i].length;) { + chunk_size = min(dma_ranges[i].length - offset, + sgl_size); + + /* + * Since the memory being mapped is a device memory + * it could never be in CPU caches. + */ + dma_addr = dma_map_resource(attachment->dev, + phys + offset, + chunk_size, dir, + DMA_ATTR_SKIP_CPU_SYNC); + ret = dma_mapping_error(attachment->dev, dma_addr); + if (ret) + goto err; + + sg_set_page(sgl, NULL, chunk_size, 0); + sg_dma_address(sgl) = dma_addr; + sg_dma_len(sgl) = chunk_size; + sgl = sg_next(sgl); + offset += chunk_size; + } + } + + return 0; +err: + for_each_sgtable_sg(sgt, sgl, j) { + if (!sg_dma_len(sgl)) + continue; + + dma_unmap_resource(attachment->dev, sg_dma_address(sgl), + sg_dma_len(sgl), + dir, DMA_ATTR_SKIP_CPU_SYNC); + } + + return ret; +} + +static struct sg_table * +vfio_pci_dma_buf_map(struct dma_buf_attachment *attachment, + enum dma_data_direction dir) +{ + size_t sgl_size = dma_get_max_seg_size(attachment->dev); + struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv; + struct sg_table *sgt; + unsigned int nents; + int ret; + + dma_resv_assert_held(priv->dmabuf->resv); + + if (!attachment->peer2peer) + return ERR_PTR(-EPERM); + + if (priv->revoked) + return ERR_PTR(-ENODEV); + + sgt = kzalloc(sizeof(*sgt), GFP_KERNEL); + if (!sgt) + return ERR_PTR(-ENOMEM); + + nents = DIV_ROUND_UP(priv->dmabuf->size, sgl_size); + ret = sg_alloc_table(sgt, nents, GFP_KERNEL); + if (ret) + goto err_kfree_sgt; + + ret = populate_sgt(attachment, dir, sgt, sgl_size); + if (ret) + goto err_free_sgt; + + /* + * Because we are not going to include a CPU list we want to have some + * chance that other users will detect this by setting the orig_nents to + * 0 and using only nents (length of DMA list) when going over the sgl + */ + priv->orig_nents = sgt->orig_nents; + sgt->orig_nents = 0; + return sgt; + +err_free_sgt: + sg_free_table(sgt); +err_kfree_sgt: + kfree(sgt); + return ERR_PTR(ret); +} + +static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment, + struct sg_table *sgt, + enum dma_data_direction dir) +{ + struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv; + struct scatterlist *sgl; + int i; + + for_each_sgtable_dma_sg(sgt, sgl, i) + dma_unmap_resource(attachment->dev, + sg_dma_address(sgl), + sg_dma_len(sgl), + dir, DMA_ATTR_SKIP_CPU_SYNC); + + sgt->orig_nents = priv->orig_nents; + sg_free_table(sgt); + kfree(sgt); +} + +static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf) +{ + struct vfio_pci_dma_buf *priv = dmabuf->priv; + + /* + * Either this or vfio_pci_dma_buf_cleanup() will remove from the list. + * The refcount prevents both. + */ + if (priv->vdev) { + down_write(&priv->vdev->memory_lock); + list_del_init(&priv->dmabufs_elm); + up_write(&priv->vdev->memory_lock); + vfio_device_put_registration(&priv->vdev->vdev); + } + kfree(priv); +} + +static const struct dma_buf_ops vfio_pci_dmabuf_ops = { + .attach = vfio_pci_dma_buf_attach, + .map_dma_buf = vfio_pci_dma_buf_map, + .pin = vfio_pci_dma_buf_pin, + .unpin = vfio_pci_dma_buf_unpin, + .release = vfio_pci_dma_buf_release, + .unmap_dma_buf = vfio_pci_dma_buf_unmap, +}; + +static int check_dma_ranges(struct vfio_pci_dma_buf *priv, + uint64_t *dmabuf_size) +{ + struct vfio_region_dma_range *dma_ranges = priv->dma_ranges; + struct pci_dev *pdev = priv->vdev->pdev; + resource_size_t bar_size; + int i; + + for (i = 0; i < priv->nr_ranges; i++) { + /* + * For PCI the region_index is the BAR number like + * everything else. + */ + if (dma_ranges[i].region_index >= VFIO_PCI_ROM_REGION_INDEX) + return -EINVAL; + + if (!PAGE_ALIGNED(dma_ranges[i].offset) || + !PAGE_ALIGNED(dma_ranges[i].length)) + return -EINVAL; + + bar_size = pci_resource_len(pdev, dma_ranges[i].region_index); + if (dma_ranges[i].offset > bar_size || + dma_ranges[i].offset + dma_ranges[i].length > bar_size) + return -EINVAL; + + *dmabuf_size += dma_ranges[i].length; + } + + return 0; +} + +int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, + struct vfio_device_feature_dma_buf __user *arg, + size_t argsz) +{ + struct vfio_device_feature_dma_buf get_dma_buf; + struct vfio_region_dma_range *dma_ranges; + DEFINE_DMA_BUF_EXPORT_INFO(exp_info); + struct vfio_pci_dma_buf *priv; + uint64_t dmabuf_size = 0; + int ret; + + ret = vfio_check_feature(flags, argsz, VFIO_DEVICE_FEATURE_GET, + sizeof(get_dma_buf)); + if (ret != 1) + return ret; + + if (copy_from_user(&get_dma_buf, arg, sizeof(get_dma_buf))) + return -EFAULT; + + dma_ranges = memdup_array_user(&arg->dma_ranges, + get_dma_buf.nr_ranges, + sizeof(*dma_ranges)); + if (IS_ERR(dma_ranges)) + return PTR_ERR(dma_ranges); + + priv = kzalloc(sizeof(*priv), GFP_KERNEL); + if (!priv) { + kfree(dma_ranges); + return -ENOMEM; + } + + priv->vdev = vdev; + priv->nr_ranges = get_dma_buf.nr_ranges; + priv->dma_ranges = dma_ranges; + + ret = check_dma_ranges(priv, &dmabuf_size); + if (ret) + goto err_free_priv; + + if (!vfio_device_try_get_registration(&vdev->vdev)) { + ret = -ENODEV; + goto err_free_priv; + } + + exp_info.ops = &vfio_pci_dmabuf_ops; + exp_info.size = dmabuf_size; + exp_info.flags = get_dma_buf.open_flags; + exp_info.priv = priv; + + priv->dmabuf = dma_buf_export(&exp_info); + if (IS_ERR(priv->dmabuf)) { + ret = PTR_ERR(priv->dmabuf); + goto err_dev_put; + } + + /* dma_buf_put() now frees priv */ + INIT_LIST_HEAD(&priv->dmabufs_elm); + down_write(&vdev->memory_lock); + dma_resv_lock(priv->dmabuf->resv, NULL); + priv->revoked = !__vfio_pci_memory_enabled(vdev); + list_add_tail(&priv->dmabufs_elm, &vdev->dmabufs); + dma_resv_unlock(priv->dmabuf->resv); + up_write(&vdev->memory_lock); + + /* + * dma_buf_fd() consumes the reference, when the file closes the dmabuf + * will be released. + */ + return dma_buf_fd(priv->dmabuf, get_dma_buf.open_flags); + +err_dev_put: + vfio_device_put_registration(&vdev->vdev); +err_free_priv: + kfree(dma_ranges); + kfree(priv); + return ret; +} + +void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked) +{ + struct vfio_pci_dma_buf *priv; + struct vfio_pci_dma_buf *tmp; + + lockdep_assert_held_write(&vdev->memory_lock); + + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) { + /* + * Returns true if a reference was successfully obtained. + * The caller must interlock with the dmabuf's release + * function in some way, such as RCU, to ensure that this + * is not called on freed memory. + */ + if (!get_file_rcu(&priv->dmabuf->file)) + continue; + + if (priv->revoked != revoked) { + dma_resv_lock(priv->dmabuf->resv, NULL); + priv->revoked = revoked; + dma_buf_move_notify(priv->dmabuf); + dma_resv_unlock(priv->dmabuf->resv); + } + dma_buf_put(priv->dmabuf); + } +} + +void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) +{ + struct vfio_pci_dma_buf *priv; + struct vfio_pci_dma_buf *tmp; + + down_write(&vdev->memory_lock); + list_for_each_entry_safe(priv, tmp, &vdev->dmabufs, dmabufs_elm) { + if (!get_file_rcu(&priv->dmabuf->file)) + continue; + + dma_resv_lock(priv->dmabuf->resv, NULL); + list_del_init(&priv->dmabufs_elm); + priv->vdev = NULL; + priv->revoked = true; + dma_buf_move_notify(priv->dmabuf); + dma_resv_unlock(priv->dmabuf->resv); + vfio_device_put_registration(&vdev->vdev); + dma_buf_put(priv->dmabuf); + } + up_write(&vdev->memory_lock); +} diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h index a9972eacb293..6f3e8eafdc35 100644 --- a/drivers/vfio/pci/vfio_pci_priv.h +++ b/drivers/vfio/pci/vfio_pci_priv.h @@ -107,4 +107,27 @@ static inline bool vfio_pci_is_vga(struct pci_dev *pdev) return (pdev->class >> 8) == PCI_CLASS_DISPLAY_VGA; } +#ifdef CONFIG_DMA_SHARED_BUFFER +int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, + struct vfio_device_feature_dma_buf __user *arg, + size_t argsz); +void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev); +void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, bool revoked); +#else +static int +vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, + struct vfio_device_feature_dma_buf __user *arg, + size_t argsz) +{ + return -ENOTTY; +} +static inline void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) +{ +} +static inline void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, + bool revoked) +{ +} +#endif + #endif diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index fbb472dd99b3..da5d8955ae56 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -94,6 +94,7 @@ struct vfio_pci_core_device { struct vfio_pci_core_device *sriov_pf_core_dev; struct notifier_block nb; struct rw_semaphore memory_lock; + struct list_head dmabufs; }; /* Will be exported for vfio pci drivers usage */ diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 5764f315137f..9445fa36efd3 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1468,6 +1468,31 @@ struct vfio_device_feature_bus_master { }; #define VFIO_DEVICE_FEATURE_BUS_MASTER 10 +/** + * Upon VFIO_DEVICE_FEATURE_GET create a dma_buf fd for the + * regions selected. + * + * open_flags are the typical flags passed to open(2), eg O_RDWR, O_CLOEXEC, + * etc. offset/length specify a slice of the region to create the dmabuf from. + * nr_ranges is the total number of (P2P DMA) ranges that comprise the dmabuf. + * + * Return: The fd number on success, -1 and errno is set on failure. + */ +#define VFIO_DEVICE_FEATURE_DMA_BUF 11 + +struct vfio_region_dma_range { + __u32 region_index; + __u32 __pad; + __u64 offset; + __u64 length; +}; + +struct vfio_device_feature_dma_buf { + __u32 open_flags; + __u32 nr_ranges; + struct vfio_region_dma_range dma_ranges[]; +}; + /* -------- API for Type1 VFIO IOMMU -------- */ /** From patchwork Thu May 29 05:34:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893237 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D900259CA9; Thu, 29 May 2025 05:42:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497331; cv=none; b=UUyVkZpj+8n1Fvs6bBU0V+HGZh6paPMzQmvb6VEHvax0/eeM6Lhttip5boN0AxpkaGUzYOzQmcu4hbc/eG0ksYLrVvCRQXl0+PoKJTBHGQvYRFLeExz4iGO6UaQTPINBgU7+D8yfgK8rUxQQPP51rfPvhse9kjqixd015FJ6F+Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497331; c=relaxed/simple; bh=SszxCmsQ/alnLjPMaj/qHpVpVa5oHlCJl3O95fFvP8o=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=KZUlmk54ogw0GW1qSK4Q9/5CR5lxlZVgOS/CdG3YA3cqcjb/3yU05WWWWqREjcCcKC516QdS7P9xBlRP+po5n/BFi6ycwWYlh9qX6kXnYm+Ya5lDMeHO8fbu2BCi6EkvrTVonfxFcuj8ow5Iw8LQclPQC3DF22ONihxyZyOvJy4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=LoO2Vo3f; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="LoO2Vo3f" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497331; x=1780033331; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=SszxCmsQ/alnLjPMaj/qHpVpVa5oHlCJl3O95fFvP8o=; b=LoO2Vo3fV7n2BunwX1rdMkVOXuelDaX0YsqTNQun7eHl0ph9t2i2xrZX /VdQaZAhew8fjKccU+4Ykur4PGoRQF9HzVkASz8vUzaAWh2G8xfqAz4yB 5pBi6H9XLx7Kim5Rhhle5J4o7aEJuT+5uqaLvkgb6voJUiuXgNDR8Mcxk IU0hwO4B34rb1ryNF8bvWAHqBZO+4eOKQC1Bi7ItxxXrpJIIjyJ8OyQ1k O2xsw8bZs66z5p/e1SmUZZcajk53mw6Qo29iUVsFuY8KyD3jZeEE4Js6a UGAjxyJYKq3UZKG35tHjYxYc2154UNhuE77rIWQwJxG5O13cGr21kGgWW g==; X-CSE-ConnectionGUID: UdAqw97RRIGTfcj6s9KU8w== X-CSE-MsgGUID: VwBmHPeJTFmtJGQQJsRYDw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962958" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962958" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:10 -0700 X-CSE-ConnectionGUID: eCitUIkpQa2RjTiX6bzOHQ== X-CSE-MsgGUID: gVKSKMYHTUWyYpK329M9qA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443296" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:04 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 05/30] fixup! vfio/pci: fix dma-buf revoke typo on reset Date: Thu, 29 May 2025 13:34:48 +0800 Message-Id: <20250529053513.1592088-6-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Fixed the patch: vfio/pci: Allow MMIO regions to be exported through dma-buf Signed-off-by: Xu Yilun --- drivers/vfio/pci/vfio_pci_config.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index efccbb2d2a42..7ac062bd5044 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -911,7 +911,7 @@ static int vfio_exp_config_write(struct vfio_pci_core_device *vdev, int pos, vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); if (__vfio_pci_memory_enabled(vdev)) - vfio_pci_dma_buf_move(vdev, true); + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); } } @@ -996,7 +996,7 @@ static int vfio_af_config_write(struct vfio_pci_core_device *vdev, int pos, vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); if (__vfio_pci_memory_enabled(vdev)) - vfio_pci_dma_buf_move(vdev, true); + vfio_pci_dma_buf_move(vdev, false); up_write(&vdev->memory_lock); } } From patchwork Thu May 29 05:34:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893495 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A24652561BB; Thu, 29 May 2025 05:42:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497338; cv=none; b=NIrIqWhr98EmntVCjNIQXzMlztQt2UA/f0BB6sL2kPyfpQjObOssRL+Bgifwe/P8Hni+I5zeMb1LdvhmsPhoA8HwEAMgNqa9/0nkIZIS8xj1IMRFj0OGNXtxN9/jzNH9j1MVqOp6HA6Oh48ehEyJ21YySJx1gQ3K6IGA+Or2STs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497338; c=relaxed/simple; bh=p31VmxVilMXC9KdxzlBocv0GuObtBaCJYVrKzleJOVk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nsJMqFAQ0s5xZE/m/Wz5uYcuhcrpmxcO1oZwI9VVyoUdVtESt0WXuZjrWO3kygvJA+MiFWYta1yCi5xxQ8nSwzPQSLIK84kbgrwddf15VvZuq93GixyPLJPI6JAtHapKhXwaqG3ZCleTKPeFrjBCWsfRYs7s16vYXvV3T81L2lY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ZG9nx8BH; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ZG9nx8BH" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497337; x=1780033337; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p31VmxVilMXC9KdxzlBocv0GuObtBaCJYVrKzleJOVk=; b=ZG9nx8BHqEP/chC3pzlp0J2pqRgjF9GEzvXMnk8oKlhr/8C6ctWR9WHW 2Z1Vl8he/zDaj8d0LcwkV57KoncCZWUmYma2Fq26vH8kFw6Gpw6KIMguO IZbWAOzF0OA30Qr4U833vngpVyF+I20E9I3TiDSB+1uktplAp9gX8yAkV elbULIk24EOBAeIW2CxBmD2BnPqvYlEpYfVOgge1zOHwEi7fmKcikPKnq zIbLeGrwQprSfoh7bmPZ5JcaPQB5enPZHr4GqysZMERQWj2NpvaD/df8M WOoi0gOmyVAMdHH/k6aJIuymExOi2OduNnhqxwQATS5ZKqnIk6L3DtPp7 Q==; X-CSE-ConnectionGUID: yYwr5jFtTrmga6wSo0te1w== X-CSE-MsgGUID: q72lDjJASE2jqb7LI1mQsg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962978" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962978" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:17 -0700 X-CSE-ConnectionGUID: zcK+C4NbTKClBEhaxo9Yyw== X-CSE-MsgGUID: U/QnultNSjmn5UPRmf0Lxw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443303" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:10 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 06/30] HACK: vfio/pci: Support get_pfn() callback for dma-buf Date: Thu, 29 May 2025 13:34:49 +0800 Message-Id: <20250529053513.1592088-7-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This is to support private device/MMMIO assignment, but is an incomplete implementation as discussed. In this case, VFIO PCI act as the exporter for MMIO regions and KVM is the importer. KVM imports the dma-buf FD and gets MMIO pfn through dma_buf_ops.get_pfn(), then map the pfn in KVM MMU. KVM should also react to dma-buf move notify, unmap all pfns when VFIO revokes the MMIOs. I.e VFIO controls the lifetime of the MMIOs. Previously, KVM uses follow_pfn() to get the MMIO pfn. With dma-buf, KVM no longer needs to firstly map the MMIOs to host page table. It also solves the concern in Confidential Computing (CC) that host is not allowed to have mapping to private resources owned by guest. Signed-off-by: Xu Yilun --- drivers/vfio/pci/vfio_pci_dmabuf.c | 34 ++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index a4c313ca5bda..cf9a90448856 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -174,6 +174,39 @@ static void vfio_pci_dma_buf_unmap(struct dma_buf_attachment *attachment, kfree(sgt); } +static int vfio_pci_dma_buf_get_pfn(struct dma_buf_attachment *attachment, + pgoff_t pgoff, u64 *pfn, int *max_order) +{ + struct vfio_pci_dma_buf *priv = attachment->dmabuf->priv; + struct vfio_region_dma_range *dma_ranges = priv->dma_ranges; + u64 offset = pgoff << PAGE_SHIFT; + int i; + + dma_resv_assert_held(priv->dmabuf->resv); + + if (priv->revoked) + return -ENODEV; + + if (offset >= priv->dmabuf->size) + return -EINVAL; + + for (i = 0; i < priv->nr_ranges; i++) { + if (offset < dma_ranges[i].length) + break; + + offset -= dma_ranges[i].length; + } + + *pfn = PHYS_PFN(pci_resource_start(priv->vdev->pdev, dma_ranges[i].region_index) + + dma_ranges[i].offset + offset); + + /* TODO: large page mapping is yet to be supported */ + if (max_order) + *max_order = 0; + + return 0; +} + static void vfio_pci_dma_buf_release(struct dma_buf *dmabuf) { struct vfio_pci_dma_buf *priv = dmabuf->priv; @@ -198,6 +231,7 @@ static const struct dma_buf_ops vfio_pci_dmabuf_ops = { .unpin = vfio_pci_dma_buf_unpin, .release = vfio_pci_dma_buf_release, .unmap_dma_buf = vfio_pci_dma_buf_unmap, + .get_pfn = vfio_pci_dma_buf_get_pfn, }; static int check_dma_ranges(struct vfio_pci_dma_buf *priv, From patchwork Thu May 29 05:34:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893236 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF849253F1B; Thu, 29 May 2025 05:42:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497346; cv=none; b=fK/kJ0AY4MMblj5enhivmUqfEJY3/j81iqW9+nJN+1DYMzs1EtokuITYM+fzWUHMHjVSOiOs82YipGYVkQW6EHyT9H1tdrB/FYIelWlXTbqvYxmNso6c20AK+RRHydEyd6f68usnpY1d58CboUjpW9KzOA0bp4dRnbSfAGsDXxc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497346; c=relaxed/simple; bh=fB1BxV19uu1rsWBru8A2Ip+fbK6R8flI/KIFXemro9E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=CGzLpXtYcbJkX1UlF7S21lZB5vMZl17gEb5WqkogwEDUnlWEa7RfRUrb9tHVJFG0y2xUAQTi5OoSvEjTqzp/LSSme/vgsmyzrfjsQNpR7ti2IVr8YfmrPY9PCcWLdHpNyj//bmDQGHRe55kDkcD5uib3rm4S59n0dG9p1GwaznA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UrnQqWSd; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UrnQqWSd" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497344; x=1780033344; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fB1BxV19uu1rsWBru8A2Ip+fbK6R8flI/KIFXemro9E=; b=UrnQqWSdBa/SXoLCUtzgyJmSJG81blMVLgZ1y6RBJKvD/SxQou+v1ERF VHfllMEX/TZwgXbgGQKx/TVZv5leWkwkF6DJywHaVjKpMYqqgwfiRaYXa B3zzA3/q+vh8feFQH00yyL5ZxhHlgJzwronoa1ek+2eL8tLmSGq84Fwtv KxB+/iWnm6QjnoS9LsO5t+xdulhTHF0dhkKnSBwZMgLj9FG2eMTMq/B/u aWN5P9Fi2oWsVo9olEm1bBIfv9+tAYZDaP1wkJcQgsv6WJDlyWyKEOV0A zT+/uOn1zN475e+ln/u2vbteaPEao986Ry5p96xZnV7rSsT52l9loVWbu g==; X-CSE-ConnectionGUID: 7I10/HNdQX2njkV1oc6LIg== X-CSE-MsgGUID: DHlrg3PyRnKo6AtjKkc7Ww== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67962999" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67962999" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:23 -0700 X-CSE-ConnectionGUID: MKeI1hBfReC8eAvQfu0QJA== X-CSE-MsgGUID: gVrbL36zTsyBDKdZNeqBvQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443319" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:16 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 07/30] KVM: Support vfio_dmabuf backed MMIO region Date: Thu, 29 May 2025 13:34:50 +0800 Message-Id: <20250529053513.1592088-8-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Extend KVM_SET_USER_MEMORY_REGION2 to support mapping vfio_dmabuf backed MMIO region into a guest. The main purpose of this change is for KVM to map MMIO resources without firstly mapping into the host, similar to what is done in guest_memfd. The immediate use case is for CoCo VMs to support private MMIO. Similar to private guest memory, private MMIO is also not intended to be accessed by host. The host access to private MMIO would be rejected by private devices (known as TDI in TDISP spec) and cause the TDI exit the secure state. The further impact to the system may vary according to device implementation. The TDISP spec doesn't mandate any error reporting or logging, the TLP may be handled as an Unsupported Request, or just be dropped. In my test environment, an AER NonFatalErr is reported and no further impact. So from HW perspective, disallowing host access to private MMIO is not that critical but nice to have. But stick to find pfn via userspace mapping while allowing the pfn been privately mapped conflicts with the private mapping concept. And it virtually allows userspace to map any address as private. Before fault in, KVM cannot distinguish if a userspace addr is for private MMIO and safe to host access. Rely on userspace mapping also means private MMIO mapping should follow userspace mapping change via mmu_notifier. This conflicts with the current design that mmu_notifier never impacts private mapping. It also makes no sense to support mmu_notifier just for private MMIO, private MMIO mapping should be fixed when CoCo-VM accepts the private MMIO, any following mapping change without guest permission should be invalid. So the choice here is to eliminate userspace mapping and switch to use the FD based MMIO resources. There is still need to switch the memory attribute (shared <-> private) for private MMIO, when guest switches the device attribute between shared & private. Unlike memory, MMIO region has only one physical backend so it is a bit like in-place conversion, which for private memory, requires much effort on how to invalidate user mapping when converting to private. But for MMIO, it is expected that VMM never needs to access assigned MMIO for feature emulation, so always disallow userspace MMIO mapping and use FD based MMIO resources for 'private capable' MMIO region. The dma-buf is chosen as the FD based backend, it meets the need for KVM to aquire the non-struct page memory that can still have lifetime controlled by VFIO. It provides the option to disallow userspace mmap as long as the exporter doesn't provide dma_buf_ops.mmap() callback. The concern is it now just supports mapping into device's default_domain via DMA APIs. Some clue I can found to extend dma-buf APIs for subsystems like IOMMUFD [1] or KVM. The adding of dma_buf_get_pfn_unlocked() in this series is for this purpose. An alternative is VFIO provides a dedicated FD for KVM. But considering IOMMUFD may use dma-buf for MMIO mapping [2], it is better to have a unified export mechanism for the same purpose in VFIO. Open: Currently store the dmabuf fd parameter in kvm_userspace_memory_region2::guest_memfd. It may be confusing but avoids introducing another API format for IOCTL(KVM_SET_USER_MEMORY_REGION3). [1] https://lore.kernel.org/all/YwywgciH6BiWz4H1@nvidia.com/ [2] https://lore.kernel.org/kvm/14-v4-0de2f6c78ed0+9d1-iommufd_jgg@nvidia.com/ Signed-off-by: Xu Yilun --- Documentation/virt/kvm/api.rst | 7 ++ include/linux/kvm_host.h | 18 +++++ include/uapi/linux/kvm.h | 1 + virt/kvm/Kconfig | 6 ++ virt/kvm/Makefile.kvm | 1 + virt/kvm/kvm_main.c | 32 +++++++-- virt/kvm/kvm_mm.h | 19 +++++ virt/kvm/vfio_dmabuf.c | 125 +++++++++++++++++++++++++++++++++ 8 files changed, 205 insertions(+), 4 deletions(-) create mode 100644 virt/kvm/vfio_dmabuf.c diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 47c7c3f92314..2962b0e30f81 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6307,6 +6307,13 @@ state. At VM creation time, all memory is shared, i.e. the PRIVATE attribute is '0' for all gfns. Userspace can control whether memory is shared/private by toggling KVM_MEMORY_ATTRIBUTE_PRIVATE via KVM_SET_MEMORY_ATTRIBUTES as needed. +Userspace can set KVM_MEM_VFIO_DMABUF in flags to indicate the memory region is +backed by a userspace unmappable dma_buf exported by VFIO. The backend resource +is one piece of MMIO region of the device. The slot is unmappable so it is +allowed to be converted to private. KVM binds the memory region to a given +dma_buf fd range of [0, memory_size]. For now, the dma_buf fd is filled in +'guest_memfd' field, and the guest_memfd_offset must be 0; + S390: ^^^^^ diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 291d49b9bf05..d16f47c3d008 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -612,6 +612,10 @@ struct kvm_memory_slot { pgoff_t pgoff; } gmem; #endif + +#ifdef CONFIG_KVM_VFIO_DMABUF + struct dma_buf_attachment *dmabuf_attach; +#endif }; static inline bool kvm_slot_can_be_private(const struct kvm_memory_slot *slot) @@ -2571,4 +2575,18 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_VFIO_DMABUF +int kvm_vfio_dmabuf_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order); +#else +static inline int kvm_vfio_dmabuf_get_pfn(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, + int *max_order); +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} +#endif + #endif diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index b6ae8ad8934b..a4e05fe46918 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -51,6 +51,7 @@ struct kvm_userspace_memory_region2 { #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) #define KVM_MEM_READONLY (1UL << 1) #define KVM_MEM_GUEST_MEMFD (1UL << 2) +#define KVM_MEM_VFIO_DMABUF (1UL << 3) /* for KVM_IRQ_LINE */ struct kvm_irq_level { diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 727b542074e7..9e6832dfa297 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -119,6 +119,7 @@ config KVM_PRIVATE_MEM config KVM_GENERIC_PRIVATE_MEM select KVM_GENERIC_MEMORY_ATTRIBUTES select KVM_PRIVATE_MEM + select KVM_VFIO_DMABUF bool config HAVE_KVM_ARCH_GMEM_PREPARE @@ -128,3 +129,8 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_VFIO_DMABUF + bool + select DMA_SHARED_BUFFER + select DMABUF_MOVE_NOTIFY diff --git a/virt/kvm/Makefile.kvm b/virt/kvm/Makefile.kvm index 724c89af78af..c08e98f13f65 100644 --- a/virt/kvm/Makefile.kvm +++ b/virt/kvm/Makefile.kvm @@ -13,3 +13,4 @@ kvm-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(KVM)/irqchip.o kvm-$(CONFIG_HAVE_KVM_DIRTY_RING) += $(KVM)/dirty_ring.o kvm-$(CONFIG_HAVE_KVM_PFNCACHE) += $(KVM)/pfncache.o kvm-$(CONFIG_KVM_PRIVATE_MEM) += $(KVM)/guest_memfd.o +kvm-$(CONFIG_KVM_VFIO_DMABUF) += $(KVM)/vfio_dmabuf.o diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index e85b33a92624..f2ee111038ef 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -957,6 +957,8 @@ static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) { if (slot->flags & KVM_MEM_GUEST_MEMFD) kvm_gmem_unbind(slot); + else if (slot->flags & KVM_MEM_VFIO_DMABUF) + kvm_vfio_dmabuf_unbind(slot); kvm_destroy_dirty_bitmap(slot); @@ -1529,13 +1531,19 @@ static void kvm_replace_memslot(struct kvm *kvm, static int check_memory_region_flags(struct kvm *kvm, const struct kvm_userspace_memory_region2 *mem) { + u32 private_mask = KVM_MEM_GUEST_MEMFD | KVM_MEM_VFIO_DMABUF; + u32 private_flag = mem->flags & private_mask; u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; + /* private flags are mutually exclusive. */ + if (private_flag & (private_flag - 1)) + return -EINVAL; + if (kvm_arch_has_private_mem(kvm)) - valid_flags |= KVM_MEM_GUEST_MEMFD; + valid_flags |= private_flag; /* Dirty logging private memory is not currently supported. */ - if (mem->flags & KVM_MEM_GUEST_MEMFD) + if (private_flag) valid_flags &= ~KVM_MEM_LOG_DIRTY_PAGES; /* @@ -1543,8 +1551,7 @@ static int check_memory_region_flags(struct kvm *kvm, * read-only memslots have emulated MMIO, not page fault, semantics, * and KVM doesn't allow emulated MMIO for private memory. */ - if (kvm_arch_has_readonly_mem(kvm) && - !(mem->flags & KVM_MEM_GUEST_MEMFD)) + if (kvm_arch_has_readonly_mem(kvm) && !private_flag) valid_flags |= KVM_MEM_READONLY; if (mem->flags & ~valid_flags) @@ -2049,6 +2056,21 @@ static int kvm_set_memory_region(struct kvm *kvm, r = kvm_gmem_bind(kvm, new, mem->guest_memfd, mem->guest_memfd_offset); if (r) goto out; + } else if (mem->flags & KVM_MEM_VFIO_DMABUF) { + if (mem->guest_memfd_offset) { + r = -EINVAL; + goto out; + } + + /* + * Open: May be confusing that store the dmabuf fd parameter in + * kvm_userspace_memory_region2::guest_memfd. But this avoids + * introducing another format for + * IOCTL(KVM_SET_USER_MEMORY_REGIONX). + */ + r = kvm_vfio_dmabuf_bind(kvm, new, mem->guest_memfd); + if (r) + goto out; } r = kvm_set_memslot(kvm, old, new, change); @@ -2060,6 +2082,8 @@ static int kvm_set_memory_region(struct kvm *kvm, out_unbind: if (mem->flags & KVM_MEM_GUEST_MEMFD) kvm_gmem_unbind(new); + else if (mem->flags & KVM_MEM_VFIO_DMABUF) + kvm_vfio_dmabuf_unbind(new); out: kfree(new); return r; diff --git a/virt/kvm/kvm_mm.h b/virt/kvm/kvm_mm.h index acef3f5c582a..faefc252c337 100644 --- a/virt/kvm/kvm_mm.h +++ b/virt/kvm/kvm_mm.h @@ -93,4 +93,23 @@ static inline void kvm_gmem_unbind(struct kvm_memory_slot *slot) } #endif /* CONFIG_KVM_PRIVATE_MEM */ +#ifdef CONFIG_KVM_VFIO_DMABUF +int kvm_vfio_dmabuf_bind(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned int fd); +void kvm_vfio_dmabuf_unbind(struct kvm_memory_slot *slot); +#else +static inline int kvm_vfio_dmabuf_bind(struct kvm *kvm, + struct kvm_memory_slot *slot, + unsigned int fd); +{ + WARN_ON_ONCE(1); + return -EIO; +} + +static inline void kvm_vfio_dmabuf_unbind(struct kvm_memory_slot *slot) +{ + WARN_ON_ONCE(1); +} +#endif /* CONFIG_KVM_VFIO_DMABUF */ + #endif /* __KVM_MM_H__ */ diff --git a/virt/kvm/vfio_dmabuf.c b/virt/kvm/vfio_dmabuf.c new file mode 100644 index 000000000000..c427ab39c68a --- /dev/null +++ b/virt/kvm/vfio_dmabuf.c @@ -0,0 +1,125 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "kvm_mm.h" + +MODULE_IMPORT_NS("DMA_BUF"); + +struct kvm_vfio_dmabuf { + struct kvm *kvm; + struct kvm_memory_slot *slot; +}; + +static void kv_dmabuf_move_notify(struct dma_buf_attachment *attach) +{ + struct kvm_vfio_dmabuf *kv_dmabuf = attach->importer_priv; + struct kvm_memory_slot *slot = kv_dmabuf->slot; + struct kvm *kvm = kv_dmabuf->kvm; + bool flush = false; + + struct kvm_gfn_range gfn_range = { + .start = slot->base_gfn, + .end = slot->base_gfn + slot->npages, + .slot = slot, + .may_block = true, + .attr_filter = KVM_FILTER_PRIVATE | KVM_FILTER_SHARED, + }; + + KVM_MMU_LOCK(kvm); + kvm_mmu_invalidate_begin(kvm); + flush |= kvm_mmu_unmap_gfn_range(kvm, &gfn_range); + if (flush) + kvm_flush_remote_tlbs(kvm); + + kvm_mmu_invalidate_end(kvm); + KVM_MMU_UNLOCK(kvm); +} + +static const struct dma_buf_attach_ops kv_dmabuf_attach_ops = { + .allow_peer2peer = true, + .move_notify = kv_dmabuf_move_notify, +}; + +int kvm_vfio_dmabuf_bind(struct kvm *kvm, struct kvm_memory_slot *slot, + unsigned int fd) +{ + size_t size = slot->npages << PAGE_SHIFT; + struct dma_buf_attachment *attach; + struct kvm_vfio_dmabuf *kv_dmabuf; + struct dma_buf *dmabuf; + int ret; + + dmabuf = dma_buf_get(fd); + if (IS_ERR(dmabuf)) + return PTR_ERR(dmabuf); + + if (size != dmabuf->size) { + ret = -EINVAL; + goto err_dmabuf; + } + + kv_dmabuf = kzalloc(sizeof(*kv_dmabuf), GFP_KERNEL); + if (!kv_dmabuf) { + ret = -ENOMEM; + goto err_dmabuf; + } + + kv_dmabuf->kvm = kvm; + kv_dmabuf->slot = slot; + attach = dma_buf_dynamic_attach(dmabuf, NULL, &kv_dmabuf_attach_ops, + kv_dmabuf); + if (IS_ERR(attach)) { + ret = PTR_ERR(attach); + goto err_kv_dmabuf; + } + + slot->dmabuf_attach = attach; + + return 0; + +err_kv_dmabuf: + kfree(kv_dmabuf); +err_dmabuf: + dma_buf_put(dmabuf); + return ret; +} + +void kvm_vfio_dmabuf_unbind(struct kvm_memory_slot *slot) +{ + struct dma_buf_attachment *attach = slot->dmabuf_attach; + struct kvm_vfio_dmabuf *kv_dmabuf; + struct dma_buf *dmabuf; + + if (WARN_ON_ONCE(!attach)) + return; + + kv_dmabuf = attach->importer_priv; + dmabuf = attach->dmabuf; + dma_buf_detach(dmabuf, attach); + kfree(kv_dmabuf); + dma_buf_put(dmabuf); +} + +/* + * The return value matters. If return -EFAULT, userspace will try to do + * page attribute (shared <-> private) conversion. + */ +int kvm_vfio_dmabuf_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct dma_buf_attachment *attach = slot->dmabuf_attach; + pgoff_t pgoff = gfn - slot->base_gfn; + int ret; + + if (WARN_ON_ONCE(!attach)) + return -EFAULT; + + ret = dma_buf_get_pfn_unlocked(attach, pgoff, pfn, max_order); + if (ret) + return -EIO; + + return 0; +} +EXPORT_SYMBOL_GPL(kvm_vfio_dmabuf_get_pfn); From patchwork Thu May 29 05:34:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893494 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9559825C81E; Thu, 29 May 2025 05:42:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497351; cv=none; b=dE/ew3wMZp0w7wgEAzIgh7Eky0Byn967wQiDrEzIZy7sX9cEw8jMVVL+p2gLLkIfX/a+WF64ffFew5p68rnezH7Vk1loylQNvFILsyn/RDRObRAYzXLpBCyp3TqhNrVp8kZfsnZviCCo8LYXUcRhTNxZNKwZB7k0jqyjLPE/xaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497351; c=relaxed/simple; bh=dc9UGJ2UOu8eBX3b96rS/6oZiGutVnEA2gyyRmtg9bU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Hod8y7PgLy2+jFsdgHyzf1usya1D/8vJCAyE/aDElQHkfulX1XAonc8v5qI4y1SPZSh9wVy33kmNkTFVxPHS25919WMy5zznN7DnmQ7b1FhDZzo1ZGeO6p+/OJz//8+myfLtPSF9hC8Z7uzX16yA/gM8NHlW8XCpCk9zO6vduRg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YxSnAp9X; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YxSnAp9X" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497350; x=1780033350; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dc9UGJ2UOu8eBX3b96rS/6oZiGutVnEA2gyyRmtg9bU=; b=YxSnAp9X5lA3Mf+9OJjz7aF7el5jxX5Pll6FPmpux5xspl7qskYuxfbP oeV3ZZlo6Bvnw7meoT+O8OexlceMhC5C8tKlgc5JDuL0vzYuzx5d6ExA5 xi++EXymgZegUOBvtew6KCOMa+6HViTsNSTTn6C8c/nZvwojx4+GGC96o 26l7PMCUmucCOkcxy/k34JKUmLxBCRzgqMEhJRi29I4C/5tTgxj1RwJuO +tVHruYZRzXIa7SEhGHmoDTyvCadVzQ/z3YlAD/aJJxqejbK858NZSEeS gkWCVeVoqNPcY3r9hrld6PmZCNqC1N/LOSbOPluHAOVyyAww9oJmJ+Hjr g==; X-CSE-ConnectionGUID: P3FMu/plSXGdfFaMJ4M+vQ== X-CSE-MsgGUID: zuFI7IRFSKC2/OBro/55/A== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963010" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963010" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:30 -0700 X-CSE-ConnectionGUID: 5Xn1NSYcQ3CI5iCeaC7oJA== X-CSE-MsgGUID: FK39mpxeSamJZbUyKZEQJg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443332" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:23 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 08/30] KVM: x86/mmu: Handle page fault for vfio_dmabuf backed MMIO Date: Thu, 29 May 2025 13:34:51 +0800 Message-Id: <20250529053513.1592088-9-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add support for resolving page faults on vfio_dmabuf backed MMIO. Now only support setup KVM MMU mapping on shared roots, i.e. vfio_dmabuf works for shared assigned devices. Further work is to support private MMIO for private assigned devices (known as TDI in TDISP spec). Signed-off-by: Xu Yilun --- arch/x86/kvm/mmu/mmu.c | 16 ++++++++++++++++ include/linux/kvm_host.h | 5 +++++ 2 files changed, 21 insertions(+) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 63bb77ee1bb1..40d33bd6b532 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4523,6 +4523,22 @@ static int __kvm_mmu_faultin_pfn(struct kvm_vcpu *vcpu, if (fault->is_private) return kvm_mmu_faultin_pfn_private(vcpu, fault); + /* vfio_dmabuf slot is also applicable for shared mapping */ + if (kvm_slot_is_vfio_dmabuf(fault->slot)) { + int max_order, r; + + r = kvm_vfio_dmabuf_get_pfn(vcpu->kvm, fault->slot, fault->gfn, + &fault->pfn, &max_order); + if (r) + return r; + + fault->max_level = min(kvm_max_level_for_order(max_order), + fault->max_level); + fault->map_writable = !(fault->slot->flags & KVM_MEM_READONLY); + + return RET_PF_CONTINUE; + } + foll |= FOLL_NOWAIT; fault->pfn = __kvm_faultin_pfn(fault->slot, fault->gfn, foll, &fault->map_writable, &fault->refcounted_page); diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index d16f47c3d008..b850d3cff83c 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -623,6 +623,11 @@ static inline bool kvm_slot_can_be_private(const struct kvm_memory_slot *slot) return slot && (slot->flags & KVM_MEM_GUEST_MEMFD); } +static inline bool kvm_slot_is_vfio_dmabuf(const struct kvm_memory_slot *slot) +{ + return slot && (slot->flags & KVM_MEM_VFIO_DMABUF); +} + static inline bool kvm_slot_dirty_track_enabled(const struct kvm_memory_slot *slot) { return slot->flags & KVM_MEM_LOG_DIRTY_PAGES; From patchwork Thu May 29 05:34:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893235 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8499253F1B; Thu, 29 May 2025 05:42:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497357; cv=none; b=tkAM9zxX6iyLDVRbrwRZ7t8n6ae+RssW+CN/S37Ql4oNwtCVj6//1aaTeDEe15XyQA59+JFsTRY95zEnlaYUNa7QwNx7O8yrgtDdmh6CoZMxDQGbexwPVrAfzFZhrtNDi+VTznomZapc/7rU83r8Vdsn70Kh23DfGeXjahw3r8M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497357; c=relaxed/simple; bh=p9hqoSu/HsjsyfABS0VPyixE2kcW46/UFMRJtpNvppI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pdjRac9eyRBZDN3EIxO57KOw3sviTzNrzOLd6yla0DcKIkOe1AS47U9/l++SDD7ittoYzdkxCFlBT/ny3BhceaE0PWRB4Y6GLK3Y7phwFIrudmzrF6VCQPdtRq+C1crBCXNQrqTbD1SvoMOtTVzqNnQH3cppL9FmpSQrn97zsH0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=STDobxnL; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="STDobxnL" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497356; x=1780033356; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=p9hqoSu/HsjsyfABS0VPyixE2kcW46/UFMRJtpNvppI=; b=STDobxnL0ps/z+In0pzUfcs+Gw8vVhuHQHyTA5lQLzgcXfj9wu3vGcrK fTp0bpTSr1q5fxK9PRsWjHBs4Z0o9tTfU+MXkyPQyxKyfb6jLe704Fht/ 7Cos15g7IqybnZzKgGpHSIs6hc28XpO/yfWEfHzPG7sfHogrpNZ3bmfQh VTeUYVlHGf2Uf7ma+ZVJ0fz89xv7mZWjmQhdDydpewmn1k5p7eERxYxqX sDbcn7CkV3zL5/fnOn/Iun41U7TvaAALns2a1zcWEuqC1sz8ZLvTwIOq0 VmO8cPL5RpnrzxbspBK/KFdYSgUTymT6w6BGYwP+LRSsJ4HCSZ4n55j55 Q==; X-CSE-ConnectionGUID: OCOU+QPVRrSG+DLqWghyvw== X-CSE-MsgGUID: h7XekdrSSjqNGGim05AA9A== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963032" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963032" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:36 -0700 X-CSE-ConnectionGUID: sX7W78N6RGefx/ixE10Qpg== X-CSE-MsgGUID: EQlW02SCRkeY3QkKQvP9LQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443338" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:29 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 09/30] KVM: x86/mmu: Handle page fault for private MMIO Date: Thu, 29 May 2025 13:34:52 +0800 Message-Id: <20250529053513.1592088-10-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add support for resolving page faults on private MMIO. This is part of the effort to enable private assigned devices (known as TDI in TDISP spec). Private MMIOs are set to KVM as vfio_dmabuf typed memory slot, which is another type of can-be-private memory slot just like the gmem slot. Like gmem slot, KVM needs to map its GFN as shared or private based on the current state of the GFN's memory attribute. When page fault happens for private MMIO but private <-> shared conversion is needed, KVM still exits to userspace with exit reason KVM_EXIT_MEMORY_FAULT and toggles KVM_MEMORY_EXIT_FLAG_PRIVATE. Unlike gmem slot, vfio_dmabuf slot has only one backend MMIO resource, the switching of GFN's attribute won't change the way of getting PFN, the vfio_dmabuf specific way, kvm_vfio_dmabuf_get_pfn(). Signed-off-by: Xu Yilun --- arch/x86/kvm/mmu/mmu.c | 9 +++++++-- include/linux/kvm_host.h | 2 +- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 40d33bd6b532..547fb645692b 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -4501,8 +4501,13 @@ static int kvm_mmu_faultin_pfn_private(struct kvm_vcpu *vcpu, return -EFAULT; } - r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, &fault->pfn, - &fault->refcounted_page, &max_order); + if (kvm_slot_is_vfio_dmabuf(fault->slot)) + r = kvm_vfio_dmabuf_get_pfn(vcpu->kvm, fault->slot, fault->gfn, + &fault->pfn, &max_order); + else + r = kvm_gmem_get_pfn(vcpu->kvm, fault->slot, fault->gfn, + &fault->pfn, &fault->refcounted_page, + &max_order); if (r) { kvm_mmu_prepare_memory_fault_exit(vcpu, fault); return r; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index b850d3cff83c..dd9c876374b8 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -620,7 +620,7 @@ struct kvm_memory_slot { static inline bool kvm_slot_can_be_private(const struct kvm_memory_slot *slot) { - return slot && (slot->flags & KVM_MEM_GUEST_MEMFD); + return slot && (slot->flags & (KVM_MEM_GUEST_MEMFD | KVM_MEM_VFIO_DMABUF)); } static inline bool kvm_slot_is_vfio_dmabuf(const struct kvm_memory_slot *slot) From patchwork Thu May 29 05:34:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893493 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E7DA525484D; Thu, 29 May 2025 05:42:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497364; cv=none; b=EMG5JHjaJD/q5srzSIbs6ljcRomVFkUo1/IT+Avf9dQsEWZeuSU+u95wgrKWIrg7EoQIcu9UwD5tip4g/yZd6bs/dQfgAClNeM4i/fmMubIlCRt0EdynoHEJrBb/uh2MPuMJWKuz3ZvpVQWHaGMENRoQE/x4njLLI3oXIZxjTvA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497364; c=relaxed/simple; bh=r8jQZ+fhgbYHTsi3rR/12Px/bML06/rj6uFZHcdwCGw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cebBI36J07gXd0HDRuHzcPB9l4Mp5gD7zpecqkWJJSlxjffN1497rpQBfhxJ+ishcOvys+c97OoDioGlcBdNeIYeIRH0r0OQ6YzdZ7oXlgL4q4sWtuAWMMyk72KHQHUgqexQEslvHv5M7X2gOeTeEOvouJ3V0gCvODBeh5965EU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=Knq/9+K8; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="Knq/9+K8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497363; x=1780033363; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r8jQZ+fhgbYHTsi3rR/12Px/bML06/rj6uFZHcdwCGw=; b=Knq/9+K8e4GzRc6xQZBT553+1EOzk88/Qdj78ALx0Ulxt+7wzJpM4EX8 Mx/SFJYiUJzheaNDm6T3pCs8LgHwr6n7hPNAepaSwNdgvC2+TriYMzP3P o3jtrurFZVWJUIuqKKRj4Yy/6BX0spdFrnXJl9E6Cb+irs89oKfSraKUm DuAskrJJIV5uiXWu2P0WqYP7AgZiVsi14uUTjcOrVAeni7RGe46AnEknV G3SOzPv5hkos2CQXiESjP/BEZQhU/vLu9ExD85c8LLsp6OlL7N0buNzJq GXvKsTMpnvst6HFcAIL/filmmjeySnDHtR9SLKOHNJHujz1JWZoEHnfSe w==; X-CSE-ConnectionGUID: iG1IEb/4S3eQLEVUA0jmFA== X-CSE-MsgGUID: 8iU0S3GgQnS0vwcPTqnIPw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963050" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963050" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:42 -0700 X-CSE-ConnectionGUID: Biwv1OXpRvy58RsS0/YMJQ== X-CSE-MsgGUID: CIW/OnndQ3GZeLSRRy7fZg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443347" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:35 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 10/30] vfio/pci: Export vfio dma-buf specific info for importers Date: Thu, 29 May 2025 13:34:53 +0800 Message-Id: <20250529053513.1592088-11-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Export vfio dma-buf specific info by attaching vfio_dma_buf_data in struct dma_buf::priv. Provide a helper vfio_dma_buf_get_data() for importers to fetch these data. Exporters identify VFIO dma-buf by successfully getting these data. VFIO dma-buf supports disabling host access to these exported MMIO regions when the device is converted to private. Exporters like KVM need to identify this type of dma-buf to decide if it is good to use. KVM only allows host unaccessible MMIO regions been mapped in private roots. Export struct kvm * handler attached to the vfio device. This allows KVM to do another sanity check. MMIO should only be assigned to a CoCo VM if its owner device is already assigned to the same VM. Signed-off-by: Xu Yilun --- drivers/vfio/pci/vfio_pci_dmabuf.c | 18 ++++++++++++++++++ include/linux/vfio.h | 18 ++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci_dmabuf.c b/drivers/vfio/pci/vfio_pci_dmabuf.c index cf9a90448856..4011545db3ad 100644 --- a/drivers/vfio/pci/vfio_pci_dmabuf.c +++ b/drivers/vfio/pci/vfio_pci_dmabuf.c @@ -10,6 +10,8 @@ MODULE_IMPORT_NS("DMA_BUF"); struct vfio_pci_dma_buf { + struct vfio_dma_buf_data export_data; + struct dma_buf *dmabuf; struct vfio_pci_core_device *vdev; struct list_head dmabufs_elm; @@ -300,6 +302,8 @@ int vfio_pci_core_feature_dma_buf(struct vfio_pci_core_device *vdev, u32 flags, priv->nr_ranges = get_dma_buf.nr_ranges; priv->dma_ranges = dma_ranges; + priv->export_data.kvm = vdev->vdev.kvm; + ret = check_dma_ranges(priv, &dmabuf_size); if (ret) goto err_free_priv; @@ -391,3 +395,17 @@ void vfio_pci_dma_buf_cleanup(struct vfio_pci_core_device *vdev) } up_write(&vdev->memory_lock); } + +/* + * Only vfio/pci implements this, so put the helper here for now. + */ +struct vfio_dma_buf_data *vfio_dma_buf_get_data(struct dma_buf *dmabuf) +{ + struct vfio_pci_dma_buf *priv = dmabuf->priv; + + if (dmabuf->ops != &vfio_pci_dmabuf_ops) + return ERR_PTR(-EINVAL); + + return &priv->export_data; +} +EXPORT_SYMBOL_GPL(vfio_dma_buf_get_data); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index ba65bbdffd0b..d521d2c01a92 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -9,6 +9,7 @@ #define VFIO_H +#include #include #include #include @@ -383,4 +384,21 @@ int vfio_virqfd_enable(void *opaque, int (*handler)(void *, void *), void vfio_virqfd_disable(struct virqfd **pvirqfd); void vfio_virqfd_flush_thread(struct virqfd **pvirqfd); +/* + * DMA-buf - generic + */ +struct vfio_dma_buf_data { + struct kvm *kvm; +}; + +#if IS_ENABLED(CONFIG_DMA_SHARED_BUFFER) && IS_ENABLED(CONFIG_VFIO_PCI_CORE) +struct vfio_dma_buf_data *vfio_dma_buf_get_data(struct dma_buf *dmabuf); +#else +static inline +struct vfio_dma_buf_data *vfio_dma_buf_get_data(struct dma_buf *dmabuf) +{ + return NULL; +} +#endif + #endif /* VFIO_H */ From patchwork Thu May 29 05:34:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893234 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD3F325E462; Thu, 29 May 2025 05:42:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497370; cv=none; b=Eh5ILgtp1bMhf7bGQhOfMHRvZR9NZ5K4BFaxeMhamOrYX6utjrOqdSJ7kO7nX43ViLz2cvXGsTzzcm8nCsTDGUYTJU63vOZcIeAjkc0fBuCZD8+ZTeTBUCol2bXwBoVEMyuiW8DA+3xkhNNQh+BW86zvcxhb+3vgoQXZeljmuNk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497370; c=relaxed/simple; bh=ea45/wy6aelPF2+M9pYlfuxLHzoZNKixVvebPZ8V9YU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=j9V8vKoB8Y1wx3GIDHvNM8osmjWBgOYrmAX40uHn3i91pcT57Mpe5rqoQIXaMqeKB0wkm/RHJwUtVEdnChnUkrOkYLLQzed7kQ8vx3gMtnAjtVmzINLJhNUkHUdhiLncMOHOyPfwK4gN+rUStRY6wOCN/mXvml/ytQbpczHu2w0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=mpgztUNu; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="mpgztUNu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497369; x=1780033369; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ea45/wy6aelPF2+M9pYlfuxLHzoZNKixVvebPZ8V9YU=; b=mpgztUNueKIm5mQ4iYxy96/3vkvA8KdFmuvkrAWYptmlL4rUmE/fP9E1 KqfwqVL0vXCuVpD6EUKDA+koUfhsO9gBDL/Pj6hLWvm1pvelHLC/CHw0l JJjs//x2Hyrr59+IjGdgc/7xmEDIbxn8mXxMm2BP8uCPgsodkb1+3/WH2 M84rz1qXCKZ22dRxBAEUI0bRe/zi8RC8nPfgEqqy/4R7NUALacMIajlz4 /fQ5+1W90rmuAnGDasV4/PWVY0XohxQhC3I0RFH34c9yde0KIuRcTyeDD x1aCjPGNIuq+2cfFd8xsn1YZIc2ZBGGccMraTDh1rhRlmvyPYn267riyK Q==; X-CSE-ConnectionGUID: hO3/TYYBSu6AcZ1CRK+Xeg== X-CSE-MsgGUID: BTx15RoXTrOJnKKqTx8yrg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963074" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963074" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:49 -0700 X-CSE-ConnectionGUID: lpReMSdNTTqVdLXtIvyWcg== X-CSE-MsgGUID: rYdlzE95QFeNt+TED2GCkA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443355" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:42 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 11/30] KVM: vfio_dmabuf: Fetch VFIO specific dma-buf data for sanity check Date: Thu, 29 May 2025 13:34:54 +0800 Message-Id: <20250529053513.1592088-12-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Fetch VFIO specific dma-buf data to see if the dma-buf is eligible to be assigned to CoCo VM as private MMIO. KVM expects host unaccessible MMIO regions been mapped in private roots. So need to identify VFIO dma-buf by successfully getting VFIO specific dma-buf data. VFIO dma-buf also provides the struct kvm *kvm handler for KVM to check if the owner device of the MMIO region is already assigned to the same CoCo VM. Signed-off-by: Xu Yilun --- virt/kvm/vfio_dmabuf.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/virt/kvm/vfio_dmabuf.c b/virt/kvm/vfio_dmabuf.c index c427ab39c68a..ef695039402f 100644 --- a/virt/kvm/vfio_dmabuf.c +++ b/virt/kvm/vfio_dmabuf.c @@ -12,6 +12,22 @@ struct kvm_vfio_dmabuf { struct kvm_memory_slot *slot; }; +static struct vfio_dma_buf_data *kvm_vfio_dma_buf_get_data(struct dma_buf *dmabuf) +{ + struct vfio_dma_buf_data *(*fn)(struct dma_buf *dmabuf); + struct vfio_dma_buf_data *ret; + + fn = symbol_get(vfio_dma_buf_get_data); + if (!fn) + return ERR_PTR(-ENOENT); + + ret = fn(dmabuf); + + symbol_put(vfio_dma_buf_get_data); + + return ret; +} + static void kv_dmabuf_move_notify(struct dma_buf_attachment *attach) { struct kvm_vfio_dmabuf *kv_dmabuf = attach->importer_priv; @@ -48,6 +64,7 @@ int kvm_vfio_dmabuf_bind(struct kvm *kvm, struct kvm_memory_slot *slot, size_t size = slot->npages << PAGE_SHIFT; struct dma_buf_attachment *attach; struct kvm_vfio_dmabuf *kv_dmabuf; + struct vfio_dma_buf_data *data; struct dma_buf *dmabuf; int ret; @@ -60,6 +77,15 @@ int kvm_vfio_dmabuf_bind(struct kvm *kvm, struct kvm_memory_slot *slot, goto err_dmabuf; } + data = kvm_vfio_dma_buf_get_data(dmabuf); + if (IS_ERR(data)) + goto err_dmabuf; + + if (data->kvm != kvm) { + ret = -EINVAL; + goto err_dmabuf; + } + kv_dmabuf = kzalloc(sizeof(*kv_dmabuf), GFP_KERNEL); if (!kv_dmabuf) { ret = -ENOMEM; From patchwork Thu May 29 05:34:55 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893492 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8E72257426; Thu, 29 May 2025 05:42:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497377; cv=none; b=OUb/EhyU9/yxZc4TcLPrBk0y05gsLE/v3oVAkjjllMsewS5AZ8Ld7bbGV8XRekRJRJJFOp2zYzAYZG3JKXh+/UkOW8zQ9tNKYyerQDaeQI8a2EclEi0BkgZOjLEh6k1UzzX3tgj9R5EdLGFsyFkuht9H8GLhU8cqMCwDZT4NAI8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497377; c=relaxed/simple; bh=4RGJKaGi7tjWuePHS0WN6tFoyf+foru2SDZB2A2BTOY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b1+5aAgkOSwYgQqfA+1ySGRHSLsucBxFyZDKNZnz6ZkaGoIxc2zk44WzQZTmYRyKNEjgmtZg2YauYmZP/6/2IOwJRHZJ26xHul8pbjGd3Z+N1YFaNSy6u+t+kdeSQXQRn3Q6u9EA1wIl3IoudZsYtoKevYEO/2mB8ZLGyH93Cjc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=L911y31W; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="L911y31W" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497376; x=1780033376; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4RGJKaGi7tjWuePHS0WN6tFoyf+foru2SDZB2A2BTOY=; b=L911y31WDPsVwAwyKJ/uJPWk0tl2Kw4bAGel2tKbXiEJpsvywMlo+rOa APPhoaptUspXHfpkdpqTTpH1mIlqC5SWhhDYGOUbuznnUBI9mQ4MN6BET DiOS5Mbxz2iQfAz66o8VVmevjDNWxSSujv/z/qaQ2Kqo3YIOhTzpEdwi7 fpRFRzmZhitsUNuOY+3ZHL+Gkok0OMHIRbuNzA9SoFqENRZ3N/maOOpee UvsvRwSCHD7xBvoqzDnF4iK4rA9xBWn0ALITCLfUTioZCYBPRsMg+Wm7u m9lGIlF83XXBJO+BF+VrBR/jyUV8soHD/FhClZa4QXzp7Hvv5XI+7+Kwo Q==; X-CSE-ConnectionGUID: hbGwRK2eSVusK8t81Bbt0Q== X-CSE-MsgGUID: RRjfkZxLTPm60QtryuNj5w== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963088" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963088" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:42:55 -0700 X-CSE-ConnectionGUID: 1RCu3eSpRVmQoTlgXr2zEA== X-CSE-MsgGUID: lAU2JY3FTmCrM6asmbLT5A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443375" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:48 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 12/30] iommufd/device: Associate a kvm pointer to iommufd_device Date: Thu, 29 May 2025 13:34:55 +0800 Message-Id: <20250529053513.1592088-13-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shameer Kolothum Add a struct kvm * to iommufd_device_bind() fn and associate it with idev if bind is successful. Signed-off-by: Shameer Kolothum Reviewed-by: Jason Gunthorpe --- This patch and next Shameer's patch are part of the series: https://lore.kernel.org/all/20250319173202.78988-3-shameerali.kolothum.thodi@huawei.com/ --- drivers/iommu/iommufd/device.c | 5 ++++- drivers/iommu/iommufd/iommufd_private.h | 2 ++ drivers/vfio/iommufd.c | 2 +- include/linux/iommufd.h | 4 +++- 4 files changed, 10 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 2111bad72c72..37ef6bec2009 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -152,6 +152,7 @@ void iommufd_device_destroy(struct iommufd_object *obj) * iommufd_device_bind - Bind a physical device to an iommu fd * @ictx: iommufd file descriptor * @dev: Pointer to a physical device struct + * @kvm: Pointer to struct kvm if device belongs to a KVM VM * @id: Output ID number to return to userspace for this device * * A successful bind establishes an ownership over the device and returns @@ -165,7 +166,8 @@ void iommufd_device_destroy(struct iommufd_object *obj) * The caller must undo this with iommufd_device_unbind() */ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, - struct device *dev, u32 *id) + struct device *dev, struct kvm *kvm, + u32 *id) { struct iommufd_device *idev; struct iommufd_group *igroup; @@ -215,6 +217,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, if (!iommufd_selftest_is_mock_dev(dev)) iommufd_ctx_get(ictx); idev->dev = dev; + idev->kvm = kvm; idev->enforce_cache_coherency = device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY); /* The calling driver is a user until iommufd_device_unbind() */ diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 80e8c76d25f2..297e4e2a12d1 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -424,6 +424,8 @@ struct iommufd_device { struct list_head group_item; /* always the physical device */ struct device *dev; + /* ..and kvm if available */ + struct kvm *kvm; bool enforce_cache_coherency; /* protect iopf_enabled counter */ struct mutex iopf_lock; diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index c8c3a2d53f86..3441d24538a8 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -115,7 +115,7 @@ int vfio_iommufd_physical_bind(struct vfio_device *vdev, { struct iommufd_device *idev; - idev = iommufd_device_bind(ictx, vdev->dev, out_device_id); + idev = iommufd_device_bind(ictx, vdev->dev, vdev->kvm, out_device_id); if (IS_ERR(idev)) return PTR_ERR(idev); vdev->iommufd_device = idev; diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 34b6e6ca4bfa..2b2d6095309c 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -24,6 +24,7 @@ struct iommufd_ctx; struct iommufd_device; struct iommufd_viommu_ops; struct page; +struct kvm; enum iommufd_object_type { IOMMUFD_OBJ_NONE, @@ -52,7 +53,8 @@ struct iommufd_object { }; struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, - struct device *dev, u32 *id); + struct device *dev, struct kvm *kvm, + u32 *id); void iommufd_device_unbind(struct iommufd_device *idev); int iommufd_device_attach(struct iommufd_device *idev, ioasid_t pasid, From patchwork Thu May 29 05:34:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893233 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADD112580CB; Thu, 29 May 2025 05:43:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497383; cv=none; b=S6pXXNpqXYNvI6LVXmzOtfqPmDwXnL6Za/I5pnq/j8cuUzaptXLYurRVhSxeJyvOKGcxhgpAet+HmMaJhAtMLJ6k8Ergn+/3fs4LzFBMeSxZfuh7ME0jS1pvgrxeJ2Byt2pJh+NMzw4f+MCghBiz83Lq2Xod1hbhavgCn/3EKac= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497383; c=relaxed/simple; bh=GBS29KYqTli4hPK2EW9uHxwbeQwMH4GW8Dj2RBOdOlw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=lvkq2gmUJRp0y+idTV02lNKy/2xF+0/GEpYVoQzJWdSkutHdCqss8IiYUjcXmw8Om7ZCI82Hy3vIHcwtAT8C1BpgVIlz77Ci8hWMD7sytTGwMTSnw6DR2LwEG8th5xtQHPJNqM+br4gF6xWJ0UH5mvXH+e+41AuGmGWxKdlnjuk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=JCchpuBE; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="JCchpuBE" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497382; x=1780033382; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GBS29KYqTli4hPK2EW9uHxwbeQwMH4GW8Dj2RBOdOlw=; b=JCchpuBEG5oWAmwIx4pr1om6Y2fnzh+5vKdb6uOZPEYNUWtD7Z0lu72b xnZ0cVv18/1kBMhhfsGc4hvwVJLRXUZKIdPPpFpdnGVGrJG6+5541/xIm 6wY4X+1V56AVgLhRdpx5Ht0WBsQs+sfPqub7VPlxjkkAt+wA8uaC46jUj ESKAksLibCx3dsaLtncEq949sD0ZoOxt3UFUO3jId/TZEOAShEg86yut5 NISoAXUgm3WOQz4Bq//bOUfDlQYCMdclaUuJcTdc5rJEehuVuOYGOjqny 3fqei9gr9Tyn+uWeLjfSyI+RnXbb4IgHVn4I+8DnBrFbtXbkkjMSdYFKM g==; X-CSE-ConnectionGUID: zPuKi/gASru/aXHyfUB/4g== X-CSE-MsgGUID: +x8L3JYFRte7wLWK5+aJ7A== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963106" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963106" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:02 -0700 X-CSE-ConnectionGUID: /k+bvgEYSzm1XrkUJbzfUQ== X-CSE-MsgGUID: wtMPvTBJQZCcYfzGFJRK1w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443386" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:42:55 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 13/30] fixup! iommufd/selftest: Sync iommufd_device_bind() change to selftest Date: Thu, 29 May 2025 13:34:56 +0800 Message-Id: <20250529053513.1592088-14-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sync up the additional struct kvm * parameter. Signed-off-by: Zhenzhong Duan Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/selftest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 18d9a216eb30..d070807757f2 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -992,7 +992,7 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd, goto out_sobj; } - idev = iommufd_device_bind(ucmd->ictx, &sobj->idev.mock_dev->dev, + idev = iommufd_device_bind(ucmd->ictx, &sobj->idev.mock_dev->dev, NULL, &idev_id); if (IS_ERR(idev)) { rc = PTR_ERR(idev); From patchwork Thu May 29 05:34:57 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893491 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B8BD2417D9; Thu, 29 May 2025 05:43:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497389; cv=none; b=lpLwAmZ1dsxcVUxCCeOl68eSINQswhCXk9w1zVQYH2BIg3arVShVKVZemSMZJ7B7Y0f9ZptRnZr46Q6uf6fwZfqq4wAmFnRMmyc8YfkZONmEXPshHI1HA4pPWK7OBOEOk8AsVlkk1ykvGIiZVVVS9EPMeasC9E1FHJiXP3c0tTA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497389; c=relaxed/simple; bh=4hunogpNwciqFH+bSwtWb6TQfi33+KNtBwWBoXqV5M4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pj5odGzjRKbuuLUjwlyer7LhcN9lXtbuUfavrye4y/mJVzinGnSlHP4EJDkqmA3sWa6IcJl8AinMw57g0EbtNnY7eXiJ0hSW5yShKCsrhW3MktigLes3zfEti4ANWGJ8EAlF2J50y/c2Sg7pzgmsrPs5S0Yq1ZsZh1CranOVLkA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GbL4Tf6u; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GbL4Tf6u" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497389; x=1780033389; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4hunogpNwciqFH+bSwtWb6TQfi33+KNtBwWBoXqV5M4=; b=GbL4Tf6uH4Nvc73pNMn6tOHwHD6AFxL/Dopbm9oX4KqwkOgCYEYhOWWl fFNpbCma0i2eaP1VsXb8g0y1G/oLQZwXeGZNJ62mGFw7taxlZ/AK/M3Xm uPEHU4klWWKCIw1VopCxp0f4NN6QwGUynPmvTs/NEDgRqvAqW88wSoXbM JAJZBqaVtyBwgCKO36BVxYPqRlLq8iufiyWVScFiOXWLzP3F7Ospl9NUy gk6CtPEA/xdQ5kbRL+KUtz3GFZubQFsotzFKjDjaWV7rmUPNiY01CVjKr 8Z61i5gTZ5u+Rip4NfEBwRaDmIZVCyhNuZxySYedBllgKKa9fCk9aBOT/ w==; X-CSE-ConnectionGUID: WepAyHhOQC6JBa6h7b/TSg== X-CSE-MsgGUID: xrdFvCa8Q8KzAHTUtsOPoA== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963134" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963134" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:08 -0700 X-CSE-ConnectionGUID: ZmANZOTgTEGCPxcNuqIHBg== X-CSE-MsgGUID: hzXAEOUOQ/O6ZKvbdS4Gcg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443399" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:01 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 14/30] iommu/arm-smmu-v3-iommufd: Pass in kvm pointer to viommu_alloc Date: Thu, 29 May 2025 13:34:57 +0800 Message-Id: <20250529053513.1592088-15-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Shameer Kolothum No functional changes. This will be used in a later patch to add support to use KVM VMID in ARM SMMUv3 s2 stage configuration. Signed-off-by: Shameer Kolothum --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 1 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 1 + drivers/iommu/iommufd/viommu.c | 3 ++- include/linux/iommu.h | 4 +++- 4 files changed, 7 insertions(+), 2 deletions(-) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c index e4fd8d522af8..5ee2b24e7bcf 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c @@ -383,6 +383,7 @@ static const struct iommufd_viommu_ops arm_vsmmu_ops = { }; struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, + struct kvm *kvm, struct iommu_domain *parent, struct iommufd_ctx *ictx, unsigned int viommu_type) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index dd1ad56ce863..94b695b60c26 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -1060,6 +1060,7 @@ struct arm_vsmmu { #if IS_ENABLED(CONFIG_ARM_SMMU_V3_IOMMUFD) void *arm_smmu_hw_info(struct device *dev, u32 *length, u32 *type); struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, + struct kvm *kvm, struct iommu_domain *parent, struct iommufd_ctx *ictx, unsigned int viommu_type); diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index 01df2b985f02..488905989b7c 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -47,7 +47,8 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) goto out_put_hwpt; } - viommu = ops->viommu_alloc(idev->dev, hwpt_paging->common.domain, + viommu = ops->viommu_alloc(idev->dev, idev->kvm, + hwpt_paging->common.domain, ucmd->ictx, cmd->type); if (IS_ERR(viommu)) { rc = PTR_ERR(viommu); diff --git a/include/linux/iommu.h b/include/linux/iommu.h index ccce8a751e2a..3675a5a6cea0 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -47,6 +47,7 @@ struct iommufd_ctx; struct iommufd_viommu; struct msi_desc; struct msi_msg; +struct kvm; #define IOMMU_FAULT_PERM_READ (1 << 0) /* read */ #define IOMMU_FAULT_PERM_WRITE (1 << 1) /* write */ @@ -661,7 +662,8 @@ struct iommu_ops { int (*def_domain_type)(struct device *dev); struct iommufd_viommu *(*viommu_alloc)( - struct device *dev, struct iommu_domain *parent_domain, + struct device *dev, struct kvm *kvm, + struct iommu_domain *parent_domain, struct iommufd_ctx *ictx, unsigned int viommu_type); const struct iommu_domain_ops *default_domain_ops; From patchwork Thu May 29 05:34:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893232 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B045E2550D2; Thu, 29 May 2025 05:43:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497397; cv=none; b=T2uuvf1icyte/H5xUxm6zsJLSJrE9db9Tzz+zW1hZTPNHkqLTjkH+RHnT7ZMeYodkeRqXdx5mjLHpKZhO7IkDn0HbBh5wbPrYFWR5uuUhgsbZ7fvgfXuAhnW4RrZygdSy0xvJR8Nu00atvGaxNxOmkARowwG0R6aBLRru1noVn4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497397; c=relaxed/simple; bh=yZxllDmS71frgFYePwIFmAbgOkfQeZNetfNQ4GrvCjw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ta0sSAAuDzcsm+F24dDJ7I2VhG2JK2ckfZ7PcGdqNxsSB0GlmxJ4yTk2YZmjwlrCriiILpOmgWGYzX9meglDKBM1ARtE7tA5hrRMRvhkUn5TaNQ95VtYpLrO2yX4AqrySe51oLC45S8dCAA0WPJvsNUYW2aD+Teb5RntJk/CdJk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=eHfukW4g; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="eHfukW4g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497396; x=1780033396; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yZxllDmS71frgFYePwIFmAbgOkfQeZNetfNQ4GrvCjw=; b=eHfukW4gflWBoIkXvfe2M1bjF7vltpRCgx18yXA2wVzvJ+lRZkCxoJ4J iqkDtDZSLfkRSPD/1LJ0ODxd1sjh4Zt9/fqiFBYMCL9g60aPlSWFHh6mE k7deFwyNchrwXivIDewVv+D2wGpSbLirR+tjtjmBqOFGiPtHA1U+vxfHR AQHNdqs+y+fspgQS1A/Qh89RIHbqpYCF5NfnuXP+0FSzRNWBT5Vlr5OUz +L0bl6Kwccvb4EiKAaPgQUqA02U+RyDirETw3PP7HXS2yLRUWbWzOB79f 7DMkyZsK1ZfJ8EbQiPKRQnqf1C+8oG4yoYzzadvIy91rIsS5GmpUnJJM7 w==; X-CSE-ConnectionGUID: 3Bpti7sET8mH4upW0Ml8lw== X-CSE-MsgGUID: 6eJEecCLTBqVz2VDjE5GKQ== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963185" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963185" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:15 -0700 X-CSE-ConnectionGUID: Fs3ze9vgS0+10ukv3r/wgw== X-CSE-MsgGUID: egJESrszQraMvZPOBy2wlQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443413" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:08 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 15/30] fixup: iommu/selftest: Sync .viommu_alloc() change to selftest Date: Thu, 29 May 2025 13:34:58 +0800 Message-Id: <20250529053513.1592088-16-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Sync up the additional struct kvm * parameter. Signed-off-by: Lu Baolu Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/selftest.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index d070807757f2..90e6d1d3aa62 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -734,6 +734,7 @@ static struct iommufd_viommu_ops mock_viommu_ops = { }; static struct iommufd_viommu *mock_viommu_alloc(struct device *dev, + struct kvm *kvm, struct iommu_domain *domain, struct iommufd_ctx *ictx, unsigned int viommu_type) From patchwork Thu May 29 05:34:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893490 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3277C263F22; Thu, 29 May 2025 05:43:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497402; cv=none; b=EcH1q+IfAGMY6uIN2NUaMABhYDheST3y/aQnVQkQQb3tDn8qT90jgpM0wL3uUehpm+4YDqwk5ki9tzNyjsU3BS84fqySejdeV76IXFUOIQB88kTxWlT5hpp0Es+GbjQnm3CxWaACSBWVbymuUSA2aL4ffMtnP7YU+3PQ+//xjSo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497402; c=relaxed/simple; bh=1sJ+poZNAA+vU0Y1aNtnd5Xw3yJEQigl4o9A/3/bVeQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=h3P65zfX4M2Y87VbAyJmdOTUqLGOlJwKq3YhYkdwBqr3FEaQ/tO4g1RxBoQMXn4k5JgkPZC99hhsYgzsyJyzLPJ0uokm1WsKmrC27qf9Syh5aLrnziHm/H0R1gQ7RGerDoImEhgOXIZVuNO0/IMZFl4ADx8QLBiVCnqAQ9xZJR8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YhjfCBpO; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YhjfCBpO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497402; x=1780033402; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1sJ+poZNAA+vU0Y1aNtnd5Xw3yJEQigl4o9A/3/bVeQ=; b=YhjfCBpOTPW+de7iNTJU+wvO38pD/PYRQ+No4dTYLdheXWgCb6zy/XDR YJCE3m74BNZc/QplRep/kVitHZS8I9CXjxpGsNwNPeF/Ah7OOyUSc+lAt u0r886sH+7/mh/okGt3UNEUwsCK/DeaVp7W4pzEBkEimLHhLxzmqZpHI9 OWtTdQUc8Nc/ZzbOz9Z5bnibqy6xLquhkVTPXy11jAVWqMgCcx9/KaMXS MQI0WelXNUbq/wRzpVxZEEQu1bBbpwEM5CDBtZBrlExAl9AmlBri7Dpf1 X2e1rXVpcO12bebVvYkRflKDXUsyVnbLB37jNE3BZs4PfXzb+mvVCwxCL Q==; X-CSE-ConnectionGUID: sOdH2yVKT6+eAEBcQ9TlPw== X-CSE-MsgGUID: /SncQ3dbRAu3ZnG95/sXqg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963207" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963207" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:21 -0700 X-CSE-ConnectionGUID: V/DubCh8TMKiJEbyh03JUQ== X-CSE-MsgGUID: 19kO3jWHS7m725iMeywwhQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443422" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:14 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 16/30] iommufd/viommu: track the kvm pointer & its refcount in viommu core Date: Thu, 29 May 2025 13:34:59 +0800 Message-Id: <20250529053513.1592088-17-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Track the kvm pointer and its refcount in viommu core. The kvm pointer will be used later to support TSM Bind feature, which tells the secure firmware the connection between a vPCI device and a CoCo VM. There is existing need to reference kvm pointer in viommu [1], but in that series kvm pointer is used & tracked in platform iommu drivers. While in Confidential Computing (CC) case, viommu should manage a generic routine for TSM Bind, i.e. call pci_tsm_bind(pdev, kvm, tdi_id) So it is better the viommu core keeps and tracks the kvm pointer. [1] https://lore.kernel.org/all/20250319173202.78988-5-shameerali.kolothum.thodi@huawei.com/ Signed-off-by: Lu Baolu Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/viommu.c | 62 ++++++++++++++++++++++++++++++++++ include/linux/iommufd.h | 3 ++ 2 files changed, 65 insertions(+) diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index 488905989b7c..2fcef3f8d1a5 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -1,8 +1,68 @@ // SPDX-License-Identifier: GPL-2.0-only /* Copyright (c) 2024, NVIDIA CORPORATION & AFFILIATES */ +#if IS_ENABLED(CONFIG_KVM) +#include +#endif + #include "iommufd_private.h" +#if IS_ENABLED(CONFIG_KVM) +static void viommu_get_kvm_safe(struct iommufd_viommu *viommu, struct kvm *kvm) +{ + void (*pfn)(struct kvm *kvm); + bool (*fn)(struct kvm *kvm); + bool ret; + + if (!kvm) + return; + + pfn = symbol_get(kvm_put_kvm); + if (WARN_ON(!pfn)) + return; + + fn = symbol_get(kvm_get_kvm_safe); + if (WARN_ON(!fn)) { + symbol_put(kvm_put_kvm); + return; + } + + ret = fn(kvm); + symbol_put(kvm_get_kvm_safe); + if (!ret) { + symbol_put(kvm_put_kvm); + return; + } + + viommu->put_kvm = pfn; + viommu->kvm = kvm; +} + +static void viommu_put_kvm(struct iommufd_viommu *viommu) +{ + if (!viommu->kvm) + return; + + if (WARN_ON(!viommu->put_kvm)) + goto clear; + + viommu->put_kvm(viommu->kvm); + viommu->put_kvm = NULL; + symbol_put(kvm_put_kvm); + +clear: + viommu->kvm = NULL; +} +#else +static void viommu_get_kvm_safe(struct iommufd_viommu *viommu, struct kvm *kvm) +{ +} + +static void viommu_put_kvm(struct iommufd_viommu *viommu) +{ +} +#endif + void iommufd_viommu_destroy(struct iommufd_object *obj) { struct iommufd_viommu *viommu = @@ -10,6 +70,7 @@ void iommufd_viommu_destroy(struct iommufd_object *obj) if (viommu->ops && viommu->ops->destroy) viommu->ops->destroy(viommu); + viommu_put_kvm(viommu); refcount_dec(&viommu->hwpt->common.obj.users); xa_destroy(&viommu->vdevs); } @@ -68,6 +129,7 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) * on its own. */ viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev); + viommu_get_kvm_safe(viommu, idev->kvm); cmd->out_viommu_id = viommu->obj.id; rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 2b2d6095309c..2712421802b9 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -104,6 +104,9 @@ struct iommufd_viommu { struct rw_semaphore veventqs_rwsem; unsigned int type; + + struct kvm *kvm; + void (*put_kvm)(struct kvm *kvm); }; /** From patchwork Thu May 29 05:35:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893231 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B700B265CAC; Thu, 29 May 2025 05:43:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497409; cv=none; b=juDBgLYXcs6syWupac0dr6haibZejjxBY9p+5UVXaBq9/NwULY2H9/Z2sNmxMpngZcbPRU9QnujHxhzAZAL7jnChdzzw6mTFXs4hzloOElD3W+NeoUzhaIUD057artMq3n2TREoTwGiJ6Y95ASRtlSnMvcNEC2vdjmDzTHUeZIo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497409; c=relaxed/simple; bh=vP92koKVunofrqlnPlqCNDul5Llw0DB0fsaPd9kiOSU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oAZs+tM7Qnp5nUVl70gHZWuewvxyOpZuID4BYkwO72EINN3I27w5W9UeZcfPdb9jzKtDTZO4rNWHjZlMyvY1W96zUzpkV16tb5ewqluKUpS+PscEqsSPDU6di/IKxmK73yFZIPGib2b2OoOPmijT92eRHNdfI4akwInwAXsBO74= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=KuvkG8Eu; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="KuvkG8Eu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497408; x=1780033408; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vP92koKVunofrqlnPlqCNDul5Llw0DB0fsaPd9kiOSU=; b=KuvkG8Euhf2itkJHNfV0ur3MFHtrSmOyapDJipIAAmYskcuyw8KEUrnI N//evaynJNTiY48lfc9oW79haov9S0QXm74IAhlOwn40RtcpylhAZTaGq 7eaL5tnu50ts4pByRREajT9lF7QcwXxDQW8fE5HHjUULSFyP/HiWlpMen KsGbK+nycRInVgbVjwUSMsXFYf768BcPcJsh3vRHfI8KskRKa0nqjSOTD 30xATh9SHW4VGabK2Fuitdh9/K2P2RY2EWYqyUEX6jrqFaKbBWbc4WJhi Lu/QcQLtZsVB8G+z4quQQZnhFp6rGkxZc2i/Cr4bxxf5Yycc4K/97WXI5 w==; X-CSE-ConnectionGUID: GUvb0evXR1OnFqT+HoN8ZQ== X-CSE-MsgGUID: ZRkpEC89RfaQZY2sMz6SKw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963223" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963223" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:28 -0700 X-CSE-ConnectionGUID: dg5mUtXITPSP2NYLPwER9w== X-CSE-MsgGUID: f+/Gs+kRTsKqO/91QWb0IQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443443" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:21 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 17/30] iommufd/device: Add TSM Bind/Unbind for TIO support Date: Thu, 29 May 2025 13:35:00 +0800 Message-Id: <20250529053513.1592088-18-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add new kAPIs against iommufd_device to support TSM Bind/Unbind commands issued by CoCo-VM. The TSM bind means VMM does all preparations for private device assignement, lock down the device by transiting it to TDISP CONFIG_LOCKED or RUN state (when in RUN state, TSM could still block any accessing to/from device), so that the device is ready for attestation by CoCo-VM. The interfaces are added against IOMMUFD because IOMMUFD builds several abstract objects applicable for private device assignment, e.g. viommu for secure iommu & kvm, vdevice for vBDF. IOMMUFD links them up to finish all configurations required by secure firmware. That also means TSM Bind interface should be called after viommu & vdevice allocation. Suggested-by: Jason Gunthorpe Originally-by: Alexey Kardashevskiy Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/device.c | 84 +++++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 6 ++ drivers/iommu/iommufd/viommu.c | 44 +++++++++++++ include/linux/iommufd.h | 3 + 4 files changed, 137 insertions(+) diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 37ef6bec2009..984780c66ab2 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -3,6 +3,7 @@ */ #include #include +#include #include #include #include @@ -1561,3 +1562,86 @@ int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) iommufd_put_object(ucmd->ictx, &idev->obj); return rc; } + +/** + * iommufd_device_tsm_bind - Move a device to TSM Bind state + * @idev: device to attach + * @vdev_id: Input a IOMMUFD_OBJ_VDEVICE + * + * This configures for device Confidential Computing(CC), and moves the device + * to the TSM Bind state. Once this completes the device is locked down (TDISP + * CONFIG_LOCKED or RUN), waiting for guest's attestation. + * + * This function is undone by calling iommufd_device_tsm_unbind(). + */ +int iommufd_device_tsm_bind(struct iommufd_device *idev, u32 vdevice_id) +{ + struct iommufd_vdevice *vdev; + int rc; + + if (!dev_is_pci(idev->dev)) + return -ENODEV; + + vdev = container_of(iommufd_get_object(idev->ictx, vdevice_id, IOMMUFD_OBJ_VDEVICE), + struct iommufd_vdevice, obj); + if (IS_ERR(vdev)) + return PTR_ERR(vdev); + + if (vdev->dev != idev->dev) { + rc = -EINVAL; + goto out_put_vdev; + } + + mutex_lock(&idev->igroup->lock); + if (idev->vdev) { + rc = -EEXIST; + goto out_unlock; + } + + rc = iommufd_vdevice_tsm_bind(vdev); + if (rc) + goto out_unlock; + + idev->vdev = vdev; + refcount_inc(&vdev->obj.users); + mutex_unlock(&idev->igroup->lock); + + /* + * Pairs with iommufd_device_tsm_unbind() - catches caller bugs attempting + * to destroy a bound device. + */ + refcount_inc(&idev->obj.users); + goto out_put_vdev; + +out_unlock: + mutex_unlock(&idev->igroup->lock); +out_put_vdev: + iommufd_put_object(idev->ictx, &vdev->obj); + return rc; +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_tsm_bind, "IOMMUFD"); + +/** + * iommufd_device_tsm_unbind - Move a device out of TSM bind state + * @idev: device to detach + * + * Undo iommufd_device_tsm_bind(). This removes all Confidential Computing + * configurations, Once this completes the device is unlocked (TDISP + * CONFIG_UNLOCKED). + */ +void iommufd_device_tsm_unbind(struct iommufd_device *idev) +{ + mutex_lock(&idev->igroup->lock); + if (!idev->vdev) { + mutex_unlock(&idev->igroup->lock); + return; + } + + iommufd_vdevice_tsm_unbind(idev->vdev); + refcount_dec(&idev->vdev->obj.users); + idev->vdev = NULL; + mutex_unlock(&idev->igroup->lock); + + refcount_dec(&idev->obj.users); +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_tsm_unbind, "IOMMUFD"); diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 297e4e2a12d1..29af8616e4aa 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -430,6 +430,7 @@ struct iommufd_device { /* protect iopf_enabled counter */ struct mutex iopf_lock; unsigned int iopf_enabled; + struct iommufd_vdevice *vdev; }; static inline struct iommufd_device * @@ -615,8 +616,13 @@ struct iommufd_vdevice { struct iommufd_viommu *viommu; struct device *dev; u64 id; /* per-vIOMMU virtual ID */ + struct mutex tsm_lock; + bool tsm_bound; }; +int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev); +void iommufd_vdevice_tsm_unbind(struct iommufd_vdevice *vdev); + #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); void iommufd_selftest_destroy(struct iommufd_object *obj); diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index 2fcef3f8d1a5..296143e21368 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -4,6 +4,7 @@ #if IS_ENABLED(CONFIG_KVM) #include #endif +#include #include "iommufd_private.h" @@ -193,11 +194,13 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd) goto out_put_idev; } + vdev->ictx = ucmd->ictx; //This is a unrelated fix for vdevice alloc vdev->id = virt_id; vdev->dev = idev->dev; get_device(idev->dev); vdev->viommu = viommu; refcount_inc(&viommu->obj.users); + mutex_init(&vdev->tsm_lock); curr = xa_cmpxchg(&viommu->vdevs, virt_id, NULL, vdev, GFP_KERNEL); if (curr) { @@ -220,3 +223,44 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd) iommufd_put_object(ucmd->ictx, &viommu->obj); return rc; } + +int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) +{ + struct kvm *kvm; + int rc; + + mutex_lock(&vdev->tsm_lock); + if (vdev->tsm_bound) { + rc = -EEXIST; + goto out_unlock; + } + + kvm = vdev->viommu->kvm; + if (!kvm) { + rc = -ENOENT; + goto out_unlock; + } + + rc = pci_tsm_bind(to_pci_dev(vdev->dev), kvm, vdev->id); + if (rc) + goto out_unlock; + + vdev->tsm_bound = true; + +out_unlock: + mutex_unlock(&vdev->tsm_lock); + return rc; +} + +void iommufd_vdevice_tsm_unbind(struct iommufd_vdevice *vdev) +{ + mutex_lock(&vdev->tsm_lock); + if (!vdev->tsm_bound) + goto out_unlock; + + pci_tsm_unbind(to_pci_dev(vdev->dev)); + vdev->tsm_bound = false; + +out_unlock: + mutex_unlock(&vdev->tsm_lock); +} diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 2712421802b9..5f9a286232ac 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -63,6 +63,9 @@ int iommufd_device_replace(struct iommufd_device *idev, ioasid_t pasid, u32 *pt_id); void iommufd_device_detach(struct iommufd_device *idev, ioasid_t pasid); +int iommufd_device_tsm_bind(struct iommufd_device *idev, u32 vdevice_id); +void iommufd_device_tsm_unbind(struct iommufd_device *idev); + struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev); u32 iommufd_device_to_id(struct iommufd_device *idev); From patchwork Thu May 29 05:35:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893489 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0325F259CB0; Thu, 29 May 2025 05:43:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497417; cv=none; b=O9t6qt3DxaNqNzHIO0wG/N/GpDQx9CM2JHFByblIUGWTZAL7C3q76U5zYWHBV36F7UnFslmjYWj/Naj0n/eRrC/nGg09lZEYnCyp+xIoKtgHpRor8WKQXYgdSstFLF9sB59wrLgq6/XaOG6HcN8cjuVaLj/F7AaxCvxYSBjoBag= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497417; c=relaxed/simple; bh=H5T1oQeKOf7EV6IY5wLyfFhALN/V22r5lDzPWKVIEGY=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=t/EyYcpP/lAv/qNfBG6uN3T+5Xx3WiMzt9+b9a/hnWYzKeEIXnICT8B9vrKjgItfdfh/oX3I5HSUX+Kkwsou0TKdS7CPS/k/92InReH6mKspgtwXbKI2xvFZGOu/Yp+D1ffI7NYJNy90WdkIExMy1lLyHOcc4kqbE8YHfGfswGA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ETqfRkVh; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ETqfRkVh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497416; x=1780033416; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=H5T1oQeKOf7EV6IY5wLyfFhALN/V22r5lDzPWKVIEGY=; b=ETqfRkVhU/Q0B3WHVQ4gUzibVeFh/cd71McziYgIk7gVsjFP7MXkks6C 37ojluJNg1NYrOjv+JbKRoM3t6Ik7t3OrOvfvVI6BScAfZ41mvmuKn2xz k+mfuezI1K0pCzJIwmSbhN8LBy/zE87AF/AgzzbGTwB4thg872bF0we5V xYH9w2FnWJyCp/X7Q/uyMPd7dsZ/5Gv6sTKKkcSMOJgxAjp82PcJ64cof xTqMVb6E17P45PMOFM0oV68FntgvMntLlIQvWbVn7jaTuHBeDQIQJCwp2 EdWbTAkrQ0bCESZo77nnHGndNvotr98geV75enoj0SwXGRJow7Hvc7H0E g==; X-CSE-ConnectionGUID: e9ODA3kpRLy/jN9KHVVdCA== X-CSE-MsgGUID: 1+aqVZwTQASo4S4eQ7E/NA== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963245" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963245" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:35 -0700 X-CSE-ConnectionGUID: uP6OUWujSrKq+yVSnWdvuw== X-CSE-MsgGUID: smGeTrTaTj6FUFOgx2fVKQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443460" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:27 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 18/30] iommufd/viommu: Add trusted IOMMU configuration handlers for vdev Date: Thu, 29 May 2025 13:35:01 +0800 Message-Id: <20250529053513.1592088-19-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add handlers for setting up/removing trusted IOMMU configurations against vdevice. IOMMUFD calls these handlers on TSM bind/unbind. Most vendors extend the trusted IOMMU engine for private device assignment, thus require extra IOMMU configuration for TSM bind. E.g. Intel TDX Connect requires host to build extra trusted Device Conext Table entries (but not present), while AMD requires to clear Domain-ID on non-secure DTE. Existing DMA setup flow against IOMMUFD are driven by userspace, usually start with allocating a domain, then attach the domain to the device. While trusted DMA setup is embedded in TSM bind/unbind() IOCTLs. This is because platform secure firmwares have various configuration enforcements for trusted. E.g. Intel TDX Connect enforces trusted IOPT detach after TDI STOP but before TDI metadata free. Using coarser uAPIs like TSM bind/unbind that wrap all trusted configurations prevent these low level complexities propagating to userspace. Coarser uAPI means userspace lose the flexibility to attach different domains to trusted part of the device. Also it cannot operate on the trusted domain. That seems not a problem cause VMM is out of the TCB so secure firmware either disallows VMM touching the trusted domain or only allows a fixed configuration set. E.g. TDX Connect enforces all assigned devices in the same VM must share the same trusted domain. It also specifies every value of the trusted Context Table entries. So just setup everything for trusted DMA in IOMMU driver is a reasonable choice. OPEN: Should these handlers be viommu ops or vdevice ops? Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/iommufd_private.h | 1 + drivers/iommu/iommufd/viommu.c | 41 ++++++++++++++++++++++++- include/linux/iommufd.h | 2 ++ 3 files changed, 43 insertions(+), 1 deletion(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 29af8616e4aa..0db9a0e53a77 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -618,6 +618,7 @@ struct iommufd_vdevice { u64 id; /* per-vIOMMU virtual ID */ struct mutex tsm_lock; bool tsm_bound; + bool trusted_dma_enabled; }; int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev); diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index 296143e21368..8437e936c278 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -224,6 +224,37 @@ int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd) return rc; } +static int iommufd_vdevice_enable_trusted_dma(struct iommufd_vdevice *vdev) +{ + struct iommufd_viommu *viommu = vdev->viommu; + int rc; + + if (vdev->trusted_dma_enabled) + return 0; + + if (viommu->ops->setup_trusted_vdev) { + rc = viommu->ops->setup_trusted_vdev(viommu, vdev->id); + if (rc) + return rc; + } + + vdev->trusted_dma_enabled = true; + return 0; +} + +static void iommufd_vdevice_disable_trusted_dma(struct iommufd_vdevice *vdev) +{ + struct iommufd_viommu *viommu = vdev->viommu; + + if (!vdev->trusted_dma_enabled) + return; + + if (viommu->ops->remove_trusted_vdev) + viommu->ops->remove_trusted_vdev(viommu, vdev->id); + + vdev->trusted_dma_enabled = false; +} + int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) { struct kvm *kvm; @@ -241,12 +272,19 @@ int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) goto out_unlock; } - rc = pci_tsm_bind(to_pci_dev(vdev->dev), kvm, vdev->id); + rc = iommufd_vdevice_enable_trusted_dma(vdev); if (rc) goto out_unlock; + rc = pci_tsm_bind(to_pci_dev(vdev->dev), kvm, vdev->id); + if (rc) + goto out_disable_trusted_dma; + vdev->tsm_bound = true; + goto out_unlock; +out_disable_trusted_dma: + iommufd_vdevice_disable_trusted_dma(vdev); out_unlock: mutex_unlock(&vdev->tsm_lock); return rc; @@ -259,6 +297,7 @@ void iommufd_vdevice_tsm_unbind(struct iommufd_vdevice *vdev) goto out_unlock; pci_tsm_unbind(to_pci_dev(vdev->dev)); + iommufd_vdevice_disable_trusted_dma(vdev); vdev->tsm_bound = false; out_unlock: diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 5f9a286232ac..d73e8d3b9b95 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -136,6 +136,8 @@ struct iommufd_viommu_ops { const struct iommu_user_data *user_data); int (*cache_invalidate)(struct iommufd_viommu *viommu, struct iommu_user_data_array *array); + int (*setup_trusted_vdev)(struct iommufd_viommu *viommu, u64 vdev_id); + void (*remove_trusted_vdev)(struct iommufd_viommu *viommu, u64 vdev_id); }; #if IS_ENABLED(CONFIG_IOMMUFD) From patchwork Thu May 29 05:35:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893230 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3ECB7259CA5; Thu, 29 May 2025 05:43:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497422; cv=none; b=Y2ZJPFpTBvG7VC7clem8UAyM49jJDn9sN6ehEihA4FWTnJm2kK02Wh1pHJxsoyjHVkIUkFe8Ul3khgteGesxa1wFQI6w5Nw10dr0aHt1QqIInCtJvDQe2B6B1C7znz8hjE/CgcipgxTVMhWyS6jKOOfO8dHfksNXLg86D6Websk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497422; c=relaxed/simple; bh=yM5Go8q551iYzcyzQ9ia7JqUoiyP6vgM15ae2CBubv0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FgYF7QOXkoe0bxZAWZGeHMafqKSF9FoEERUZIciyLQwzc8n4PvhAst7awcDwI/AtPHYviWZEqL9H52dls0qAWrR7DTsqZYSVSsl8uUeJIRFN+7PhzPhI65SuAvzX7eesZxrmbFF6wKJffWjvTlRQ9sp7saOCJFhOb1haWaiI4Ik= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=HTvZXFxg; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="HTvZXFxg" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497422; x=1780033422; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yM5Go8q551iYzcyzQ9ia7JqUoiyP6vgM15ae2CBubv0=; b=HTvZXFxgoUr0QJia5Xvg1M9X7Bcpl6FS7yBK3YGUVlsrp0d0yIzI373x gH+mwkQLCA6j1mW0x0aMd/5rgMFbKiFsBIEcudskGf7ndWGoLz+4uobrE 7IuAY0n4jhsd4uVA6oOJlh9ehVvOmJPgPb1dje8Q6tdF2QfQLJXmFyN02 RggdYLc0akGJ4XrOA1SbcCjbVUx6nrpcwhGSf3mXXpvfYtEwigMUKetk5 sAQkdVXjwx5exwNHK/X/VnsnBiD59RdtMaoiJAW7UHayIg+CH+tvymRoO jI2laSzCtpBQWThaTed5ju3nq9cA50agjHiVZyw1qp4qXZGs+bv8937yv w==; X-CSE-ConnectionGUID: MncBAY0XTwimQDCk6fe/1Q== X-CSE-MsgGUID: JocJMT+UQ4Cd93Trobt/9Q== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963259" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963259" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:41 -0700 X-CSE-ConnectionGUID: /RMdZ+CQSgaFwpZlVfHZzQ== X-CSE-MsgGUID: KGDFlmNISRCKp26GnMrMTA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443484" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:34 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 19/30] vfio/pci: Add TSM TDI bind/unbind IOCTLs for TEE-IO support Date: Thu, 29 May 2025 13:35:02 +0800 Message-Id: <20250529053513.1592088-20-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add new IOCTLs to do TSM based TDI bind/unbind. These IOCTLs are expected to be called by userspace when CoCo VM issues TDI bind/unbind command to VMM. Specifically for TDX Connect, these commands are some secure Hypervisor call named GHCI (Guest-Hypervisor Communication Interface). The TSM TDI bind/unbind operations are expected to be initiated by a running CoCo VM, which already have the legacy assigned device in place. The TSM bind operation is to request VMM make all secure configurations to support device work as a TDI, and then issue TDISP messages to move the TDI to CONFIG_LOCKED or RUN state, waiting for guest's attestation. Do TSM Unbind before vfio_pci_core_disable(), otherwise will lead device to TDISP ERROR state. Suggested-by: Jason Gunthorpe Signed-off-by: Wu Hao Signed-off-by: Xu Yilun --- drivers/vfio/iommufd.c | 22 ++++++++++ drivers/vfio/pci/vfio_pci_core.c | 74 ++++++++++++++++++++++++++++++++ include/linux/vfio.h | 7 +++ include/linux/vfio_pci_core.h | 1 + include/uapi/linux/vfio.h | 42 ++++++++++++++++++ 5 files changed, 146 insertions(+) diff --git a/drivers/vfio/iommufd.c b/drivers/vfio/iommufd.c index 3441d24538a8..33fd20ffaeee 100644 --- a/drivers/vfio/iommufd.c +++ b/drivers/vfio/iommufd.c @@ -297,3 +297,25 @@ void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev) vdev->iommufd_attached = false; } EXPORT_SYMBOL_GPL(vfio_iommufd_emulated_detach_ioas); + +int vfio_iommufd_tsm_bind(struct vfio_device *vdev, u32 vdevice_id) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return -EINVAL; + + return iommufd_device_tsm_bind(vdev->iommufd_device, vdevice_id); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_tsm_bind); + +void vfio_iommufd_tsm_unbind(struct vfio_device *vdev) +{ + lockdep_assert_held(&vdev->dev_set->lock); + + if (WARN_ON(!vdev->iommufd_device)) + return; + + iommufd_device_tsm_unbind(vdev->iommufd_device); +} +EXPORT_SYMBOL_GPL(vfio_iommufd_tsm_unbind); diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 116964057b0b..92544e54c9c3 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -692,6 +692,13 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev) #if IS_ENABLED(CONFIG_EEH) eeh_dev_release(vdev->pdev); #endif + + if (vdev->is_tsm_bound) { + vfio_iommufd_tsm_unbind(&vdev->vdev); + pci_release_regions(vdev->pdev); + vdev->is_tsm_bound = false; + } + vfio_pci_core_disable(vdev); vfio_pci_dma_buf_cleanup(vdev); @@ -1447,6 +1454,69 @@ static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, ioeventfd.fd); } +static int vfio_pci_ioctl_tsm_bind(struct vfio_pci_core_device *vdev, + void __user *arg) +{ + unsigned long minsz = offsetofend(struct vfio_pci_tsm_bind, vdevice_id); + struct vfio_pci_tsm_bind tsm_bind; + struct pci_dev *pdev = vdev->pdev; + int ret; + + if (copy_from_user(&tsm_bind, arg, minsz)) + return -EFAULT; + + if (tsm_bind.argsz < minsz || tsm_bind.flags) + return -EINVAL; + + mutex_lock(&vdev->vdev.dev_set->lock); + + /* To ensure no host side MMIO access is possible */ + ret = pci_request_regions_exclusive(pdev, "vfio-pci-tsm"); + if (ret) + goto out_unlock; + + ret = vfio_iommufd_tsm_bind(&vdev->vdev, tsm_bind.vdevice_id); + if (ret) + goto out_release_region; + + vdev->is_tsm_bound = true; + mutex_unlock(&vdev->vdev.dev_set->lock); + + return 0; + +out_release_region: + pci_release_regions(pdev); +out_unlock: + mutex_unlock(&vdev->vdev.dev_set->lock); + return ret; +} + +static int vfio_pci_ioctl_tsm_unbind(struct vfio_pci_core_device *vdev, + void __user *arg) +{ + unsigned long minsz = offsetofend(struct vfio_pci_tsm_unbind, flags); + struct vfio_pci_tsm_unbind tsm_unbind; + struct pci_dev *pdev = vdev->pdev; + + if (copy_from_user(&tsm_unbind, arg, minsz)) + return -EFAULT; + + if (tsm_unbind.argsz < minsz || tsm_unbind.flags) + return -EINVAL; + + mutex_lock(&vdev->vdev.dev_set->lock); + + if (!vdev->is_tsm_bound) + return 0; + + vfio_iommufd_tsm_unbind(&vdev->vdev); + pci_release_regions(pdev); + vdev->is_tsm_bound = false; + mutex_unlock(&vdev->vdev.dev_set->lock); + + return 0; +} + long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, unsigned long arg) { @@ -1471,6 +1541,10 @@ long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, return vfio_pci_ioctl_reset(vdev, uarg); case VFIO_DEVICE_SET_IRQS: return vfio_pci_ioctl_set_irqs(vdev, uarg); + case VFIO_DEVICE_TSM_BIND: + return vfio_pci_ioctl_tsm_bind(vdev, uarg); + case VFIO_DEVICE_TSM_UNBIND: + return vfio_pci_ioctl_tsm_unbind(vdev, uarg); default: return -ENOTTY; } diff --git a/include/linux/vfio.h b/include/linux/vfio.h index d521d2c01a92..747b94bb9758 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -70,6 +70,7 @@ struct vfio_device { struct iommufd_device *iommufd_device; struct ida pasids; u8 iommufd_attached:1; + u8 iommufd_tsm_bound:1; #endif u8 cdev_opened:1; #ifdef CONFIG_DEBUG_FS @@ -155,6 +156,8 @@ int vfio_iommufd_emulated_bind(struct vfio_device *vdev, void vfio_iommufd_emulated_unbind(struct vfio_device *vdev); int vfio_iommufd_emulated_attach_ioas(struct vfio_device *vdev, u32 *pt_id); void vfio_iommufd_emulated_detach_ioas(struct vfio_device *vdev); +int vfio_iommufd_tsm_bind(struct vfio_device *vdev, u32 vdevice_id); +void vfio_iommufd_tsm_unbind(struct vfio_device *vdev); #else static inline struct iommufd_ctx * vfio_iommufd_device_ictx(struct vfio_device *vdev) @@ -190,6 +193,10 @@ vfio_iommufd_get_dev_id(struct vfio_device *vdev, struct iommufd_ctx *ictx) ((int (*)(struct vfio_device *vdev, u32 *pt_id)) NULL) #define vfio_iommufd_emulated_detach_ioas \ ((void (*)(struct vfio_device *vdev)) NULL) +#define vfio_iommufd_tsm_bind \ + ((int (*)(struct vfio_device *vdev, u32 vdevice_id)) NULL) +#define vfio_iommufd_tsm_unbind \ + ((void (*)(struct vfio_device *vdev)) NULL) #endif static inline bool vfio_device_cdev_opened(struct vfio_device *device) diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index da5d8955ae56..b2982100221f 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -80,6 +80,7 @@ struct vfio_pci_core_device { bool needs_pm_restore:1; bool pm_intx_masked:1; bool pm_runtime_engaged:1; + bool is_tsm_bound:1; struct pci_saved_state *pci_saved_state; struct pci_saved_state *pm_save; int ioeventfds_nr; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 9445fa36efd3..16bd93a5b427 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1493,6 +1493,48 @@ struct vfio_device_feature_dma_buf { struct vfio_region_dma_range dma_ranges[]; }; +/* + * Upon VFIO_DEVICE_TSM_BIND, Put the device in TSM Bind state. + * + * @argsz: User filled size of this data. + * @flags: Must be 0. + * @vdevice_id: Input the target id which can represent an vdevice allocated + * via iommufd subsystem. + * + * The vdevice holds all virtualization information needed for TSM Bind. + * TSM Bind means host finishes all host side trusted configurations to build + * a Tee Device Interface(TDI), then put the TDI in TDISP CONFIG_LOCKED or RUN + * state, waiting for guest's attestation. IOMMUFD finds all virtualization + * information from vdevice_id, and executes the TSM Bind. VFIO should be aware + * some operations (e.g. reset, toggle MSE, private MMIO access) to physical + * device impacts TSM Bind, so never do them or do them only after TSM Unbind. + * This IOCTL is only allowed on cdev fds. + */ +struct vfio_pci_tsm_bind { + __u32 argsz; + __u32 flags; + __u32 vdevice_id; + __u32 pad; +}; + +#define VFIO_DEVICE_TSM_BIND _IO(VFIO_TYPE, VFIO_BASE + 22) + +/* + * Upon VFIO_DEVICE_TSM_UNBIND, put the device in TSM Unbind state. + * + * @argsz: User filled size of this data. + * @flags: Must be 0. + * + * TSM Unbind means host removes all trusted configurations, and put the TDI in + * CONFIG_UNLOCKED TDISP state. + */ +struct vfio_pci_tsm_unbind { + __u32 argsz; + __u32 flags; +}; + +#define VFIO_DEVICE_TSM_UNBIND _IO(VFIO_TYPE, VFIO_BASE + 23) + /* -------- API for Type1 VFIO IOMMU -------- */ /** From patchwork Thu May 29 05:35:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893488 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DAA21267F4A; Thu, 29 May 2025 05:43:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497429; cv=none; b=pa9XENtVZ0K4hXuDO8mcQsqfryJM3+x/gOKs13LjpjDXAhcVOIGD4/KFgsC45+iyFAfuQwRv6yxy3cTNymSqTjtcrTujSUyqzL/h7QUncYkH1c5AU/phfMrKfhDPzJJil9OVEEd77xvx694GxQ09WZAhmfPQvraw0JXcgXuc3p8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497429; c=relaxed/simple; bh=O68j6W8S4Lo+E07VyX4nc3K/JBnGY6gvWDiYv/6rh+0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=b9uzmCAGhkmWF/D0T9AzGVqBYhByojJGE9igTluF2DZBabEyhN9Yj4xg0lJlApr8Fu3htbTGcoTqYSyRRM/ZSoV5fPoaobPsRb6faQ3RhzqIg31eFDus8A266iBHJfBypgiiy+MSjuxDsKlHD3lnmzYC8nrE0q9aV0CM2Ds54Es= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=FFh/4R0m; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="FFh/4R0m" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497428; x=1780033428; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=O68j6W8S4Lo+E07VyX4nc3K/JBnGY6gvWDiYv/6rh+0=; b=FFh/4R0mbkziojIi48I+ZYVQ2U4R/aKWTuRIZh0QYUKatEzp1wCMh89c 2/5wmlWxk+yxVD87+MUQzKIeYcFb945ZVipF9TpN3NTFEDfrIF4Gz6pRA mUrS+R3vYYprHnlnpebZAK7a61IwyV3M5IaRrGxT0gz12qmrtAiw6nA9g 59sFfG7YB9egVQbn6zl27C7+iF14NczBpyodrCEeV1dAtkD8HWu7q9Be1 eQ4R6XYGkGh9vp0x9lDhK8bfQInyU50mhJ5GQe8EDYwpsBZzTi3fuulL7 ygwH7TI7uNBnu5FoGMmaBTPStEvmMDkJ8m4S6HX6Y8IFjQOV1maYCxT0d Q==; X-CSE-ConnectionGUID: ppLyOASTTPayyqhqbtS2bA== X-CSE-MsgGUID: UOiIsYUYSiqzPn1IS4ywuQ== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963279" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963279" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:48 -0700 X-CSE-ConnectionGUID: 23TJ1BpZT2G5FZDTmgRQuA== X-CSE-MsgGUID: lvsv/nzrRY2tMgaig+Q/Zw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443514" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:41 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 20/30] vfio/pci: Do TSM Unbind before zapping bars Date: Thu, 29 May 2025 13:35:03 +0800 Message-Id: <20250529053513.1592088-21-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 When device is TSM Bound, some of its MMIO regions are controlled by secure firmware. E.g. TDX Connect would require these MMIO regions mappeed in S-EPT and never unmapped until device Unbound. Zapping bars irrespective of TSM Bound state may cause unexpected secure firmware errors. It is always safe to do TSM Unbind first, transiting the device to shared, then do whatever needed as before. Signed-off-by: Xu Yilun --- drivers/vfio/pci/vfio_pci_config.c | 4 +++ drivers/vfio/pci/vfio_pci_core.c | 41 +++++++++++++++++++----------- drivers/vfio/pci/vfio_pci_priv.h | 3 +++ 3 files changed, 33 insertions(+), 15 deletions(-) diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c index 7ac062bd5044..4ffe661c9e59 100644 --- a/drivers/vfio/pci/vfio_pci_config.c +++ b/drivers/vfio/pci/vfio_pci_config.c @@ -590,6 +590,7 @@ static int vfio_basic_config_write(struct vfio_pci_core_device *vdev, int pos, new_mem = !!(new_cmd & PCI_COMMAND_MEMORY); if (!new_mem) { + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); vfio_pci_dma_buf_move(vdev, true); } else { @@ -712,6 +713,7 @@ static void vfio_lock_and_set_power_state(struct vfio_pci_core_device *vdev, pci_power_t state) { if (state >= PCI_D3hot) { + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); vfio_pci_dma_buf_move(vdev, true); } else { @@ -907,6 +909,7 @@ static int vfio_exp_config_write(struct vfio_pci_core_device *vdev, int pos, &cap); if (!ret && (cap & PCI_EXP_DEVCAP_FLR)) { + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); @@ -992,6 +995,7 @@ static int vfio_af_config_write(struct vfio_pci_core_device *vdev, int pos, &cap); if (!ret && (cap & PCI_AF_CAP_FLR) && (cap & PCI_AF_CAP_TP)) { + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); vfio_pci_dma_buf_move(vdev, true); pci_try_reset_function(vdev->pdev); diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 92544e54c9c3..a8437fcecca1 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -286,6 +286,7 @@ static int vfio_pci_runtime_pm_entry(struct vfio_pci_core_device *vdev, * The vdev power related flags are protected with 'memory_lock' * semaphore. */ + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); vfio_pci_dma_buf_move(vdev, true); @@ -693,11 +694,7 @@ void vfio_pci_core_close_device(struct vfio_device *core_vdev) eeh_dev_release(vdev->pdev); #endif - if (vdev->is_tsm_bound) { - vfio_iommufd_tsm_unbind(&vdev->vdev); - pci_release_regions(vdev->pdev); - vdev->is_tsm_bound = false; - } + __vfio_pci_tsm_unbind(vdev); vfio_pci_core_disable(vdev); @@ -1222,6 +1219,7 @@ static int vfio_pci_ioctl_reset(struct vfio_pci_core_device *vdev, if (!vdev->reset_works) return -EINVAL; + vfio_pci_tsm_unbind(vdev); vfio_pci_zap_and_down_write_memory_lock(vdev); /* @@ -1491,12 +1489,32 @@ static int vfio_pci_ioctl_tsm_bind(struct vfio_pci_core_device *vdev, return ret; } +void __vfio_pci_tsm_unbind(struct vfio_pci_core_device *vdev) +{ + struct pci_dev *pdev = vdev->pdev; + + lockdep_assert_held(&vdev->vdev.dev_set->lock); + + if (!vdev->is_tsm_bound) + return; + + vfio_iommufd_tsm_unbind(&vdev->vdev); + pci_release_regions(pdev); + vdev->is_tsm_bound = false; +} + +void vfio_pci_tsm_unbind(struct vfio_pci_core_device *vdev) +{ + mutex_lock(&vdev->vdev.dev_set->lock); + __vfio_pci_tsm_unbind(vdev); + mutex_unlock(&vdev->vdev.dev_set->lock); +} + static int vfio_pci_ioctl_tsm_unbind(struct vfio_pci_core_device *vdev, void __user *arg) { unsigned long minsz = offsetofend(struct vfio_pci_tsm_unbind, flags); struct vfio_pci_tsm_unbind tsm_unbind; - struct pci_dev *pdev = vdev->pdev; if (copy_from_user(&tsm_unbind, arg, minsz)) return -EFAULT; @@ -1504,15 +1522,7 @@ static int vfio_pci_ioctl_tsm_unbind(struct vfio_pci_core_device *vdev, if (tsm_unbind.argsz < minsz || tsm_unbind.flags) return -EINVAL; - mutex_lock(&vdev->vdev.dev_set->lock); - - if (!vdev->is_tsm_bound) - return 0; - - vfio_iommufd_tsm_unbind(&vdev->vdev); - pci_release_regions(pdev); - vdev->is_tsm_bound = false; - mutex_unlock(&vdev->vdev.dev_set->lock); + vfio_pci_tsm_unbind(vdev); return 0; } @@ -2526,6 +2536,7 @@ static int vfio_pci_dev_set_hot_reset(struct vfio_device_set *dev_set, break; } + __vfio_pci_tsm_unbind(vdev); /* * Take the memory write lock for each device and zap BAR * mappings to prevent the user accessing the device while in diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h index 6f3e8eafdc35..e5bf27f46a73 100644 --- a/drivers/vfio/pci/vfio_pci_priv.h +++ b/drivers/vfio/pci/vfio_pci_priv.h @@ -130,4 +130,7 @@ static inline void vfio_pci_dma_buf_move(struct vfio_pci_core_device *vdev, } #endif +void __vfio_pci_tsm_unbind(struct vfio_pci_core_device *vdev); +void vfio_pci_tsm_unbind(struct vfio_pci_core_device *vdev); + #endif From patchwork Thu May 29 05:35:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893229 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B4EA2690D5; Thu, 29 May 2025 05:43:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497436; cv=none; b=tIOB9n+OVfax6tyQmBRtg+G15f8OF04GZb9/v9I3Dc0PLHJi8Yj93Mw1u91nEdFfzjhp7N9Hppv4T368G5HJmJr4zp6vcYFtOEFpZW6XY2p8F1KUvrrDE/ta+N7pVckiBs8OQSuZjaQcuzNlxqzBw/BMk0YG+Nj2RKHgTpwH2nU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497436; c=relaxed/simple; bh=ZFjs4Khd7IU0ICvoXheVso0w6spur8+isJHyAW1QoRM=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=FmCtp7KNLrg5zQPGq383X+8O7k9uOPqePO6l8y51UkxG7wUyburXUcs4fkfvdQZB95EyXDBsGjWSLGB0UCV+vwS1VsDsRkdK1EdLd8o84+/e43CqemqLkZxaP59OWyNnEbjWJ720PNcLUvEz7YAz9MA5ohR5Otr5wrM2/r4kIn8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=UsH+ngUD; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="UsH+ngUD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497435; x=1780033435; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZFjs4Khd7IU0ICvoXheVso0w6spur8+isJHyAW1QoRM=; b=UsH+ngUDuwtVmpyjY6Fw6qfMMF0XCHYvi4pdbWcFsGhDtmJaZwEwL+Tk ggfJ885zjmlBZEm50grXcM853WblDNl04vVG0/cY2+FZ6gF21kBKdToEw YOSU2ADkIQPgh0WqwZ392RZ9iT2FDyts1ZQJcYlFHSeP/0qE+Ic3aNGEF D9w6BULzZDhBox6raJKkKcXsUJVItOlqVzaHXluRpFNdi243url2iP+V+ fp1dkFWXzsX+Y9DeuWYQHTZaKLHsa9mL+aMxrznVBhtwk2ihP7N/llWmE 7+rpK2GEDm2ah0tFYXIgopiQcXfIGsqUgNoHaT6+MzIpSbvwO697P3Us3 w==; X-CSE-ConnectionGUID: jgmbCRzNSWSw5gcBI1XYiQ== X-CSE-MsgGUID: ghcu6mFhTEimInlHItbPhg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963305" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963305" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:43:54 -0700 X-CSE-ConnectionGUID: eeF3BdF1Si2VogXNH8xFMA== X-CSE-MsgGUID: 9iw3/0VWQOWyF22z10R/Vg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443541" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:47 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 21/30] iommufd/vdevice: Add TSM Guest request uAPI Date: Thu, 29 May 2025 13:35:04 +0800 Message-Id: <20250529053513.1592088-22-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Alexey Kardashevskiy Add TSM Guest request uAPI against iommufd_vdevice to forward various TSM attestation & acceptance requests from guest to TSM driver/secure firmware. This uAPI takes function only after TSM Bind. After a vPCI device is locked down by TSM Bind, CoCo VM should attest and accept the device in its TEE. These operations needs interaction with secure firmware and the device, but doesn't impact the device management from host's POV. It doesn't change the fact that host should not touch some part of the device (see TDISP spec) to keep the trusted assignment, and host could exit trusted assignment and roll back everything by TSM Unbind. So the TSM Guest request becomes a passthrough channel for CoCo VM to exchange request/response blobs with TSM driver/secure firmware. The definition of this IOCTL illustates this idea. Signed-off-by: Alexey Kardashevskiy Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/iommufd_private.h | 1 + drivers/iommu/iommufd/main.c | 3 ++ drivers/iommu/iommufd/viommu.c | 39 +++++++++++++++++++++++++ include/uapi/linux/iommufd.h | 28 ++++++++++++++++++ 4 files changed, 71 insertions(+) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 0db9a0e53a77..610dc2efcdd5 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -609,6 +609,7 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd); void iommufd_viommu_destroy(struct iommufd_object *obj); int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd); void iommufd_vdevice_destroy(struct iommufd_object *obj); +int iommufd_vdevice_tsm_guest_request_ioctl(struct iommufd_ucmd *ucmd); struct iommufd_vdevice { struct iommufd_object obj; diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index 3df468f64e7d..17c5b2cb6ab1 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -320,6 +320,7 @@ union ucmd_buffer { struct iommu_veventq_alloc veventq; struct iommu_vfio_ioas vfio_ioas; struct iommu_viommu_alloc viommu; + struct iommu_vdevice_tsm_guest_request gr; #ifdef CONFIG_IOMMUFD_TEST struct iommu_test_cmd test; #endif @@ -379,6 +380,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { __reserved), IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl, struct iommu_viommu_alloc, out_viommu_id), + IOCTL_OP(IOMMU_VDEVICE_TSM_GUEST_REQUEST, iommufd_vdevice_tsm_guest_request_ioctl, + struct iommu_vdevice_tsm_guest_request, resp_uptr), #ifdef CONFIG_IOMMUFD_TEST IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last), #endif diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index 8437e936c278..c64ce1a9f87d 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -303,3 +303,42 @@ void iommufd_vdevice_tsm_unbind(struct iommufd_vdevice *vdev) out_unlock: mutex_unlock(&vdev->tsm_lock); } + +int iommufd_vdevice_tsm_guest_request_ioctl(struct iommufd_ucmd *ucmd) +{ + struct iommu_vdevice_tsm_guest_request *cmd = ucmd->cmd; + struct pci_tsm_guest_req_info info = { + .type = cmd->type, + .type_info = u64_to_user_ptr(cmd->type_info_uptr), + .type_info_len = cmd->type_info_len, + .req = u64_to_user_ptr(cmd->req_uptr), + .req_len = cmd->req_len, + .resp = u64_to_user_ptr(cmd->resp_uptr), + .resp_len = cmd->resp_len, + }; + struct iommufd_vdevice *vdev; + int rc; + + vdev = container_of(iommufd_get_object(ucmd->ictx, cmd->vdevice_id, + IOMMUFD_OBJ_VDEVICE), + struct iommufd_vdevice, obj); + if (IS_ERR(vdev)) + return PTR_ERR(vdev); + + mutex_lock(&vdev->tsm_lock); + if (!vdev->tsm_bound) { + rc = -ENOENT; + goto out_unlock; + } + + rc = pci_tsm_guest_req(to_pci_dev(vdev->dev), &info); + if (rc) + goto out_unlock; + + cmd->resp_len = info.resp_len; + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); +out_unlock: + mutex_unlock(&vdev->tsm_lock); + iommufd_put_object(ucmd->ictx, &vdev->obj); + return rc; +} diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index f29b6c44655e..b8170fe3d700 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -56,6 +56,7 @@ enum { IOMMUFD_CMD_VDEVICE_ALLOC = 0x91, IOMMUFD_CMD_IOAS_CHANGE_PROCESS = 0x92, IOMMUFD_CMD_VEVENTQ_ALLOC = 0x93, + IOMMUFD_CMD_VDEVICE_TSM_GUEST_REQUEST = 0x94, }; /** @@ -1141,4 +1142,31 @@ struct iommu_veventq_alloc { __u32 __reserved; }; #define IOMMU_VEVENTQ_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VEVENTQ_ALLOC) + +/** + * struct iommu_vdevice_tsm_guest_request - ioctl(IOMMU_VDEVICE_TSM_GUEST_REQUEST) + * @size: sizeof(struct iommu_vdevice_tsm_guest_request) + * @vdevice_id: vDevice ID the guest request is for + * @type: identify the format of the following blobs + * @type_info_len: the blob size for @type_info_uptr + * @req_len: the blob size for @req_uptr, filled by guest + * @resp_len: for input, the blob size for @resp_uptr, filled by guest + * for output, the size of actual response data, filled by host + * @type_info_uptr: extra input/output info, e.g. firmware error code + * @req_uptr: request data buffer filled by guest + * @resp_uptr: response data buffer filled by host + */ +struct iommu_vdevice_tsm_guest_request { + __u32 size; + __u32 vdevice_id; + __u32 type; + __u32 type_info_len; + __u32 req_len; + __u32 resp_len; + __aligned_u64 type_info_uptr; + __aligned_u64 req_uptr; + __aligned_u64 resp_uptr; +}; +#define IOMMU_VDEVICE_TSM_GUEST_REQUEST _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VDEVICE_TSM_GUEST_REQUEST) + #endif From patchwork Thu May 29 05:35:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893487 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12CB5269D0A; Thu, 29 May 2025 05:44:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497442; cv=none; b=NDEd45ot49SMTfmqJ05uI3rIefhOFYxQazkfIqS/+zXMTpixgxLU6eUY4rloFuEPyS6PbSFUL0r5IoE7mhmS3ekrBb1VR3seUiiFrmC5WEmYUY9s4XTWxE+fYTECUk65cNoprbD2yRlgYbNNoOG1sI8E3kus1innZcOEloUa4Tg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497442; c=relaxed/simple; bh=46/snHaJ8vYBoGpNl3amWuUUc5KyrzV3w87FzR6ysPU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=diGlztTmxUYW2UY0yNrtuyDcP9gLckcyBS0/Fbd+Leuvsr1WSS8qEH8F/SD/ok4IiT4DIOgyuyvU1Afvjx2ggbpf/7wfUGnJB6tGzb3fajqJfoy6kNRqu19Ys0Z33mUwIYf8gnFOPPuzsqg8Djc3A9ud8xPbcCTglPR0tgjtj/o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=g8b7y7ef; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="g8b7y7ef" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497442; x=1780033442; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=46/snHaJ8vYBoGpNl3amWuUUc5KyrzV3w87FzR6ysPU=; b=g8b7y7effiUGo53RV7pEwvRg8On1M2Ham0ZOnb2Kxxb5tQd4kCqPkTiJ OIuyTOkdN1wXkOPMQiIUX9HnWwi+4aa/Tk0o+9qNzsw9+44zF6bq83zml /n1mVLJPixBZXX5eZ+m1VafykTJe/w5FlR5DBKxIJyqg88XLbDoN6f3mr lvo+3DfStYtZ296cQHxzS4vBRGk7VNJpUPafO4JtCn3J/ZTKe9CG4UmU+ Mg92XROhNCThe/mjihyAUAWr5Zr3E8RpXIf99n8kJvy0Pf4hRCC+DG741 LqvN2OAVXjVwnt7Z2RTE+mRWK52uXSy60lE8SjsxdHBVZyCpYEJcZgOy9 g==; X-CSE-ConnectionGUID: Lh6CC/wETJK6K598V9M8Rw== X-CSE-MsgGUID: n5bUg2O+SaCdiC/C2lVfHQ== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963321" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963321" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:01 -0700 X-CSE-ConnectionGUID: J+8tvmNbRximREaBwYmdow== X-CSE-MsgGUID: PinTgIsGSeusrosn1EdYoA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443588" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:43:54 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 22/30] fixup! PCI/TSM: Change the guest request type definition Date: Thu, 29 May 2025 13:35:05 +0800 Message-Id: <20250529053513.1592088-23-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Move the guest_request_type to IOMMUFD uAPI header file so that userspace could use it for IOMMUFD uAPI - IOMMU_VDEVICE_TSM_GUEST_REQUEST. Add __user marker to all blob pointers to indicate the TSM drivers' responsibility to read out/fill in user data. Signed-off-by: Xu Yilun --- include/linux/pci-tsm.h | 12 ++++-------- include/uapi/linux/iommufd.h | 8 ++++++++ 2 files changed, 12 insertions(+), 8 deletions(-) diff --git a/include/linux/pci-tsm.h b/include/linux/pci-tsm.h index 1920ca591a42..737767f8a9c5 100644 --- a/include/linux/pci-tsm.h +++ b/include/linux/pci-tsm.h @@ -107,10 +107,6 @@ static inline bool is_pci_tsm_pf0(struct pci_dev *pdev) return PCI_FUNC(pdev->devfn) == 0; } -enum pci_tsm_guest_req_type { - PCI_TSM_GUEST_REQ_TDXC, -}; - /** * struct pci_tsm_guest_req_info - parameter for pci_tsm_ops.guest_req() * @type: identify the format of the following blobs @@ -123,12 +119,12 @@ enum pci_tsm_guest_req_type { * for output, the size of actual response data filled by host */ struct pci_tsm_guest_req_info { - enum pci_tsm_guest_req_type type; - void *type_info; + u32 type; + void __user *type_info; size_t type_info_len; - void *req; + void __user *req; size_t req_len; - void *resp; + void __user *resp; size_t resp_len; }; diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b8170fe3d700..7196bc295669 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -1143,6 +1143,14 @@ struct iommu_veventq_alloc { }; #define IOMMU_VEVENTQ_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VEVENTQ_ALLOC) +/** + * enum pci_tsm_guest_req_type - Specify the format of guest request blobs + * @PCI_TSM_GUEST_REQ_TDXC: Intel TDX Connect specific type + */ +enum pci_tsm_guest_req_type { + PCI_TSM_GUEST_REQ_TDXC, +}; + /** * struct iommu_vdevice_tsm_guest_request - ioctl(IOMMU_VDEVICE_TSM_GUEST_REQUEST) * @size: sizeof(struct iommu_vdevice_tsm_guest_request) From patchwork Thu May 29 05:35:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893228 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DE24125C6E6; Thu, 29 May 2025 05:44:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497449; cv=none; b=rXboIZUOqnMPftndS7DeQUn9rl7usTXhlzznBHnO5+MC1aJeOc2BcWQfliu4lwUSYsuxoRkHwlSkEUsTtemGVXJ4q8NVuCjf1GXShTN9Qj9ArOJWH1qnVf55gIAuRh79agKwp2fvkf/5A0T0X7Mos2h0Pn6W3c9NGR9YK7gZnAA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497449; c=relaxed/simple; bh=fX5qqs3UQZaEe4IJmzBqdZFhYNfZDQ/NDnIIGo01peQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=NQ99SbPZJwo3CPHaJDpjApj/PjCuskU4THvcipeJY0wmAC10cxiWgb3ksF+lRoDzdshD6sSg/chQFvayLefUyfQ9oJ2xQjNGbVEOQzX57fL/y/9RWk06ycGp+IXIQjhFAU5lLe/uVVyx9dGGptyB/9KH8+oglJCSNzBejpouZ7Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cwlUme9K; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cwlUme9K" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497448; x=1780033448; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=fX5qqs3UQZaEe4IJmzBqdZFhYNfZDQ/NDnIIGo01peQ=; b=cwlUme9Kb8JYPIPRHzQgDcFGoHQ4aRogCUE7j/pKAu0n7YEp0Vudal2Z hXW/xCrEwUfNnPLLOyNag7X/y0LnfUP9A10HUlFtnTbxTDGtYVmev3bFT 5hTHjLVTFAyG22EIMdlgS0FC9Jbk/Up09ZXDYNTxo4JrppbFJwopepH4G FnYr7m3UjEZ7IDigf9GKictIKsFwc+hRS6AXEijFc6WJckG6WYNxezC6E HW4vgiCdn1K6OiFgo1Yml8B0VuUOdAXOUDOFirAZ5rvFNT4EsVUVd/w8D 2NaH9Y04chHXtSGqv8JBx6AlRX6iu3vkTpTno6PPb8Zp8j+s1hRjF8mYi w==; X-CSE-ConnectionGUID: SGp4jfpDR9Stvp8bTfGyzA== X-CSE-MsgGUID: PUEHk1A/RamPHg2YJcdWwA== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963349" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963349" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:08 -0700 X-CSE-ConnectionGUID: 1F/EBU3yRRas2DxnaeDH7Q== X-CSE-MsgGUID: 2AH8AJopSJGE4iXmfSrbjA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443612" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:01 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 23/30] coco/tdx_tsm: Introduce a "tdx" subsystem and "tsm" device Date: Thu, 29 May 2025 13:35:06 +0800 Message-Id: <20250529053513.1592088-24-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dan Williams TDX depends on a platform firmware module that is invoked via instructions similar to vmenter (i.e. enter into a new privileged "root-mode" context to manage private memory and private device mechanisms). It is a software construct that depends on the CPU vmxon state to enable invocation of TDX-module ABIs. Unlike other Trusted Execution Environment (TEE) platform implementations that employ a firmware module running on a PCI device with an MMIO mailbox for communication, TDX has no hardware device to point to as the "TSM". The "/sys/devices/virtual" hierarchy is intended for "software constructs which need sysfs interface", which aligns with what TDX needs. The new tdx_subsys will export global attributes populated by the TDX-module "sysinfo". A tdx_tsm device is published on this bus to enable a typical driver model for the low level "TEE Security Manager" (TSM) flows that talk TDISP to capable PCIe devices. For now, this is only the base tdx_subsys and tdx_tsm device registration with attribute definition and TSM driver to follow later. Recall that TDX guest would also use TSM to authenticate assigned devices and it surely needs a virtual software construct to enable guest side TSM flow. A tdx_guest_tsm device would be published on tdx_subsys to indicate the guest is capable of communicate to firmware for TIO via TDVMCALLs. Create some common helpers for TDX host/guest to create software devices on tdx_subsys. Signed-off-by: Dan Williams Signed-off-by: Wu Hao Signed-off-by: Xu Yilun --- arch/x86/Kconfig | 1 + drivers/virt/coco/host/Kconfig | 3 ++ drivers/virt/coco/host/Makefile | 2 + drivers/virt/coco/host/tdx_tsm_bus.c | 70 ++++++++++++++++++++++++++++ include/linux/tdx_tsm_bus.h | 17 +++++++ 5 files changed, 93 insertions(+) create mode 100644 drivers/virt/coco/host/tdx_tsm_bus.c create mode 100644 include/linux/tdx_tsm_bus.h diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 4b9f378e05f6..fb6cc23b02e3 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1925,6 +1925,7 @@ config INTEL_TDX_HOST depends on CONTIG_ALLOC depends on !KEXEC_CORE depends on X86_MCE + select TDX_TSM_BUS help Intel Trust Domain Extensions (TDX) protects guest VMs from malicious host and certain physical attacks. This option enables necessary TDX diff --git a/drivers/virt/coco/host/Kconfig b/drivers/virt/coco/host/Kconfig index 4fbc6ef34f12..c04b0446cd5f 100644 --- a/drivers/virt/coco/host/Kconfig +++ b/drivers/virt/coco/host/Kconfig @@ -4,3 +4,6 @@ # config TSM tristate + +config TDX_TSM_BUS + bool diff --git a/drivers/virt/coco/host/Makefile b/drivers/virt/coco/host/Makefile index be0aba6007cd..ce1ab15ac8d3 100644 --- a/drivers/virt/coco/host/Makefile +++ b/drivers/virt/coco/host/Makefile @@ -4,3 +4,5 @@ obj-$(CONFIG_TSM) += tsm.o tsm-y := tsm-core.o + +obj-$(CONFIG_TDX_TSM_BUS) += tdx_tsm_bus.o diff --git a/drivers/virt/coco/host/tdx_tsm_bus.c b/drivers/virt/coco/host/tdx_tsm_bus.c new file mode 100644 index 000000000000..9f4875ebf032 --- /dev/null +++ b/drivers/virt/coco/host/tdx_tsm_bus.c @@ -0,0 +1,70 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2024 Intel Corporation. All rights reserved. */ + +#include +#include + +static struct tdx_tsm_dev *alloc_tdx_tsm_dev(void) +{ + struct tdx_tsm_dev *tsm = kzalloc(sizeof(*tsm), GFP_KERNEL); + struct device *dev; + + if (!tsm) + return ERR_PTR(-ENOMEM); + + dev = &tsm->dev; + dev->bus = &tdx_subsys; + device_initialize(dev); + + return tsm; +} + +DEFINE_FREE(tdx_tsm_dev_put, struct tdx_tsm_dev *, + if (!IS_ERR_OR_NULL(_T)) put_device(&_T->dev)) +struct tdx_tsm_dev *init_tdx_tsm_dev(const char *name) +{ + struct device *dev; + int ret; + + struct tdx_tsm_dev *tsm __free(tdx_tsm_dev_put) = alloc_tdx_tsm_dev(); + if (IS_ERR(tsm)) + return tsm; + + dev = &tsm->dev; + ret = dev_set_name(dev, name); + if (ret) + return ERR_PTR(ret); + + ret = device_add(dev); + if (ret) + return ERR_PTR(ret); + + return no_free_ptr(tsm); +} +EXPORT_SYMBOL_GPL(init_tdx_tsm_dev); + +static int tdx_match(struct device *dev, const struct device_driver *drv) +{ + if (!strcmp(dev_name(dev), drv->name)) + return 1; + + return 0; +} + +static int tdx_uevent(const struct device *dev, struct kobj_uevent_env *env) +{ + return add_uevent_var(env, "MODALIAS=%s", dev_name(dev)); +} + +const struct bus_type tdx_subsys = { + .name = "tdx", + .match = tdx_match, + .uevent = tdx_uevent, +}; +EXPORT_SYMBOL_GPL(tdx_subsys); + +static int tdx_tsm_dev_init(void) +{ + return subsys_virtual_register(&tdx_subsys, NULL); +} +arch_initcall(tdx_tsm_dev_init); diff --git a/include/linux/tdx_tsm_bus.h b/include/linux/tdx_tsm_bus.h new file mode 100644 index 000000000000..ef7af97ba230 --- /dev/null +++ b/include/linux/tdx_tsm_bus.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* Copyright(c) 2024 Intel Corporation. */ + +#ifndef __TDX_TSM_BUS_H +#define __TDX_TSM_BUS_H + +#include + +struct tdx_tsm_dev { + struct device dev; +}; + +extern const struct bus_type tdx_subsys; + +struct tdx_tsm_dev *init_tdx_tsm_dev(const char *name); + +#endif From patchwork Thu May 29 05:35:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893486 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5247F25C708; Thu, 29 May 2025 05:44:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497455; cv=none; b=oRIPFh7NLlHcCnVOKtwGbSMOCAEXvN6txtp1P/RpUV7OmfE3r//n62wD9uz6a6Ql2fJw5oWB4eHTR0nX1xlDWDGn9/BYFlZ+AqvZKMg7VDXnC2epUR6jn1TsONND/+iWYrGVhJTxVEV+C46gYzx5PhKh2PDRYD2KyeYshv6sEd0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497455; c=relaxed/simple; bh=dGnM1wTm2CDlqkz04ggD8fGVlHhqahSjdQmqrS9B/ac=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=E4zDm0il62z2g+cr9yBG4A3cpm7k1n50srFNj15hy7CaiUmmTJ+3vH2oDuMFO5dc404HDE9U96VMjSIkfcX3zloyuSn0oT3qOY3Ulalu4jgS0bOEUYKmOFeIzMaIST2Vf+AqNwdgnWxkKIoQuzw3VwiUeMbKBYdBTve1QT6lsqU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=jeM/MeuO; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="jeM/MeuO" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497455; x=1780033455; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dGnM1wTm2CDlqkz04ggD8fGVlHhqahSjdQmqrS9B/ac=; b=jeM/MeuO50jSGCUlxK5Bn/D5pk9l1bBQn8Bw3uRyJimoFxmInOlKbv8L YeyP5RUHb8bXIHIovaEAGPmqW0+WcsvEtygLtCWOrRU183NNhQfs1Whrn qZXBr3aLYf7i/0Tu3XRPemyEaWFtKDQM6ATogs8ToAhS5CEHT1/+SKdVq cK0FKBI77RNYWHTannjdthzOVQJso35Sk5iAducfEIwMRwdq/RCAbKJFo qYBUBAm8W+9yeOeAQA6jrdYN1pxvGGG370+cMKHZvKIVz1YHTHYZgEOik yEApxaoelwMrWFnkHqABiK71AHxX1nY9/n45c79MiUZZNkjj0P/74v4eq A==; X-CSE-ConnectionGUID: qB/AX7ppQamBJVphmfUHFg== X-CSE-MsgGUID: wnE/Dr7cSNO9ZifMK81Xyw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963373" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963373" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:14 -0700 X-CSE-ConnectionGUID: 806Y5ep0R6GEoE6kOIU/IA== X-CSE-MsgGUID: gjCt7esHTueK5oh0lHvLXw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443616" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:07 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 24/30] coco/tdx_tsm: TEE Security Manager driver for TDX Date: Thu, 29 May 2025 13:35:07 +0800 Message-Id: <20250529053513.1592088-25-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dan Williams Recall that a TEE Security Manager (TSM) is a platform agent that speaks the TEE Device Interface Security Protocol (TDISP) to PCIe devices and manages private memory resources for the platform. The tdx_tsm driver loads against a device of the same name registered at TDX Module initialization time. The device lives on the "tdx" bus which is a virtual subsystem that hosts the TDX module sysfs ABI. It allows for device-security enumeration and initialization flows to be deferred from TDX Module init time. Crucially, when / if TDX Module init moves earlier in x86 initialization flow this driver is still guaranteed to run after IOMMU and PCI init (i.e. subsys_initcall() vs device_initcall()). The ability to unload the module, or unbind the driver is also useful for debug and coarse grained transitioning between PCI TSM operation and PCI CMA operation (native kernel PCI device authentication). For now this is the basic boilerplate with sysfs attributes and operation flows to be added later. Signed-off-by: Dan Williams Signed-off-by: Wu Hao Signed-off-by: Xu Yilun --- drivers/virt/coco/host/Kconfig | 7 ++ drivers/virt/coco/host/Makefile | 1 + drivers/virt/coco/host/tdx_tsm.c | 189 +++++++++++++++++++++++++++++++ 3 files changed, 197 insertions(+) create mode 100644 drivers/virt/coco/host/tdx_tsm.c diff --git a/drivers/virt/coco/host/Kconfig b/drivers/virt/coco/host/Kconfig index c04b0446cd5f..f2b05b15a24e 100644 --- a/drivers/virt/coco/host/Kconfig +++ b/drivers/virt/coco/host/Kconfig @@ -7,3 +7,10 @@ config TSM config TDX_TSM_BUS bool + +config TDX_TSM + depends on INTEL_TDX_HOST + select TDX_TSM_BUS + select PCI_TSM + select TSM + tristate "TDX TEE Security Manager Driver" diff --git a/drivers/virt/coco/host/Makefile b/drivers/virt/coco/host/Makefile index ce1ab15ac8d3..38ee9c96b921 100644 --- a/drivers/virt/coco/host/Makefile +++ b/drivers/virt/coco/host/Makefile @@ -6,3 +6,4 @@ obj-$(CONFIG_TSM) += tsm.o tsm-y := tsm-core.o obj-$(CONFIG_TDX_TSM_BUS) += tdx_tsm_bus.o +obj-$(CONFIG_TDX_TSM) += tdx_tsm.o diff --git a/drivers/virt/coco/host/tdx_tsm.c b/drivers/virt/coco/host/tdx_tsm.c new file mode 100644 index 000000000000..72f3705fe7bb --- /dev/null +++ b/drivers/virt/coco/host/tdx_tsm.c @@ -0,0 +1,189 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright(c) 2024 Intel Corporation. All rights reserved. */ +#include +#include +#include +#include +#include +#include + +#define TDISP_FUNC_ID GENMASK(15, 0) +#define TDISP_FUNC_ID_SEGMENT GENMASK(23, 16) +#define TDISP_FUNC_ID_SEG_VALID BIT(24) + +static inline u32 tdisp_func_id(struct pci_dev *pdev) +{ + u32 func_id; + + func_id = FIELD_PREP(TDISP_FUNC_ID_SEGMENT, pci_domain_nr(pdev->bus)); + if (func_id) + func_id |= TDISP_FUNC_ID_SEG_VALID; + func_id |= FIELD_PREP(TDISP_FUNC_ID, + PCI_DEVID(pdev->bus->number, pdev->devfn)); + + return func_id; +} + +struct tdx_tsm { + struct pci_tsm_pf0 pci; + u32 func_id; +}; + +static struct tdx_tsm *to_tdx_tsm(struct pci_tsm *tsm) +{ + return container_of(tsm, struct tdx_tsm, pci.tsm); +} + +struct tdx_tdi { + struct pci_tdi tdi; + u32 func_id; +}; + +static struct tdx_tdi *to_tdx_tdi(struct pci_tdi *tdi) +{ + return container_of(tdi, struct tdx_tdi, tdi); +} + +static struct pci_tdi *tdx_tsm_bind(struct pci_dev *pdev, + struct pci_dev *dsm_dev, + struct kvm *kvm, u64 tdi_id) +{ + struct tdx_tdi *ttdi __free(kfree) = + kzalloc(sizeof(*ttdi), GFP_KERNEL); + if (!ttdi) + return NULL; + + ttdi->func_id = tdisp_func_id(pdev); + ttdi->tdi.pdev = pdev; + ttdi->tdi.dsm_dev = pci_dev_get(dsm_dev); + ttdi->tdi.kvm = kvm; + + /*TODO: TDX Module required operations */ + + return &no_free_ptr(ttdi)->tdi; +} + +static void tdx_tsm_unbind(struct pci_tdi *tdi) +{ + struct tdx_tdi *ttdi = to_tdx_tdi(tdi); + + /*TODO: TDX Module required operations */ + + pci_dev_put(ttdi->tdi.dsm_dev); + kfree(ttdi); +} + +static int tdx_tsm_guest_req(struct pci_dev *pdev, + struct pci_tsm_guest_req_info *info) +{ + return -ENXIO; +} + +static int tdx_tsm_connect(struct pci_dev *pdev) +{ + return -ENXIO; +} + +static void tdx_tsm_disconnect(struct pci_dev *pdev) +{ +} + +static struct pci_tsm *tdx_tsm_pci_probe(struct pci_dev *pdev) +{ + if (is_pci_tsm_pf0(pdev)) { + int rc; + + struct tdx_tsm *ttsm __free(kfree) = + kzalloc(sizeof(*ttsm), GFP_KERNEL); + if (!ttsm) + return NULL; + + rc = pci_tsm_pf0_initialize(pdev, &ttsm->pci); + if (rc) + return NULL; + + ttsm->func_id = tdisp_func_id(pdev); + + pci_info(pdev, "PF tsm enabled\n"); + return &no_free_ptr(ttsm)->pci.tsm; + } + + /* for VF and MFD */ + struct pci_tsm *pci_tsm __free(kfree) = + kzalloc(sizeof(*pci_tsm), GFP_KERNEL); + if (!pci_tsm) + return NULL; + + pci_tsm_initialize(pdev, pci_tsm); + + pci_info(pdev, "VF/MFD tsm enabled\n"); + return no_free_ptr(pci_tsm); +} + +static void tdx_tsm_pci_remove(struct pci_tsm *tsm) +{ + if (is_pci_tsm_pf0(tsm->pdev)) { + struct tdx_tsm *ttsm = to_tdx_tsm(tsm); + + pci_info(tsm->pdev, "PF tsm disabled\n"); + kfree(ttsm); + + return; + } + + /* for VF and MFD */ + kfree(tsm); +} + +static const struct pci_tsm_ops tdx_pci_tsm_ops = { + .probe = tdx_tsm_pci_probe, + .remove = tdx_tsm_pci_remove, + .connect = tdx_tsm_connect, + .disconnect = tdx_tsm_disconnect, + .bind = tdx_tsm_bind, + .unbind = tdx_tsm_unbind, + .guest_req = tdx_tsm_guest_req, +}; + +static void unregister_tsm(void *tsm_core) +{ + tsm_unregister(tsm_core); +} + +static int tdx_tsm_probe(struct device *dev) +{ + struct tsm_core_dev *tsm_core; + + tsm_core = tsm_register(dev, NULL, &tdx_pci_tsm_ops); + if (IS_ERR(tsm_core)) { + dev_err(dev, "failed to register TSM: (%pe)\n", tsm_core); + return PTR_ERR(tsm_core); + } + + return devm_add_action_or_reset(dev, unregister_tsm, tsm_core); +} + +static struct device_driver tdx_tsm_driver = { + .probe = tdx_tsm_probe, + .bus = &tdx_subsys, + .owner = THIS_MODULE, + .name = KBUILD_MODNAME, + .mod_name = KBUILD_MODNAME, +}; + +static int __init tdx_tsm_init(void) +{ + return driver_register(&tdx_tsm_driver); +} +module_init(tdx_tsm_init); + +static void __exit tdx_tsm_exit(void) +{ + driver_unregister(&tdx_tsm_driver); +} +module_exit(tdx_tsm_exit); + +MODULE_IMPORT_NS("TDX"); +MODULE_LICENSE("GPL"); +MODULE_ALIAS("tdx_tsm"); +MODULE_DESCRIPTION("TDX TEE Security Manager"); From patchwork Thu May 29 05:35:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893227 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B8C525C810; Thu, 29 May 2025 05:44:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497462; cv=none; b=QyU5lFXINSHRWJzaLmTFlj1WYpPHIG5gb22aqjbOVGHgr1ssK8y/d3C3L0yWrCURDbgIYpaPbTchphZs5HQLP3WxboE4JXZAPLSFU7QUKjfJNjbjcPM8xC/eoVkdiMgYOdAN7F2A9KCPbs4lFe7G8tatgubx71rJvTHDpFMXQYY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497462; c=relaxed/simple; bh=GNv2RKvyaLb/r2umwW27kcjDvWo5+4A29CNU15DNhN4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oeICpq2wbIqFF51Q1pfFJgM3APuqA5TYoHt4huiBUl0Fg3aR3OtibriScf9aZ+RN6Bz4Tvfx2eW+Cnp3Jv0oOk/zs5OBdEmT3CBdPcm04IIfIn9fNbaXfEiXFqXtKxR5ZxkCK26a65BBxSg5zYfzqDKJYe75bzweJ64gnZZ4D4U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=n4Z9S1xb; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="n4Z9S1xb" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497461; x=1780033461; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GNv2RKvyaLb/r2umwW27kcjDvWo5+4A29CNU15DNhN4=; b=n4Z9S1xbC5UVNafwcIjCmTurEZmBmQT9Q2KpiDnelP543rTul9o6SylZ FnLF2QJ4XRiXMB1K417aQKMWkOnI7vwAicjq5q2NgFTEavIpZqN4xoX7H OYPNDs3NeyQXeOpGM+azGzNwxVUDxwKIpz9s/1jeZks0S/aDeORioJMLQ OUuLxXe7sHwU62X4HNCG/NYi6hp+IXjo4UA/KSmfvoRP4RcPkwHNFowMN TN9/+fbh7IEtO8Hp3Zz2KkqURNBWEl/FHhhSII60dAw3lRzV4ImX5suie 8seai7On5v3s4fmNRqtVqPrjpQgdu3J0uaPY3iyy6DB0/F79DaSvbf7I/ A==; X-CSE-ConnectionGUID: Wsrj88lLTDOtKTqJq0JFTg== X-CSE-MsgGUID: IvKLWjUcQLSiraKepPJ3LQ== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963396" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963396" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:21 -0700 X-CSE-ConnectionGUID: uG+vLF2RRoeXmzPQgvbfyA== X-CSE-MsgGUID: r//6m1aFQ+ylGKPz+nsRXQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443620" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:14 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 25/30] coco/tdx_tsm: Add connect()/disconnect() handlers prototype Date: Thu, 29 May 2025 13:35:08 +0800 Message-Id: <20250529053513.1592088-26-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Wu Hao Add basic skeleton for connect()/disconnect() handlers. The major steps are SPDM setup first and then IDE selective stream setup. No detailed TDX Connect implementation. Signed-off-by: Wu Hao Signed-off-by: Xu Yilun --- drivers/virt/coco/host/tdx_tsm.c | 55 +++++++++++++++++++++++++++++++- 1 file changed, 54 insertions(+), 1 deletion(-) diff --git a/drivers/virt/coco/host/tdx_tsm.c b/drivers/virt/coco/host/tdx_tsm.c index 72f3705fe7bb..d1a8384d8339 100644 --- a/drivers/virt/coco/host/tdx_tsm.c +++ b/drivers/virt/coco/host/tdx_tsm.c @@ -79,13 +79,66 @@ static int tdx_tsm_guest_req(struct pci_dev *pdev, return -ENXIO; } +static int tdx_tsm_spdm_session_setup(struct tdx_tsm *ttsm) +{ + return 0; +} + +static int tdx_tsm_spdm_session_teardown(struct tdx_tsm *ttsm) +{ + return 0; +} + +static int tdx_tsm_ide_stream_setup(struct tdx_tsm *ttsm) +{ + return 0; +} + +static int tdx_tsm_ide_stream_teardown(struct tdx_tsm *ttsm) +{ + return 0; +} + static int tdx_tsm_connect(struct pci_dev *pdev) { - return -ENXIO; + struct tdx_tsm *ttsm = to_tdx_tsm(pdev->tsm); + int ret; + + ret = tdx_tsm_spdm_session_setup(ttsm); + if (ret) { + pci_err(pdev, "fail to setup spdm session\n"); + return ret; + } + + ret = tdx_tsm_ide_stream_setup(ttsm); + if (ret) { + pci_err(pdev, "fail to setup ide stream\n"); + tdx_tsm_spdm_session_teardown(ttsm); + return ret; + } + + pci_dbg(pdev, "%s complete\n", __func__); + return ret; } static void tdx_tsm_disconnect(struct pci_dev *pdev) { + struct tdx_tsm *ttsm = to_tdx_tsm(pdev->tsm); + int ret; + + ret = tdx_tsm_ide_stream_teardown(ttsm); + if (ret) { + pci_err(pdev, "fail to teardown ide stream\n"); + return; + } + + ret = tdx_tsm_spdm_session_teardown(ttsm); + if (ret) { + pci_err(pdev, "fail to teadown spdm session\n"); + return; + } + + pci_dbg(pdev, "%s complete\n", __func__); } static struct pci_tsm *tdx_tsm_pci_probe(struct pci_dev *pdev) From patchwork Thu May 29 05:35:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893485 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 767622566E7; Thu, 29 May 2025 05:44:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497468; cv=none; b=XaEgJeyrQAt0n8cRzFGmE+/3ZbztfggpbgE5EaiIy03PGVUBEFbB2Lz8/g4BBAU+KvlCJtDnnYGXzsml7Wj33Gn3jxLuL9AE3uamfC7tj8I1iYBbuhCI0uUQS00yU3zQ3FMiBlTzo+9Rm2GTQu8TLz/TcOc16OdOhConL5r3RNU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497468; c=relaxed/simple; bh=s+Og/7//HP2ImpOPljrHL/2A9Bv5uMTbEhhPjy/aPyo=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ZoKb5CPtGzjWIFWGqmsW9rU+5Z1N4W5C77D8Tk8z37b5eRF79FFLIDLnwvvIt6k6m0KZW6Tbd5zRMtqAqAj49kVnjRI/CIu7ue9Nk76OReiOrQpFeUdeJIrgTQa5/L21M7Il0EW/lJrXKwqnOrRo6OmCH8pS8ablhewK3qbQ/wE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lbQYNb4j; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lbQYNb4j" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497468; x=1780033468; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=s+Og/7//HP2ImpOPljrHL/2A9Bv5uMTbEhhPjy/aPyo=; b=lbQYNb4jC1txl8eXLp75ILHeE5A7bgdTFv1u2Kxxli/sS7piieZxh2kX rbvgyQgkvxq7OP2pB0OM4wLJn/kPacnOP+g9cpJg153VWN/MknTM/o5bO V9rvwijMiXJ9clXKWRpRNqKAnfg0jreU9pGlZl4TAWdiC/EzQKqUqGkkB MPp1Zt3V+ZzJ7ZXkA54hoMGonOOsucgXxDSkamARKeKqpjM6OQ1afFpNE jx1H91q47a22HVBjSTk8XfRDpxGxNNxrXKbtUokH8yjIJKkjxnW2LPVD7 +JZwx3woGFWsQqWjpwCisIZb0lx9+WnnIb3cBWYdjW1tmOoO8mg5EwVSC w==; X-CSE-ConnectionGUID: 5QKp1vZ6ReSqabGTt8veWQ== X-CSE-MsgGUID: m8ZkUyFNT7GRNsrp30ByKg== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963417" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963417" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:27 -0700 X-CSE-ConnectionGUID: Qm3Lbc/QRMGZMhshJ4HxFA== X-CSE-MsgGUID: pJCuEKEIR1+XYLHUM/QRBg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443629" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:20 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 26/30] coco/tdx_tsm: Add bind()/unbind()/guest_req() handlers prototype Date: Thu, 29 May 2025 13:35:09 +0800 Message-Id: <20250529053513.1592088-27-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add basic skeleton for bind()/unbind()/guest_req() handlers. Specifically, tdx_tdi_devifmt/devif_create() declare the TDI ownership to TD. tdx_tdi_mmiomt_create() declares the MMIO ownership to TD. tdx_tdi_request(TDX_TDI_REQ_BIND) locks the TDI. No detailed TDX Connect implementation. Signed-off-by: Xu Yilun --- drivers/virt/coco/host/tdx_tsm.c | 83 ++++++++++++++++++++++++++++++-- 1 file changed, 80 insertions(+), 3 deletions(-) diff --git a/drivers/virt/coco/host/tdx_tsm.c b/drivers/virt/coco/host/tdx_tsm.c index d1a8384d8339..beb65f45b478 100644 --- a/drivers/virt/coco/host/tdx_tsm.c +++ b/drivers/virt/coco/host/tdx_tsm.c @@ -44,10 +44,49 @@ static struct tdx_tdi *to_tdx_tdi(struct pci_tdi *tdi) return container_of(tdi, struct tdx_tdi, tdi); } +static int tdx_tdi_devifmt_create(struct tdx_tdi *ttdi) +{ + return 0; +} + +static void tdx_tdi_devifmt_free(struct tdx_tdi *ttdi) +{ +} + +static int tdx_tdi_mmiomt_create(struct tdx_tdi *ttdi) +{ + return 0; +} + +static void tdx_tdi_mmiomt_free(struct tdx_tdi *ttdi) +{ +} + +static int tdx_tdi_devif_create(struct tdx_tdi *ttdi) +{ + return 0; +} + +static void tdx_tdi_devif_free(struct tdx_tdi *ttdi) +{ +} + +#define TDX_TDI_REQ_BIND 1 +#define TDX_TDI_REQ_START 2 +#define TDX_TDI_REQ_GET_STATE 3 +#define TDX_TDI_REQ_STOP 4 + +static int tdx_tdi_request(struct tdx_tdi *ttdi, unsigned int req) +{ + return 0; +} + static struct pci_tdi *tdx_tsm_bind(struct pci_dev *pdev, struct pci_dev *dsm_dev, struct kvm *kvm, u64 tdi_id) { + int ret; + struct tdx_tdi *ttdi __free(kfree) = kzalloc(sizeof(*ttdi), GFP_KERNEL); if (!ttdi) @@ -58,17 +97,55 @@ static struct pci_tdi *tdx_tsm_bind(struct pci_dev *pdev, ttdi->tdi.dsm_dev = pci_dev_get(dsm_dev); ttdi->tdi.kvm = kvm; - /*TODO: TDX Module required operations */ + ret = tdx_tdi_devifmt_create(ttdi); + if (ret) { + pci_err(pdev, "fail to init devifmt\n"); + goto put_dsm_dev; + } + + ret = tdx_tdi_devif_create(ttdi); + if (ret) { + pci_err(pdev, "%s fail to init devif\n", __func__); + goto devifmt_free; + } + + ret = tdx_tdi_mmiomt_create(ttdi); + if (ret) { + pci_err(pdev, "%s fail to create mmiomt\n", __func__); + goto devif_free; + } + + ret = tdx_tdi_request(ttdi, TDX_TDI_REQ_BIND); + if (ret) { + pci_err(pdev, "%s fial to request bind\n", __func__); + goto mmiomt_free; + } return &no_free_ptr(ttdi)->tdi; + +mmiomt_free: + tdx_tdi_mmiomt_free(ttdi); +devif_free: + tdx_tdi_devif_free(ttdi); +devifmt_free: + tdx_tdi_devifmt_free(ttdi); +put_dsm_dev: + pci_dev_put(dsm_dev); + return NULL; } static void tdx_tsm_unbind(struct pci_tdi *tdi) { struct tdx_tdi *ttdi = to_tdx_tdi(tdi); - /*TODO: TDX Module required operations */ - + /* + * TODO: In fact devif cannot be freed before TDI's private MMIOs and + * private DMA are unmapped. Will handle this restriction later. + */ + tdx_tdi_request(ttdi, TDX_TDI_REQ_STOP); + tdx_tdi_mmiomt_free(ttdi); + tdx_tdi_devif_free(ttdi); + tdx_tdi_devifmt_free(ttdi); pci_dev_put(ttdi->tdi.dsm_dev); kfree(ttdi); } From patchwork Thu May 29 05:35:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893226 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F18F425C835; Thu, 29 May 2025 05:44:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497475; cv=none; b=uaQyGJ9H+QVC7CKBu6Bg4+ODz5HNi7gPaI6Vqvbm/jFy37sv79xgehZn9scvY7/IwYkKkwrugto+Y5wbIK0Nup59fclVewQ7hFWswtgRVqEEKLhmbCP9BzTyAdYLQPF6L9dWjpQdVGfzhgswt1zTquBwZjZOgW9ZOISUEu2fC/c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497475; c=relaxed/simple; bh=cPetbcObK2zQM71pDawhMP63WT3poHVOOSqqFGJk3PI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=M7cLWhuccjkB2MoLbt6Kob3mrRVjzj0rEM3g8/Nu+tHfVecTAKhjVG/ahUjrB82NZ7fdJguOZ5qrmiv30P95sjKtLer609ayZ+W9nf4q0pQKGEHlw8CtexY48uUgokngHpMfFXcjGtiHlkQ5I1L7Nfpw1iERtbyhz7Ce7U1HEz8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=H9z7fR3g; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="H9z7fR3g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497474; x=1780033474; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cPetbcObK2zQM71pDawhMP63WT3poHVOOSqqFGJk3PI=; b=H9z7fR3g/3AZe/Wi1Czgxs6jJVmrZndIl+Xq7pB8CzOAKWmImIpCdWjG uwIgOOJgY3O8JL3ldKvjgF4BLwIv+MfVbcQYZV18/5Xs6UcmNtqi4qmpB iCTSZCZWmiuBcMlCN+4A5LHbs/vlDYUdMIGA6jPLbak/HDc8lkfKFZOyo FO+Ox7VuZDpMU7/wbiUDR9n64ySL8nCWoLVL6yNA3A83MB5RzXqC1eH6B DJyRHp2iJau36YwoirR1bbjFgVcPDCjZ1cAVBn5AvdEY4tcGpwtmJDOj7 bFqbcLG6C7/gwAGTuDtgwlRVWO2u5qsamAbgwTLodUGviblJsrSw0Ee4I Q==; X-CSE-ConnectionGUID: /2SEQrd+Rvq54qlsbqGFCw== X-CSE-MsgGUID: d4Svy8/wT/W3QiEB50juow== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963452" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963452" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:34 -0700 X-CSE-ConnectionGUID: k5KQxk77QCufk/x1v32kzg== X-CSE-MsgGUID: d4qoKWHwSSek1z6G/x67rQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443639" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:27 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 27/30] PCI/TSM: Add PCI driver callbacks to handle TSM requirements Date: Thu, 29 May 2025 13:35:10 +0800 Message-Id: <20250529053513.1592088-28-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Add optional PCI driver callbacks to notify TSM events. For now, these handlers may be called during pci_tsm_unbind(). By calling these handlers, TSM driver askes for external collaboration to finish entire TSM unbind flow. If platform TSM driver could finish TSM bind/unbind all by itself, don't call these handlers. Host may need to configure various system components according to platform trusted firmware's requirements. E.g. for Intel TDX Connect, host should do private MMIO mapping in S-EPT, trusted DMA setup, device ownership claiming and device TDISP state transition. Some operations are out of control of PCI TSM, so need collaboration by external components like IOMMU driver, KVM. Further more, trusted firmware may enforce executing these operations in a fixed sequence. E.g. Intel TDX Connect enforces the following sequences for TSM unbind: 1. STOP TDI via TDISP message STOP_INTERFACE 2. Private MMIO unmap from Secure EPT 3. Trusted Device Context Table cleanup for the TDI 4. TDI ownership reclaim and metadata free PCI TSM could do Step 1 and 4, but need KVM for Step 2 and IOMMU driver for Step 3. While it is possible TSM provides finer grained APIs like tdi_stop() & tdi_free(), and the caller ensures the sequence, it is better these specific enforcement could be managed in platform TSM driver. By introducing TSM handlers, platform TSM driver controls the operation sequence and notify other components to do the real work. Currently add 3 callbacks for TDX Connect. disable_mmio() is for VFIO to invalidate MMIO so that KVM could unmap them from S-EPT. recover_mmio() is to re-validate MMIO so that KVM could map them again for shared assigned device. disable_trusted_dma() is to cleanup trusted IOMMU setup. Signed-off-by: Xu Yilun --- include/linux/pci-tsm.h | 7 +++++++ include/linux/pci.h | 3 +++ 2 files changed, 10 insertions(+) diff --git a/include/linux/pci-tsm.h b/include/linux/pci-tsm.h index 737767f8a9c5..ed549724eb5b 100644 --- a/include/linux/pci-tsm.h +++ b/include/linux/pci-tsm.h @@ -157,6 +157,13 @@ struct pci_tsm_ops { int (*accept)(struct pci_dev *pdev); }; +/* pci drivers callbacks for TSM */ +struct pci_tsm_handlers { + void (*disable_mmio)(struct pci_dev *dev); + void (*recover_mmio)(struct pci_dev *dev); + void (*disable_trusted_dma)(struct pci_dev *dev); +}; + enum pci_doe_proto { PCI_DOE_PROTO_CMA = 1, PCI_DOE_PROTO_SSESSION = 2, diff --git a/include/linux/pci.h b/include/linux/pci.h index 5f37957da18f..4f768b4658e8 100644 --- a/include/linux/pci.h +++ b/include/linux/pci.h @@ -545,6 +545,7 @@ struct pci_dev { #endif #ifdef CONFIG_PCI_TSM struct pci_tsm *tsm; /* TSM operation state */ + void *trusted_dma_owner; #endif u16 acs_cap; /* ACS Capability offset */ u8 supported_speeds; /* Supported Link Speeds Vector */ @@ -957,6 +958,7 @@ struct module; * @sriov_get_vf_total_msix: PF driver callback to get the total number of * MSI-X vectors available for distribution to the VFs. * @err_handler: See Documentation/PCI/pci-error-recovery.rst + * @tsm_handler: Optional driver callbacks to handle TSM requirements. * @groups: Sysfs attribute groups. * @dev_groups: Attributes attached to the device that will be * created once it is bound to the driver. @@ -982,6 +984,7 @@ struct pci_driver { int (*sriov_set_msix_vec_count)(struct pci_dev *vf, int msix_vec_count); /* On PF */ u32 (*sriov_get_vf_total_msix)(struct pci_dev *pf); const struct pci_error_handlers *err_handler; + struct pci_tsm_handlers *tsm_handler; const struct attribute_group **groups; const struct attribute_group **dev_groups; struct device_driver driver; From patchwork Thu May 29 05:35:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893484 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5468D253F1D; Thu, 29 May 2025 05:44:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497481; cv=none; b=Cxf/nUPgMs/bjC2HMOQWMJGyZaO2K/EceYdxkvX8EYTHCrm+YGn/hbVhjrQmb43Az44xTQTmEyaJj7H6hr8N5nFCVaDBL42OTpGSmvB+o3+1p4rxqnRWvI6fA7aQaZOTQbfTqw6bLBym0DMGFU29iEyBINv8pBfLe92lONuUUnA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497481; c=relaxed/simple; bh=AyrKaCgBJ3r/A9+852gqJoO1wObY7IGy8rbLw4mCWCs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=oIWj842E5WHxmVzP3X9vmAKuADRB2Z2FOS3lCGrKYSOmsfWp52mKapquGZBpB572KRG4GAygAmKOnmIpBAF85rn8cB3T2HOrIk6czg/GHvud6rcTaHlFsczIeYpdhOyDt5OCHCOBtStppZHmywqqtV1W0M2js5cj/creJ7UqM/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=cPPfmNvn; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="cPPfmNvn" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497481; x=1780033481; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AyrKaCgBJ3r/A9+852gqJoO1wObY7IGy8rbLw4mCWCs=; b=cPPfmNvnb+oUrCX9v/6bcI/s5AdGFn8WUs5uWlN+wHaSGr+p93Gl+Dgo yyAfIIqlzQiLxguSeRtKNFmeIxMILulyYLOEdlYXnrq7GsdrXTO+tb3tT eXEvA1d9vze8GoGWKAIzmN7heQsiz4OP+Ff7gDthmlxxU8ztvjFn/BQN6 pI/f1OBa/MPVBET11SyHw7CBfK2DjUQzlRNm8NintRvHiDwDdaTHGbj+z 2blCR9gH1GP8zcuaWKPuvhiZKogXXtjG+2UBQur3k3prFKsh1tqJlXUIE E3hYg4kpqcqEcSWt674XxgyoiYmLFso9KRazRxnNNYtGMyFJImi2/m/0U g==; X-CSE-ConnectionGUID: Emw0yeCITZmcka500vaGwg== X-CSE-MsgGUID: Dr6LoQ32Rhek4vMlbA1nLQ== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963474" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963474" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:40 -0700 X-CSE-ConnectionGUID: VT8q7GNFRs+Ealv7ugjQjA== X-CSE-MsgGUID: PIlIdinVRtKIaFhRpIfP0g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443656" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:33 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 28/30] vfio/pci: Implement TSM handlers for MMIO Date: Thu, 29 May 2025 13:35:11 +0800 Message-Id: <20250529053513.1592088-29-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 VFIO invalidates MMIOs on disable_mmio() so that KVM could unmap them from S-EPT. VFIO re-validate MMIOs on recover_mmio() so that KVM could map them again for shared assigned device. For now these handlers are mainly for Intel TDX Connect, but should have no impact since other platform TSM drivers don't call these handlers. Signed-off-by: Xu Yilun --- drivers/vfio/pci/vfio_pci.c | 1 + drivers/vfio/pci/vfio_pci_core.c | 26 ++++++++++++++++++++++++++ include/linux/vfio_pci_core.h | 1 + 3 files changed, 28 insertions(+) diff --git a/drivers/vfio/pci/vfio_pci.c b/drivers/vfio/pci/vfio_pci.c index 5ba39f7623bb..df25a3083fb0 100644 --- a/drivers/vfio/pci/vfio_pci.c +++ b/drivers/vfio/pci/vfio_pci.c @@ -202,6 +202,7 @@ static struct pci_driver vfio_pci_driver = { .remove = vfio_pci_remove, .sriov_configure = vfio_pci_sriov_configure, .err_handler = &vfio_pci_core_err_handlers, + .tsm_handler = &vfio_pci_core_tsm_handlers, .driver_managed_dma = true, }; diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index a8437fcecca1..405461583e2f 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -1452,6 +1453,31 @@ static int vfio_pci_ioctl_ioeventfd(struct vfio_pci_core_device *vdev, ioeventfd.fd); } +static void vfio_pci_core_tsm_disable_mmio(struct pci_dev *pdev) +{ + struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev); + + down_write(&vdev->memory_lock); + vfio_pci_dma_buf_move(vdev, true); + up_write(&vdev->memory_lock); +} + +static void vfio_pci_core_tsm_recover_mmio(struct pci_dev *pdev) +{ + struct vfio_pci_core_device *vdev = dev_get_drvdata(&pdev->dev); + + down_write(&vdev->memory_lock); + if (__vfio_pci_memory_enabled(vdev)) + vfio_pci_dma_buf_move(vdev, false); + up_write(&vdev->memory_lock); +} + +struct pci_tsm_handlers vfio_pci_core_tsm_handlers = { + .disable_mmio = vfio_pci_core_tsm_disable_mmio, + .recover_mmio = vfio_pci_core_tsm_recover_mmio, +}; +EXPORT_SYMBOL_GPL(vfio_pci_core_tsm_handlers); + static int vfio_pci_ioctl_tsm_bind(struct vfio_pci_core_device *vdev, void __user *arg) { diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index b2982100221f..7da71b861d87 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -111,6 +111,7 @@ void vfio_pci_core_release_dev(struct vfio_device *core_vdev); int vfio_pci_core_register_device(struct vfio_pci_core_device *vdev); void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev); extern const struct pci_error_handlers vfio_pci_core_err_handlers; +extern struct pci_tsm_handlers vfio_pci_core_tsm_handlers; int vfio_pci_core_sriov_configure(struct vfio_pci_core_device *vdev, int nr_virtfn); long vfio_pci_core_ioctl(struct vfio_device *core_vdev, unsigned int cmd, From patchwork Thu May 29 05:35:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893225 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A39F826FDA8; Thu, 29 May 2025 05:44:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497488; cv=none; b=WxJyedeIUjsn3A8DOzUjSy+NC99o/nX3UlmGtM+YlA0tjgpMImkg+5M9SIZFtBx7WFBTQ40Ml1UPrA/yaRHO8HRp9LQ6w+gc8L7xyndvOKNEI5DkDgVn7F5dc/TLJ1uRsDJ7+/sser9fuoADKIlmeV73r2SCGtDsXNZPEsdUnK0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497488; c=relaxed/simple; bh=5t1xbsesQU/RfLeQac4mL21mX07YhXCrMjP205S8yWk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=MaUxjCV67wnxANb7P8DD0YyundQokjzPv32f9gzr7VJt1Ai92sIVu5OsGqX9+CpUwUkpkKNN89Fw1W0oUxQf7O8MRHU6K2j+DKKkdpnAN2cNusPhii5m6PD9/7WMw5TL7ioryI9r/eY3AIiAkFRYEpNneQoVwJ7daMpNVVI46kA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=GQALThth; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="GQALThth" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497487; x=1780033487; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5t1xbsesQU/RfLeQac4mL21mX07YhXCrMjP205S8yWk=; b=GQALThth886LPpqFb32T//2/yZDnHXFngYsNea9mRzKju0k5AKFFoDQE aq7k1I5iYE78Y5KlpRQnh2r7U2Gd7thC6buHgewWF8WiPDGw61jUqAboH oPZGRaCfP6MG6bU5c5WAGMb3NgJoRGLIVS9VPjz073jz+wWdbLqQ4082U YBo6AbD+N/oXSnBKDaWVlv9qij8AuH5VnBTskhNSb8SyavlNExLfDFvFn eKq4hSIyFLbjwOot+QM6l0L16kNhQI5gVJAzH8j0agPIvDc//XG2lnccp F0KV/muQIsEUvN/8fNApZIsBlfQ5ugrcCCN5VUuwyNi6xGwxvG2J+nswn A==; X-CSE-ConnectionGUID: a3OZXH16QkSxJ8rrED2ZDA== X-CSE-MsgGUID: 3bFjKOuzQb2VEm35xBgtNw== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963505" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963505" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:47 -0700 X-CSE-ConnectionGUID: w8s4Yi3kTTiJadEJ5384Ag== X-CSE-MsgGUID: PoWShFvhTNGfLLU1Qt6FIA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443689" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:40 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 29/30] iommufd/vdevice: Implement TSM handlers for trusted DMA Date: Thu, 29 May 2025 13:35:12 +0800 Message-Id: <20250529053513.1592088-30-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 IOMMUFD implements disable_trusted_dma() handler to clean up trusted DMA configuration when device is to be unbound. For now these handlers are mainly for Intel TDX Connect, but should have no impact since other platform TSM drivers don't call these handlers. Signed-off-by: Xu Yilun --- drivers/iommu/iommufd/viommu.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index c64ce1a9f87d..b7281a4422ff 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -255,8 +255,16 @@ static void iommufd_vdevice_disable_trusted_dma(struct iommufd_vdevice *vdev) vdev->trusted_dma_enabled = false; } +static void pci_driver_disable_trusted_dma(struct pci_dev *pdev) +{ + struct iommufd_vdevice *vdev = pdev->trusted_dma_owner; + + iommufd_vdevice_disable_trusted_dma(vdev); +} + int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) { + struct pci_dev *pdev = to_pci_dev(vdev->dev); struct kvm *kvm; int rc; @@ -272,6 +280,9 @@ int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) goto out_unlock; } + pdev->trusted_dma_owner = vdev; + pdev->driver->tsm_handler->disable_trusted_dma = pci_driver_disable_trusted_dma; + rc = iommufd_vdevice_enable_trusted_dma(vdev); if (rc) goto out_unlock; @@ -292,12 +303,16 @@ int iommufd_vdevice_tsm_bind(struct iommufd_vdevice *vdev) void iommufd_vdevice_tsm_unbind(struct iommufd_vdevice *vdev) { + struct pci_dev *pdev = to_pci_dev(vdev->dev); + mutex_lock(&vdev->tsm_lock); if (!vdev->tsm_bound) goto out_unlock; pci_tsm_unbind(to_pci_dev(vdev->dev)); iommufd_vdevice_disable_trusted_dma(vdev); + pdev->trusted_dma_owner = NULL; + pdev->driver->tsm_handler->disable_trusted_dma = NULL; vdev->tsm_bound = false; out_unlock: From patchwork Thu May 29 05:35:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xu Yilun X-Patchwork-Id: 893483 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 01FB425CC72; Thu, 29 May 2025 05:44:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497494; cv=none; b=ie2QsBDi19K/kcN3OcmHcw7bG6sU+xGXyeYgwsfYObjCQWWQNzGKRUrNEPFdzWUjg+hu0hI2aAG3NKSekBkpHZrxZ3s/PI8TulcwndhAnGZKim/TxF1LF/SvUrIrMy6/rMu42kk3EZ2I2zi8un+jETVn2RlJlgrMAjLZiOks+xI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1748497494; c=relaxed/simple; bh=JfSkXN7VFQD0j241YOoerWnrQ3UAC5/O1OD0Wg9yzYc=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=me0yBMz4fFfmOplXQoK3IcDoHR8zZnfVt1kLh1NwMga40z+BWAn/DeT9XhcArkx3/d2z848EY0W+WNHf3o1ZOI6nG7LU66ntXsjxVxWfiCeh3fs1nGtUFTJphEcjljbDbEAojUdls4hd7oL6wWTgz6MN1cTCFNZeTor/39wngtw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=IYcaHTjF; arc=none smtp.client-ip=198.175.65.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="IYcaHTjF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1748497493; x=1780033493; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=JfSkXN7VFQD0j241YOoerWnrQ3UAC5/O1OD0Wg9yzYc=; b=IYcaHTjFZV2uxjMVwYXZ5cWhaJdl3ZM4xwOaYUMZMdA+FTpAgc/3rIEr w8/p2EV4MdzqiUo0X9bg+jZyKjnZI4L/3DOIRjuBhTBEf9bv7Hee1kNe2 l62zcpNIc5dT5g5VapXycq7aYDoKc9avZFhLTa5VsdPDqdEizkWP4cBTm zCVaseZMGf5VYZGZoEj2VH6g1Um3oaLBeYYuvGPMCXzxg/iIPF1X2EcC7 YTtI62GleFp1iv2Oi6q5t9iYZPv7HAeWQ/kTjHp5FCldpX4rhBrV7+8DL igyDdGwoX86wLDw3oKY+qUkcaOAOhkgGl4MVF5Zxtc8sR+C9wGBDI32uX w==; X-CSE-ConnectionGUID: PcBxA5VXS4mZ4qHUrZvi8w== X-CSE-MsgGUID: 1VeWn4CeT5GikANEUk//3Q== X-IronPort-AV: E=McAfee;i="6700,10204,11447"; a="67963526" X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="67963526" Received: from fmviesa009.fm.intel.com ([10.60.135.149]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2025 22:44:53 -0700 X-CSE-ConnectionGUID: VpISARUySKWDHRrKuSvNOg== X-CSE-MsgGUID: opx6ComdRiCGYPBNIKpm/g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.15,323,1739865600"; d="scan'208";a="144443716" Received: from yilunxu-optiplex-7050.sh.intel.com ([10.239.159.165]) by fmviesa009.fm.intel.com with ESMTP; 28 May 2025 22:44:46 -0700 From: Xu Yilun To: kvm@vger.kernel.org, sumit.semwal@linaro.org, christian.koenig@amd.com, pbonzini@redhat.com, seanjc@google.com, alex.williamson@redhat.com, jgg@nvidia.com, dan.j.williams@intel.com, aik@amd.com, linux-coco@lists.linux.dev Cc: dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, linaro-mm-sig@lists.linaro.org, vivek.kasireddy@intel.com, yilun.xu@intel.com, yilun.xu@linux.intel.com, linux-kernel@vger.kernel.org, lukas@wunner.de, yan.y.zhao@intel.com, daniel.vetter@ffwll.ch, leon@kernel.org, baolu.lu@linux.intel.com, zhenzhong.duan@intel.com, tao1.su@intel.com, linux-pci@vger.kernel.org, zhiw@nvidia.com, simona.vetter@ffwll.ch, shameerali.kolothum.thodi@huawei.com, aneesh.kumar@kernel.org, iommu@lists.linux.dev, kevin.tian@intel.com Subject: [RFC PATCH 30/30] coco/tdx_tsm: Manage TDX Module enforced operation sequences for Unbind Date: Thu, 29 May 2025 13:35:13 +0800 Message-Id: <20250529053513.1592088-31-yilun.xu@linux.intel.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20250529053513.1592088-1-yilun.xu@linux.intel.com> References: <20250529053513.1592088-1-yilun.xu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-media@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Implement TDX Connect enforced sequences for TSM unbind. The enforced sequences are: 1. STOP TDI via TDISP message STOP_INTERFACE 2. Private MMIO unmap from Secure EPT 3. Trusted Device Context Table cleanup for the TDI 4. TDI ownership reclaim and metadata free Step 2 is the responsibility of KVM, step 3 is for IOMMU driver. So TDX TSM driver needs to invoke TSM handlers for external collaboration. Signed-off-by: Xu Yilun --- drivers/virt/coco/host/tdx_tsm.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-) diff --git a/drivers/virt/coco/host/tdx_tsm.c b/drivers/virt/coco/host/tdx_tsm.c index beb65f45b478..66d6019812ca 100644 --- a/drivers/virt/coco/host/tdx_tsm.c +++ b/drivers/virt/coco/host/tdx_tsm.c @@ -87,6 +87,15 @@ static struct pci_tdi *tdx_tsm_bind(struct pci_dev *pdev, { int ret; + if (!pdev->trusted_dma_owner || + !pdev->driver->tsm_handler || + !pdev->driver->tsm_handler->disable_mmio || + !pdev->driver->tsm_handler->recover_mmio || + !pdev->driver->tsm_handler->disable_trusted_dma) { + pci_err(pdev, "%s no driver or driver not support bind\n", __func__); + return NULL; + } + struct tdx_tdi *ttdi __free(kfree) = kzalloc(sizeof(*ttdi), GFP_KERNEL); if (!ttdi) @@ -137,15 +146,15 @@ static struct pci_tdi *tdx_tsm_bind(struct pci_dev *pdev, static void tdx_tsm_unbind(struct pci_tdi *tdi) { struct tdx_tdi *ttdi = to_tdx_tdi(tdi); + struct pci_dev *pdev = tdi->pdev; - /* - * TODO: In fact devif cannot be freed before TDI's private MMIOs and - * private DMA are unmapped. Will handle this restriction later. - */ tdx_tdi_request(ttdi, TDX_TDI_REQ_STOP); + pdev->driver->tsm_handler->disable_mmio(pdev); + pdev->driver->tsm_handler->disable_trusted_dma(pdev); tdx_tdi_mmiomt_free(ttdi); tdx_tdi_devif_free(ttdi); tdx_tdi_devifmt_free(ttdi); + pdev->driver->tsm_handler->recover_mmio(pdev); pci_dev_put(ttdi->tdi.dsm_dev); kfree(ttdi); }