mbox series

[v2,0/4] iommufd: Add iommu hardware info reporting

Message ID 20230309075358.571567-1-yi.l.liu@intel.com
Headers show
Series iommufd: Add iommu hardware info reporting | expand

Message

Yi Liu March 9, 2023, 7:53 a.m. UTC
iommufd gives userspace the capability to manipulate iommu subsytem.
e.g. DMA map/unmap etc. In the near future, it will support iommu nested
translation. Different platform vendors have different implementation for
the nested translation. So before set up nested translation, userspace
needs to know the hardware iommu information. For example, Intel VT-d
supports using guest I/O page table as the stage-1 translation table. This
requires guest I/O page table be compatible with hardware IOMMU.

This series reports the iommu hardware information for a given iommufd_device
which has been bound to iommufd. It is preparation work for userspace to
allocate hwpt for given device. Like the nested translation support[1].

This series introduces an iommu op to report the iommu hardware info,
and an ioctl IOMMU_DEVICE_GET_HW_INFO is added to report such hardware
info to user. enum iommu_hw_info_type is defined to differentiate the
iommu hardware info reported to user hence user can decode them. This
series only adds the framework for iommu hw info reporting, the complete
reporting path needs vendor specific definition and driver support. The
full picture is available in [1] as well.

base-commit: 4c7e97cb6e65eab02991f60a5cc7a4fed5498c7a

[1] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting

Change log:

v2:
 - Drop patch 05 of v1 as it is already covered by other series
 - Rename the capability info to be iommu hardware info

v1: https://lore.kernel.org/linux-iommu/20230209041642.9346-1-yi.l.liu@intel.com/

Regards,
	Yi Liu

Lu Baolu (1):
  iommu: Add new iommu op to get iommu hardware information

Nicolin Chen (1):
  iommufd/selftest: Add coverage for IOMMU_DEVICE_GET_HW_INFO ioctl

Yi Liu (2):
  iommu: Move dev_iommu_ops() to private header
  iommufd: Add IOMMU_DEVICE_GET_HW_INFO

 drivers/iommu/iommu-priv.h                    | 11 +++
 drivers/iommu/iommu.c                         |  2 +
 drivers/iommu/iommufd/device.c                | 75 +++++++++++++++++++
 drivers/iommu/iommufd/iommufd_private.h       |  1 +
 drivers/iommu/iommufd/iommufd_test.h          | 15 ++++
 drivers/iommu/iommufd/main.c                  |  3 +
 drivers/iommu/iommufd/selftest.c              | 16 ++++
 include/linux/iommu.h                         | 24 +++---
 include/uapi/linux/iommufd.h                  | 47 ++++++++++++
 tools/testing/selftests/iommu/iommufd.c       | 17 ++++-
 tools/testing/selftests/iommu/iommufd_utils.h | 26 +++++++
 11 files changed, 225 insertions(+), 12 deletions(-)

Comments

Yi Liu March 10, 2023, 8:06 a.m. UTC | #1
> From: Jason Gunthorpe <jgg@nvidia.com>
> Sent: Friday, March 10, 2023 1:21 AM
> 
> On Thu, Mar 09, 2023 at 09:50:18PM +0800, Baolu Lu wrote:
> 
> > > +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> > > +		return -EOPNOTSUPP;
> > > +
> > > +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> > > +	if (IS_ERR(idev))
> > > +		return PTR_ERR(idev);
> > > +
> > > +	ops = dev_iommu_ops(idev->dev);
> > > +	if (!ops || !ops->hw_info) {
> >
> > dev_iommu_ops() will never return a NULL.
> >
> > Need below check
> >
> > 	dev->iommu && dev->iommu->iommu_dev
> >
> > before dev_iommu_ops(). Perhaps something like below?
> >
> > 	if (!dev->iommu || !dev->iommu->iommu_dev)
> > 		return -EINVAL;
> 
> At this point the device has become owned through the ownership API,
> it absolutely has to have an iommu and an ops. No need to check
> anything.

ok. so just needs to check hw_info callback.
Tian, Kevin March 16, 2023, 8:23 a.m. UTC | #2
> From: Liu, Yi L <yi.l.liu@intel.com>
> Sent: Thursday, March 9, 2023 3:54 PM
> +
> +int iommufd_device_get_hw_info(struct iommufd_ucmd *ucmd)
> +{
> +	struct iommu_hw_info *cmd = ucmd->cmd;
> +	struct iommufd_device *idev;
> +	const struct iommu_ops *ops;
> +	void *data;
> +	unsigned int length, data_len;
> +	int rc;
> +
> +	if (cmd->flags || cmd->__reserved || !cmd->data_len)
> +		return -EOPNOTSUPP;
> +
> +	idev = iommufd_get_device(ucmd, cmd->dev_id);
> +	if (IS_ERR(idev))
> +		return PTR_ERR(idev);
> +
> +	ops = dev_iommu_ops(idev->dev);
> +	if (!ops || !ops->hw_info) {
> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}
> +
> +	/* driver has hw_info callback should have a unique driver_type */
> +	if (WARN_ON(ops->driver_type ==
> IOMMU_HW_INFO_TYPE_DEFAULT)) {
> +		rc = -EOPNOTSUPP;
> +		goto out_put;
> +	}

ok, here is where the check is done.

> +
> +	data = ops->hw_info(idev->dev, &data_len);

if we directly return type in @hw_info, this becomes:

	data = ops->hw_info(idev->dev, &data_len, &driver_type);

> +/**
> + * struct iommu_hw_info - ioctl(IOMMU_DEVICE_GET_HW_INFO)
> + * @size: sizeof(struct iommu_hw_info)
> + * @flags: Must be 0
> + * @dev_id: The device being attached to the iommufd

"The device bound to the iommufd"

> + * @data_len: Input the length of the user buffer in bytes. Output the
> + *            length of data filled to the user buffer.

s/to/in/

> + * @data_ptr: Pointer to the type specific structure

"Pointer to the user buffer"

> + * @out_data_type: Output the iommu hardware info type, it is one of
> + *                 enum iommu_hw_info_type.

s/it is one of/as defined by/

> + * @__reserved: Must be 0
> + *
> + * Query the hardware iommu information for given device which has been
> + * bound to iommufd. @data_len is the size of the buffer which captures
> + * iommu type specific data and the data will be filled. Trailing bytes
> + * are zeroed if the user buffer is larger than the data kernel has.
> + *
> + * The type specific data would be used to sync capability between the
> + * vIOMMU and the hardware IOMMU. e.g. nested translation requires to

s/vIOMMU/virtual IOMMU/

> + * check the hardware IOMMU capability, since a stage-1 translation table
> + * is owned by user but used by hardware IOMMU.

"check ... capability so guest stage-1 page table uses a format compatible
to the hardware IOMMU"

> + *
> + * The @out_data_type will be filled if the ioctl succeeds. It would
> + * be used to decode the data filled in the buffer pointed by @data_ptr.
> + *
> + * This is only available for the physical devices bound to iommufd as
> + * only physical devices can have hardware IOMMU.

not required. User doesn't know whether it's physical or emulated
device.