Message ID | 20231227161354.67701-1-yi.l.liu@intel.com |
---|---|
Headers | show |
Series | Add iommufd nesting (part 2/2) | expand |
On Wed, Dec 27, 2023 at 08:13:44AM -0800, Yi Liu wrote:
> [2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting
This branch is still pointing to v7's iommufd_nesting-12212023-yi.
Likely you should update it to v8's iommufd_nesting-12272023-yi :)
Thanks
Nic
On 2023/12/28 04:58, Nicolin Chen wrote: > On Wed, Dec 27, 2023 at 08:13:44AM -0800, Yi Liu wrote: > >> [2] https://github.com/yiliu1765/iommufd/tree/iommufd_nesting > > This branch is still pointing to v7's iommufd_nesting-12212023-yi. > Likely you should update it to v8's iommufd_nesting-12272023-yi :) > yes, it is. done! :)
> From: Liu, Yi L <yi.l.liu@intel.com> > Sent: Thursday, December 28, 2023 12:14 AM > > @@ -1376,6 +1383,8 @@ int qi_submit_sync(struct intel_iommu *iommu, > struct qi_desc *desc, > > restart: > rc = 0; > + if (fault) > + *fault = 0; move it to right before the loop of qi_check_fault() > > raw_spin_lock_irqsave(&qi->q_lock, flags); > /* > @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu, > struct qi_desc *desc, > * a deadlock where the interrupt context can wait > indefinitely > * for free slots in the queue. > */ > - rc = qi_check_fault(iommu, index, wait_index); > + rc = qi_check_fault(iommu, index, wait_index, fault); > if (rc) > break; and as replied in another thread let's change qi_check_fault to return -ETIMEDOUT to break the restart loop when fault pointer is valid.
> From: Liu, Yi L <yi.l.liu@intel.com> > Sent: Thursday, December 28, 2023 12:14 AM > +/** > + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation > + * (IOMMU_HWPT_INVALIDATE_DATA_VTD_S1) > + * @addr: The start address of the range to be invalidated. It needs to > + * be 4KB aligned. > + * @npages: Number of contiguous 4K pages to be invalidated. > + * @flags: Combination of enum iommu_hwpt_vtd_s1_invalidate_flags > + * @hw_error: One of enum iommu_hwpt_vtd_s1_invalidate_error > + * > + * The Intel VT-d specific invalidation data for user-managed stage-1 cache > + * invalidation in nested translation. Userspace uses this structure to > + * tell the impacted cache scope after modifying the stage-1 page table. > + * > + * Invalidating all the caches related to the page table by setting @addr > + * to be 0 and @npages to be U64_MAX. > + * > + * The device TLB will be invalidated automatically if ATS is enabled. > + * > + * The @hw_error is meaningful when the entry is handled by the kernel. > + * Check the entry_num output of IOMMU_HWPT_INVALIDATE ioctl to > know the > + * handled entries. @hw_error only covers the errors detected by hardware. > + * The software detected errors would go through the normal ioctl errno. > + */ * An entry is considered 'handled' after it passes the audit and submitted * to the IOMMU by the underlying driver. Check the @entry_num output of * struct iommu_hwpt_invalidate for the number of handled entries. A 'handled' * request may still fail in hardware for various reasons, e.g. due to timeout * on waiting for device response upon a device TLB invalidation request. In * such case the hardware error info is reported in the @hw_error field of the * handled entry.
> From: Tian, Kevin > Sent: Thursday, December 28, 2023 2:38 PM > > > From: Liu, Yi L <yi.l.liu@intel.com> > > Sent: Thursday, December 28, 2023 12:14 AM > > +/** > > + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation > > + * (IOMMU_HWPT_INVALIDATE_DATA_VTD_S1) > > + * @addr: The start address of the range to be invalidated. It needs to > > + * be 4KB aligned. > > + * @npages: Number of contiguous 4K pages to be invalidated. > > + * @flags: Combination of enum iommu_hwpt_vtd_s1_invalidate_flags > > + * @hw_error: One of enum iommu_hwpt_vtd_s1_invalidate_error > > + * > > + * The Intel VT-d specific invalidation data for user-managed stage-1 cache > > + * invalidation in nested translation. Userspace uses this structure to > > + * tell the impacted cache scope after modifying the stage-1 page table. > > + * > > + * Invalidating all the caches related to the page table by setting @addr > > + * to be 0 and @npages to be U64_MAX. > > + * > > + * The device TLB will be invalidated automatically if ATS is enabled. > > + * > > + * The @hw_error is meaningful when the entry is handled by the kernel. > > + * Check the entry_num output of IOMMU_HWPT_INVALIDATE ioctl to > > know the > > + * handled entries. @hw_error only covers the errors detected by > hardware. > > + * The software detected errors would go through the normal ioctl errno. > > + */ > > * An entry is considered 'handled' after it passes the audit and submitted > * to the IOMMU by the underlying driver. Check the @entry_num output of > * struct iommu_hwpt_invalidate for the number of handled entries. A > 'handled' > * request may still fail in hardware for various reasons, e.g. due to timeout > * on waiting for device response upon a device TLB invalidation request. In > * such case the hardware error info is reported in the @hw_error field of the > * handled entry. with that: Reviewed-by: Kevin Tian <kevin.tian@intel.com>
On 2023/12/28 14:17, Tian, Kevin wrote: >> From: Liu, Yi L <yi.l.liu@intel.com> >> Sent: Thursday, December 28, 2023 12:14 AM >> >> @@ -1376,6 +1383,8 @@ int qi_submit_sync(struct intel_iommu *iommu, >> struct qi_desc *desc, >> >> restart: >> rc = 0; >> + if (fault) >> + *fault = 0; > > move it to right before the loop of qi_check_fault() ok. >> >> raw_spin_lock_irqsave(&qi->q_lock, flags); >> /* >> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu, >> struct qi_desc *desc, >> * a deadlock where the interrupt context can wait >> indefinitely >> * for free slots in the queue. >> */ >> - rc = qi_check_fault(iommu, index, wait_index); >> + rc = qi_check_fault(iommu, index, wait_index, fault); >> if (rc) >> break; > > and as replied in another thread let's change qi_check_fault to return > -ETIMEDOUT to break the restart loop when fault pointer is valid. sure.
On 2023/12/28 14:38, Tian, Kevin wrote: >> From: Tian, Kevin >> Sent: Thursday, December 28, 2023 2:38 PM >> >>> From: Liu, Yi L <yi.l.liu@intel.com> >>> Sent: Thursday, December 28, 2023 12:14 AM >>> +/** >>> + * struct iommu_hwpt_vtd_s1_invalidate - Intel VT-d cache invalidation >>> + * (IOMMU_HWPT_INVALIDATE_DATA_VTD_S1) >>> + * @addr: The start address of the range to be invalidated. It needs to >>> + * be 4KB aligned. >>> + * @npages: Number of contiguous 4K pages to be invalidated. >>> + * @flags: Combination of enum iommu_hwpt_vtd_s1_invalidate_flags >>> + * @hw_error: One of enum iommu_hwpt_vtd_s1_invalidate_error >>> + * >>> + * The Intel VT-d specific invalidation data for user-managed stage-1 cache >>> + * invalidation in nested translation. Userspace uses this structure to >>> + * tell the impacted cache scope after modifying the stage-1 page table. >>> + * >>> + * Invalidating all the caches related to the page table by setting @addr >>> + * to be 0 and @npages to be U64_MAX. >>> + * >>> + * The device TLB will be invalidated automatically if ATS is enabled. >>> + * >>> + * The @hw_error is meaningful when the entry is handled by the kernel. >>> + * Check the entry_num output of IOMMU_HWPT_INVALIDATE ioctl to >>> know the >>> + * handled entries. @hw_error only covers the errors detected by >> hardware. >>> + * The software detected errors would go through the normal ioctl errno. >>> + */ >> >> * An entry is considered 'handled' after it passes the audit and submitted >> * to the IOMMU by the underlying driver. Check the @entry_num output of >> * struct iommu_hwpt_invalidate for the number of handled entries. A >> 'handled' >> * request may still fail in hardware for various reasons, e.g. due to timeout >> * on waiting for device response upon a device TLB invalidation request. In >> * such case the hardware error info is reported in the @hw_error field of the >> * handled entry. > > with that: > > Reviewed-by: Kevin Tian <kevin.tian@intel.com> yep.
On 1/1/2024 11:34 AM, Baolu Lu wrote: > On 12/28/23 2:17 PM, Tian, Kevin wrote: >>> raw_spin_lock_irqsave(&qi->q_lock, flags); >>> /* >>> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu, >>> struct qi_desc *desc, >>> * a deadlock where the interrupt context can wait >>> indefinitely >>> * for free slots in the queue. >>> */ >>> - rc = qi_check_fault(iommu, index, wait_index); >>> + rc = qi_check_fault(iommu, index, wait_index, fault); >>> if (rc) >>> break; >> and as replied in another thread let's change qi_check_fault to return >> -ETIMEDOUT to break the restart loop when fault pointer is valid. > > It's fine to break the retry loop when fault happens and the fault > pointer is valid. Please don't forget to add an explanation comment > around the code. Something like: > > /* > * The caller is able to handle the fault by itself. The IOMMU driver > * should not attempt to retry this request. > */ If caller could pass desc with mixed iotlb & devtlb invalidation request, it would be problematic/difficult for caller or qi_submit_sync() to do error handling, imagine a case like, 1. call qi_submit_sync() with iotlb & devltb. 2. qi_submit_sync() detects the target device is dead. 3. break the loop, or will block other invalidation submitter / hang. 4. it is hard for qi_submit_sync() to extract those iotlb invalidation to retry. 5. it is also difficult for caller to retry the iotlb invalidation, or leave iotlb out-of-sync. ---there is no sync at all, device is gone. and if only ITE fault hit, but target device is there && configuration space reading okay, the ITE is probably left by previous request for other device, not triggered by this batch, the question is we couldn't identify the ITE device is just the same as current target ? if the same, then breaking out is reasonable, or just leave the problem to caller, something in the request batch is bad, some requests someone request befoere is bad, but the request is not from the same caller. Thanks, Ethan > > Best regards, > baolu >