Message ID | 20200414131542.25608-1-joro@8bytes.org |
---|---|
Headers | show |
Series | iommu: Move iommu_group setup to IOMMU core code | expand |
On 2020/4/14 21:15, Joerg Roedel wrote: > From: Joerg Roedel<jroedel@suse.de> > > Add call-backs to 'struct iommu_ops' as an alternative to the > add_device() and remove_device() call-backs, which will be removed when > all drivers are converted. > > The new call-backs will not setupt IOMMU groups and domains anymore, > so also add a probe_finalize() call-back where the IOMMU driver can do > per-device setup work which require the device to be set up with a > group and a domain. The subject is inaccurate. probe_device() and release_device() are added to replace the add and remove pair. Best regards, baolu
On Wed, Apr 15, 2020 at 02:36:20PM +0800, Lu Baolu wrote: > On 2020/4/14 21:15, Joerg Roedel wrote: > > From: Joerg Roedel<jroedel@suse.de> > > > > Add call-backs to 'struct iommu_ops' as an alternative to the > > add_device() and remove_device() call-backs, which will be removed when > > all drivers are converted. > > > > The new call-backs will not setupt IOMMU groups and domains anymore, > > so also add a probe_finalize() call-back where the IOMMU driver can do > > per-device setup work which require the device to be set up with a > > group and a domain. > > The subject is inaccurate. probe_device() and release_device() are > added to replace the add and remove pair. This patch does not replace them yet, it just adds the new call-backs. The removal of add_device()/remove_device() happens later in the patch-set when all drivers are converted. Regards, Joerg
Hi Joerg, > Hi, > > here is the second version of this patch-set. The first version with > some more introductory text can be found here: > > https://lore.kernel.org/lkml/20200407183742.4344-1-joro@8bytes.org/ Thanks for the continued improvements in this area! I may have spotted a problem with setups like VMD. The core PCI bus is set up during early boot. Then, for the PCI bus, we reach iommu_bus_init() -> bus_iommu_probe(). In there, we call probe_iommu_group() -> dev_iommu_get() for each PCI device, which allocates dev->iommu in each case. So far so good. The problem is that this is the last time that we'll call dev_iommu_get(). If any PCI bus devices get added after this point, they do not get passed to dev_iommu_get(). So when the vmd module gets loaded later, and creates more PCI devices, we end up in iommu_bus_notifier() -> iommu_probe_device() -> __iommu_probe_device() which does: dev->iommu->iommu_dev = iommu_dev; dev->iommu-> is a NULL dereference because dev_iommu_get() was never called for this new device. Daniel
On Tue Apr 14 20, Joerg Roedel wrote: >Hi, > >here is the second version of this patch-set. The first version with >some more introductory text can be found here: > > https://lore.kernel.org/lkml/20200407183742.4344-1-joro@8bytes.org/ > >Changes v1->v2: > > * Rebased to v5.7-rc1 > > * Re-wrote the arm-smmu changes as suggested by Robin Murphy > > * Re-worked the Exynos patches to hopefully not break the > driver anymore > > * Fixed a missing mutex_unlock() reported by Marek Szyprowski, > thanks for that. > >There is also a git-branch available with these patches applied: > > https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 > >Please review. > >Thanks, > > Joerg > >Joerg Roedel (32): > iommu: Move default domain allocation to separate function > iommu/amd: Implement iommu_ops->def_domain_type call-back > iommu/vt-d: Wire up iommu_ops->def_domain_type > iommu/amd: Remove dma_mask check from check_device() > iommu/amd: Return -ENODEV in add_device when device is not handled by > IOMMU > iommu: Add probe_device() and remove_device() call-backs > iommu: Move default domain allocation to iommu_probe_device() > iommu: Keep a list of allocated groups in __iommu_probe_device() > iommu: Move new probe_device path to separate function > iommu: Split off default domain allocation from group assignment > iommu: Move iommu_group_create_direct_mappings() out of > iommu_group_add_device() > iommu: Export bus_iommu_probe() and make is safe for re-probing > iommu/amd: Remove dev_data->passthrough > iommu/amd: Convert to probe/release_device() call-backs > iommu/vt-d: Convert to probe/release_device() call-backs > iommu/arm-smmu: Convert to probe/release_device() call-backs > iommu/pamu: Convert to probe/release_device() call-backs > iommu/s390: Convert to probe/release_device() call-backs > iommu/virtio: Convert to probe/release_device() call-backs > iommu/msm: Convert to probe/release_device() call-backs > iommu/mediatek: Convert to probe/release_device() call-backs > iommu/mediatek-v1 Convert to probe/release_device() call-backs > iommu/qcom: Convert to probe/release_device() call-backs > iommu/rockchip: Convert to probe/release_device() call-backs > iommu/tegra: Convert to probe/release_device() call-backs > iommu/renesas: Convert to probe/release_device() call-backs > iommu/omap: Remove orphan_dev tracking > iommu/omap: Convert to probe/release_device() call-backs > iommu/exynos: Use first SYSMMU in controllers list for IOMMU core > iommu/exynos: Convert to probe/release_device() call-backs > iommu: Remove add_device()/remove_device() code-paths > iommu: Unexport iommu_group_get_for_dev() > >Sai Praneeth Prakhya (1): > iommu: Add def_domain_type() callback in iommu_ops > > drivers/iommu/amd_iommu.c | 97 ++++---- > drivers/iommu/amd_iommu_types.h | 1 - > drivers/iommu/arm-smmu-v3.c | 38 +-- > drivers/iommu/arm-smmu.c | 39 ++-- > drivers/iommu/exynos-iommu.c | 24 +- > drivers/iommu/fsl_pamu_domain.c | 22 +- > drivers/iommu/intel-iommu.c | 68 +----- > drivers/iommu/iommu.c | 393 +++++++++++++++++++++++++------- > drivers/iommu/ipmmu-vmsa.c | 60 ++--- > drivers/iommu/msm_iommu.c | 34 +-- > drivers/iommu/mtk_iommu.c | 24 +- > drivers/iommu/mtk_iommu_v1.c | 50 ++-- > drivers/iommu/omap-iommu.c | 99 ++------ > drivers/iommu/qcom_iommu.c | 24 +- > drivers/iommu/rockchip-iommu.c | 26 +-- > drivers/iommu/s390-iommu.c | 22 +- > drivers/iommu/tegra-gart.c | 24 +- > drivers/iommu/tegra-smmu.c | 31 +-- > drivers/iommu/virtio-iommu.c | 41 +--- > include/linux/iommu.h | 21 +- > 20 files changed, 533 insertions(+), 605 deletions(-) > >-- >2.17.1 > >_______________________________________________ >iommu mailing list >iommu@lists.linux-foundation.org >https://lists.linuxfoundation.org/mailman/listinfo/iommu > Hi Joerg, With this patchset, I have an epyc system where if I boot with iommu=nopt and force a dump I will see some io page faults for a nic on the system. The vmcore is harvested and the system reboots. I haven't reproduced it on other systems yet, but without the patchset I don't see the io page faults during the kdump. Regards, Jerry
On Fri May 29 20, Jerry Snitselaar wrote: >On Tue Apr 14 20, Joerg Roedel wrote: >>Hi, >> >>here is the second version of this patch-set. The first version with >>some more introductory text can be found here: >> >> https://lore.kernel.org/lkml/20200407183742.4344-1-joro@8bytes.org/ >> >>Changes v1->v2: >> >> * Rebased to v5.7-rc1 >> >> * Re-wrote the arm-smmu changes as suggested by Robin Murphy >> >> * Re-worked the Exynos patches to hopefully not break the >> driver anymore >> >> * Fixed a missing mutex_unlock() reported by Marek Szyprowski, >> thanks for that. >> >>There is also a git-branch available with these patches applied: >> >> https://git.kernel.org/pub/scm/linux/kernel/git/joro/linux.git/log/?h=iommu-probe-device-v2 >> >>Please review. >> >>Thanks, >> >> Joerg >> >>Joerg Roedel (32): >> iommu: Move default domain allocation to separate function >> iommu/amd: Implement iommu_ops->def_domain_type call-back >> iommu/vt-d: Wire up iommu_ops->def_domain_type >> iommu/amd: Remove dma_mask check from check_device() >> iommu/amd: Return -ENODEV in add_device when device is not handled by >> IOMMU >> iommu: Add probe_device() and remove_device() call-backs >> iommu: Move default domain allocation to iommu_probe_device() >> iommu: Keep a list of allocated groups in __iommu_probe_device() >> iommu: Move new probe_device path to separate function >> iommu: Split off default domain allocation from group assignment >> iommu: Move iommu_group_create_direct_mappings() out of >> iommu_group_add_device() >> iommu: Export bus_iommu_probe() and make is safe for re-probing >> iommu/amd: Remove dev_data->passthrough >> iommu/amd: Convert to probe/release_device() call-backs >> iommu/vt-d: Convert to probe/release_device() call-backs >> iommu/arm-smmu: Convert to probe/release_device() call-backs >> iommu/pamu: Convert to probe/release_device() call-backs >> iommu/s390: Convert to probe/release_device() call-backs >> iommu/virtio: Convert to probe/release_device() call-backs >> iommu/msm: Convert to probe/release_device() call-backs >> iommu/mediatek: Convert to probe/release_device() call-backs >> iommu/mediatek-v1 Convert to probe/release_device() call-backs >> iommu/qcom: Convert to probe/release_device() call-backs >> iommu/rockchip: Convert to probe/release_device() call-backs >> iommu/tegra: Convert to probe/release_device() call-backs >> iommu/renesas: Convert to probe/release_device() call-backs >> iommu/omap: Remove orphan_dev tracking >> iommu/omap: Convert to probe/release_device() call-backs >> iommu/exynos: Use first SYSMMU in controllers list for IOMMU core >> iommu/exynos: Convert to probe/release_device() call-backs >> iommu: Remove add_device()/remove_device() code-paths >> iommu: Unexport iommu_group_get_for_dev() >> >>Sai Praneeth Prakhya (1): >> iommu: Add def_domain_type() callback in iommu_ops >> >>drivers/iommu/amd_iommu.c | 97 ++++---- >>drivers/iommu/amd_iommu_types.h | 1 - >>drivers/iommu/arm-smmu-v3.c | 38 +-- >>drivers/iommu/arm-smmu.c | 39 ++-- >>drivers/iommu/exynos-iommu.c | 24 +- >>drivers/iommu/fsl_pamu_domain.c | 22 +- >>drivers/iommu/intel-iommu.c | 68 +----- >>drivers/iommu/iommu.c | 393 +++++++++++++++++++++++++------- >>drivers/iommu/ipmmu-vmsa.c | 60 ++--- >>drivers/iommu/msm_iommu.c | 34 +-- >>drivers/iommu/mtk_iommu.c | 24 +- >>drivers/iommu/mtk_iommu_v1.c | 50 ++-- >>drivers/iommu/omap-iommu.c | 99 ++------ >>drivers/iommu/qcom_iommu.c | 24 +- >>drivers/iommu/rockchip-iommu.c | 26 +-- >>drivers/iommu/s390-iommu.c | 22 +- >>drivers/iommu/tegra-gart.c | 24 +- >>drivers/iommu/tegra-smmu.c | 31 +-- >>drivers/iommu/virtio-iommu.c | 41 +--- >>include/linux/iommu.h | 21 +- >>20 files changed, 533 insertions(+), 605 deletions(-) >> >>-- >>2.17.1 >> >>_______________________________________________ >>iommu mailing list >>iommu@lists.linux-foundation.org >>https://lists.linuxfoundation.org/mailman/listinfo/iommu >> > >Hi Joerg, > >With this patchset, I have an epyc system where if I boot with >iommu=nopt and force a dump I will see some io page faults for a nic >on the system. The vmcore is harvested and the system reboots. I >haven't reproduced it on other systems yet, but without the patchset I >don't see the io page faults during the kdump. > >Regards, >Jerry I just hit an issue on a separate intel based system (kdump iommu=nopt), where it panics in during intel_iommu_attach_device, in is_aux_domain, due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't get set to a valid address until the domain_add_dev_info call. Is it as simple as the following? diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 29d3940847d3..f1bbeed46a4c 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct iommu_domain *domain) { struct device_domain_info *info = dev->archdata.iommu; - return info && info->auxd_enabled && - domain->type == IOMMU_DOMAIN_UNMANAGED; + return info && info != DEFER_DEVICE_DOMAIN_INFO && + info->auxd_enabled && domain->type == IOMMU_DOMAIN_UNMANAGED; } static void auxiliary_link_device(struct dmar_domain *domain, Regards, Jerry
Hi Jerry, On 6/1/20 6:42 PM, Jerry Snitselaar wrote: >> >> Hi Joerg, >> >> With this patchset, I have an epyc system where if I boot with >> iommu=nopt and force a dump I will see some io page faults for a nic >> on the system. The vmcore is harvested and the system reboots. I >> haven't reproduced it on other systems yet, but without the patchset I >> don't see the io page faults during the kdump. >> >> Regards, >> Jerry > > I just hit an issue on a separate intel based system (kdump iommu=nopt), > where it panics in during intel_iommu_attach_device, in is_aux_domain, > due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't > get set to a valid address until the domain_add_dev_info call. > > Is it as simple as the following? I guess you won't hit this issue if you use iommu/next branch of Joerg's tree. We've changed to use a generic helper to retrieve the valid per device iommu data or NULL (if there's no). Best regards, baolu > > diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c > index 29d3940847d3..f1bbeed46a4c 100644 > --- a/drivers/iommu/intel-iommu.c > +++ b/drivers/iommu/intel-iommu.c > @@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct > iommu_domain *domain) > { > struct device_domain_info *info = dev->archdata.iommu; > > - return info && info->auxd_enabled && > - domain->type == IOMMU_DOMAIN_UNMANAGED; > + return info && info != DEFER_DEVICE_DOMAIN_INFO && > + info->auxd_enabled && domain->type == > IOMMU_DOMAIN_UNMANAGED; > } > > static void auxiliary_link_device(struct dmar_domain *domain, > > > Regards, > Jerry
On Tue Jun 02 20, Lu Baolu wrote: >Hi Jerry, > >On 6/1/20 6:42 PM, Jerry Snitselaar wrote: >>> >>>Hi Joerg, >>> >>>With this patchset, I have an epyc system where if I boot with >>>iommu=nopt and force a dump I will see some io page faults for a nic >>>on the system. The vmcore is harvested and the system reboots. I >>>haven't reproduced it on other systems yet, but without the patchset I >>>don't see the io page faults during the kdump. >>> >>>Regards, >>>Jerry >> >>I just hit an issue on a separate intel based system (kdump iommu=nopt), >>where it panics in during intel_iommu_attach_device, in is_aux_domain, >>due to device_domain_info being DEFER_DEVICE_DOMAIN_INFO. That doesn't >>get set to a valid address until the domain_add_dev_info call. >> >>Is it as simple as the following? > >I guess you won't hit this issue if you use iommu/next branch of Joerg's >tree. We've changed to use a generic helper to retrieve the valid per >device iommu data or NULL (if there's no). > >Best regards, >baolu > Yeah, that will solve the panic. >> >>diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c >>index 29d3940847d3..f1bbeed46a4c 100644 >>--- a/drivers/iommu/intel-iommu.c >>+++ b/drivers/iommu/intel-iommu.c >>@@ -5053,8 +5053,8 @@ is_aux_domain(struct device *dev, struct >>iommu_domain *domain) >> { >> struct device_domain_info *info = dev->archdata.iommu; >> >>- return info && info->auxd_enabled && >>- domain->type == IOMMU_DOMAIN_UNMANAGED; >>+ return info && info != DEFER_DEVICE_DOMAIN_INFO && >>+ info->auxd_enabled && domain->type == >>IOMMU_DOMAIN_UNMANAGED; >> } >> >> static void auxiliary_link_device(struct dmar_domain *domain, >> >> >>Regards, >>Jerry >_______________________________________________ >iommu mailing list >iommu@lists.linux-foundation.org >https://lists.linuxfoundation.org/mailman/listinfo/iommu
Hi Jerry, On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote: > > Yeah, that will solve the panic. > If you still see the kdump faults, can you please try with the attached diff? I was not able to reproduce them in my setup. Regards, Joerg diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b5ea203f6c68..5a6d509f72b6 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type *bus, static int iommu_group_do_dma_attach(struct device *dev, void *data) { struct iommu_domain *domain = data; + int ret = 0; - return __iommu_attach_device(domain, dev); + if (!iommu_is_attach_deferred(group->domain, dev)) + ret = __iommu_attach_device(group->domain, dev); + + return ret; } static int __iommu_group_dma_attach(struct iommu_group *group)
On Tue Jun 02 20, Jerry Snitselaar wrote: >On Tue Jun 02 20, Joerg Roedel wrote: >>Hi Jerry, >> >>On Mon, Jun 01, 2020 at 05:02:36PM -0700, Jerry Snitselaar wrote: >>> >>>Yeah, that will solve the panic. >>> >> >>If you still see the kdump faults, can you please try with the attached >>diff? I was not able to reproduce them in my setup. >> >>Regards, >> >> Joerg >> > >I have another hp proliant server now, and reproduced. I will have the >patch below tested shortly. Minor change, I switched group->domain to >domain since group isn't an argument, and *data being passed in comes >from group->domain anyways. > Looks like it solves problem for both the epyc system, and the hp proliant server, >>diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c >>index b5ea203f6c68..5a6d509f72b6 100644 >>--- a/drivers/iommu/iommu.c >>+++ b/drivers/iommu/iommu.c >>@@ -1680,8 +1680,12 @@ static void probe_alloc_default_domain(struct bus_type *bus, >>static int iommu_group_do_dma_attach(struct device *dev, void *data) >>{ >> struct iommu_domain *domain = data; >>+ int ret = 0; >> >>- return __iommu_attach_device(domain, dev); >>+ if (!iommu_is_attach_deferred(group->domain, dev)) >>+ ret = __iommu_attach_device(group->domain, dev); >>+ >>+ return ret; >>} >> >>static int __iommu_group_dma_attach(struct iommu_group *group) >>_______________________________________________ >>iommu mailing list >>iommu@lists.linux-foundation.org >>https://lists.linuxfoundation.org/mailman/listinfo/iommu >>