diff mbox series

[v6,13/14] iommu/arm-smmu-v3: Report events that belong to devices attached to vIOMMU

Message ID b71a5b132e8ba771998c5b810675f10b100d4ff3.1737754129.git.nicolinc@nvidia.com
State New
Headers show
Series iommufd: Add vIOMMU infrastructure (Part-3: vEVENTQ) | expand

Commit Message

Nicolin Chen Jan. 25, 2025, 12:30 a.m. UTC
Aside from the IOPF framework, iommufd provides an additional pathway to
report hardware events, via the vEVENTQ of vIOMMU infrastructure.

Define an iommu_vevent_arm_smmuv3 uAPI structure, and report stage-1 events
in the threaded IRQ handler.

Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>
---
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h   |  7 +++
 include/uapi/linux/iommufd.h                  | 15 +++++
 .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c     | 15 +++++
 drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c   | 58 +++++++++++--------
 4 files changed, 70 insertions(+), 25 deletions(-)

Comments

Tian, Kevin Feb. 18, 2025, 5:21 a.m. UTC | #1
> From: Nicolin Chen <nicolinc@nvidia.com>
> Sent: Saturday, January 25, 2025 8:31 AM
> 
> Aside from the IOPF framework, iommufd provides an additional pathway to
> report hardware events, via the vEVENTQ of vIOMMU infrastructure.
> 
> Define an iommu_vevent_arm_smmuv3 uAPI structure, and report stage-1
> events
> in the threaded IRQ handler.
> 
> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com>

Reviewed-by: Kevin Tian <kevin.tian@intel.com>
Jason Gunthorpe Feb. 18, 2025, 5:18 p.m. UTC | #2
On Fri, Jan 24, 2025 at 04:30:42PM -0800, Nicolin Chen wrote:

> @@ -1831,31 +1831,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
>  		return -EOPNOTSUPP;
>  	}

There is still the filter at the top:

	switch (event->id) {
	case EVT_ID_TRANSLATION_FAULT:
	case EVT_ID_ADDR_SIZE_FAULT:
	case EVT_ID_ACCESS_FAULT:
	case EVT_ID_PERMISSION_FAULT:
		break;
	default:
		return -EOPNOTSUPP;
	}

Is that right here or should more event types be forwarded to the
guest?
  
>  	mutex_lock(&smmu->streams_mutex);
[..]
  
> -	ret = iommu_report_device_fault(master->dev, &fault_evt);
> +	if (event->stall) {
> +		ret = iommu_report_device_fault(master->dev, &fault_evt);
> +	} else {
> +		down_read(&master->vmaster_rwsem);

This already holds the streams_mutex across all of this, do you think
we should get rid of the vmaster_rwsem and hold the streams_mutex on
write instead?

Jason
Nicolin Chen Feb. 18, 2025, 6:28 p.m. UTC | #3
On Tue, Feb 18, 2025 at 01:18:21PM -0400, Jason Gunthorpe wrote:
> On Fri, Jan 24, 2025 at 04:30:42PM -0800, Nicolin Chen wrote:
> 
> > @@ -1831,31 +1831,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
> >  		return -EOPNOTSUPP;
> >  	}
> 
> There is still the filter at the top:
> 
> 	switch (event->id) {
> 	case EVT_ID_TRANSLATION_FAULT:
> 	case EVT_ID_ADDR_SIZE_FAULT:
> 	case EVT_ID_ACCESS_FAULT:
> 	case EVT_ID_PERMISSION_FAULT:
> 		break;
> 	default:
> 		return -EOPNOTSUPP;
> 	}
> 
> Is that right here or should more event types be forwarded to the
> guest?

That doesn't seem to be right. Something like EVT_ID_BAD_CD_CONFIG
should be forwarded too. I will go through the list.

> >  	mutex_lock(&smmu->streams_mutex);
> [..]
>   
> > -	ret = iommu_report_device_fault(master->dev, &fault_evt);
> > +	if (event->stall) {
> > +		ret = iommu_report_device_fault(master->dev, &fault_evt);
> > +	} else {
> > +		down_read(&master->vmaster_rwsem);
> 
> This already holds the streams_mutex across all of this, do you think
> we should get rid of the vmaster_rwsem and hold the streams_mutex on
> write instead?

They are per master v.s. per smmu. The latter one would make master
commits/attaches exclusive, which feels unnecessary to me, although
it would make the code here slightly cleaner..

Thanks
Nicolin
Jason Gunthorpe Feb. 18, 2025, 6:50 p.m. UTC | #4
On Tue, Feb 18, 2025 at 10:28:04AM -0800, Nicolin Chen wrote:
> On Tue, Feb 18, 2025 at 01:18:21PM -0400, Jason Gunthorpe wrote:
> > On Fri, Jan 24, 2025 at 04:30:42PM -0800, Nicolin Chen wrote:
> > 
> > > @@ -1831,31 +1831,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
> > >  		return -EOPNOTSUPP;
> > >  	}
> > 
> > There is still the filter at the top:
> > 
> > 	switch (event->id) {
> > 	case EVT_ID_TRANSLATION_FAULT:
> > 	case EVT_ID_ADDR_SIZE_FAULT:
> > 	case EVT_ID_ACCESS_FAULT:
> > 	case EVT_ID_PERMISSION_FAULT:
> > 		break;
> > 	default:
> > 		return -EOPNOTSUPP;
> > 	}
> > 
> > Is that right here or should more event types be forwarded to the
> > guest?
> 
> That doesn't seem to be right. Something like EVT_ID_BAD_CD_CONFIG
> should be forwarded too. I will go through the list.

I think the above should decode into a 'faultable' path because they
all decode to something with an IOVA

The rest should decode to things that include a SID and the SID decode
should always be forwarded to the VM. Maybe there are small
exclusions, but generally that is how I would see it..

> > This already holds the streams_mutex across all of this, do you think
> > we should get rid of the vmaster_rwsem and hold the streams_mutex on
> > write instead?
> 
> They are per master v.s. per smmu. The latter one would make master
> commits/attaches exclusive, which feels unnecessary to me, although
> it would make the code here slightly cleaner..

I'd pay the cost on the attach side to have a single lock on the fault
side..

Jason
Nicolin Chen Feb. 18, 2025, 7:02 p.m. UTC | #5
On Tue, Feb 18, 2025 at 02:50:46PM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 18, 2025 at 10:28:04AM -0800, Nicolin Chen wrote:
> > On Tue, Feb 18, 2025 at 01:18:21PM -0400, Jason Gunthorpe wrote:
> > > On Fri, Jan 24, 2025 at 04:30:42PM -0800, Nicolin Chen wrote:
> > > 
> > > > @@ -1831,31 +1831,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
> > > >  		return -EOPNOTSUPP;
> > > >  	}
> > > 
> > > There is still the filter at the top:
> > > 
> > > 	switch (event->id) {
> > > 	case EVT_ID_TRANSLATION_FAULT:
> > > 	case EVT_ID_ADDR_SIZE_FAULT:
> > > 	case EVT_ID_ACCESS_FAULT:
> > > 	case EVT_ID_PERMISSION_FAULT:
> > > 		break;
> > > 	default:
> > > 		return -EOPNOTSUPP;
> > > 	}
> > > 
> > > Is that right here or should more event types be forwarded to the
> > > guest?
> > 
> > That doesn't seem to be right. Something like EVT_ID_BAD_CD_CONFIG
> > should be forwarded too. I will go through the list.
> 
> I think the above should decode into a 'faultable' path because they
> all decode to something with an IOVA
> 
> The rest should decode to things that include a SID and the SID decode
> should always be forwarded to the VM. Maybe there are small
> exclusions, but generally that is how I would see it..

Ack. SMMU spec defines three type:
"Three categories of events might be recorded into the Event queue:
 • Configuration errors.
 • Faults from the translation process.
 • Miscellaneous."

The driver cares the first two only, as you remarked here.

> > > This already holds the streams_mutex across all of this, do you think
> > > we should get rid of the vmaster_rwsem and hold the streams_mutex on
> > > write instead?
> > 
> > They are per master v.s. per smmu. The latter one would make master
> > commits/attaches exclusive, which feels unnecessary to me, although
> > it would make the code here slightly cleaner..
> 
> I'd pay the cost on the attach side to have a single lock on the fault
> side..

OK. Maybe a small patch to turn the streams_mutex to streams_rwsem?

Thanks
Nicolin
Jason Gunthorpe Feb. 18, 2025, 7:08 p.m. UTC | #6
On Tue, Feb 18, 2025 at 11:02:23AM -0800, Nicolin Chen wrote:
> > > > This already holds the streams_mutex across all of this, do you think
> > > > we should get rid of the vmaster_rwsem and hold the streams_mutex on
> > > > write instead?
> > > 
> > > They are per master v.s. per smmu. The latter one would make master
> > > commits/attaches exclusive, which feels unnecessary to me, although
> > > it would make the code here slightly cleaner..
> > 
> > I'd pay the cost on the attach side to have a single lock on the fault
> > side..
> 
> OK. Maybe a small patch to turn the streams_mutex to streams_rwsem?

I don't think the interrupt path is multithreaded, is it? So only 1
reader anyhow?

Jason
Nicolin Chen Feb. 18, 2025, 7:27 p.m. UTC | #7
On Tue, Feb 18, 2025 at 03:08:46PM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 18, 2025 at 11:02:23AM -0800, Nicolin Chen wrote:
> > > > > This already holds the streams_mutex across all of this, do you think
> > > > > we should get rid of the vmaster_rwsem and hold the streams_mutex on
> > > > > write instead?
> > > > 
> > > > They are per master v.s. per smmu. The latter one would make master
> > > > commits/attaches exclusive, which feels unnecessary to me, although
> > > > it would make the code here slightly cleaner..
> > > 
> > > I'd pay the cost on the attach side to have a single lock on the fault
> > > side..
> > 
> > OK. Maybe a small patch to turn the streams_mutex to streams_rwsem?
> 
> I don't think the interrupt path is multithreaded, is it? So only 1
> reader anyhow?

Right, it's IRQF_ONESHOT. I will keep that unchanged.

Thanks
Nicolin
Nicolin Chen Feb. 20, 2025, 8:45 p.m. UTC | #8
On Tue, Feb 18, 2025 at 02:50:46PM -0400, Jason Gunthorpe wrote:
> On Tue, Feb 18, 2025 at 10:28:04AM -0800, Nicolin Chen wrote:
> > On Tue, Feb 18, 2025 at 01:18:21PM -0400, Jason Gunthorpe wrote:
> > > On Fri, Jan 24, 2025 at 04:30:42PM -0800, Nicolin Chen wrote:
> > > 
> > > > @@ -1831,31 +1831,30 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
> > > >  		return -EOPNOTSUPP;
> > > >  	}
> > > 
> > > There is still the filter at the top:
> > > 
> > > 	switch (event->id) {
> > > 	case EVT_ID_TRANSLATION_FAULT:
> > > 	case EVT_ID_ADDR_SIZE_FAULT:
> > > 	case EVT_ID_ACCESS_FAULT:
> > > 	case EVT_ID_PERMISSION_FAULT:
> > > 		break;
> > > 	default:
> > > 		return -EOPNOTSUPP;
> > > 	}
> > > 
> > > Is that right here or should more event types be forwarded to the
> > > guest?
> > 
> > That doesn't seem to be right. Something like EVT_ID_BAD_CD_CONFIG
> > should be forwarded too. I will go through the list.
> 
> I think the above should decode into a 'faultable' path because they
> all decode to something with an IOVA
> 
> The rest should decode to things that include a SID and the SID decode
> should always be forwarded to the VM. Maybe there are small
> exclusions, but generally that is how I would see it..

I think we are safe to add these:

------------------------------------------------------------
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index fd2f13a63f27..be9746ecdc65 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1067,7 +1067,16 @@ enum iommu_veventq_type {
  * struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event
  *                                  (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3)
  * @evt: 256-bit ARM SMMUv3 Event record, little-endian.
- *       (Refer to "7.3 Event records" in SMMUv3 HW Spec)
+ *       Reported event records: (Refer to "7.3 Event records" in SMMUv3 HW Spec)
+ *       - 0x02 C_BAD_STREAMID
+ *       - 0x04 C_BAD_STE
+ *       - 0x06 F_STREAM_DISABLED
+ *       - 0x08 C_BAD_SUBSTREAMID
+ *       - 0x0a C_BAD_STE
+ *       - 0x10 F_TRANSLATION
+ *       - 0x11 F_ADDR_SIZE
+ *       - 0x12 F_ACCESS
+ *       - 0x13 F_PERMISSION
  *
  * StreamID field reports a virtual device ID. To receive a virtual event for a
  * device, a vDEVICE must be allocated via IOMMU_VDEVICE_ALLOC.
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 0efda55ad6bd..f3aa9ce16058 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1827,7 +1827,15 @@ static int arm_smmu_handle_event(struct arm_smmu_device *smmu, u64 *evt,
        case EVT_ID_ADDR_SIZE_FAULT:
        case EVT_ID_ACCESS_FAULT:
        case EVT_ID_PERMISSION_FAULT:
+       case EVT_ID_BAD_CD_CONFIG:
+       case EVT_ID_BAD_STE_CONFIG:
+       case EVT_ID_BAD_STREAMID_CONFIG:
+       case EVT_ID_BAD_SUBSTREAMID_CONFIG:
+       case EVT_ID_STREAM_DISABLED_FAULT:
                break;
+       case EVT_ID_STE_FETCH_FAULT:
+       case EVT_ID_CD_FETCH_FAULT:
+               /* FIXME need to replace fetch_addr with IPA? */
        default:
                return -EOPNOTSUPP;
        }
------------------------------------------------------------

All of the supported events require vSID replacement. Those faults
with addresses are dealing with stage-1 IOVA or IPA, i.e. IOVA and
PA for a VM. So, they could be simply forwarded.

But F_CD_FETCH and F_STE_FETCH seem to be complicated here, as both
report PA in their FetchAddr fields, although the spec does mention
both might be injected to a guest VM:
 - "Note: This event might be injected into a guest VM, as though
    from a virtual SMMU, when a hypervisor receives a stage 2
    Translation-related fault indicating CD fetch as a cause (with
    CLASS == CD)."
 - "Note: This event might be injected into a guest VM, as though
    from a virtual SMMU, when a hypervisor detects invalid guest
    configuration that would cause a guest STE fetch from an illegal
    IPA."

For F_CD_FETCH, at least the CD table pointer in the nested STE is
an IPA, and all the entries in the CD table that can be 2-level are
IPAs as well. So, we need some kinda reverse translation from a PA
to IPA using its stage-2 mapping. I am not sure what's the best way
to do that...

For F_STE_FETCH, the host prepared the nested STE, so there is no
IPA involved. We would have to ask VMM to fill the field since an
STE IPA should be just a piece of entry given the vSID. One thing
that I am not sure is whether the FetchAddr is STE-size aligned or
not, though we can carry the offset in the FetchAddr field via the
vEVENT by masking away any upper bits...

I wonder if @Robin or @Will may also shed some light on these two
events.

Otherwise, perhaps not-supporting them in this series might be a
safer bet?

Thanks
Nicolin
Jason Gunthorpe Feb. 20, 2025, 11:24 p.m. UTC | #9
On Thu, Feb 20, 2025 at 12:45:46PM -0800, Nicolin Chen wrote:
> ------------------------------------------------------------
> diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> index fd2f13a63f27..be9746ecdc65 100644
> --- a/include/uapi/linux/iommufd.h
> +++ b/include/uapi/linux/iommufd.h
> @@ -1067,7 +1067,16 @@ enum iommu_veventq_type {
>   * struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event
>   *                                  (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3)
>   * @evt: 256-bit ARM SMMUv3 Event record, little-endian.
> - *       (Refer to "7.3 Event records" in SMMUv3 HW Spec)
> + *       Reported event records: (Refer to "7.3 Event records" in SMMUv3 HW Spec)
> + *       - 0x02 C_BAD_STREAMID

This is documented as 'Transaction StreamID out of range.' so it would
by a hypervisor kernel bug to hit it

> + *       - 0x04 C_BAD_STE

I'm not sure we do enough validation to reject all bad STE fragments
so it makes sense this could happen.

> + *       - 0x06 F_STREAM_DISABLED

This looked guest triggerable to me.. so it makes sense

> + *       - 0x08 C_BAD_SUBSTREAMID
> + *       - 0x0a C_BAD_STE

Typo, this is C_BAD_CD

> + *       - 0x10 F_TRANSLATION
> + *       - 0x11 F_ADDR_SIZE
> + *       - 0x12 F_ACCESS
> + *       - 0x13 F_PERMISSION

List makes sense to me otherwise

> But F_CD_FETCH and F_STE_FETCH seem to be complicated here, as both

F_STE_FETCH would indicate a hypervisor failure managing the stream
table so no need to forward it.

> report PA in their FetchAddr fields, although the spec does mention
> both might be injected to a guest VM:
>  - "Note: This event might be injected into a guest VM, as though
>     from a virtual SMMU, when a hypervisor receives a stage 2
>     Translation-related fault indicating CD fetch as a cause (with
>     CLASS == CD)."

That sounds like the VMM should be catching the
F_TRANSLATION and convert it for the CLASS=CD

> For F_CD_FETCH, at least the CD table pointer in the nested STE is
> an IPA, and all the entries in the CD table that can be 2-level are
> IPAs as well. So, we need some kinda reverse translation from a PA
> to IPA using its stage-2 mapping. I am not sure what's the best way
> to do that...

And if the F_TRANSLATION covers the case then maybe this just stays in
the hypervisor?

> Otherwise, perhaps not-supporting them in this series might be a
> safer bet?

Yeah, I would consider skipping F_CD_FETCH. May also just try it out
and see what events come out on a CD fetch failure..

Jason
Nicolin Chen Feb. 21, 2025, 8:10 a.m. UTC | #10
On Thu, Feb 20, 2025 at 07:24:07PM -0400, Jason Gunthorpe wrote:
> On Thu, Feb 20, 2025 at 12:45:46PM -0800, Nicolin Chen wrote:
> > ------------------------------------------------------------
> > diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
> > index fd2f13a63f27..be9746ecdc65 100644
> > --- a/include/uapi/linux/iommufd.h
> > +++ b/include/uapi/linux/iommufd.h
> > @@ -1067,7 +1067,16 @@ enum iommu_veventq_type {
> >   * struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event
> >   *                                  (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3)
> >   * @evt: 256-bit ARM SMMUv3 Event record, little-endian.
> > - *       (Refer to "7.3 Event records" in SMMUv3 HW Spec)
> > + *       Reported event records: (Refer to "7.3 Event records" in SMMUv3 HW Spec)
> > + *       - 0x02 C_BAD_STREAMID
> 
> This is documented as 'Transaction StreamID out of range.' so it would
> by a hypervisor kernel bug to hit it

I see. Dropping it.

> > + *       - 0x04 C_BAD_STE
> 
> I'm not sure we do enough validation to reject all bad STE fragments
> so it makes sense this could happen.
> 
> > + *       - 0x06 F_STREAM_DISABLED
> 
> This looked guest triggerable to me.. so it makes sense

Keeping these two.

> > + *       - 0x08 C_BAD_SUBSTREAMID
> > + *       - 0x0a C_BAD_STE
> 
> Typo, this is C_BAD_CD

Fixed.

> > But F_CD_FETCH and F_STE_FETCH seem to be complicated here, as both
> 
> F_STE_FETCH would indicate a hypervisor failure managing the stream
> table so no need to forward it.
> 
> > report PA in their FetchAddr fields, although the spec does mention
> > both might be injected to a guest VM:
> >  - "Note: This event might be injected into a guest VM, as though
> >     from a virtual SMMU, when a hypervisor receives a stage 2
> >     Translation-related fault indicating CD fetch as a cause (with
> >     CLASS == CD)."
> 
> That sounds like the VMM should be catching the
> F_TRANSLATION and convert it for the CLASS=CD
> 
> > For F_CD_FETCH, at least the CD table pointer in the nested STE is
> > an IPA, and all the entries in the CD table that can be 2-level are
> > IPAs as well. So, we need some kinda reverse translation from a PA
> > to IPA using its stage-2 mapping. I am not sure what's the best way
> > to do that...
> 
> And if the F_TRANSLATION covers the case then maybe this just stays in
> the hypervisor?

> > Otherwise, perhaps not-supporting them in this series might be a
> > safer bet?
> 
> Yeah, I would consider skipping F_CD_FETCH. May also just try it out
> and see what events come out on a CD fetch failure..

I will skip these two for now. Meanwhile, will try some hack to
trigger a FETCH fault.

Thanks
Nicolin
diff mbox series

Patch

diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
index 4435ad7db776..d24c3d8ee397 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h
@@ -1066,6 +1066,7 @@  struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
 int arm_smmu_attach_prepare_vmaster(struct arm_smmu_attach_state *state,
 				    struct iommu_domain *domain);
 void arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state);
+int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt);
 #else
 #define arm_smmu_hw_info NULL
 #define arm_vsmmu_alloc NULL
@@ -1081,6 +1082,12 @@  static inline void
 arm_smmu_attach_commit_vmaster(struct arm_smmu_attach_state *state)
 {
 }
+
+static inline int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster,
+					   u64 *evt)
+{
+	return -EOPNOTSUPP;
+}
 #endif /* CONFIG_ARM_SMMU_V3_IOMMUFD */
 
 #endif /* _ARM_SMMU_V3_H */
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h
index 08cbc6bc3725..cbc30eff302d 100644
--- a/include/uapi/linux/iommufd.h
+++ b/include/uapi/linux/iommufd.h
@@ -1058,9 +1058,24 @@  struct iommufd_vevent_header {
 /**
  * enum iommu_veventq_type - Virtual Event Queue Type
  * @IOMMU_VEVENTQ_TYPE_DEFAULT: Reserved for future use
+ * @IOMMU_VEVENTQ_TYPE_ARM_SMMUV3: ARM SMMUv3 Virtual Event Queue
  */
 enum iommu_veventq_type {
 	IOMMU_VEVENTQ_TYPE_DEFAULT = 0,
+	IOMMU_VEVENTQ_TYPE_ARM_SMMUV3 = 1,
+};
+
+/**
+ * struct iommu_vevent_arm_smmuv3 - ARM SMMUv3 Virtual Event
+ *                                  (IOMMU_VEVENTQ_TYPE_ARM_SMMUV3)
+ * @evt: 256-bit ARM SMMUv3 Event record, little-endian.
+ *       (Refer to "7.3 Event records" in SMMUv3 HW Spec)
+ *
+ * StreamID field reports a virtual device ID. To receive a virtual event for a
+ * device, a vDEVICE must be allocated via IOMMU_VDEVICE_ALLOC.
+ */
+struct iommu_vevent_arm_smmuv3 {
+	__aligned_le64 evt[4];
 };
 
 /**
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
index 98138088fd16..ceeed907a714 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c
@@ -443,4 +443,19 @@  struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev,
 	return &vsmmu->core;
 }
 
+int arm_vmaster_report_event(struct arm_smmu_vmaster *vmaster, u64 *evt)
+{
+	struct iommu_vevent_arm_smmuv3 vevt;
+	int i;
+
+	vevt.evt[0] = cpu_to_le64((evt[0] & ~EVTQ_0_SID) |
+				  FIELD_PREP(EVTQ_0_SID, vmaster->vsid));
+	for (i = 1; i < EVTQ_ENT_DWORDS; i++)
+		vevt.evt[i] = cpu_to_le64(evt[i]);
+
+	return iommufd_viommu_report_event(&vmaster->vsmmu->core,
+					   IOMMU_VEVENTQ_TYPE_ARM_SMMUV3, &vevt,
+					   sizeof(vevt));
+}
+
 MODULE_IMPORT_NS("IOMMUFD");
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
index 686c171dd273..59fbc342a095 100644
--- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
+++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c
@@ -1812,8 +1812,8 @@  static void arm_smmu_decode_event(struct arm_smmu_device *smmu, u64 *raw,
 	mutex_unlock(&smmu->streams_mutex);
 }
 
-static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
-			       struct arm_smmu_event *event)
+static int arm_smmu_handle_event(struct arm_smmu_device *smmu, u64 *evt,
+				 struct arm_smmu_event *event)
 {
 	int ret = 0;
 	u32 perm = 0;
@@ -1831,31 +1831,30 @@  static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
 		return -EOPNOTSUPP;
 	}
 
-	if (!event->stall)
-		return -EOPNOTSUPP;
-
-	if (event->read)
-		perm |= IOMMU_FAULT_PERM_READ;
-	else
-		perm |= IOMMU_FAULT_PERM_WRITE;
+	if (event->stall) {
+		if (event->read)
+			perm |= IOMMU_FAULT_PERM_READ;
+		else
+			perm |= IOMMU_FAULT_PERM_WRITE;
 
-	if (event->instruction)
-		perm |= IOMMU_FAULT_PERM_EXEC;
+		if (event->instruction)
+			perm |= IOMMU_FAULT_PERM_EXEC;
 
-	if (event->privileged)
-		perm |= IOMMU_FAULT_PERM_PRIV;
+		if (event->privileged)
+			perm |= IOMMU_FAULT_PERM_PRIV;
 
-	flt->type = IOMMU_FAULT_PAGE_REQ;
-	flt->prm = (struct iommu_fault_page_request) {
-		.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
-		.grpid = event->stag,
-		.perm = perm,
-		.addr = event->iova,
-	};
+		flt->type = IOMMU_FAULT_PAGE_REQ;
+		flt->prm = (struct iommu_fault_page_request){
+			.flags = IOMMU_FAULT_PAGE_REQUEST_LAST_PAGE,
+			.grpid = event->stag,
+			.perm = perm,
+			.addr = event->iova,
+		};
 
-	if (event->ssv) {
-		flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
-		flt->prm.pasid = event->ssid;
+		if (event->ssv) {
+			flt->prm.flags |= IOMMU_FAULT_PAGE_REQUEST_PASID_VALID;
+			flt->prm.pasid = event->ssid;
+		}
 	}
 
 	mutex_lock(&smmu->streams_mutex);
@@ -1865,7 +1864,16 @@  static int arm_smmu_handle_event(struct arm_smmu_device *smmu,
 		goto out_unlock;
 	}
 
-	ret = iommu_report_device_fault(master->dev, &fault_evt);
+	if (event->stall) {
+		ret = iommu_report_device_fault(master->dev, &fault_evt);
+	} else {
+		down_read(&master->vmaster_rwsem);
+		if (master->vmaster && !event->s2)
+			ret = arm_vmaster_report_event(master->vmaster, evt);
+		else
+			ret = -EFAULT; /* Unhandled events should be pinned */
+		up_read(&master->vmaster_rwsem);
+	}
 out_unlock:
 	mutex_unlock(&smmu->streams_mutex);
 	return ret;
@@ -1943,7 +1951,7 @@  static irqreturn_t arm_smmu_evtq_thread(int irq, void *dev)
 	do {
 		while (!queue_remove_raw(q, evt)) {
 			arm_smmu_decode_event(smmu, evt, &event);
-			if (arm_smmu_handle_event(smmu, &event))
+			if (arm_smmu_handle_event(smmu, evt, &event))
 				arm_smmu_dump_event(smmu, evt, &event, &rs);
 
 			put_device(event.dev);