diff mbox series

[2/2] drm/gpuvm: Add locking helpers

Message ID 20250613235705.28006-3-robin.clark@oss.qualcomm.com
State New
Headers show
Series drm/gpuvm: Locking helpers | expand

Commit Message

Rob Clark June 13, 2025, 11:57 p.m. UTC
For UNMAP/REMAP steps we could be needing to lock objects that are not
explicitly listed in the VM_BIND ioctl in order to tear-down unmapped
VAs.  These helpers handle locking/preparing the needed objects.

Note that these functions do not strictly require the VM changes to be
applied before the next drm_gpuvm_sm_map_lock()/_unmap_lock() call.  In
the case that VM changes from an earlier drm_gpuvm_sm_map()/_unmap()
call result in a differing sequence of steps when the VM changes are
actually applied, it will be the same set of GEM objects involved, so
the locking is still correct.

Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com>
---
 drivers/gpu/drm/drm_gpuvm.c | 81 +++++++++++++++++++++++++++++++++++++
 include/drm/drm_gpuvm.h     |  8 ++++
 2 files changed, 89 insertions(+)

Comments

Danilo Krummrich June 17, 2025, 9:51 a.m. UTC | #1
On Mon, Jun 16, 2025 at 03:25:08PM -0700, Rob Clark wrote:
> On Mon, Jun 16, 2025 at 2:39 PM Danilo Krummrich <dakr@redhat.com> wrote:
> >
> > On Sat, Jun 14, 2025 at 08:03:20AM -0700, Rob Clark wrote:
> > > On Sat, Jun 14, 2025 at 3:39 AM Danilo Krummrich <dakr@redhat.com> wrote:
> > > >
> > > > On Fri, Jun 13, 2025 at 04:57:03PM -0700, Rob Clark wrote:
> > > > > For UNMAP/REMAP steps we could be needing to lock objects that are not
> > > > > explicitly listed in the VM_BIND ioctl in order to tear-down unmapped
> > > > > VAs.  These helpers handle locking/preparing the needed objects.
> > > >
> > > > Yes, that's a common use-case. I think drivers typically iterate through their
> > > > drm_gpuva_ops to lock those objects.
> > > >
> > > > I had a look at you link [1] and it seems that you keep a list of ops as well by
> > > > calling vm_op_enqueue() with a new struct msm_vm_op from the callbacks.
> > > >
> > > > Please note that for exactly this case there is the op_alloc callback in
> > > > struct drm_gpuvm_ops, such that you can allocate a custom op type (i.e. struct
> > > > msm_vm_op) that embedds a struct drm_gpuva_op.
> > >
> > > I did use drm_gpuvm_sm_xyz_ops_create() in an earlier iteration of my
> > > VM_BIND series, but it wasn't quite what I was after.  I wanted to
> > > apply the VM updates immediately to avoid issues with a later
> > > map/unmap overlapping an earlier map, which
> > > drm_gpuvm_sm_xyz_ops_create() doesn't really handle.  I'm not even
> > > sure why this isn't a problem for other drivers unless userspace is
> > > providing some guarantees.
> >
> > The drm_gpuva_ops are usually used in a pattern like this.
> >
> >         vm_bind {
> >                 for_each_vm_bind_operation {
			    drm_gpuvm_sm_xyz_ops_create();
> >                         drm_gpuva_for_each_op {
> >                                 // modify drm_gpuvm's interval tree
> >                                 // pre-allocate memory
> >                                 // lock and prepare objects
> >                         }
> >                 }
> >
> >                 drm_sched_entity_push_job();
> >         }
> >
> >         run_job {
> >                 for_each_vm_bind_operation {
> >                         drm_gpuva_for_each_op {
> >                                 // modify page tables
> >                         }
> >                 }
> >         }
> >
> >         run_job {
> >                 for_each_vm_bind_operation {
> >                         drm_gpuva_for_each_op {
> >                                 // free page table structures, if any
> >                                 // free unused pre-allocated memory
> >                         }
> >                 }
> >         }
> >
> > What did you do instead to get map/unmap overlapping? Even more interesting,
> > what are you doing now?
> 
> From what I can tell, the drivers using drm_gpva_for_each_op()/etc are
> doing drm_gpuva_remove() while iterating the ops list..
> drm_gpuvm_sm_xyz_ops_create() itself does not modify the VM.  So this
> can only really work if you perform one MAP or UNMAP at a time.  Or at
> least if you process the VM modifying part of the ops list before
> proceeding to the next op.

(Added the drm_gpuvm_sm_xyz_ops_create() step above.)

I went through the code you posted [1] and conceptually you're implementing
exactly the pattern I described above, i.e. you do:

	vm_bind {
		for_each_vm_bind_operation {
			drm_gpuvm_sm_xyz_exec_lock();
		}

		for_each_vm_bind_operation {
			drm_gpuvm_sm_xyz() {
				// modify drm_gpuvm's interval tree
				// create custom ops
			}
		}

		drm_sched_entity_push_job();
	}

	run_job {
		for_each_vm_bind_operation {
			for_each_custom_op() {
				// do stuff
			}
		}
	}

However, GPUVM intends to solve your use-case with the following, semantically
identical, approach.

	vm_bind {
		for_each_vm_bind_operation {
			drm_gpuvm_sm_xyz_ops_create();

			drm_gpuva_for_each_op {
				// modify drm_gpuvm's interval tree
				// lock and prepare objects (1)
			}
		}

		drm_sched_entity_push_job();
	}

	run_job {
		for_each_vm_bind_operation {
			drm_gpuva_for_each_op() {
				// do stuff
			}
		}
	}

(Note that GPUVM already supports to extend the existing OP structures; you
should take advantage of that.)

Hence, the helper we really want is to lock and prepare the objects at (1). I.e.
a helper that takes a pointer to a struct drm_gpuva_op and locks / validates the
corresponding objects.

[1] https://gitlab.freedesktop.org/robclark/msm/-/blob/sparse-newer/drivers/gpu/drm/msm/msm_gem_vma.c
Rob Clark June 17, 2025, 1:43 p.m. UTC | #2
On Tue, Jun 17, 2025 at 5:48 AM Rob Clark <rob.clark@oss.qualcomm.com> wrote:
>
> On Tue, Jun 17, 2025 at 2:51 AM Danilo Krummrich <dakr@redhat.com> wrote:
> >
> > On Mon, Jun 16, 2025 at 03:25:08PM -0700, Rob Clark wrote:
> > > On Mon, Jun 16, 2025 at 2:39 PM Danilo Krummrich <dakr@redhat.com> wrote:
> > > >
> > > > On Sat, Jun 14, 2025 at 08:03:20AM -0700, Rob Clark wrote:
> > > > > On Sat, Jun 14, 2025 at 3:39 AM Danilo Krummrich <dakr@redhat.com> wrote:
> > > > > >
> > > > > > On Fri, Jun 13, 2025 at 04:57:03PM -0700, Rob Clark wrote:
> > > > > > > For UNMAP/REMAP steps we could be needing to lock objects that are not
> > > > > > > explicitly listed in the VM_BIND ioctl in order to tear-down unmapped
> > > > > > > VAs.  These helpers handle locking/preparing the needed objects.
> > > > > >
> > > > > > Yes, that's a common use-case. I think drivers typically iterate through their
> > > > > > drm_gpuva_ops to lock those objects.
> > > > > >
> > > > > > I had a look at you link [1] and it seems that you keep a list of ops as well by
> > > > > > calling vm_op_enqueue() with a new struct msm_vm_op from the callbacks.
> > > > > >
> > > > > > Please note that for exactly this case there is the op_alloc callback in
> > > > > > struct drm_gpuvm_ops, such that you can allocate a custom op type (i.e. struct
> > > > > > msm_vm_op) that embedds a struct drm_gpuva_op.
> > > > >
> > > > > I did use drm_gpuvm_sm_xyz_ops_create() in an earlier iteration of my
> > > > > VM_BIND series, but it wasn't quite what I was after.  I wanted to
> > > > > apply the VM updates immediately to avoid issues with a later
> > > > > map/unmap overlapping an earlier map, which
> > > > > drm_gpuvm_sm_xyz_ops_create() doesn't really handle.  I'm not even
> > > > > sure why this isn't a problem for other drivers unless userspace is
> > > > > providing some guarantees.
> > > >
> > > > The drm_gpuva_ops are usually used in a pattern like this.
> > > >
> > > >         vm_bind {
> > > >                 for_each_vm_bind_operation {
> >                             drm_gpuvm_sm_xyz_ops_create();
> > > >                         drm_gpuva_for_each_op {
> > > >                                 // modify drm_gpuvm's interval tree
> > > >                                 // pre-allocate memory
> > > >                                 // lock and prepare objects
> > > >                         }
> > > >                 }
> > > >
> > > >                 drm_sched_entity_push_job();
> > > >         }
> > > >
> > > >         run_job {
> > > >                 for_each_vm_bind_operation {
> > > >                         drm_gpuva_for_each_op {
> > > >                                 // modify page tables
> > > >                         }
> > > >                 }
> > > >         }
> > > >
> > > >         run_job {
> > > >                 for_each_vm_bind_operation {
> > > >                         drm_gpuva_for_each_op {
> > > >                                 // free page table structures, if any
> > > >                                 // free unused pre-allocated memory
> > > >                         }
> > > >                 }
> > > >         }
> > > >
> > > > What did you do instead to get map/unmap overlapping? Even more interesting,
> > > > what are you doing now?
> > >
> > > From what I can tell, the drivers using drm_gpva_for_each_op()/etc are
> > > doing drm_gpuva_remove() while iterating the ops list..
> > > drm_gpuvm_sm_xyz_ops_create() itself does not modify the VM.  So this
> > > can only really work if you perform one MAP or UNMAP at a time.  Or at
> > > least if you process the VM modifying part of the ops list before
> > > proceeding to the next op.
> >
> > (Added the drm_gpuvm_sm_xyz_ops_create() step above.)
> >
> > I went through the code you posted [1] and conceptually you're implementing
> > exactly the pattern I described above, i.e. you do:
> >
> >         vm_bind {
> >                 for_each_vm_bind_operation {
> >                         drm_gpuvm_sm_xyz_exec_lock();
> >                 }
> >
> >                 for_each_vm_bind_operation {
> >                         drm_gpuvm_sm_xyz() {
> >                                 // modify drm_gpuvm's interval tree
> >                                 // create custom ops
> >                         }
> >                 }
> >
> >                 drm_sched_entity_push_job();
> >         }
> >
> >         run_job {
> >                 for_each_vm_bind_operation {
> >                         for_each_custom_op() {
> >                                 // do stuff
> >                         }
> >                 }
> >         }
>
> Close, but by the time we get to run_job there is just a single list
> of ops covering all the vm_bind operations:
>
>         run_job {
>                 for_each_custom_op() {
>                         // do stuff
>                 }
>         }
>
> rather than a list of va ops per vm_bind op.
>
> > However, GPUVM intends to solve your use-case with the following, semantically
> > identical, approach.
> >
> >         vm_bind {
> >                 for_each_vm_bind_operation {
> >                         drm_gpuvm_sm_xyz_ops_create();
> >
> >                         drm_gpuva_for_each_op {
> >                                 // modify drm_gpuvm's interval tree
> >                                 // lock and prepare objects (1)
>
> I currently decouple lock+pin from VM modification to avoid an error
> path that leaves the VM partially modified.  Once you add this back
> in, the va-ops approach isn't simpler, IMHO.

Oh, actually scratch that.. using va-ops, it is not even possible to
decouple locking/prepare from VM modifications.  So using
DRM_EXEC_INTERRUPTIBLE_WAIT, for ex, with va-ops list would be an
actively bad idea.

BR,
-R

> >                         }
> >                 }
> >
> >                 drm_sched_entity_push_job();
> >         }
> >
> >         run_job {
> >                 for_each_vm_bind_operation {
> >                         drm_gpuva_for_each_op() {
> >                                 // do stuff
> >                         }
> >                 }
> >         }
> >
> > (Note that GPUVM already supports to extend the existing OP structures; you
> > should take advantage of that.)
> >
> > Hence, the helper we really want is to lock and prepare the objects at (1). I.e.
> > a helper that takes a pointer to a struct drm_gpuva_op and locks / validates the
> > corresponding objects.
>
> I still prefer that we don't _require_ using va-ops.  But if it makes
> it more useful for other drivers I could add a helper which
> exec_lock's based on a list of va-ops instead.
>
> BR,
> -R
>
> > [1] https://gitlab.freedesktop.org/robclark/msm/-/blob/sparse-newer/drivers/gpu/drm/msm/msm_gem_vma.c
> >
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c
index 0ca717130541..197066dd390b 100644
--- a/drivers/gpu/drm/drm_gpuvm.c
+++ b/drivers/gpu/drm/drm_gpuvm.c
@@ -2390,6 +2390,87 @@  drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
 }
 EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap);
 
+static int
+drm_gpuva_sm_step_lock(struct drm_gpuva_op *op, void *priv)
+{
+	struct drm_exec *exec = priv;
+
+	switch (op->op) {
+	case DRM_GPUVA_OP_REMAP:
+		if (op->remap.unmap->va->gem.obj)
+			return drm_exec_lock_obj(exec, op->remap.unmap->va->gem.obj);
+		return 0;
+	case DRM_GPUVA_OP_UNMAP:
+		if (op->unmap.va->gem.obj)
+			return drm_exec_lock_obj(exec, op->unmap.va->gem.obj);
+		return 0;
+	default:
+		return 0;
+	}
+}
+
+static const struct drm_gpuvm_ops lock_ops = {
+	.sm_step_map = drm_gpuva_sm_step_lock,
+	.sm_step_remap = drm_gpuva_sm_step_lock,
+	.sm_step_unmap = drm_gpuva_sm_step_lock,
+};
+
+/**
+ * drm_gpuvm_sm_map_lock() - locks the objects touched by a drm_gpuvm_sm_map()
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
+ * @exec: the &drm_exec locking context
+ * @num_fences: for newly mapped objects, the # of fences to reserve
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ * @req_obj: the &drm_gem_object to map
+ * @req_offset: the offset within the &drm_gem_object
+ *
+ * This function locks (drm_exec_lock_obj()) objects that will be unmapped/
+ * remapped, and locks+prepares (drm_exec_prepare_object()) objects that
+ * will be newly mapped.
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuvm_sm_map_lock(struct drm_gpuvm *gpuvm,
+		      struct drm_exec *exec, unsigned int num_fences,
+		      u64 req_addr, u64 req_range,
+		      struct drm_gem_object *req_obj, u64 req_offset)
+{
+	if (req_obj) {
+		int ret = drm_exec_prepare_obj(exec, req_obj, num_fences);
+		if (ret)
+			return ret;
+	}
+
+	return __drm_gpuvm_sm_map(gpuvm, &lock_ops, exec,
+				  req_addr, req_range,
+				  req_obj, req_offset);
+
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_lock);
+
+/**
+ * drm_gpuvm_sm_unmap_lock() - locks the objects touched by drm_gpuvm_sm_unmap()
+ * @gpuvm: the &drm_gpuvm representing the GPU VA space
+ * @exec: the &drm_exec locking context
+ * @req_addr: the start address of the range to unmap
+ * @req_range: the range of the mappings to unmap
+ *
+ * This function locks (drm_exec_lock_obj()) objects that will be unmapped/
+ * remapped by drm_gpuvm_sm_unmap().
+ *
+ * Returns: 0 on success or a negative error code
+ */
+int
+drm_gpuvm_sm_unmap_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
+			u64 req_addr, u64 req_range)
+{
+	return __drm_gpuvm_sm_unmap(gpuvm, &lock_ops, exec,
+				    req_addr, req_range);
+}
+EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap_lock);
+
 static struct drm_gpuva_op *
 gpuva_op_alloc(struct drm_gpuvm *gpuvm)
 {
diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h
index 0ef837a331d6..a769dccfb3ae 100644
--- a/include/drm/drm_gpuvm.h
+++ b/include/drm/drm_gpuvm.h
@@ -1211,6 +1211,14 @@  int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv,
 int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv,
 		       u64 req_addr, u64 req_range);
 
+int drm_gpuvm_sm_map_lock(struct drm_gpuvm *gpuvm,
+			  struct drm_exec *exec, unsigned int num_fences,
+			  u64 req_addr, u64 req_range,
+			  struct drm_gem_object *obj, u64 offset);
+
+int drm_gpuvm_sm_unmap_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec,
+			    u64 req_addr, u64 req_range);
+
 void drm_gpuva_map(struct drm_gpuvm *gpuvm,
 		   struct drm_gpuva *va,
 		   struct drm_gpuva_op_map *op);