Message ID | 20250613235705.28006-3-robin.clark@oss.qualcomm.com |
---|---|
State | New |
Headers | show |
Series | drm/gpuvm: Locking helpers | expand |
On Mon, Jun 16, 2025 at 03:25:08PM -0700, Rob Clark wrote: > On Mon, Jun 16, 2025 at 2:39 PM Danilo Krummrich <dakr@redhat.com> wrote: > > > > On Sat, Jun 14, 2025 at 08:03:20AM -0700, Rob Clark wrote: > > > On Sat, Jun 14, 2025 at 3:39 AM Danilo Krummrich <dakr@redhat.com> wrote: > > > > > > > > On Fri, Jun 13, 2025 at 04:57:03PM -0700, Rob Clark wrote: > > > > > For UNMAP/REMAP steps we could be needing to lock objects that are not > > > > > explicitly listed in the VM_BIND ioctl in order to tear-down unmapped > > > > > VAs. These helpers handle locking/preparing the needed objects. > > > > > > > > Yes, that's a common use-case. I think drivers typically iterate through their > > > > drm_gpuva_ops to lock those objects. > > > > > > > > I had a look at you link [1] and it seems that you keep a list of ops as well by > > > > calling vm_op_enqueue() with a new struct msm_vm_op from the callbacks. > > > > > > > > Please note that for exactly this case there is the op_alloc callback in > > > > struct drm_gpuvm_ops, such that you can allocate a custom op type (i.e. struct > > > > msm_vm_op) that embedds a struct drm_gpuva_op. > > > > > > I did use drm_gpuvm_sm_xyz_ops_create() in an earlier iteration of my > > > VM_BIND series, but it wasn't quite what I was after. I wanted to > > > apply the VM updates immediately to avoid issues with a later > > > map/unmap overlapping an earlier map, which > > > drm_gpuvm_sm_xyz_ops_create() doesn't really handle. I'm not even > > > sure why this isn't a problem for other drivers unless userspace is > > > providing some guarantees. > > > > The drm_gpuva_ops are usually used in a pattern like this. > > > > vm_bind { > > for_each_vm_bind_operation { drm_gpuvm_sm_xyz_ops_create(); > > drm_gpuva_for_each_op { > > // modify drm_gpuvm's interval tree > > // pre-allocate memory > > // lock and prepare objects > > } > > } > > > > drm_sched_entity_push_job(); > > } > > > > run_job { > > for_each_vm_bind_operation { > > drm_gpuva_for_each_op { > > // modify page tables > > } > > } > > } > > > > run_job { > > for_each_vm_bind_operation { > > drm_gpuva_for_each_op { > > // free page table structures, if any > > // free unused pre-allocated memory > > } > > } > > } > > > > What did you do instead to get map/unmap overlapping? Even more interesting, > > what are you doing now? > > From what I can tell, the drivers using drm_gpva_for_each_op()/etc are > doing drm_gpuva_remove() while iterating the ops list.. > drm_gpuvm_sm_xyz_ops_create() itself does not modify the VM. So this > can only really work if you perform one MAP or UNMAP at a time. Or at > least if you process the VM modifying part of the ops list before > proceeding to the next op. (Added the drm_gpuvm_sm_xyz_ops_create() step above.) I went through the code you posted [1] and conceptually you're implementing exactly the pattern I described above, i.e. you do: vm_bind { for_each_vm_bind_operation { drm_gpuvm_sm_xyz_exec_lock(); } for_each_vm_bind_operation { drm_gpuvm_sm_xyz() { // modify drm_gpuvm's interval tree // create custom ops } } drm_sched_entity_push_job(); } run_job { for_each_vm_bind_operation { for_each_custom_op() { // do stuff } } } However, GPUVM intends to solve your use-case with the following, semantically identical, approach. vm_bind { for_each_vm_bind_operation { drm_gpuvm_sm_xyz_ops_create(); drm_gpuva_for_each_op { // modify drm_gpuvm's interval tree // lock and prepare objects (1) } } drm_sched_entity_push_job(); } run_job { for_each_vm_bind_operation { drm_gpuva_for_each_op() { // do stuff } } } (Note that GPUVM already supports to extend the existing OP structures; you should take advantage of that.) Hence, the helper we really want is to lock and prepare the objects at (1). I.e. a helper that takes a pointer to a struct drm_gpuva_op and locks / validates the corresponding objects. [1] https://gitlab.freedesktop.org/robclark/msm/-/blob/sparse-newer/drivers/gpu/drm/msm/msm_gem_vma.c
On Tue, Jun 17, 2025 at 5:48 AM Rob Clark <rob.clark@oss.qualcomm.com> wrote: > > On Tue, Jun 17, 2025 at 2:51 AM Danilo Krummrich <dakr@redhat.com> wrote: > > > > On Mon, Jun 16, 2025 at 03:25:08PM -0700, Rob Clark wrote: > > > On Mon, Jun 16, 2025 at 2:39 PM Danilo Krummrich <dakr@redhat.com> wrote: > > > > > > > > On Sat, Jun 14, 2025 at 08:03:20AM -0700, Rob Clark wrote: > > > > > On Sat, Jun 14, 2025 at 3:39 AM Danilo Krummrich <dakr@redhat.com> wrote: > > > > > > > > > > > > On Fri, Jun 13, 2025 at 04:57:03PM -0700, Rob Clark wrote: > > > > > > > For UNMAP/REMAP steps we could be needing to lock objects that are not > > > > > > > explicitly listed in the VM_BIND ioctl in order to tear-down unmapped > > > > > > > VAs. These helpers handle locking/preparing the needed objects. > > > > > > > > > > > > Yes, that's a common use-case. I think drivers typically iterate through their > > > > > > drm_gpuva_ops to lock those objects. > > > > > > > > > > > > I had a look at you link [1] and it seems that you keep a list of ops as well by > > > > > > calling vm_op_enqueue() with a new struct msm_vm_op from the callbacks. > > > > > > > > > > > > Please note that for exactly this case there is the op_alloc callback in > > > > > > struct drm_gpuvm_ops, such that you can allocate a custom op type (i.e. struct > > > > > > msm_vm_op) that embedds a struct drm_gpuva_op. > > > > > > > > > > I did use drm_gpuvm_sm_xyz_ops_create() in an earlier iteration of my > > > > > VM_BIND series, but it wasn't quite what I was after. I wanted to > > > > > apply the VM updates immediately to avoid issues with a later > > > > > map/unmap overlapping an earlier map, which > > > > > drm_gpuvm_sm_xyz_ops_create() doesn't really handle. I'm not even > > > > > sure why this isn't a problem for other drivers unless userspace is > > > > > providing some guarantees. > > > > > > > > The drm_gpuva_ops are usually used in a pattern like this. > > > > > > > > vm_bind { > > > > for_each_vm_bind_operation { > > drm_gpuvm_sm_xyz_ops_create(); > > > > drm_gpuva_for_each_op { > > > > // modify drm_gpuvm's interval tree > > > > // pre-allocate memory > > > > // lock and prepare objects > > > > } > > > > } > > > > > > > > drm_sched_entity_push_job(); > > > > } > > > > > > > > run_job { > > > > for_each_vm_bind_operation { > > > > drm_gpuva_for_each_op { > > > > // modify page tables > > > > } > > > > } > > > > } > > > > > > > > run_job { > > > > for_each_vm_bind_operation { > > > > drm_gpuva_for_each_op { > > > > // free page table structures, if any > > > > // free unused pre-allocated memory > > > > } > > > > } > > > > } > > > > > > > > What did you do instead to get map/unmap overlapping? Even more interesting, > > > > what are you doing now? > > > > > > From what I can tell, the drivers using drm_gpva_for_each_op()/etc are > > > doing drm_gpuva_remove() while iterating the ops list.. > > > drm_gpuvm_sm_xyz_ops_create() itself does not modify the VM. So this > > > can only really work if you perform one MAP or UNMAP at a time. Or at > > > least if you process the VM modifying part of the ops list before > > > proceeding to the next op. > > > > (Added the drm_gpuvm_sm_xyz_ops_create() step above.) > > > > I went through the code you posted [1] and conceptually you're implementing > > exactly the pattern I described above, i.e. you do: > > > > vm_bind { > > for_each_vm_bind_operation { > > drm_gpuvm_sm_xyz_exec_lock(); > > } > > > > for_each_vm_bind_operation { > > drm_gpuvm_sm_xyz() { > > // modify drm_gpuvm's interval tree > > // create custom ops > > } > > } > > > > drm_sched_entity_push_job(); > > } > > > > run_job { > > for_each_vm_bind_operation { > > for_each_custom_op() { > > // do stuff > > } > > } > > } > > Close, but by the time we get to run_job there is just a single list > of ops covering all the vm_bind operations: > > run_job { > for_each_custom_op() { > // do stuff > } > } > > rather than a list of va ops per vm_bind op. > > > However, GPUVM intends to solve your use-case with the following, semantically > > identical, approach. > > > > vm_bind { > > for_each_vm_bind_operation { > > drm_gpuvm_sm_xyz_ops_create(); > > > > drm_gpuva_for_each_op { > > // modify drm_gpuvm's interval tree > > // lock and prepare objects (1) > > I currently decouple lock+pin from VM modification to avoid an error > path that leaves the VM partially modified. Once you add this back > in, the va-ops approach isn't simpler, IMHO. Oh, actually scratch that.. using va-ops, it is not even possible to decouple locking/prepare from VM modifications. So using DRM_EXEC_INTERRUPTIBLE_WAIT, for ex, with va-ops list would be an actively bad idea. BR, -R > > } > > } > > > > drm_sched_entity_push_job(); > > } > > > > run_job { > > for_each_vm_bind_operation { > > drm_gpuva_for_each_op() { > > // do stuff > > } > > } > > } > > > > (Note that GPUVM already supports to extend the existing OP structures; you > > should take advantage of that.) > > > > Hence, the helper we really want is to lock and prepare the objects at (1). I.e. > > a helper that takes a pointer to a struct drm_gpuva_op and locks / validates the > > corresponding objects. > > I still prefer that we don't _require_ using va-ops. But if it makes > it more useful for other drivers I could add a helper which > exec_lock's based on a list of va-ops instead. > > BR, > -R > > > [1] https://gitlab.freedesktop.org/robclark/msm/-/blob/sparse-newer/drivers/gpu/drm/msm/msm_gem_vma.c > >
diff --git a/drivers/gpu/drm/drm_gpuvm.c b/drivers/gpu/drm/drm_gpuvm.c index 0ca717130541..197066dd390b 100644 --- a/drivers/gpu/drm/drm_gpuvm.c +++ b/drivers/gpu/drm/drm_gpuvm.c @@ -2390,6 +2390,87 @@ drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv, } EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap); +static int +drm_gpuva_sm_step_lock(struct drm_gpuva_op *op, void *priv) +{ + struct drm_exec *exec = priv; + + switch (op->op) { + case DRM_GPUVA_OP_REMAP: + if (op->remap.unmap->va->gem.obj) + return drm_exec_lock_obj(exec, op->remap.unmap->va->gem.obj); + return 0; + case DRM_GPUVA_OP_UNMAP: + if (op->unmap.va->gem.obj) + return drm_exec_lock_obj(exec, op->unmap.va->gem.obj); + return 0; + default: + return 0; + } +} + +static const struct drm_gpuvm_ops lock_ops = { + .sm_step_map = drm_gpuva_sm_step_lock, + .sm_step_remap = drm_gpuva_sm_step_lock, + .sm_step_unmap = drm_gpuva_sm_step_lock, +}; + +/** + * drm_gpuvm_sm_map_lock() - locks the objects touched by a drm_gpuvm_sm_map() + * @gpuvm: the &drm_gpuvm representing the GPU VA space + * @exec: the &drm_exec locking context + * @num_fences: for newly mapped objects, the # of fences to reserve + * @req_addr: the start address of the range to unmap + * @req_range: the range of the mappings to unmap + * @req_obj: the &drm_gem_object to map + * @req_offset: the offset within the &drm_gem_object + * + * This function locks (drm_exec_lock_obj()) objects that will be unmapped/ + * remapped, and locks+prepares (drm_exec_prepare_object()) objects that + * will be newly mapped. + * + * Returns: 0 on success or a negative error code + */ +int +drm_gpuvm_sm_map_lock(struct drm_gpuvm *gpuvm, + struct drm_exec *exec, unsigned int num_fences, + u64 req_addr, u64 req_range, + struct drm_gem_object *req_obj, u64 req_offset) +{ + if (req_obj) { + int ret = drm_exec_prepare_obj(exec, req_obj, num_fences); + if (ret) + return ret; + } + + return __drm_gpuvm_sm_map(gpuvm, &lock_ops, exec, + req_addr, req_range, + req_obj, req_offset); + +} +EXPORT_SYMBOL_GPL(drm_gpuvm_sm_map_lock); + +/** + * drm_gpuvm_sm_unmap_lock() - locks the objects touched by drm_gpuvm_sm_unmap() + * @gpuvm: the &drm_gpuvm representing the GPU VA space + * @exec: the &drm_exec locking context + * @req_addr: the start address of the range to unmap + * @req_range: the range of the mappings to unmap + * + * This function locks (drm_exec_lock_obj()) objects that will be unmapped/ + * remapped by drm_gpuvm_sm_unmap(). + * + * Returns: 0 on success or a negative error code + */ +int +drm_gpuvm_sm_unmap_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec, + u64 req_addr, u64 req_range) +{ + return __drm_gpuvm_sm_unmap(gpuvm, &lock_ops, exec, + req_addr, req_range); +} +EXPORT_SYMBOL_GPL(drm_gpuvm_sm_unmap_lock); + static struct drm_gpuva_op * gpuva_op_alloc(struct drm_gpuvm *gpuvm) { diff --git a/include/drm/drm_gpuvm.h b/include/drm/drm_gpuvm.h index 0ef837a331d6..a769dccfb3ae 100644 --- a/include/drm/drm_gpuvm.h +++ b/include/drm/drm_gpuvm.h @@ -1211,6 +1211,14 @@ int drm_gpuvm_sm_map(struct drm_gpuvm *gpuvm, void *priv, int drm_gpuvm_sm_unmap(struct drm_gpuvm *gpuvm, void *priv, u64 req_addr, u64 req_range); +int drm_gpuvm_sm_map_lock(struct drm_gpuvm *gpuvm, + struct drm_exec *exec, unsigned int num_fences, + u64 req_addr, u64 req_range, + struct drm_gem_object *obj, u64 offset); + +int drm_gpuvm_sm_unmap_lock(struct drm_gpuvm *gpuvm, struct drm_exec *exec, + u64 req_addr, u64 req_range); + void drm_gpuva_map(struct drm_gpuvm *gpuvm, struct drm_gpuva *va, struct drm_gpuva_op_map *op);
For UNMAP/REMAP steps we could be needing to lock objects that are not explicitly listed in the VM_BIND ioctl in order to tear-down unmapped VAs. These helpers handle locking/preparing the needed objects. Note that these functions do not strictly require the VM changes to be applied before the next drm_gpuvm_sm_map_lock()/_unmap_lock() call. In the case that VM changes from an earlier drm_gpuvm_sm_map()/_unmap() call result in a differing sequence of steps when the VM changes are actually applied, it will be the same set of GEM objects involved, so the locking is still correct. Signed-off-by: Rob Clark <robin.clark@oss.qualcomm.com> --- drivers/gpu/drm/drm_gpuvm.c | 81 +++++++++++++++++++++++++++++++++++++ include/drm/drm_gpuvm.h | 8 ++++ 2 files changed, 89 insertions(+)