From patchwork Mon May 19 17:51:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rob Clark X-Patchwork-Id: 891139 Received: from mail-pg1-f171.google.com (mail-pg1-f171.google.com [209.85.215.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5687328937B; Mon, 19 May 2025 17:54:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677265; cv=none; b=S71mbDv8r3yHOP2w0mvSInlwZKOVKVjqYrN+oGx+nwhgUBoFict2rBcPoxtL/qREPHf/LT2qWnWb/+mfGtbVw5gb/KpqMfZ7mZAGeNtbZJhLD8jkazbwDDyPLUPefP4oN7pLA/PGU1aGRvDREqD15CRMoKvyk4eZPJ5gvOp1Eow= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747677265; c=relaxed/simple; bh=OCOkY5Kzx9fQHT33EYqh/192I4x5fhYXVcPq5AZ1/8w=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qKzQAx17uCknPbIJHaEsEc0nkxWUqXQL6xKNVxbKmVa8uqUAuI3DMgfHGUDwvhg0KQMnPIjfit+ulYHYxyqpzJJ6NXc5Q63li2jZem6pTZ3BjAsrVR7TDwpR2i/bk5H97Wn4jWqWc6c69EllAol24y6QkqfndMjRce5vDvQXpiw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=MGkux4ab; arc=none smtp.client-ip=209.85.215.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MGkux4ab" Received: by mail-pg1-f171.google.com with SMTP id 41be03b00d2f7-ae727e87c26so3059802a12.0; Mon, 19 May 2025 10:54:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1747677263; x=1748282063; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MECHGtnXB65b4MXNo29O0yS4gkyeIySNFhpqKFVb7PA=; b=MGkux4ab2qQ5Nd2CxfGx4+3cj/MIUmR3ePz0vsaDUXosnXVsi47pJGu9GZ31b+dfO/ wpP+GM1e3v2Hwuxc1qpWdtuW5BCY5JCnfEGloOOMqwa6tCpeO/HMwF+XZENmLe1rRIvJ uleezDz+4nu+nXNCOAQ7tyIKwVPNSCnuCkMQJvOWOX3VUiwah5ja0TekkAWpiDY0cqrJ I1zbsy0quzQKToNX7H4gl+8+8WMquQk/ULGujgncXDVtGGIcj7n9YwqQL0t9gJLVwHgu zQUZgYnWAKS2hqZ/JtCxJKpAO+N7Py7ZhNxbeMm2w58NJYbKW7t/RbbLjFPeE/0hWsPM Rphg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1747677263; x=1748282063; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MECHGtnXB65b4MXNo29O0yS4gkyeIySNFhpqKFVb7PA=; b=pPzNV4n2XJfW+evUwY0pG6tV2lvJJYTRr1PNVIY+BTA8XIshskK95ezgEwf+5KS3DX EQytYHeyqINSWAOhOALKGKcba+Tzf9e2onYtJEF7cLW1eoXu6a/m3HfLNLnrg2/xQDc9 Gp0TIUM/zch31XN1rUigYGVxeowl9H23RFiZEUzD8ntYcZtUrbiDH1y1bJjKc3uebyxa zhjHOWgBvBZsBgD1TIKxVKzksVWkNKqgFNRf3DM/7ioJxC4CuevNri0hJ41GOPv1Df7j wbInsop2Y/nhq4R2D61vYjayWuxp3BPoFY+Cdbbwv144N7cg5BkBinazNbyQZbnP178D rSxA== X-Forwarded-Encrypted: i=1; AJvYcCV1vhIh6EnMSppgklKbjTm7Nh8EFFhWFoiKod60PhOKektRnSI4w1w7eAXRbetMqxrhcDk9JvYQgwzcUkOQ@vger.kernel.org, AJvYcCWWSAHwC6Uf1XuZjKXmZaaNknV0I0+iIEsnCqUmdHdfEHVIL1f1LgiwzSx3YUkqRjRQJw5+JlWMfjAMoHYm@vger.kernel.org X-Gm-Message-State: AOJu0YwRdi9QuN7/Zf+37REneRwkzec7qBmyqxvktbQp9nuQ5Z/HsUZl rj9eNDx6An2s21AkcH28nZhBUceQEmRDivWCyTE2QAFrNJ3gV2LW23qqanlxgA== X-Gm-Gg: ASbGncsJz+/gmE5DWcmZ2WCLBx7XtV4bfkpKrLt83pSOW+bude+DFzy4QCSZu2vuQR3 ydXXCutFQU4mK7PxVdjlna4zIjuRoLm5ZYRiafsR2Gt0HVsI1mUUZzdhzuc8S38wCuR+8bW5YTW zhGspjiQ+0sbgOstwzRQc2qat0hpFE3RojT7Gli0OnVn/0otBiDoG6RY0jNFzt6cNPK8eKVgogM /Q13H6+dK7fKhLfGwkxoRR/3CZDk2x1k3RzJBdlowo8YXauPdhZqQD6T5rNPW+lTth2DIrad9xV NJU+g74dyZy/i/DMaVs5OdbpmZPDM9kQmPX4k8zJx1qSvf9m3acdl8hCrQ52l2ADCmYiOQxE/N9 5D7lXSe42p1Fpk/oeAycYbyP/gQ== X-Google-Smtp-Source: AGHT+IETEAy4l4Rs9OrAnwlLbsxGPdmd9hMmk8n/Oel19Ttvhga9l+j0mfP1uF4WLnTD1Aj7/dl/7g== X-Received: by 2002:a17:902:ce87:b0:223:653e:eb09 with SMTP id d9443c01a7336-231d438a294mr182291145ad.7.1747677262603; Mon, 19 May 2025 10:54:22 -0700 (PDT) Received: from localhost ([2a00:79e0:3e00:2601:3afc:446b:f0df:eadc]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-231d4ebad26sm62936415ad.198.2025.05.19.10.54.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 19 May 2025 10:54:22 -0700 (PDT) From: Rob Clark To: dri-devel@lists.freedesktop.org Cc: freedreno@lists.freedesktop.org, linux-arm-msm@vger.kernel.org, Connor Abbott , Rob Clark , Philipp Stanner , Danilo Krummrich , Matthew Brost , Philipp Stanner , =?utf-8?q?Christian_K=C3=B6nig?= , Maarten Lankhorst , Maxime Ripard , Thomas Zimmermann , David Airlie , Simona Vetter , linux-kernel@vger.kernel.org (open list) Subject: [PATCH v5 04/40] drm/sched: Add enqueue credit limit Date: Mon, 19 May 2025 10:51:27 -0700 Message-ID: <20250519175348.11924-5-robdclark@gmail.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250519175348.11924-1-robdclark@gmail.com> References: <20250519175348.11924-1-robdclark@gmail.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Rob Clark Similar to the existing credit limit mechanism, but applying to jobs enqueued to the scheduler but not yet run. The use case is to put an upper bound on preallocated, and potentially unneeded, pgtable pages. When this limit is exceeded, pushing new jobs will block until the count drops below the limit. Cc: Philipp Stanner Cc: Danilo Krummrich Signed-off-by: Rob Clark --- drivers/gpu/drm/scheduler/sched_entity.c | 19 +++++++++++++++++-- drivers/gpu/drm/scheduler/sched_main.c | 3 +++ include/drm/gpu_scheduler.h | 24 +++++++++++++++++++++++- 3 files changed, 43 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/scheduler/sched_entity.c b/drivers/gpu/drm/scheduler/sched_entity.c index bd39db7bb240..8e6b12563348 100644 --- a/drivers/gpu/drm/scheduler/sched_entity.c +++ b/drivers/gpu/drm/scheduler/sched_entity.c @@ -579,12 +579,25 @@ void drm_sched_entity_select_rq(struct drm_sched_entity *entity) * fence sequence number this function should be called with drm_sched_job_arm() * under common lock for the struct drm_sched_entity that was set up for * @sched_job in drm_sched_job_init(). + * + * If enqueue_credit_limit is used, this can return -ERESTARTSYS if the system + * call is interrupted. */ -void drm_sched_entity_push_job(struct drm_sched_job *sched_job) +int drm_sched_entity_push_job(struct drm_sched_job *sched_job) { struct drm_sched_entity *entity = sched_job->entity; + struct drm_gpu_scheduler *sched = sched_job->sched; bool first; ktime_t submit_ts; + int ret; + + ret = wait_event_interruptible( + sched->job_scheduled, + atomic_read(&sched->enqueue_credit_count) <= + sched->enqueue_credit_limit); + if (ret) + return ret; + atomic_add(sched_job->enqueue_credits, &sched->enqueue_credit_count); trace_drm_sched_job(sched_job, entity); atomic_inc(entity->rq->sched->score); @@ -609,7 +622,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) spin_unlock(&entity->lock); DRM_ERROR("Trying to push to a killed entity\n"); - return; + return -EINVAL; } rq = entity->rq; @@ -626,5 +639,7 @@ void drm_sched_entity_push_job(struct drm_sched_job *sched_job) drm_sched_wakeup(sched); } + + return 0; } EXPORT_SYMBOL(drm_sched_entity_push_job); diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c index cda1216adfa4..5f812253656a 100644 --- a/drivers/gpu/drm/scheduler/sched_main.c +++ b/drivers/gpu/drm/scheduler/sched_main.c @@ -1221,6 +1221,7 @@ static void drm_sched_run_job_work(struct work_struct *w) trace_drm_run_job(sched_job, entity); fence = sched->ops->run_job(sched_job); + atomic_sub(sched_job->enqueue_credits, &sched->enqueue_credit_count); complete_all(&entity->entity_idle); drm_sched_fence_scheduled(s_fence, fence); @@ -1257,6 +1258,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_ sched->ops = args->ops; sched->credit_limit = args->credit_limit; + sched->enqueue_credit_limit = args->enqueue_credit_limit; sched->name = args->name; sched->timeout = args->timeout; sched->hang_limit = args->hang_limit; @@ -1312,6 +1314,7 @@ int drm_sched_init(struct drm_gpu_scheduler *sched, const struct drm_sched_init_ INIT_LIST_HEAD(&sched->pending_list); spin_lock_init(&sched->job_list_lock); atomic_set(&sched->credit_count, 0); + atomic_set(&sched->enqueue_credit_count, 0); INIT_DELAYED_WORK(&sched->work_tdr, drm_sched_job_timedout); INIT_WORK(&sched->work_run_job, drm_sched_run_job_work); INIT_WORK(&sched->work_free_job, drm_sched_free_job_work); diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h index da64232c989d..8ec5000f81e1 100644 --- a/include/drm/gpu_scheduler.h +++ b/include/drm/gpu_scheduler.h @@ -366,6 +366,19 @@ struct drm_sched_job { enum drm_sched_priority s_priority; u32 credits; + /** + * @enqueue_credits: the number of enqueue credits this job + * contributes to the drm_gpu_scheduler.enqueue_credit_count. + * + * The (optional) @enqueue_credits should be set before calling + * drm_sched_entity_push_job(). When sum of all the jobs pushed + * to the entity, but not yet having their run_job() callback + * called exceeds @drm_gpu_scheduler.enqueue_credit_limit, the + * drm_sched_entity_push_job() will block until the count drops + * back below the limit, providing a way to throttle the number + * of queued, but not yet run, jobs. + */ + u32 enqueue_credits; /** @last_dependency: tracks @dependencies as they signal */ unsigned int last_dependency; atomic_t karma; @@ -485,6 +498,10 @@ struct drm_sched_backend_ops { * @ops: backend operations provided by the driver. * @credit_limit: the credit limit of this scheduler * @credit_count: the current credit count of this scheduler + * @enqueue_credit_limit: the credit limit of jobs pushed to scheduler and not + * yet run + * @enqueue_credit_count: the current crdit count of jobs pushed to scheduler + * but not yet run * @timeout: the time after which a job is removed from the scheduler. * @name: name of the ring for which this scheduler is being used. * @num_rqs: Number of run-queues. This is at most DRM_SCHED_PRIORITY_COUNT, @@ -518,6 +535,8 @@ struct drm_gpu_scheduler { const struct drm_sched_backend_ops *ops; u32 credit_limit; atomic_t credit_count; + u32 enqueue_credit_limit; + atomic_t enqueue_credit_count; long timeout; const char *name; u32 num_rqs; @@ -550,6 +569,8 @@ struct drm_gpu_scheduler { * @num_rqs: Number of run-queues. This may be at most DRM_SCHED_PRIORITY_COUNT, * as there's usually one run-queue per priority, but may be less. * @credit_limit: the number of credits this scheduler can hold from all jobs + * @enqueue_credit_limit: the number of credits that can be enqueued before + * drm_sched_entity_push_job() blocks * @hang_limit: number of times to allow a job to hang before dropping it. * This mechanism is DEPRECATED. Set it to 0. * @timeout: timeout value in jiffies for submitted jobs. @@ -564,6 +585,7 @@ struct drm_sched_init_args { struct workqueue_struct *timeout_wq; u32 num_rqs; u32 credit_limit; + u32 enqueue_credit_limit; unsigned int hang_limit; long timeout; atomic_t *score; @@ -600,7 +622,7 @@ int drm_sched_job_init(struct drm_sched_job *job, struct drm_sched_entity *entity, u32 credits, void *owner); void drm_sched_job_arm(struct drm_sched_job *job); -void drm_sched_entity_push_job(struct drm_sched_job *sched_job); +int drm_sched_entity_push_job(struct drm_sched_job *sched_job); int drm_sched_job_add_dependency(struct drm_sched_job *job, struct dma_fence *fence); int drm_sched_job_add_syncobj_dependency(struct drm_sched_job *job,