Message ID | 20230531075854.703-1-johan+linaro@kernel.org |
---|---|
State | New |
Headers | show |
Series | drm/msm/a6xx: fix uninitialised lock in init error path | expand |
Hi, On Wed, May 31, 2023 at 1:00 AM Johan Hovold <johan+linaro@kernel.org> wrote: > > A recent commit started taking the GMU lock in the GPU destroy path, > which on GPU initialisation failure is called before the GMU and its > lock have been initialised. > > Make sure that the GMU has been initialised before taking the lock in > a6xx_destroy() and drop the now redundant check from a6xx_gmu_remove(). > > Fixes: 4cd15a3e8b36 ("drm/msm/a6xx: Make GPU destroy a bit safer") > Cc: stable@vger.kernel.org # 6.3 > Cc: Douglas Anderson <dianders@chromium.org> > Signed-off-by: Johan Hovold <johan+linaro@kernel.org> > --- > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 3 --- > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 ++++++--- > 2 files changed, 6 insertions(+), 6 deletions(-) I think Dmitry already posted a patch 1.5 months ago to fix this. https://lore.kernel.org/r/20230410165908.3094626-1-dmitry.baryshkov@linaro.org Can you confirm that works for you? -Doug
On Wed, May 31, 2023 at 07:22:49AM -0700, Doug Anderson wrote: > Hi, > > On Wed, May 31, 2023 at 1:00 AM Johan Hovold <johan+linaro@kernel.org> wrote: > > > > A recent commit started taking the GMU lock in the GPU destroy path, > > which on GPU initialisation failure is called before the GMU and its > > lock have been initialised. > > > > Make sure that the GMU has been initialised before taking the lock in > > a6xx_destroy() and drop the now redundant check from a6xx_gmu_remove(). > > > > Fixes: 4cd15a3e8b36 ("drm/msm/a6xx: Make GPU destroy a bit safer") > > Cc: stable@vger.kernel.org # 6.3 > > Cc: Douglas Anderson <dianders@chromium.org> > > Signed-off-by: Johan Hovold <johan+linaro@kernel.org> > > --- > > drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 3 --- > > drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 ++++++--- > > 2 files changed, 6 insertions(+), 6 deletions(-) > > I think Dmitry already posted a patch 1.5 months ago to fix this. > > https://lore.kernel.org/r/20230410165908.3094626-1-dmitry.baryshkov@linaro.org Bah, I checked if Bjorn had hit this with his recent A690 v3 series and posted a fix, but did not look further than that. > Can you confirm that works for you? That looks like it would work too, but I think I prefer my version which keeps the initialisation of the GMU struct in a6xx_gmu_init(). Dmitry or Rob, could you see to that either version gets merged soon so that we don't end up with even more people having to debug and fix the same issue? Johan
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c index e16b4b3f8535..105ccf17041f 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c @@ -1472,9 +1472,6 @@ void a6xx_gmu_remove(struct a6xx_gpu *a6xx_gpu) struct a6xx_gmu *gmu = &a6xx_gpu->gmu; struct platform_device *pdev = to_platform_device(gmu->dev); - if (!gmu->initialized) - return; - pm_runtime_force_suspend(gmu->dev); /* diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 9fb214f150dd..ee47b95a0205 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1684,6 +1684,7 @@ static void a6xx_destroy(struct msm_gpu *gpu) { struct adreno_gpu *adreno_gpu = to_adreno_gpu(gpu); struct a6xx_gpu *a6xx_gpu = to_a6xx_gpu(adreno_gpu); + struct a6xx_gmu *gmu = &a6xx_gpu->gmu; if (a6xx_gpu->sqe_bo) { msm_gem_unpin_iova(a6xx_gpu->sqe_bo, gpu->aspace); @@ -1697,9 +1698,11 @@ static void a6xx_destroy(struct msm_gpu *gpu) a6xx_llc_slices_destroy(a6xx_gpu); - mutex_lock(&a6xx_gpu->gmu.lock); - a6xx_gmu_remove(a6xx_gpu); - mutex_unlock(&a6xx_gpu->gmu.lock); + if (gmu->initialized) { + mutex_lock(&gmu->lock); + a6xx_gmu_remove(a6xx_gpu); + mutex_unlock(&gmu->lock); + } adreno_gpu_cleanup(adreno_gpu);
A recent commit started taking the GMU lock in the GPU destroy path, which on GPU initialisation failure is called before the GMU and its lock have been initialised. Make sure that the GMU has been initialised before taking the lock in a6xx_destroy() and drop the now redundant check from a6xx_gmu_remove(). Fixes: 4cd15a3e8b36 ("drm/msm/a6xx: Make GPU destroy a bit safer") Cc: stable@vger.kernel.org # 6.3 Cc: Douglas Anderson <dianders@chromium.org> Signed-off-by: Johan Hovold <johan+linaro@kernel.org> --- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 3 --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 9 ++++++--- 2 files changed, 6 insertions(+), 6 deletions(-)