Skip to content

Commit

Permalink
drm/v3d: Don't run jobs that have errors flagged in its fence
Browse files Browse the repository at this point in the history
The V3D driver still relies on `drm_sched_increase_karma()` and
`drm_sched_resubmit_jobs()` for resubmissions when a timeout occurs.
The function `drm_sched_increase_karma()` marks the job as guilty, while
`drm_sched_resubmit_jobs()` sets an error (-ECANCELED) in the DMA fence of
that guilty job.

Because of this, we must check whether the job’s DMA fence has been
flagged with an error before executing the job. Otherwise, the same guilty
job may be resubmitted indefinitely, causing repeated GPU resets.

This patch adds a check for an error on the job's fence to prevent running
a guilty job that was previously flagged when the GPU timed out.

Note that the CPU and CACHE_CLEAN queues do not require this check, as
their jobs are executed synchronously once the DRM scheduler starts them.

Cc: stable@vger.kernel.org
Fixes: d223f98 ("drm/v3d: Add support for compute shader dispatch.")
Fixes: 1584f16 ("drm/v3d: Add support for submitting jobs to the TFU.")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Maíra Canal <mcanal@igalia.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20250313-v3d-gpu-reset-fixes-v4-1-c1e780d8e096@igalia.com
  • Loading branch information
Maíra Canal committed Mar 13, 2025

Unverified

No user is associated with the committer email.
1 parent a952f1a commit 80cbee8
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion drivers/gpu/drm/v3d/v3d_sched.c
Original file line number Diff line number Diff line change
@@ -327,11 +327,15 @@ v3d_tfu_job_run(struct drm_sched_job *sched_job)
struct drm_device *dev = &v3d->drm;
struct dma_fence *fence;

if (unlikely(job->base.base.s_fence->finished.error))
return NULL;

v3d->tfu_job = job;

fence = v3d_fence_create(v3d, V3D_TFU);
if (IS_ERR(fence))
return NULL;

v3d->tfu_job = job;
if (job->base.irq_fence)
dma_fence_put(job->base.irq_fence);
job->base.irq_fence = dma_fence_get(fence);
@@ -369,6 +373,9 @@ v3d_csd_job_run(struct drm_sched_job *sched_job)
struct dma_fence *fence;
int i, csd_cfg0_reg;

if (unlikely(job->base.base.s_fence->finished.error))
return NULL;

v3d->csd_job = job;

v3d_invalidate_caches(v3d);

0 comments on commit 80cbee8

Please sign in to comment.