Skip to content

Commit

Permalink
drm/etnaviv: bring back progress check in job timeout handler
Browse files Browse the repository at this point in the history
When the hangcheck handler was replaced by the DRM scheduler timeout
handling we dropped the forward progress check, as this might allow
clients to hog the GPU for a long time with a big job.

It turns out that even reasonably well behaved clients like the
Armada Xorg driver occasionally trip over the 500ms timeout. Bring
back the forward progress check to get rid of the userspace regression.

We would still like to fix userspace to submit smaller batches
if possible, but that is for another day.

Cc: <stable@vger.kernel.org>
Fixes: 6d7a20c (drm/etnaviv: replace hangcheck with scheduler timeout)
Reported-by: Russell King <linux@armlinux.org.uk>
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Eric Anholt <eric@anholt.net>
  • Loading branch information
Lucas Stach committed Jul 5, 2018
1 parent bf6ba3a commit 2c83a72
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 0 deletions.
3 changes: 3 additions & 0 deletions drivers/gpu/drm/etnaviv/etnaviv_gpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -131,6 +131,9 @@ struct etnaviv_gpu {
struct work_struct sync_point_work;
int sync_point_event;

/* hang detection */
u32 hangcheck_dma_addr;

void __iomem *mmio;
int irq;

Expand Down
24 changes: 24 additions & 0 deletions drivers/gpu/drm/etnaviv/etnaviv_sched.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@
#include "etnaviv_gem.h"
#include "etnaviv_gpu.h"
#include "etnaviv_sched.h"
#include "state.xml.h"

static int etnaviv_job_hang_limit = 0;
module_param_named(job_hang_limit, etnaviv_job_hang_limit, int , 0444);
Expand Down Expand Up @@ -85,6 +86,29 @@ static void etnaviv_sched_timedout_job(struct drm_sched_job *sched_job)
{
struct etnaviv_gem_submit *submit = to_etnaviv_submit(sched_job);
struct etnaviv_gpu *gpu = submit->gpu;
u32 dma_addr;
int change;

/*
* If the GPU managed to complete this jobs fence, the timout is
* spurious. Bail out.
*/
if (fence_completed(gpu, submit->out_fence->seqno))
return;

/*
* If the GPU is still making forward progress on the front-end (which
* should never loop) we shift out the timeout to give it a chance to
* finish the job.
*/
dma_addr = gpu_read(gpu, VIVS_FE_DMA_ADDRESS);
change = dma_addr - gpu->hangcheck_dma_addr;
if (change < 0 || change > 16) {
gpu->hangcheck_dma_addr = dma_addr;
schedule_delayed_work(&sched_job->work_tdr,
sched_job->sched->timeout);
return;
}

/* block scheduler */
kthread_park(gpu->sched.thread);
Expand Down

0 comments on commit 2c83a72

Please sign in to comment.