Skip to content

Commit

Permalink
habanalabs: rate limit error msg on waiting for CS
Browse files Browse the repository at this point in the history
In case a user submits a CS, and the submission fails, and the user doesn't
check the return value and instead use the error return value as a valid
sequence number of a CS and ask to wait on it, the driver will print an
error and return an error code for that wait.

The real problem happens if now the user ignores the error of the wait, and
try to wait again and again. This can lead to a flood of error messages
from the driver and even soft lockup event.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Tomer Tayar <ttayar@habana.ai>
  • Loading branch information
Oded Gabbay committed Dec 14, 2019
1 parent 1698174 commit 018e0e3
Show file tree
Hide file tree
Showing 2 changed files with 4 additions and 3 deletions.
5 changes: 3 additions & 2 deletions drivers/misc/habanalabs/command_submission.c
Original file line number Diff line number Diff line change
Expand Up @@ -824,8 +824,9 @@ int hl_cs_wait_ioctl(struct hl_fpriv *hpriv, void *data)
memset(args, 0, sizeof(*args));

if (rc < 0) {
dev_err(hdev->dev, "Error %ld on waiting for CS handle %llu\n",
rc, seq);
dev_err_ratelimited(hdev->dev,
"Error %ld on waiting for CS handle %llu\n",
rc, seq);
if (rc == -ERESTARTSYS) {
args->out.status = HL_WAIT_CS_STATUS_INTERRUPTED;
rc = -EINTR;
Expand Down
2 changes: 1 addition & 1 deletion drivers/misc/habanalabs/context.c
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ struct dma_fence *hl_ctx_get_fence(struct hl_ctx *ctx, u64 seq)
spin_lock(&ctx->cs_lock);

if (seq >= ctx->cs_sequence) {
dev_notice(hdev->dev,
dev_notice_ratelimited(hdev->dev,
"Can't wait on seq %llu because current CS is at seq %llu\n",
seq, ctx->cs_sequence);
spin_unlock(&ctx->cs_lock);
Expand Down

0 comments on commit 018e0e3

Please sign in to comment.