Skip to content

Commit

Permalink
drm/amd/sriov: extend NV_MAILBOX_POLL_MSG_TIMEDOUT
Browse files Browse the repository at this point in the history
on MI300/MI308 UBB products, when doing mode1 reset, since 1 gpu need to
wait all 8 gpus finish mode1 reset and then do re-init. As observed,
sometimes the gpu which triggered the reset need to wait 15s for all
gpus to finish.

If poll msg timeout, guest driver will send the reset message again, and
may mess up the following reinit sequence on other gpus.

So extend the time to cover the maximum time needed to recover.

Signed-off-by: Victor Zhao <Victor.Zhao@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
  • Loading branch information
Victor Zhao authored and Alex Deucher committed Aug 13, 2024
1 parent bbec7ce commit ef6c2cb
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion drivers/gpu/drm/amd/amdgpu/mxgpu_nv.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
#define __MXGPU_NV_H__

#define NV_MAILBOX_POLL_ACK_TIMEDOUT 500
#define NV_MAILBOX_POLL_MSG_TIMEDOUT 6000
#define NV_MAILBOX_POLL_MSG_TIMEDOUT 15000
#define NV_MAILBOX_POLL_FLR_TIMEDOUT 10000
#define NV_MAILBOX_POLL_MSG_REP_MAX 11

Expand Down

0 comments on commit ef6c2cb

Please sign in to comment.