Skip to content

Commit

Permalink
drm/amdgpu: Reset RAS error count and status regs
Browse files Browse the repository at this point in the history
Reset the RAS error count and error status registers after
reading to prevent over reporting error counts on Aldebaran.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Reviewed-By: John Clements <John.Clements@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
  • Loading branch information
Mukul Joshi authored and Alex Deucher committed Apr 21, 2021
1 parent 5f41741 commit 1f0d8e3
Showing 1 changed file with 6 additions and 0 deletions.
6 changes: 6 additions & 0 deletions drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
Original file line number Diff line number Diff line change
Expand Up @@ -501,6 +501,12 @@ static ssize_t amdgpu_ras_sysfs_read(struct device *dev,
if (amdgpu_ras_query_error_status(obj->adev, &info))
return -EINVAL;


if (obj->adev->asic_type == CHIP_ALDEBARAN) {
if (amdgpu_ras_reset_error_status(obj->adev, info.head.block))
DRM_WARN("Failed to reset error counter and error status");
}

return sysfs_emit(buf, "%s: %lu\n%s: %lu\n", "ue", info.ue_count,
"ce", info.ce_count);
}
Expand Down

0 comments on commit 1f0d8e3

Please sign in to comment.