Skip to content

Commit

Permalink
drm/amdgpu: skip umc ras error count harvest
Browse files Browse the repository at this point in the history
remove in recovery stat check, skip umc ras err cnt
harvest in amdgpu_ras_log_on_err_counter

Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com>
Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
  • Loading branch information
Stanley.Yang authored and Alex Deucher committed Dec 7, 2021
1 parent 30c1e39 commit cf63b70
Showing 1 changed file with 10 additions and 5 deletions.
15 changes: 10 additions & 5 deletions drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c
Original file line number Diff line number Diff line change
@@ -897,11 +897,6 @@ static void amdgpu_ras_get_ecc_info(struct amdgpu_device *adev, struct ras_err_d
struct amdgpu_ras *ras = amdgpu_ras_get_context(adev);
int ret = 0;

/* skip get ecc info during gpu recovery */
if (atomic_read(&ras->in_recovery) == 1 &&
adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2))
return;

/*
* choosing right query method according to
* whether smu support query error information
@@ -1752,6 +1747,16 @@ static void amdgpu_ras_log_on_err_counter(struct amdgpu_device *adev)
if (info.head.block == AMDGPU_RAS_BLOCK__PCIE_BIF)
continue;

/*
* this is a workaround for aldebaran, skip send msg to
* smu to get ecc_info table due to smu handle get ecc
* info table failed temporarily.
* should be removed until smu fix handle ecc_info table.
*/
if ((info.head.block == AMDGPU_RAS_BLOCK__UMC) &&
(adev->ip_versions[MP1_HWIP][0] == IP_VERSION(13, 0, 2)))
continue;

amdgpu_ras_query_error_status(adev, &info);
}
}

0 comments on commit cf63b70

Please sign in to comment.