Skip to content

Commit

Permalink
selftests/eeh: Bump EEH wait time to 60s
Browse files Browse the repository at this point in the history
Some newer cards supported by aacraid can take up to 40s to recover
after an EEH event. This causes spurious failures in the basic EEH
self-test since the current maximim timeout is only 30s.

Fix the immediate issue by bumping the timeout to a default of 60s,
and allow the wait time to be specified via an environmental variable
(EEH_MAX_WAIT).

Reported-by: Steve Best <sbest@redhat.com>
Suggested-by: Douglas Miller <dougmill@us.ibm.com>
Signed-off-by: Oliver O'Halloran <oohall@gmail.com>
Signed-off-by: Michael Ellerman <mpe@ellerman.id.au>
Link: https://lore.kernel.org/r/20200122031125.25991-1-oohall@gmail.com
  • Loading branch information
Oliver O'Halloran authored and Michael Ellerman committed Jan 25, 2020
1 parent f1dbc1c commit 414f504
Showing 1 changed file with 7 additions and 3 deletions.
10 changes: 7 additions & 3 deletions tools/testing/selftests/powerpc/eeh/eeh-functions.sh
Original file line number Diff line number Diff line change
Expand Up @@ -53,9 +53,13 @@ eeh_one_dev() {
# is a no-op.
echo $dev >/sys/kernel/debug/powerpc/eeh_dev_check

# Enforce a 30s timeout for recovery. Even the IPR, which is infamously
# slow to reset, should recover within 30s.
max_wait=30
# Default to a 60s timeout when waiting for a device to recover. This
# is an arbitrary default which can be overridden by setting the
# EEH_MAX_WAIT environmental variable when required.

# The current record holder for longest recovery time is:
# "Adaptec Series 8 12G SAS/PCIe 3" at 39 seconds
max_wait=${EEH_MAX_WAIT:=60}

for i in `seq 0 ${max_wait}` ; do
if pe_ok $dev ; then
Expand Down

0 comments on commit 414f504

Please sign in to comment.