Skip to content

Commit

Permalink
nvme-pci: Fix AER reset handling
Browse files Browse the repository at this point in the history
The nvme timeout handling doesn't do anything if the pci channel is
offline, which is the case when recovering from PCI error event, so it
was a bad idea to sync the controller reset in this state. This patch
flushes the reset work in the error_resume callback instead when the
channel is back to online. This keeps AER handling serialized and
can recover from timeouts.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=199757
Fixes: cc1d5e7 ("nvme/pci: Sync controller reset for AER slot_reset")
Reported-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
  • Loading branch information
Keith Busch authored and Christoph Hellwig committed May 25, 2018
1 parent a8e3e0b commit 72cd4cc
Showing 1 changed file with 5 additions and 9 deletions.
14 changes: 5 additions & 9 deletions drivers/nvme/host/pci.c
Original file line number Diff line number Diff line change
Expand Up @@ -2706,19 +2706,15 @@ static pci_ers_result_t nvme_slot_reset(struct pci_dev *pdev)

dev_info(dev->ctrl.device, "restart after slot reset\n");
pci_restore_state(pdev);
nvme_reset_ctrl_sync(&dev->ctrl);

switch (dev->ctrl.state) {
case NVME_CTRL_LIVE:
case NVME_CTRL_ADMIN_ONLY:
return PCI_ERS_RESULT_RECOVERED;
default:
return PCI_ERS_RESULT_DISCONNECT;
}
nvme_reset_ctrl(&dev->ctrl);
return PCI_ERS_RESULT_RECOVERED;
}

static void nvme_error_resume(struct pci_dev *pdev)
{
struct nvme_dev *dev = pci_get_drvdata(pdev);

flush_work(&dev->ctrl.reset_work);
pci_cleanup_aer_uncorrect_error_status(pdev);
}

Expand Down

0 comments on commit 72cd4cc

Please sign in to comment.