Skip to content

Commit

Permalink
nvme-pci: fix rapid add remove sequence
Browse files Browse the repository at this point in the history
A surprise removal may fail to tear down request queues if it is racing
with the initial asynchronous probe. If that happens, the remove path
won't see the queue resources to tear down, and the controller reset
path may create a new request queue on a removed device, but will not
be able to make forward progress, deadlocking the pci removal.

Protect setting up non-blocking resources from a shutdown by holding the
same mutex, and transition to the CONNECTING state after these resources
are initialized so the probe path may see the dead controller state
before dispatching new IO.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202081
Reported-by: Alex Gagniuc <Alex_Gagniuc@Dellteam.com>
Signed-off-by: Keith Busch <keith.busch@intel.com>
Tested-by: Alex Gagniuc <mr.nuke.me@gmail.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
  • Loading branch information
Keith Busch authored and Christoph Hellwig committed Feb 6, 2019
1 parent e7ad43c commit 5c959d7
Showing 1 changed file with 12 additions and 10 deletions.
22 changes: 12 additions & 10 deletions drivers/nvme/host/pci.c
Original file line number Diff line number Diff line change
Expand Up @@ -2557,16 +2557,7 @@ static void nvme_reset_work(struct work_struct *work)
if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
nvme_dev_disable(dev, false);

/*
* Introduce CONNECTING state from nvme-fc/rdma transports to mark the
* initializing procedure here.
*/
if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
dev_warn(dev->ctrl.device,
"failed to mark controller CONNECTING\n");
goto out;
}

mutex_lock(&dev->shutdown_lock);
result = nvme_pci_enable(dev);
if (result)
goto out;
Expand All @@ -2585,6 +2576,17 @@ static void nvme_reset_work(struct work_struct *work)
*/
dev->ctrl.max_hw_sectors = NVME_MAX_KB_SZ << 1;
dev->ctrl.max_segments = NVME_MAX_SEGS;
mutex_unlock(&dev->shutdown_lock);

/*
* Introduce CONNECTING state from nvme-fc/rdma transports to mark the
* initializing procedure here.
*/
if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_CONNECTING)) {
dev_warn(dev->ctrl.device,
"failed to mark controller CONNECTING\n");
goto out;
}

result = nvme_init_identify(&dev->ctrl);
if (result)
Expand Down

0 comments on commit 5c959d7

Please sign in to comment.