Skip to content

Commit

Permalink
nvme: fix possible io failures when removing multipathed ns
Browse files Browse the repository at this point in the history
When a shared namespace is removed, we call blk_cleanup_queue()
when the device can still be accessed as the current path and this can
result in submission to a dying queue. Hence, direct_make_request()
called by our mpath device may fail (propagating the failure to userspace).
Instead, we want to failover this I/O to a different path if one exists.
Thus, before we cleanup the request queue, we make sure that the device is
cleared from the current path nor it can be selected again as such.

Fix this by:
- clear the ns from the head->list and synchronize rcu to make sure there is
  no concurrent path search that restores it as the current path
- clear the mpath current path in order to trigger a subsequent path search
  and sync srcu to wait for any ongoing request submissions
- safely continue to namespace removal and blk_cleanup_queue

Signed-off-by: Anton Eidelman <anton@lightbitslabs.com>
Signed-off-by: Sagi Grimberg <sagi@grimberg.me>
Signed-off-by: Christoph Hellwig <hch@lst.de>
  • Loading branch information
Anton Eidelman authored and Christoph Hellwig committed Jun 21, 2019
1 parent 4bea364 commit 2181e45
Showing 1 changed file with 8 additions and 6 deletions.
14 changes: 8 additions & 6 deletions drivers/nvme/host/core.c
Original file line number Diff line number Diff line change
Expand Up @@ -3344,23 +3344,25 @@ static void nvme_ns_remove(struct nvme_ns *ns)
return;

nvme_fault_inject_fini(ns);

mutex_lock(&ns->ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
mutex_unlock(&ns->ctrl->subsys->lock);
synchronize_rcu(); /* guarantee not available in head->list */
nvme_mpath_clear_current_path(ns);
synchronize_srcu(&ns->head->srcu); /* wait for concurrent submissions */

if (ns->disk && ns->disk->flags & GENHD_FL_UP) {
del_gendisk(ns->disk);
blk_cleanup_queue(ns->queue);
if (blk_get_integrity(ns->disk))
blk_integrity_unregister(ns->disk);
}

mutex_lock(&ns->ctrl->subsys->lock);
list_del_rcu(&ns->siblings);
nvme_mpath_clear_current_path(ns);
mutex_unlock(&ns->ctrl->subsys->lock);

down_write(&ns->ctrl->namespaces_rwsem);
list_del_init(&ns->list);
up_write(&ns->ctrl->namespaces_rwsem);

synchronize_srcu(&ns->head->srcu);
nvme_mpath_check_last_path(ns);
nvme_put_ns(ns);
}
Expand Down

0 comments on commit 2181e45

Please sign in to comment.