Skip to content

Commit

Permalink
md/raid10: fix deadlock with unaligned read during resync
Browse files Browse the repository at this point in the history
If the 'bio_split' path in raid10-read is used while
resync/recovery is happening it is possible to deadlock.
Fix this be elevating ->nr_waiting for the duration of both
parts of the split request.

This fixes a bug that has been present since 2.6.22
but has only started manifesting recently for unknown reasons.
It is suitable for and -stable since then.

Reported-by:  Justin Bronder <jsbronder@gentoo.org>
Tested-by:  Justin Bronder <jsbronder@gentoo.org>
Signed-off-by: NeilBrown <neilb@suse.de>
Cc: stable@kernel.org
  • Loading branch information
NeilBrown committed Aug 7, 2010
1 parent 69e51b4 commit 51e9ac7
Showing 1 changed file with 18 additions and 0 deletions.
18 changes: 18 additions & 0 deletions drivers/md/raid10.c
Original file line number Diff line number Diff line change
Expand Up @@ -825,11 +825,29 @@ static int make_request(mddev_t *mddev, struct bio * bio)
*/
bp = bio_split(bio,
chunk_sects - (bio->bi_sector & (chunk_sects - 1)) );

/* Each of these 'make_request' calls will call 'wait_barrier'.
* If the first succeeds but the second blocks due to the resync
* thread raising the barrier, we will deadlock because the
* IO to the underlying device will be queued in generic_make_request
* and will never complete, so will never reduce nr_pending.
* So increment nr_waiting here so no new raise_barriers will
* succeed, and so the second wait_barrier cannot block.
*/
spin_lock_irq(&conf->resync_lock);
conf->nr_waiting++;
spin_unlock_irq(&conf->resync_lock);

if (make_request(mddev, &bp->bio1))
generic_make_request(&bp->bio1);
if (make_request(mddev, &bp->bio2))
generic_make_request(&bp->bio2);

spin_lock_irq(&conf->resync_lock);
conf->nr_waiting--;
wake_up(&conf->wait_barrier);
spin_unlock_irq(&conf->resync_lock);

bio_pair_release(bp);
return 0;
bad_map:
Expand Down

0 comments on commit 51e9ac7

Please sign in to comment.