Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 345691
b: refs/heads/master
c: 7fb907c
h: refs/heads/master
i:
  345689: 3f381e2
  345687: f858a56
v: v3
  • Loading branch information
Lars Ellenberg authored and Jens Axboe committed Oct 30, 2012
1 parent 44ad0bd commit 08dd3b4
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 1 deletion.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: dbd0820c6f7b7db9a97d63ea379fc174a63ddbca
refs/heads/master: 7fb907c15fb8d0e10e72c8566a13f6defab3f484
36 changes: 36 additions & 0 deletions trunk/drivers/block/drbd/drbd_worker.c
Original file line number Diff line number Diff line change
Expand Up @@ -227,6 +227,42 @@ void drbd_endio_pri(struct bio *bio, int error)
error = -EIO;
}

/* If this request was aborted locally before,
* but now was completed "successfully",
* chances are that this caused arbitrary data corruption.
*
* "aborting" requests, or force-detaching the disk, is intended for
* completely blocked/hung local backing devices which do no longer
* complete requests at all, not even do error completions. In this
* situation, usually a hard-reset and failover is the only way out.
*
* By "aborting", basically faking a local error-completion,
* we allow for a more graceful swichover by cleanly migrating services.
* Still the affected node has to be rebooted "soon".
*
* By completing these requests, we allow the upper layers to re-use
* the associated data pages.
*
* If later the local backing device "recovers", and now DMAs some data
* from disk into the original request pages, in the best case it will
* just put random data into unused pages; but typically it will corrupt
* meanwhile completely unrelated data, causing all sorts of damage.
*
* Which means delayed successful completion,
* especially for READ requests,
* is a reason to panic().
*
* We assume that a delayed *error* completion is OK,
* though we still will complain noisily about it.
*/
if (unlikely(req->rq_state & RQ_LOCAL_ABORTED)) {
if (__ratelimit(&drbd_ratelimit_state))
dev_emerg(DEV, "delayed completion of aborted local request; disk-timeout may be too aggressive\n");

if (!error)
panic("possible random memory corruption caused by delayed completion of aborted local request\n");
}

/* to avoid recursion in __req_mod */
if (unlikely(error)) {
what = (bio_data_dir(bio) == WRITE)
Expand Down

0 comments on commit 08dd3b4

Please sign in to comment.