Skip to content

Commit

Permalink
[PATCH] Three one-liners in md.c
Browse files Browse the repository at this point in the history
The main problem fixes is that in certain situations stopping md arrays may
take longer than you expect, or may require multiple attempts.  This would
only happen when resync/recovery is happening.

This patch fixes three vaguely related bugs.

1/ The recent change to use kthreads got the setting of the
   process name wrong.  This fixes it.
2/ The recent change to use kthreads lost the ability for
   md threads to be signalled with SIG_KILL.  This restores that.
3/ There is a long standing bug in that if:
    - An array needs recovery (onto a hot-spare) and
    - The recovery is being blocked because some other array being
       recovered shares a physical device and
    - The recovery thread is killed with SIG_KILL
   Then the recovery will appear to have completed with no IO being
   done, which can cause data corruption.
   This patch makes sure that incomplete recovery will be treated as
   incomplete.

Note that any kernel affected by bug 2 will not suffer the problem of bug
3, as the signal can never be delivered.  Thus the current 2.6.14-rc
kernels are not susceptible to data corruption.  Note also that if arrays
are shutdown (with "mdadm -S" or "raidstop") then the problem doesn't
occur.  It only happens if a SIGKILL is independently delivered as done by
'init' when shutting down.

Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
  • Loading branch information
NeilBrown authored and Linus Torvalds committed Oct 20, 2005
1 parent 4a9949d commit 6985c43
Showing 1 changed file with 3 additions and 1 deletion.
4 changes: 3 additions & 1 deletion drivers/md/md.c
Original file line number Diff line number Diff line change
Expand Up @@ -3063,6 +3063,7 @@ static int md_thread(void * arg)
* many dirty RAID5 blocks.
*/

allow_signal(SIGKILL);
complete(thread->event);
while (!kthread_should_stop()) {
void (*run)(mddev_t *);
Expand Down Expand Up @@ -3111,7 +3112,7 @@ mdk_thread_t *md_register_thread(void (*run) (mddev_t *), mddev_t *mddev,
thread->mddev = mddev;
thread->name = name;
thread->timeout = MAX_SCHEDULE_TIMEOUT;
thread->tsk = kthread_run(md_thread, thread, mdname(thread->mddev));
thread->tsk = kthread_run(md_thread, thread, name, mdname(thread->mddev));
if (IS_ERR(thread->tsk)) {
kfree(thread);
return NULL;
Expand Down Expand Up @@ -3569,6 +3570,7 @@ static void md_do_sync(mddev_t *mddev)
try_again:
if (signal_pending(current)) {
flush_signals(current);
set_bit(MD_RECOVERY_INTR, &mddev->recovery);
goto skip;
}
ITERATE_MDDEV(mddev2,tmp) {
Expand Down

0 comments on commit 6985c43

Please sign in to comment.