Skip to content

Commit

Permalink
x86, MCE: Robustify mcheck_init_device
Browse files Browse the repository at this point in the history
BorisO reports that misc_register() fails often on xen. The current code
unregisters the CPU hotplug notifier in that case. If then a CPU is
offlined and onlined back again, we end up with a second timer running
on that CPU, leading to soft lockups and system hangs.

So let's leave the hotcpu notifier always registered - even if
mce_device_create failed for some cores and never unreg it so that we
can deal with the timer handling accordingly.

Reported-and-Tested-by: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Link: http://lkml.kernel.org/r/1403274493-1371-1-git-send-email-boris.ostrovsky@oracle.com
Signed-off-by: Borislav Petkov <bp@suse.de>
  • Loading branch information
Borislav Petkov committed Jul 21, 2014
1 parent 9a3c414 commit 51cbe7e
Showing 1 changed file with 6 additions and 4 deletions.
10 changes: 6 additions & 4 deletions arch/x86/kernel/cpu/mcheck/mce.c
Original file line number Diff line number Diff line change
Expand Up @@ -2451,6 +2451,12 @@ static __init int mcheck_init_device(void)
for_each_online_cpu(i) {
err = mce_device_create(i);
if (err) {
/*
* Register notifier anyway (and do not unreg it) so
* that we don't leave undeleted timers, see notifier
* callback above.
*/
__register_hotcpu_notifier(&mce_cpu_notifier);
cpu_notifier_register_done();
goto err_device_create;
}
Expand All @@ -2471,10 +2477,6 @@ static __init int mcheck_init_device(void)
err_register:
unregister_syscore_ops(&mce_syscore_ops);

cpu_notifier_register_begin();
__unregister_hotcpu_notifier(&mce_cpu_notifier);
cpu_notifier_register_done();

err_device_create:
/*
* We didn't keep track of which devices were created above, but
Expand Down

0 comments on commit 51cbe7e

Please sign in to comment.