Skip to content

Commit

Permalink
smp/hotplug: Callback vs state-machine consistency
Browse files Browse the repository at this point in the history
While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.

Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.

But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.

AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org
  • Loading branch information
Peter Zijlstra authored and Thomas Gleixner committed Sep 25, 2017
1 parent 4dddfb5 commit 724a868
Showing 1 changed file with 22 additions and 4 deletions.
26 changes: 22 additions & 4 deletions kernel/cpu.c
Original file line number Diff line number Diff line change
Expand Up @@ -202,7 +202,14 @@ static int cpuhp_invoke_callback(unsigned int cpu, enum cpuhp_state state,
hlist_for_each(node, &step->list) {
if (!cnt--)
break;
cbm(cpu, node);

trace_cpuhp_multi_enter(cpu, st->target, state, cbm, node);
ret = cbm(cpu, node);
trace_cpuhp_exit(cpu, st->state, state, ret);
/*
* Rollback must not fail,
*/
WARN_ON_ONCE(ret);
}
return ret;
}
Expand Down Expand Up @@ -659,6 +666,7 @@ static int take_cpu_down(void *_param)
struct cpuhp_cpu_state *st = this_cpu_ptr(&cpuhp_state);
enum cpuhp_state target = max((int)st->target, CPUHP_AP_OFFLINE);
int err, cpu = smp_processor_id();
int ret;

/* Ensure this CPU doesn't handle any more interrupts. */
err = __cpu_disable();
Expand All @@ -672,8 +680,13 @@ static int take_cpu_down(void *_param)
WARN_ON(st->state != CPUHP_TEARDOWN_CPU);
st->state--;
/* Invoke the former CPU_DYING callbacks */
for (; st->state > target; st->state--)
cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
for (; st->state > target; st->state--) {
ret = cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
/*
* DYING must not fail!
*/
WARN_ON_ONCE(ret);
}

/* Give up timekeeping duties */
tick_handover_do_timer();
Expand Down Expand Up @@ -876,11 +889,16 @@ void notify_cpu_starting(unsigned int cpu)
{
struct cpuhp_cpu_state *st = per_cpu_ptr(&cpuhp_state, cpu);
enum cpuhp_state target = min((int)st->target, CPUHP_AP_ONLINE);
int ret;

rcu_cpu_starting(cpu); /* Enables RCU usage on this CPU. */
while (st->state < target) {
st->state++;
cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
/*
* STARTING must not fail!
*/
WARN_ON_ONCE(ret);
}
}

Expand Down

0 comments on commit 724a868

Please sign in to comment.