Skip to content

Commit

Permalink
sched: Ensure cpu_power periodic update
Browse files Browse the repository at this point in the history
With a lot of small tasks, the softirq sched is nearly never called
when no_hz is enabled. In this case load_balance() is mainly called
with the newly_idle mode which doesn't update the cpu_power.

Add a next_update field which ensure a maximum update period when
there is short activity.

Having stale cpu_power information can skew the load-balancing
decisions, this is cured by the guaranteed update.

Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/1323717668-2143-1-git-send-email-vincent.guittot@linaro.org
  • Loading branch information
Vincent Guittot authored and Ingo Molnar committed Jan 27, 2012
1 parent 39be350 commit 4ec4412
Show file tree
Hide file tree
Showing 2 changed files with 17 additions and 8 deletions.
1 change: 1 addition & 0 deletions include/linux/sched.h
Original file line number Diff line number Diff line change
Expand Up @@ -905,6 +905,7 @@ struct sched_group_power {
* single CPU.
*/
unsigned int power, power_orig;
unsigned long next_update;
/*
* Number of busy cpus in this group.
*/
Expand Down
24 changes: 16 additions & 8 deletions kernel/sched/fair.c
Original file line number Diff line number Diff line change
Expand Up @@ -215,6 +215,8 @@ calc_delta_mine(unsigned long delta_exec, unsigned long weight,

const struct sched_class fair_sched_class;

static unsigned long __read_mostly max_load_balance_interval = HZ/10;

/**************************************************************
* CFS operations on generic schedulable entities:
*/
Expand Down Expand Up @@ -3776,6 +3778,11 @@ void update_group_power(struct sched_domain *sd, int cpu)
struct sched_domain *child = sd->child;
struct sched_group *group, *sdg = sd->groups;
unsigned long power;
unsigned long interval;

interval = msecs_to_jiffies(sd->balance_interval);
interval = clamp(interval, 1UL, max_load_balance_interval);
sdg->sgp->next_update = jiffies + interval;

if (!child) {
update_cpu_power(sd, cpu);
Expand Down Expand Up @@ -3883,12 +3890,15 @@ static inline void update_sg_lb_stats(struct sched_domain *sd,
* domains. In the newly idle case, we will allow all the cpu's
* to do the newly idle load balance.
*/
if (idle != CPU_NEWLY_IDLE && local_group) {
if (balance_cpu != this_cpu) {
*balance = 0;
return;
}
update_group_power(sd, this_cpu);
if (local_group) {
if (idle != CPU_NEWLY_IDLE) {
if (balance_cpu != this_cpu) {
*balance = 0;
return;
}
update_group_power(sd, this_cpu);
} else if (time_after_eq(jiffies, group->sgp->next_update))
update_group_power(sd, this_cpu);
}

/* Adjust by relative CPU power of the group */
Expand Down Expand Up @@ -4945,8 +4955,6 @@ static int __cpuinit sched_ilb_notifier(struct notifier_block *nfb,

static DEFINE_SPINLOCK(balancing);

static unsigned long __read_mostly max_load_balance_interval = HZ/10;

/*
* Scale the max load_balance interval with the number of CPUs in the system.
* This trades load-balance latency on larger machines for less cross talk.
Expand Down

0 comments on commit 4ec4412

Please sign in to comment.