Skip to content

Commit

Permalink
taprio: fix panic while hw offload sched list swap
Browse files Browse the repository at this point in the history
Don't swap oper and admin schedules too early, it's not correct and
causes crash.

Steps to reproduce:

1)
tc qdisc replace dev eth0 parent root handle 100 taprio \
    num_tc 3 \
    map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
    queues 1@0 1@1 1@2 \
    base-time $SOME_BASE_TIME \
    sched-entry S 01 80000 \
    sched-entry S 02 15000 \
    sched-entry S 04 40000 \
    flags 2

2)
tc qdisc replace dev eth0 parent root handle 100 taprio \
    base-time $SOME_BASE_TIME \
    sched-entry S 01 90000 \
    sched-entry S 02 20000 \
    sched-entry S 04 40000 \
    flags 2

3)
tc qdisc replace dev eth0 parent root handle 100 taprio \
    base-time $SOME_BASE_TIME \
    sched-entry S 01 150000 \
    sched-entry S 02 200000 \
    sched-entry S 04 40000 \
    flags 2

Do 2 3 2 .. steps  more times if not happens and observe:

[  305.832319] Unable to handle kernel write to read-only memory at
virtual address ffff0000087ce7f0
[  305.910887] CPU: 0 PID: 0 Comm: swapper/0 Not tainted
[  305.919306] Hardware name: Texas Instruments AM654 Base Board (DT)

[...]

[  306.017119] x1 : ffff800848031d88 x0 : ffff800848031d80
[  306.022422] Call trace:
[  306.024866]  taprio_free_sched_cb+0x4c/0x98
[  306.029040]  rcu_process_callbacks+0x25c/0x410
[  306.033476]  __do_softirq+0x10c/0x208
[  306.037132]  irq_exit+0xb8/0xc8
[  306.040267]  __handle_domain_irq+0x64/0xb8
[  306.044352]  gic_handle_irq+0x7c/0x178
[  306.048092]  el1_irq+0xb0/0x128
[  306.051227]  arch_cpu_idle+0x10/0x18
[  306.054795]  do_idle+0x120/0x138
[  306.058015]  cpu_startup_entry+0x20/0x28
[  306.061931]  rest_init+0xcc/0xd8
[  306.065154]  start_kernel+0x3bc/0x3e4
[  306.068810] Code: f2fbd5b7 f2fbd5b6 d503201f f9400422 (f9000662)
[  306.074900] ---[ end trace 96c8e2284a9d9d6e ]---
[  306.079507] Kernel panic - not syncing: Fatal exception in interrupt
[  306.085847] SMP: stopping secondary CPUs
[  306.089765] Kernel Offset: disabled

Try to explain one of the possible crash cases:

The "real" admin list is assigned when admin_sched is set to
new_admin, it happens after "swap", that assigns to oper_sched NULL.
Thus if call qdisc show it can crash.

Farther, next second time, when sched list is updated, the admin_sched
is not NULL and becomes the oper_sched, previous oper_sched was NULL so
just skipped. But then admin_sched is assigned new_admin, but schedules
to free previous assigned admin_sched (that already became oper_sched).

Farther, next third time, when sched list is updated,
while one more swap, oper_sched is not null, but it was happy to be
freed already (while prev. admin update), so while try to free
oper_sched the kernel panic happens at taprio_free_sched_cb().

So, move the "swap emulation" where it should be according to function
comment from code.

Fixes: 9c66d15 ("taprio: Add support for hardware offloading")
Signed-off-by: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org>
Acked-by: Vinicius Costa Gomes <vinicius.gomes@intel.com>
Tested-by: Vladimir Oltean <olteanv@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
Ivan Khoronzhuk authored and David S. Miller committed Nov 5, 2019
1 parent fc564e0 commit 0763b3e
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions net/sched/sch_taprio.c
Original file line number Diff line number Diff line change
Expand Up @@ -1224,8 +1224,6 @@ static int taprio_enable_offload(struct net_device *dev,
goto done;
}

taprio_offload_config_changed(q);

done:
taprio_offload_free(offload);

Expand Down Expand Up @@ -1505,6 +1503,9 @@ static int taprio_change(struct Qdisc *sch, struct nlattr *opt,
call_rcu(&admin->rcu, taprio_free_sched_cb);

spin_unlock_irqrestore(&q->current_entry_lock, flags);

if (FULL_OFFLOAD_IS_ENABLED(taprio_flags))
taprio_offload_config_changed(q);
}

new_admin = NULL;
Expand Down

0 comments on commit 0763b3e

Please sign in to comment.