Skip to content

Commit

Permalink
Merge tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux…
Browse files Browse the repository at this point in the history
…/kernel/git/tip/tip

Pull RCU updates from Ingo Molnar:
 "These are the latest RCU updates for v5.12:

   - Documentation updates.

   - Miscellaneous fixes.

   - kfree_rcu() updates: Addition of mem_dump_obj() to provide
     allocator return addresses to more easily locate bugs. This has a
     couple of RCU-related commits, but is mostly MM. Was pulled in with
     akpm's agreement.

   - Per-callback-batch tracking of numbers of callbacks, which enables
     better debugging information and smarter reactions to large numbers
     of callbacks.

   - The first round of changes to allow CPUs to be runtime switched
     from and to callback-offloaded state.

   - CONFIG_PREEMPT_RT-related changes.

   - RCU CPU stall warning updates.

   - Addition of polling grace-period APIs for SRCU.

   - Torture-test and torture-test scripting updates, including a
     "torture everything" script that runs rcutorture, locktorture,
     scftorture, rcuscale, and refscale. Plus does an allmodconfig
     build.

   - nolibc fixes for the torture tests"

* tag 'core-rcu-2021-02-17' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (130 commits)
  percpu_ref: Dump mem_dump_obj() info upon reference-count underflow
  rcu: Make call_rcu() print mem_dump_obj() info for double-freed callback
  mm: Make mem_obj_dump() vmalloc() dumps include start and length
  mm: Make mem_dump_obj() handle vmalloc() memory
  mm: Make mem_dump_obj() handle NULL and zero-sized pointers
  mm: Add mem_dump_obj() to print source of memory block
  tools/rcutorture: Fix position of -lgcc in mkinitrd.sh
  tools/nolibc: Fix position of -lgcc in the documented example
  tools/nolibc: Emit detailed error for missing alternate syscall number definitions
  tools/nolibc: Remove incorrect definitions of __ARCH_WANT_*
  tools/nolibc: Get timeval, timespec and timezone from linux/time.h
  tools/nolibc: Implement poll() based on ppoll()
  tools/nolibc: Implement fork() based on clone()
  tools/nolibc: Make getpgrp() fall back to getpgid(0)
  tools/nolibc: Make dup2() rely on dup3() when available
  tools/nolibc: Add the definition for dup()
  rcutorture: Add rcutree.use_softirq=0 to RUDE01 and TASKS01
  torture: Maintain torture-specific set of CPUs-online books
  torture: Clean up after torture-test CPU hotplugging
  rcutorture: Make object_debug also double call_rcu() heap object
  ...
  • Loading branch information
Linus Torvalds committed Feb 21, 2021
2 parents 3f6ec19 + 2b392cb commit d089f48
Show file tree
Hide file tree
Showing 63 changed files with 3,108 additions and 763 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ sections.
RCU-preempt Expedited Grace Periods
===================================

``CONFIG_PREEMPT=y`` kernels implement RCU-preempt.
``CONFIG_PREEMPTION=y`` kernels implement RCU-preempt.
The overall flow of the handling of a given CPU by an RCU-preempt
expedited grace period is shown in the following diagram:

Expand Down Expand Up @@ -112,7 +112,7 @@ things.
RCU-sched Expedited Grace Periods
---------------------------------

``CONFIG_PREEMPT=n`` kernels implement RCU-sched. The overall flow of
``CONFIG_PREEMPTION=n`` kernels implement RCU-sched. The overall flow of
the handling of a given CPU by an RCU-sched expedited grace period is
shown in the following diagram:

Expand Down
732 changes: 374 additions & 358 deletions Documentation/RCU/Design/Requirements/Requirements.rst

Large diffs are not rendered by default.

10 changes: 4 additions & 6 deletions Documentation/RCU/checklist.rst
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ over a rather long period of time, but improvements are always welcome!
is less readable and prevents lockdep from detecting locking issues.

Letting RCU-protected pointers "leak" out of an RCU read-side
critical section is every bid as bad as letting them leak out
critical section is every bit as bad as letting them leak out
from under a lock. Unless, of course, you have arranged some
other means of protection, such as a lock or a reference count
-before- letting them out of the RCU read-side critical section.
Expand Down Expand Up @@ -129,9 +129,7 @@ over a rather long period of time, but improvements are always welcome!
accesses. The rcu_dereference() primitive ensures that
the CPU picks up the pointer before it picks up the data
that the pointer points to. This really is necessary
on Alpha CPUs. If you don't believe me, see:

http://www.openvms.compaq.com/wizard/wiz_2637.html
on Alpha CPUs.

The rcu_dereference() primitive is also an excellent
documentation aid, letting the person reading the
Expand Down Expand Up @@ -214,9 +212,9 @@ over a rather long period of time, but improvements are always welcome!
the rest of the system.

7. As of v4.20, a given kernel implements only one RCU flavor,
which is RCU-sched for PREEMPT=n and RCU-preempt for PREEMPT=y.
which is RCU-sched for PREEMPTION=n and RCU-preempt for PREEMPTION=y.
If the updater uses call_rcu() or synchronize_rcu(),
then the corresponding readers my use rcu_read_lock() and
then the corresponding readers may use rcu_read_lock() and
rcu_read_unlock(), rcu_read_lock_bh() and rcu_read_unlock_bh(),
or any pair of primitives that disables and re-enables preemption,
for example, rcu_read_lock_sched() and rcu_read_unlock_sched().
Expand Down
6 changes: 3 additions & 3 deletions Documentation/RCU/rcubarrier.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ RCU (read-copy update) is a synchronization mechanism that can be thought
of as a replacement for read-writer locking (among other things), but with
very low-overhead readers that are immune to deadlock, priority inversion,
and unbounded latency. RCU read-side critical sections are delimited
by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPT
by rcu_read_lock() and rcu_read_unlock(), which, in non-CONFIG_PREEMPTION
kernels, generate no code whatsoever.

This means that RCU writers are unaware of the presence of concurrent
Expand Down Expand Up @@ -329,10 +329,10 @@ Answer: This cannot happen. The reason is that on_each_cpu() has its last
to smp_call_function() and further to smp_call_function_on_cpu(),
causing this latter to spin until the cross-CPU invocation of
rcu_barrier_func() has completed. This by itself would prevent
a grace period from completing on non-CONFIG_PREEMPT kernels,
a grace period from completing on non-CONFIG_PREEMPTION kernels,
since each CPU must undergo a context switch (or other quiescent
state) before the grace period can complete. However, this is
of no use in CONFIG_PREEMPT kernels.
of no use in CONFIG_PREEMPTION kernels.

Therefore, on_each_cpu() disables preemption across its call
to smp_call_function() and also across the local call to
Expand Down
27 changes: 24 additions & 3 deletions Documentation/RCU/stallwarn.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ warnings:

- A CPU looping with bottom halves disabled.

- For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
- For !CONFIG_PREEMPTION kernels, a CPU looping anywhere in the kernel
without invoking schedule(). If the looping in the kernel is
really expected and desirable behavior, you might need to add
some calls to cond_resched().
Expand All @@ -44,7 +44,7 @@ warnings:
result in the ``rcu_.*kthread starved for`` console-log message,
which will include additional debugging information.

- A CPU-bound real-time task in a CONFIG_PREEMPT kernel, which might
- A CPU-bound real-time task in a CONFIG_PREEMPTION kernel, which might
happen to preempt a low-priority task in the middle of an RCU
read-side critical section. This is especially damaging if
that low-priority task is not permitted to run on any other CPU,
Expand Down Expand Up @@ -92,7 +92,9 @@ warnings:
buggy timer hardware through bugs in the interrupt or exception
path (whether hardware, firmware, or software) through bugs
in Linux's timer subsystem through bugs in the scheduler, and,
yes, even including bugs in RCU itself.
yes, even including bugs in RCU itself. It can also result in
the ``rcu_.*timer wakeup didn't happen for`` console-log message,
which will include additional debugging information.

- A bug in the RCU implementation.

Expand Down Expand Up @@ -292,6 +294,25 @@ kthread is waiting for a short timeout, the "state" precedes value of the
task_struct ->state field, and the "cpu" indicates that the grace-period
kthread last ran on CPU 5.

If the relevant grace-period kthread does not wake from FQS wait in a
reasonable time, then the following additional line is printed::

kthread timer wakeup didn't happen for 23804 jiffies! g7076 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402

The "23804" indicates that kthread's timer expired more than 23 thousand
jiffies ago. The rest of the line has meaning similar to the kthread
starvation case.

Additionally, the following line is printed::

Possible timer handling issue on cpu=4 timer-softirq=11142

Here "cpu" indicates that the grace-period kthread last ran on CPU 4,
where it queued the fqs timer. The number following the "timer-softirq"
is the current ``TIMER_SOFTIRQ`` count on cpu 4. If this value does not
change on successive RCU CPU stall warnings, there is further reason to
suspect a timer problem.


Multiple Warnings From One Stall
================================
Expand Down
10 changes: 5 additions & 5 deletions Documentation/RCU/whatisRCU.rst
Original file line number Diff line number Diff line change
Expand Up @@ -683,7 +683,7 @@ Quick Quiz #1:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This section presents a "toy" RCU implementation that is based on
"classic RCU". It is also short on performance (but only for updates) and
on features such as hotplug CPU and the ability to run in CONFIG_PREEMPT
on features such as hotplug CPU and the ability to run in CONFIG_PREEMPTION
kernels. The definitions of rcu_dereference() and rcu_assign_pointer()
are the same as those shown in the preceding section, so they are omitted.
::
Expand Down Expand Up @@ -739,7 +739,7 @@ Quick Quiz #2:
Quick Quiz #3:
If it is illegal to block in an RCU read-side
critical section, what the heck do you do in
PREEMPT_RT, where normal spinlocks can block???
CONFIG_PREEMPT_RT, where normal spinlocks can block???

:ref:`Answers to Quick Quiz <8_whatisRCU>`

Expand Down Expand Up @@ -1093,7 +1093,7 @@ Quick Quiz #2:
overhead is **negative**.

Answer:
Imagine a single-CPU system with a non-CONFIG_PREEMPT
Imagine a single-CPU system with a non-CONFIG_PREEMPTION
kernel where a routing table is used by process-context
code, but can be updated by irq-context code (for example,
by an "ICMP REDIRECT" packet). The usual way of handling
Expand All @@ -1120,10 +1120,10 @@ Answer:
Quick Quiz #3:
If it is illegal to block in an RCU read-side
critical section, what the heck do you do in
PREEMPT_RT, where normal spinlocks can block???
CONFIG_PREEMPT_RT, where normal spinlocks can block???

Answer:
Just as PREEMPT_RT permits preemption of spinlock
Just as CONFIG_PREEMPT_RT permits preemption of spinlock
critical sections, it permits preemption of RCU
read-side critical sections. It also permits
spinlocks blocking while in RCU read-side critical
Expand Down
39 changes: 33 additions & 6 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4078,6 +4078,10 @@
value, meaning that RCU_SOFTIRQ is used by default.
Specify rcutree.use_softirq=0 to use rcuc kthreads.

But note that CONFIG_PREEMPT_RT=y kernels disable
this kernel boot parameter, forcibly setting it
to zero.

rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
Expand Down Expand Up @@ -4165,12 +4169,6 @@
Set wakeup interval for idle CPUs that have
RCU callbacks (RCU_FAST_NO_HZ=y).

rcutree.rcu_idle_lazy_gp_delay= [KNL]
Set wakeup interval for idle CPUs that have
only "lazy" RCU callbacks (RCU_FAST_NO_HZ=y).
Lazy RCU callbacks are those which RCU can
prove do nothing more than free memory.

rcutree.rcu_kick_kthreads= [KNL]
Cause the grace-period kthread to get an extra
wake_up() if it sleeps three times longer than
Expand Down Expand Up @@ -4324,6 +4322,14 @@
stress RCU, they don't participate in the actual
test, hence the "fake".

rcutorture.nocbs_nthreads= [KNL]
Set number of RCU callback-offload togglers.
Zero (the default) disables toggling.

rcutorture.nocbs_toggle= [KNL]
Set the delay in milliseconds between successive
callback-offload toggling attempts.

rcutorture.nreaders= [KNL]
Set number of RCU readers. The value -1 selects
N-1, where N is the number of CPUs. A value
Expand Down Expand Up @@ -4456,6 +4462,13 @@
only normal grace-period primitives. No effect
on CONFIG_TINY_RCU kernels.

But note that CONFIG_PREEMPT_RT=y kernels enables
this kernel boot parameter, forcibly setting
it to the value one, that is, converting any
post-boot attempt at an expedited RCU grace
period to instead use normal non-expedited
grace-period processing.

rcupdate.rcu_task_ipi_delay= [KNL]
Set time in jiffies during which RCU tasks will
avoid sending IPIs, starting with the beginning
Expand Down Expand Up @@ -4543,6 +4556,12 @@
refscale.verbose= [KNL]
Enable additional printk() statements.

refscale.verbose_batched= [KNL]
Batch the additional printk() statements. If zero
(the default) or negative, print everything. Otherwise,
print every Nth verbose statement, where N is the value
specified.

relax_domain_level=
[KNL, SMP] Set scheduler's default relax_domain_level.
See Documentation/admin-guide/cgroup-v1/cpusets.rst.
Expand Down Expand Up @@ -5317,6 +5336,14 @@
are running concurrently, especially on systems
with rotating-rust storage.

torture.verbose_sleep_frequency= [KNL]
Specifies how many verbose printk()s should be
emitted between each sleep. The default of zero
disables verbose-printk() sleeping.

torture.verbose_sleep_duration= [KNL]
Duration of each verbose-printk() sleep in jiffies.

tp720= [HW,PS2]

tpm_suspend_pcr=[HW,TPM]
Expand Down
2 changes: 2 additions & 0 deletions include/linux/cpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,8 @@ static inline void cpu_maps_update_done(void)
#endif /* CONFIG_SMP */
extern struct bus_type cpu_subsys;

extern int lockdep_is_cpus_held(void);

#ifdef CONFIG_HOTPLUG_CPU
extern void cpus_write_lock(void);
extern void cpus_write_unlock(void);
Expand Down
2 changes: 1 addition & 1 deletion include/linux/list.h
Original file line number Diff line number Diff line change
Expand Up @@ -901,7 +901,7 @@ static inline void hlist_add_before(struct hlist_node *n,
}

/**
* hlist_add_behing - add a new entry after the one specified
* hlist_add_behind - add a new entry after the one specified
* @n: new entry to be added
* @prev: hlist node to add it after, which must be non-NULL
*/
Expand Down
2 changes: 2 additions & 0 deletions include/linux/mm.h
Original file line number Diff line number Diff line change
Expand Up @@ -3177,5 +3177,7 @@ unsigned long wp_shared_mapping_range(struct address_space *mapping,

extern int sysctl_nr_trim_pages;

void mem_dump_obj(void *object);

#endif /* __KERNEL__ */
#endif /* _LINUX_MM_H */
Loading

0 comments on commit d089f48

Please sign in to comment.