From 1846a7fa767fbf8cf42d71daf75d51e30e3c8327 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 13 May 2021 11:17:02 -0700 Subject: [PATCH 1/4] tools/memory-model: Make read_foo_diagnostic() more clearly diagnostic The current definition of read_foo_diagnostic() in the "Lock Protection With Lockless Diagnostic Access" section returns a value, which could be use for any purpose. This could mislead people into incorrectly using data_race() in cases where READ_ONCE() is required. This commit therefore makes read_foo_diagnostic() simply print the value read. Reported-by: Manfred Spraul Signed-off-by: Paul E. McKenney --- tools/memory-model/Documentation/access-marking.txt | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt index 1ab189f51f55d..58bff26198767 100644 --- a/tools/memory-model/Documentation/access-marking.txt +++ b/tools/memory-model/Documentation/access-marking.txt @@ -259,9 +259,9 @@ diagnostic purposes. The code might look as follows: return ret; } - int read_foo_diagnostic(void) + void read_foo_diagnostic(void) { - return data_race(foo); + pr_info("Current value of foo: %d\n", data_race(foo)); } The reader-writer lock prevents the compiler from introducing concurrency From 436eef23c41fe10dc34ed19a00caf9f1290a8689 Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Thu, 13 May 2021 14:54:58 -0700 Subject: [PATCH 2/4] tools/memory-model: Add example for heuristic lockless reads This commit adds example code for heuristic lockless reads, based loosely on the sem_lock() and sem_unlock() functions. [ paulmck: Apply Alan Stern and Manfred Spraul feedback. ] Reported-by: Manfred Spraul [ paulmck: Update per Manfred Spraul and Hillf Danton feedback. ] Signed-off-by: Paul E. McKenney --- .../Documentation/access-marking.txt | 93 +++++++++++++++++++ 1 file changed, 93 insertions(+) diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt index 58bff26198767..d96fe20ed582a 100644 --- a/tools/memory-model/Documentation/access-marking.txt +++ b/tools/memory-model/Documentation/access-marking.txt @@ -319,6 +319,99 @@ of the ASSERT_EXCLUSIVE_WRITER() is to allow KCSAN to check for a buggy concurrent lockless write. +Lock-Protected Writes With Heuristic Lockless Reads +--------------------------------------------------- + +For another example, suppose that the code can normally make use of +a per-data-structure lock, but there are times when a global lock +is required. These times are indicated via a global flag. The code +might look as follows, and is based loosely on nf_conntrack_lock(), +nf_conntrack_all_lock(), and nf_conntrack_all_unlock(): + + bool global_flag; + DEFINE_SPINLOCK(global_lock); + struct foo { + spinlock_t f_lock; + int f_data; + }; + + /* All foo structures are in the following array. */ + int nfoo; + struct foo *foo_array; + + void do_something_locked(struct foo *fp) + { + /* This works even if data_race() returns nonsense. */ + if (!data_race(global_flag)) { + spin_lock(&fp->f_lock); + if (!smp_load_acquire(&global_flag)) { + do_something(fp); + spin_unlock(&fp->f_lock); + return; + } + spin_unlock(&fp->f_lock); + } + spin_lock(&global_lock); + /* global_lock held, thus global flag cannot be set. */ + spin_lock(&fp->f_lock); + spin_unlock(&global_lock); + /* + * global_flag might be set here, but begin_global() + * will wait for ->f_lock to be released. + */ + do_something(fp); + spin_unlock(&fp->f_lock); + } + + void begin_global(void) + { + int i; + + spin_lock(&global_lock); + WRITE_ONCE(global_flag, true); + for (i = 0; i < nfoo; i++) { + /* + * Wait for pre-existing local locks. One at + * a time to avoid lockdep limitations. + */ + spin_lock(&fp->f_lock); + spin_unlock(&fp->f_lock); + } + } + + void end_global(void) + { + smp_store_release(&global_flag, false); + spin_unlock(&global_lock); + } + +All code paths leading from the do_something_locked() function's first +read from global_flag acquire a lock, so endless load fusing cannot +happen. + +If the value read from global_flag is true, then global_flag is +rechecked while holding ->f_lock, which, if global_flag is now false, +prevents begin_global() from completing. It is therefore safe to invoke +do_something(). + +Otherwise, if either value read from global_flag is true, then after +global_lock is acquired global_flag must be false. The acquisition of +->f_lock will prevent any call to begin_global() from returning, which +means that it is safe to release global_lock and invoke do_something(). + +For this to work, only those foo structures in foo_array[] may be passed +to do_something_locked(). The reason for this is that the synchronization +with begin_global() relies on momentarily holding the lock of each and +every foo structure. + +The smp_load_acquire() and smp_store_release() are required because +changes to a foo structure between calls to begin_global() and +end_global() are carried out without holding that structure's ->f_lock. +The smp_load_acquire() and smp_store_release() ensure that the next +invocation of do_something() from do_something_locked() will see those +changes. + + Lockless Reads and Writes ------------------------- From f92975d76d537c06a2118f9c3c63432c0f7c7a88 Mon Sep 17 00:00:00 2001 From: Manfred Spraul Date: Fri, 14 May 2021 11:40:06 -0700 Subject: [PATCH 3/4] tools/memory-model: Heuristics using data_race() must handle all values Data loaded for use by some sorts of heuristics can tolerate the occasional erroneous value. In this case the loads may use data_race() to give the compiler full freedom to optimize while also informing KCSAN of the intent. However, for this to work, the heuristic needs to be able to tolerate any erroneous value that could possibly arise. This commit therefore adds a paragraph spelling this out. Signed-off-by: Manfred Spraul Signed-off-by: Paul E. McKenney --- tools/memory-model/Documentation/access-marking.txt | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt index d96fe20ed582a..82a4899887269 100644 --- a/tools/memory-model/Documentation/access-marking.txt +++ b/tools/memory-model/Documentation/access-marking.txt @@ -126,6 +126,11 @@ consistent errors, which in turn are quite capable of breaking heuristics. Therefore use of data_race() should be limited to cases where some other code (such as a barrier() call) will force the occasional reload. +Note that this use case requires that the heuristic be able to handle +any possible error. In contrast, if the heuristics might be fatally +confused by one or more of the possible erroneous values, use READ_ONCE() +instead of data_race(). + In theory, plain C-language loads can also be used for this use case. However, in practice this will have the disadvantage of causing KCSAN to generate false positives because KCSAN will have no way of knowing From 87859a8e3f083bd57b34e6a962544d775a76b15f Mon Sep 17 00:00:00 2001 From: "Paul E. McKenney" Date: Tue, 18 May 2021 10:47:43 -0700 Subject: [PATCH 4/4] tools/memory-model: Document data_race(READ_ONCE()) It is possible to cause KCSAN to ignore marked accesses by applying __no_kcsan to the function or applying data_race() to the marked accesses. These approaches allow the developer to restrict compiler optimizations while also causing KCSAN to ignore diagnostic accesses. This commit therefore updates the documentation accordingly. Signed-off-by: Paul E. McKenney --- .../Documentation/access-marking.txt | 49 +++++++++++++------ 1 file changed, 35 insertions(+), 14 deletions(-) diff --git a/tools/memory-model/Documentation/access-marking.txt b/tools/memory-model/Documentation/access-marking.txt index 82a4899887269..65778222183e3 100644 --- a/tools/memory-model/Documentation/access-marking.txt +++ b/tools/memory-model/Documentation/access-marking.txt @@ -37,7 +37,9 @@ compiler's use of code-motion and common-subexpression optimizations. Therefore, if a given access is involved in an intentional data race, using READ_ONCE() for loads and WRITE_ONCE() for stores is usually preferable to data_race(), which in turn is usually preferable to plain -C-language accesses. +C-language accesses. It is permissible to combine #2 and #3, for example, +data_race(READ_ONCE(a)), which will both restrict compiler optimizations +and disable KCSAN diagnostics. KCSAN will complain about many types of data races involving plain C-language accesses, but marking all accesses involved in a given data @@ -86,6 +88,10 @@ that fail to exclude the updates. In this case, it is important to use data_race() for the diagnostic reads because otherwise KCSAN would give false-positive warnings about these diagnostic reads. +If it is necessary to both restrict compiler optimizations and disable +KCSAN diagnostics, use both data_race() and READ_ONCE(), for example, +data_race(READ_ONCE(a)). + In theory, plain C-language loads can also be used for this use case. However, in practice this will have the disadvantage of causing KCSAN to generate false positives because KCSAN will have no way of knowing @@ -279,19 +285,34 @@ tells KCSAN that data races are expected, and should be silently ignored. This data_race() also tells the human reading the code that read_foo_diagnostic() might sometimes return a bogus value. -However, please note that your kernel must be built with -CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n in order for KCSAN to -detect a buggy lockless write. If you need KCSAN to detect such a -write even if that write did not change the value of foo, you also -need CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n. If you need KCSAN to -detect such a write happening in an interrupt handler running on the -same CPU doing the legitimate lock-protected write, you also need -CONFIG_KCSAN_INTERRUPT_WATCHER=y. With some or all of these Kconfig -options set properly, KCSAN can be quite helpful, although it is not -necessarily a full replacement for hardware watchpoints. On the other -hand, neither are hardware watchpoints a full replacement for KCSAN -because it is not always easy to tell hardware watchpoint to conditionally -trap on accesses. +If it is necessary to suppress compiler optimization and also detect +buggy lockless writes, read_foo_diagnostic() can be updated as follows: + + void read_foo_diagnostic(void) + { + pr_info("Current value of foo: %d\n", data_race(READ_ONCE(foo))); + } + +Alternatively, given that KCSAN is to ignore all accesses in this function, +this function can be marked __no_kcsan and the data_race() can be dropped: + + void __no_kcsan read_foo_diagnostic(void) + { + pr_info("Current value of foo: %d\n", READ_ONCE(foo)); + } + +However, in order for KCSAN to detect buggy lockless writes, your kernel +must be built with CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n. If you +need KCSAN to detect such a write even if that write did not change +the value of foo, you also need CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n. +If you need KCSAN to detect such a write happening in an interrupt handler +running on the same CPU doing the legitimate lock-protected write, you +also need CONFIG_KCSAN_INTERRUPT_WATCHER=y. With some or all of these +Kconfig options set properly, KCSAN can be quite helpful, although +it is not necessarily a full replacement for hardware watchpoints. +On the other hand, neither are hardware watchpoints a full replacement +for KCSAN because it is not always easy to tell hardware watchpoint to +conditionally trap on accesses. Lock-Protected Writes With Lockless Reads