Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 22398
b: refs/heads/master
c: 5812499
h: refs/heads/master
v: v3
  • Loading branch information
Tony Luck committed Mar 21, 2006
1 parent 2a8092e commit 67a4958
Show file tree
Hide file tree
Showing 2,280 changed files with 52,496 additions and 35,035 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: b0a06623dc4caf6dfb6a84419507643471676d20
refs/heads/master: 581249966ffeb0463bad1b0e087e1bb29ed53707
9 changes: 3 additions & 6 deletions trunk/CREDITS
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,6 @@ D: Author of lil (Linux Interrupt Latency benchmark)
D: Fixed the shm swap deallocation at swapoff time (try_to_unuse message)
D: VM hacker
D: Various other kernel hacks
S: Via Cicalini 26
S: Imola 40026
S: Italy

Expand Down Expand Up @@ -3101,7 +3100,7 @@ S: Minto, NSW, 2566
S: Australia

N: Stephen Smalley
E: sds@epoch.ncsc.mil
E: sds@tycho.nsa.gov
D: portions of the Linux Security Module (LSM) framework and security modules

N: Chris Smith
Expand Down Expand Up @@ -3643,11 +3642,9 @@ S: Cambridge. CB1 7EG
S: England

N: Chris Wright
E: chrisw@osdl.org
E: chrisw@sous-sol.org
D: hacking on LSM framework and security modules.
S: c/o OSDL
S: 12725 SW Millikan Way, Suite 400
S: Beaverton, OR 97005
S: Portland, OR
S: USA

N: Michal Wronski
Expand Down
25 changes: 14 additions & 11 deletions trunk/Documentation/RCU/RTFP.txt
Original file line number Diff line number Diff line change
Expand Up @@ -90,16 +90,20 @@ at OLS. The resulting abundance of RCU patches was presented the
following year [McKenney02a], and use of RCU in dcache was first
described that same year [Linder02a].

Also in 2002, Michael [Michael02b,Michael02a] presented techniques
that defer the destruction of data structures to simplify non-blocking
synchronization (wait-free synchronization, lock-free synchronization,
and obstruction-free synchronization are all examples of non-blocking
synchronization). In particular, this technique eliminates locking,
reduces contention, reduces memory latency for readers, and parallelizes
pipeline stalls and memory latency for writers. However, these
techniques still impose significant read-side overhead in the form of
memory barriers. Researchers at Sun worked along similar lines in the
same timeframe [HerlihyLM02,HerlihyLMS03].
Also in 2002, Michael [Michael02b,Michael02a] presented "hazard-pointer"
techniques that defer the destruction of data structures to simplify
non-blocking synchronization (wait-free synchronization, lock-free
synchronization, and obstruction-free synchronization are all examples of
non-blocking synchronization). In particular, this technique eliminates
locking, reduces contention, reduces memory latency for readers, and
parallelizes pipeline stalls and memory latency for writers. However,
these techniques still impose significant read-side overhead in the
form of memory barriers. Researchers at Sun worked along similar lines
in the same timeframe [HerlihyLM02,HerlihyLMS03]. These techniques
can be thought of as inside-out reference counts, where the count is
represented by the number of hazard pointers referencing a given data
structure (rather than the more conventional counter field within the
data structure itself).

In 2003, the K42 group described how RCU could be used to create
hot-pluggable implementations of operating-system functions. Later that
Expand All @@ -113,7 +117,6 @@ number of operating-system kernels [PaulEdwardMcKenneyPhD], a paper
describing how to make RCU safe for soft-realtime applications [Sarma04c],
and a paper describing SELinux performance with RCU [JamesMorris04b].


2005 has seen further adaptation of RCU to realtime use, permitting
preemption of RCU realtime critical sections [PaulMcKenney05a,
PaulMcKenney05b].
Expand Down
6 changes: 6 additions & 0 deletions trunk/Documentation/RCU/checklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -177,3 +177,9 @@ over a rather long period of time, but improvements are always welcome!

If you want to wait for some of these other things, you might
instead need to use synchronize_irq() or synchronize_sched().

12. Any lock acquired by an RCU callback must be acquired elsewhere
with irq disabled, e.g., via spin_lock_irqsave(). Failing to
disable irq on a given acquisition of that lock will result in
deadlock as soon as the RCU callback happens to interrupt that
acquisition's critical section.
21 changes: 12 additions & 9 deletions trunk/Documentation/RCU/listRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -232,7 +232,7 @@ entry does not exist. For this to be helpful, the search function must
return holding the per-entry spinlock, as ipc_lock() does in fact do.

Quick Quiz: Why does the search function need to return holding the
per-entry lock for this deleted-flag technique to be helpful?
per-entry lock for this deleted-flag technique to be helpful?

If the system-call audit module were to ever need to reject stale data,
one way to accomplish this would be to add a "deleted" flag and a "lock"
Expand Down Expand Up @@ -275,8 +275,8 @@ flag under the spinlock as follows:
{
struct audit_entry *e;

/* Do not use the _rcu iterator here, since this is the only
* deletion routine. */
/* Do not need to use the _rcu iterator here, since this
* is the only deletion routine. */
list_for_each_entry(e, list, list) {
if (!audit_compare_rule(rule, &e->rule)) {
spin_lock(&e->lock);
Expand Down Expand Up @@ -304,9 +304,12 @@ function to reject newly deleted data.


Answer to Quick Quiz

If the search function drops the per-entry lock before returning, then
the caller will be processing stale data in any case. If it is really
OK to be processing stale data, then you don't need a "deleted" flag.
If processing stale data really is a problem, then you need to hold the
per-entry lock across all of the code that uses the value looked up.
Why does the search function need to return holding the per-entry
lock for this deleted-flag technique to be helpful?

If the search function drops the per-entry lock before returning,
then the caller will be processing stale data in any case. If it
is really OK to be processing stale data, then you don't need a
"deleted" flag. If processing stale data really is a problem,
then you need to hold the per-entry lock across all of the code
that uses the value that was returned.
5 changes: 5 additions & 0 deletions trunk/Documentation/RCU/rcu.txt
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,11 @@ o What are all these files in this directory?

You are reading it!

rcuref.txt

Describes how to combine use of reference counts
with RCU.

whatisRCU.txt

Overview of how the RCU implementation works. Along
Expand Down
31 changes: 15 additions & 16 deletions trunk/Documentation/RCU/rcuref.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Refcounter design for elements of lists/arrays protected by RCU.
Reference-count design for elements of lists/arrays protected by RCU.

Refcounting on elements of lists which are protected by traditional
reader/writer spinlocks or semaphores are straight forward as in:
Reference counting on elements of lists which are protected by traditional
reader/writer spinlocks or semaphores are straightforward:

1. 2.
add() search_and_reference()
Expand All @@ -28,12 +28,12 @@ release_referenced() delete()
...
}

If this list/array is made lock free using rcu as in changing the
write_lock in add() and delete() to spin_lock and changing read_lock
If this list/array is made lock free using RCU as in changing the
write_lock() in add() and delete() to spin_lock and changing read_lock
in search_and_reference to rcu_read_lock(), the atomic_get in
search_and_reference could potentially hold reference to an element which
has already been deleted from the list/array. atomic_inc_not_zero takes
care of this scenario. search_and_reference should look as;
has already been deleted from the list/array. Use atomic_inc_not_zero()
in this scenario as follows:

1. 2.
add() search_and_reference()
Expand All @@ -51,17 +51,16 @@ add() search_and_reference()
release_referenced() delete()
{ {
... write_lock(&list_lock);
atomic_dec(&el->rc, relfunc) ...
... delete_element
} write_unlock(&list_lock);
...
if (atomic_dec_and_test(&el->rc)) ...
call_rcu(&el->head, el_free); delete_element
... write_unlock(&list_lock);
} ...
if (atomic_dec_and_test(&el->rc))
call_rcu(&el->head, el_free);
...
}

Sometimes, reference to the element need to be obtained in the
update (write) stream. In such cases, atomic_inc_not_zero might be an
overkill since the spinlock serialising list updates are held. atomic_inc
is to be used in such cases.

Sometimes, a reference to the element needs to be obtained in the
update (write) stream. In such cases, atomic_inc_not_zero() might be
overkill, since we hold the update-side spinlock. One might instead
use atomic_inc() in such cases.
29 changes: 17 additions & 12 deletions trunk/Documentation/RCU/whatisRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -200,10 +200,11 @@ rcu_assign_pointer()
the new value, and also executes any memory-barrier instructions
required for a given CPU architecture.

Perhaps more important, it serves to document which pointers
are protected by RCU. That said, rcu_assign_pointer() is most
frequently used indirectly, via the _rcu list-manipulation
primitives such as list_add_rcu().
Perhaps just as important, it serves to document (1) which
pointers are protected by RCU and (2) the point at which a
given structure becomes accessible to other CPUs. That said,
rcu_assign_pointer() is most frequently used indirectly, via
the _rcu list-manipulation primitives such as list_add_rcu().

rcu_dereference()

Expand Down Expand Up @@ -258,9 +259,11 @@ rcu_dereference()
locking.

As with rcu_assign_pointer(), an important function of
rcu_dereference() is to document which pointers are protected
by RCU. And, again like rcu_assign_pointer(), rcu_dereference()
is typically used indirectly, via the _rcu list-manipulation
rcu_dereference() is to document which pointers are protected by
RCU, in particular, flagging a pointer that is subject to changing
at any time, including immediately after the rcu_dereference().
And, again like rcu_assign_pointer(), rcu_dereference() is
typically used indirectly, via the _rcu list-manipulation
primitives, such as list_for_each_entry_rcu().

The following diagram shows how each API communicates among the
Expand Down Expand Up @@ -327,7 +330,7 @@ for specialized uses, but are relatively uncommon.
3. WHAT ARE SOME EXAMPLE USES OF CORE RCU API?

This section shows a simple use of the core RCU API to protect a
global pointer to a dynamically allocated structure. More typical
global pointer to a dynamically allocated structure. More-typical
uses of RCU may be found in listRCU.txt, arrayRCU.txt, and NMI-RCU.txt.

struct foo {
Expand Down Expand Up @@ -410,6 +413,8 @@ o Use synchronize_rcu() -after- removing a data element from an
data item.

See checklist.txt for additional rules to follow when using RCU.
And again, more-typical uses of RCU may be found in listRCU.txt,
arrayRCU.txt, and NMI-RCU.txt.


4. WHAT IF MY UPDATING THREAD CANNOT BLOCK?
Expand Down Expand Up @@ -513,7 +518,7 @@ production-quality implementation, and see:

for papers describing the Linux kernel RCU implementation. The OLS'01
and OLS'02 papers are a good introduction, and the dissertation provides
more details on the current implementation.
more details on the current implementation as of early 2004.


5A. "TOY" IMPLEMENTATION #1: LOCKING
Expand Down Expand Up @@ -768,7 +773,6 @@ RCU pointer/list traversal:
rcu_dereference
list_for_each_rcu (to be deprecated in favor of
list_for_each_entry_rcu)
list_for_each_safe_rcu (deprecated, not used)
list_for_each_entry_rcu
list_for_each_continue_rcu (to be deprecated in favor of new
list_for_each_entry_continue_rcu)
Expand Down Expand Up @@ -807,7 +811,8 @@ Quick Quiz #1: Why is this argument naive? How could a deadlock
Answer: Consider the following sequence of events:

1. CPU 0 acquires some unrelated lock, call it
"problematic_lock".
"problematic_lock", disabling irq via
spin_lock_irqsave().

2. CPU 1 enters synchronize_rcu(), write-acquiring
rcu_gp_mutex.
Expand Down Expand Up @@ -894,7 +899,7 @@ Answer: Just as PREEMPT_RT permits preemption of spinlock
ACKNOWLEDGEMENTS

My thanks to the people who helped make this human-readable, including
Jon Walpole, Josh Triplett, Serge Hallyn, and Suzanne Wood.
Jon Walpole, Josh Triplett, Serge Hallyn, Suzanne Wood, and Alan Stern.


For more information, see http://www.rdrop.com/users/paulmck/RCU.
27 changes: 24 additions & 3 deletions trunk/Documentation/cpu-hotplug.txt
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
Joel Schopp <jschopp@austin.ibm.com>
ia64/x86_64:
Ashok Raj <ashok.raj@intel.com>
s390:
Heiko Carstens <heiko.carstens@de.ibm.com>

Authors: Ashok Raj <ashok.raj@intel.com>
Lots of feedback: Nathan Lynch <nathanl@austin.ibm.com>,
Expand Down Expand Up @@ -44,9 +46,28 @@ maxcpus=n Restrict boot time cpus to n. Say if you have 4 cpus, using
maxcpus=2 will only boot 2. You can choose to bring the
other cpus later online, read FAQ's for more info.

additional_cpus=n [x86_64 only] use this to limit hotpluggable cpus.
This option sets
cpu_possible_map = cpu_present_map + additional_cpus
additional_cpus*=n Use this to limit hotpluggable cpus. This option sets
cpu_possible_map = cpu_present_map + additional_cpus

(*) Option valid only for following architectures
- x86_64, ia64, s390

ia64 and x86_64 use the number of disabled local apics in ACPI tables MADT
to determine the number of potentially hot-pluggable cpus. The implementation
should only rely on this to count the #of cpus, but *MUST* not rely on the
apicid values in those tables for disabled apics. In the event BIOS doesnt
mark such hot-pluggable cpus as disabled entries, one could use this
parameter "additional_cpus=x" to represent those cpus in the cpu_possible_map.

s390 uses the number of cpus it detects at IPL time to also the number of bits
in cpu_possible_map. If it is desired to add additional cpus at a later time
the number should be specified using this option or the possible_cpus option.

possible_cpus=n [s390 only] use this to set hotpluggable cpus.
This option sets possible_cpus bits in
cpu_possible_map. Thus keeping the numbers of bits set
constant even if the machine gets rebooted.
This option overrides additional_cpus.

CPU maps and such
-----------------
Expand Down
41 changes: 14 additions & 27 deletions trunk/Documentation/cpusets.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,9 @@
Copyright (C) 2004 BULL SA.
Written by Simon.Derr@bull.net

Portions Copyright (c) 2004 Silicon Graphics, Inc.
Portions Copyright (c) 2004-2006 Silicon Graphics, Inc.
Modified by Paul Jackson <pj@sgi.com>
Modified by Christoph Lameter <clameter@sgi.com>

CONTENTS:
=========
Expand Down Expand Up @@ -90,7 +91,8 @@ This can be especially valuable on:

These subsets, or "soft partitions" must be able to be dynamically
adjusted, as the job mix changes, without impacting other concurrently
executing jobs.
executing jobs. The location of the running jobs pages may also be moved
when the memory locations are changed.

The kernel cpuset patch provides the minimum essential kernel
mechanisms required to efficiently implement such subsets. It
Expand All @@ -102,8 +104,8 @@ memory allocator code.
1.3 How are cpusets implemented ?
---------------------------------

Cpusets provide a Linux kernel (2.6.7 and above) mechanism to constrain
which CPUs and Memory Nodes are used by a process or set of processes.
Cpusets provide a Linux kernel mechanism to constrain which CPUs and
Memory Nodes are used by a process or set of processes.

The Linux kernel already has a pair of mechanisms to specify on which
CPUs a task may be scheduled (sched_setaffinity) and on which Memory
Expand Down Expand Up @@ -371,22 +373,17 @@ cpusets memory placement policy 'mems' subsequently changes.
If the cpuset flag file 'memory_migrate' is set true, then when
tasks are attached to that cpuset, any pages that task had
allocated to it on nodes in its previous cpuset are migrated
to the tasks new cpuset. Depending on the implementation,
this migration may either be done by swapping the page out,
so that the next time the page is referenced, it will be paged
into the tasks new cpuset, usually on the node where it was
referenced, or this migration may be done by directly copying
the pages from the tasks previous cpuset to the new cpuset,
where possible to the same node, relative to the new cpuset,
as the node that held the page, relative to the old cpuset.
to the tasks new cpuset. The relative placement of the page within
the cpuset is preserved during these migration operations if possible.
For example if the page was on the second valid node of the prior cpuset
then the page will be placed on the second valid node of the new cpuset.

Also if 'memory_migrate' is set true, then if that cpusets
'mems' file is modified, pages allocated to tasks in that
cpuset, that were on nodes in the previous setting of 'mems',
will be moved to nodes in the new setting of 'mems.' Again,
depending on the implementation, this might be done by swapping,
or by direct copying. In either case, pages that were not in
the tasks prior cpuset, or in the cpusets prior 'mems' setting,
will not be moved.
will be moved to nodes in the new setting of 'mems.'
Pages that were not in the tasks prior cpuset, or in the cpusets
prior 'mems' setting, will not be moved.

There is an exception to the above. If hotplug functionality is used
to remove all the CPUs that are currently assigned to a cpuset,
Expand Down Expand Up @@ -434,16 +431,6 @@ and then start a subshell 'sh' in that cpuset:
# The next line should display '/Charlie'
cat /proc/self/cpuset

In the case that a change of cpuset includes wanting to move already
allocated memory pages, consider further the work of IWAMOTO
Toshihiro <iwamoto@valinux.co.jp> for page remapping and memory
hotremoval, which can be found at:

http://people.valinux.co.jp/~iwamoto/mh.html

The integration of cpusets with such memory migration is not yet
available.

In the future, a C library interface to cpusets will likely be
available. For now, the only way to query or modify cpusets is
via the cpuset file system, using the various cd, mkdir, echo, cat,
Expand Down
Loading

0 comments on commit 67a4958

Please sign in to comment.