Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 30078
b: refs/heads/master
c: 25581ad
h: refs/heads/master
v: v3
  • Loading branch information
Linus Torvalds committed Jun 25, 2006
1 parent 2cdc288 commit 309e88b
Show file tree
Hide file tree
Showing 355 changed files with 13,502 additions and 9,088 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: 7477ddaa4d2d69bbcd49e12990af158dbb03f2f2
refs/heads/master: 25581ad107be24b89d805da51a03d616f8f3d1be
6 changes: 1 addition & 5 deletions trunk/CREDITS
Original file line number Diff line number Diff line change
Expand Up @@ -1573,12 +1573,8 @@ S: 160 00 Praha 6
S: Czech Republic

N: Niels Kristian Bech Jensen
E: nkbj@image.dk
W: http://www.image.dk/~nkbj
E: nkbj1970@hotmail.com
D: Miscellaneous kernel updates and fixes.
S: Dr. Holsts Vej 34, lejl. 164
S: DK-8230 �byh�j
S: Denmark

N: Michael K. Johnson
E: johnsonm@redhat.com
Expand Down
44 changes: 42 additions & 2 deletions trunk/Documentation/DocBook/kernel-api.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,8 @@
<sect1><title>Internal Functions</title>
!Ikernel/exit.c
!Ikernel/signal.c
!Iinclude/linux/kthread.h
!Ekernel/kthread.c
</sect1>

<sect1><title>Kernel objects manipulation</title>
Expand Down Expand Up @@ -114,6 +116,29 @@ X!Ilib/string.c
</sect1>
</chapter>

<chapter id="kernel-lib">
<title>Basic Kernel Library Functions</title>

<para>
The Linux kernel provides more basic utility functions.
</para>

<sect1><title>Bitmap Operations</title>
!Elib/bitmap.c
!Ilib/bitmap.c
</sect1>

<sect1><title>Command-line Parsing</title>
!Elib/cmdline.c
</sect1>

<sect1><title>CRC Functions</title>
!Elib/crc16.c
!Elib/crc32.c
!Elib/crc-ccitt.c
</sect1>
</chapter>

<chapter id="mm">
<title>Memory Management in Linux</title>
<sect1><title>The Slab Cache</title>
Expand Down Expand Up @@ -281,12 +306,13 @@ X!Ekernel/module.c
<sect1><title>MTRR Handling</title>
!Earch/i386/kernel/cpu/mtrr/main.c
</sect1>

<sect1><title>PCI Support Library</title>
!Edrivers/pci/pci.c
!Edrivers/pci/pci-driver.c
!Edrivers/pci/remove.c
!Edrivers/pci/pci-acpi.c
<!-- kerneldoc does not understand to __devinit
<!-- kerneldoc does not understand __devinit
X!Edrivers/pci/search.c
-->
!Edrivers/pci/msi.c
Expand Down Expand Up @@ -315,6 +341,13 @@ X!Earch/i386/kernel/mca.c
</sect1>
</chapter>

<chapter id="firmware">
<title>Firmware Interfaces</title>
<sect1><title>DMI Interfaces</title>
!Edrivers/firmware/dmi_scan.c
</sect1>
</chapter>

<chapter id="devfs">
<title>The Device File System</title>
!Efs/devfs/base.c
Expand Down Expand Up @@ -403,7 +436,6 @@ X!Edrivers/pnp/system.c
</sect1>
</chapter>


<chapter id="blkdev">
<title>Block Devices</title>
!Eblock/ll_rw_blk.c
Expand All @@ -414,6 +446,14 @@ X!Edrivers/pnp/system.c
!Edrivers/char/misc.c
</chapter>

<chapter id="parportdev">
<title>Parallel Port Devices</title>
!Iinclude/linux/parport.h
!Edrivers/parport/ieee1284.c
!Edrivers/parport/share.c
!Idrivers/parport/daisy.c
</chapter>

<chapter id="viddev">
<title>Video4Linux</title>
!Edrivers/media/video/videodev.c
Expand Down
44 changes: 41 additions & 3 deletions trunk/Documentation/RCU/checklist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -144,9 +144,47 @@ over a rather long period of time, but improvements are always welcome!
whether the increased speed is worth it.

8. Although synchronize_rcu() is a bit slower than is call_rcu(),
it usually results in simpler code. So, unless update performance
is important or the updaters cannot block, synchronize_rcu()
should be used in preference to call_rcu().
it usually results in simpler code. So, unless update
performance is critically important or the updaters cannot block,
synchronize_rcu() should be used in preference to call_rcu().

An especially important property of the synchronize_rcu()
primitive is that it automatically self-limits: if grace periods
are delayed for whatever reason, then the synchronize_rcu()
primitive will correspondingly delay updates. In contrast,
code using call_rcu() should explicitly limit update rate in
cases where grace periods are delayed, as failing to do so can
result in excessive realtime latencies or even OOM conditions.

Ways of gaining this self-limiting property when using call_rcu()
include:

a. Keeping a count of the number of data-structure elements
used by the RCU-protected data structure, including those
waiting for a grace period to elapse. Enforce a limit
on this number, stalling updates as needed to allow
previously deferred frees to complete.

Alternatively, limit only the number awaiting deferred
free rather than the total number of elements.

b. Limiting update rate. For example, if updates occur only
once per hour, then no explicit rate limiting is required,
unless your system is already badly broken. The dcache
subsystem takes this approach -- updates are guarded
by a global lock, limiting their rate.

c. Trusted update -- if updates can only be done manually by
superuser or some other trusted user, then it might not
be necessary to automatically limit them. The theory
here is that superuser already has lots of ways to crash
the machine.

d. Use call_rcu_bh() rather than call_rcu(), in order to take
advantage of call_rcu_bh()'s faster grace periods.

e. Periodically invoke synchronize_rcu(), permitting a limited
number of updates per grace period.

9. All RCU list-traversal primitives, which include
list_for_each_rcu(), list_for_each_entry_rcu(),
Expand Down
12 changes: 11 additions & 1 deletion trunk/Documentation/RCU/whatisRCU.txt
Original file line number Diff line number Diff line change
Expand Up @@ -184,7 +184,17 @@ synchronize_rcu()
blocking, it registers a function and argument which are invoked
after all ongoing RCU read-side critical sections have completed.
This callback variant is particularly useful in situations where
it is illegal to block.
it is illegal to block or where update-side performance is
critically important.

However, the call_rcu() API should not be used lightly, as use
of the synchronize_rcu() API generally results in simpler code.
In addition, the synchronize_rcu() API has the nice property
of automatically limiting update rate should grace periods
be delayed. This property results in system resilience in face
of denial-of-service attacks. Code using call_rcu() should limit
update rate in order to gain this same sort of resilience. See
checklist.txt for some approaches to limiting the update rate.

rcu_assign_pointer()

Expand Down
7 changes: 6 additions & 1 deletion trunk/Documentation/devices.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

Maintained by Torben Mathiasen <device@lanana.org>

Last revised: 01 March 2006
Last revised: 15 May 2006

This list is the Linux Device List, the official registry of allocated
device numbers and /dev directory nodes for the Linux operating
Expand Down Expand Up @@ -2791,6 +2791,7 @@ Your cooperation is appreciated.
170 = /dev/ttyNX0 Hilscher netX serial port 0
...
185 = /dev/ttyNX15 Hilscher netX serial port 15
186 = /dev/ttyJ0 JTAG1 DCC protocol based serial port emulation

205 char Low-density serial ports (alternate device)
0 = /dev/culu0 Callout device for ttyLU0
Expand Down Expand Up @@ -3108,6 +3109,10 @@ Your cooperation is appreciated.
...
240 = /dev/rfdp 16th RFD FTL layer

257 char Phoenix Technologies Cryptographic Services Driver
0 = /dev/ptlsec Crypto Services Driver



**** ADDITIONAL /dev DIRECTORY ENTRIES

Expand Down
118 changes: 71 additions & 47 deletions trunk/Documentation/filesystems/fuse.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,14 @@ Non-privileged mount (or user mount):
user. NOTE: this is not the same as mounts allowed with the "user"
option in /etc/fstab, which is not discussed here.

Filesystem connection:

A connection between the filesystem daemon and the kernel. The
connection exists until either the daemon dies, or the filesystem is
umounted. Note that detaching (or lazy umounting) the filesystem
does _not_ break the connection, in this case it will exist until
the last reference to the filesystem is released.

Mount owner:

The user who does the mounting.
Expand Down Expand Up @@ -86,16 +94,20 @@ Mount options
The default is infinite. Note that the size of read requests is
limited anyway to 32 pages (which is 128kbyte on i386).

Sysfs
~~~~~
Control filesystem
~~~~~~~~~~~~~~~~~~

There's a control filesystem for FUSE, which can be mounted by:

FUSE sets up the following hierarchy in sysfs:
mount -t fusectl none /sys/fs/fuse/connections

/sys/fs/fuse/connections/N/
Mounting it under the '/sys/fs/fuse/connections' directory makes it
backwards compatible with earlier versions.

where N is an increasing number allocated to each new connection.
Under the fuse control filesystem each connection has a directory
named by a unique number.

For each connection the following attributes are defined:
For each connection the following files exist within this directory:

'waiting'

Expand All @@ -110,7 +122,47 @@ For each connection the following attributes are defined:
connection. This means that all waiting requests will be aborted an
error returned for all aborted and new requests.

Only a privileged user may read or write these attributes.
Only the owner of the mount may read or write these files.

Interrupting filesystem operations
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

If a process issuing a FUSE filesystem request is interrupted, the
following will happen:

1) If the request is not yet sent to userspace AND the signal is
fatal (SIGKILL or unhandled fatal signal), then the request is
dequeued and returns immediately.

2) If the request is not yet sent to userspace AND the signal is not
fatal, then an 'interrupted' flag is set for the request. When
the request has been successfully transfered to userspace and
this flag is set, an INTERRUPT request is queued.

3) If the request is already sent to userspace, then an INTERRUPT
request is queued.

INTERRUPT requests take precedence over other requests, so the
userspace filesystem will receive queued INTERRUPTs before any others.

The userspace filesystem may ignore the INTERRUPT requests entirely,
or may honor them by sending a reply to the _original_ request, with
the error set to EINTR.

It is also possible that there's a race between processing the
original request and it's INTERRUPT request. There are two possibilities:

1) The INTERRUPT request is processed before the original request is
processed

2) The INTERRUPT request is processed after the original request has
been answered

If the filesystem cannot find the original request, it should wait for
some timeout and/or a number of new requests to arrive, after which it
should reply to the INTERRUPT request with an EAGAIN error. In case
1) the INTERRUPT request will be requeued. In case 2) the INTERRUPT
reply will be ignored.

Aborting a filesystem connection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -139,8 +191,8 @@ the filesystem. There are several ways to do this:
- Use forced umount (umount -f). Works in all cases but only if
filesystem is still attached (it hasn't been lazy unmounted)

- Abort filesystem through the sysfs interface. Most powerful
method, always works.
- Abort filesystem through the FUSE control filesystem. Most
powerful method, always works.

How do non-privileged mounts work?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down Expand Up @@ -304,25 +356,7 @@ Scenario 1 - Simple deadlock
| | for "file"]
| | *DEADLOCK*

The solution for this is to allow requests to be interrupted while
they are in userspace:

| [interrupted by signal] |
| <fuse_unlink() |
| [release semaphore] | [semaphore acquired]
| <sys_unlink() |
| | >fuse_unlink()
| | [queue req on fc->pending]
| | [wake up fc->waitq]
| | [sleep on req->waitq]

If the filesystem daemon was single threaded, this will stop here,
since there's no other thread to dequeue and execute the request.
In this case the solution is to kill the FUSE daemon as well. If
there are multiple serving threads, you just have to kill them as
long as any remain.

Moral: a filesystem which deadlocks, can soon find itself dead.
The solution for this is to allow the filesystem to be aborted.

Scenario 2 - Tricky deadlock
----------------------------
Expand Down Expand Up @@ -355,24 +389,14 @@ but is caused by a pagefault.
| | [lock page]
| | * DEADLOCK *

Solution is again to let the the request be interrupted (not
elaborated further).

An additional problem is that while the write buffer is being
copied to the request, the request must not be interrupted. This
is because the destination address of the copy may not be valid
after the request is interrupted.

This is solved with doing the copy atomically, and allowing
interruption while the page(s) belonging to the write buffer are
faulted with get_user_pages(). The 'req->locked' flag indicates
when the copy is taking place, and interruption is delayed until
this flag is unset.
Solution is basically the same as above.

Scenario 3 - Tricky deadlock with asynchronous read
---------------------------------------------------
An additional problem is that while the write buffer is being copied
to the request, the request must not be interrupted/aborted. This is
because the destination address of the copy may not be valid after the
request has returned.

The same situation as above, except thread-1 will wait on page lock
and hence it will be uninterruptible as well. The solution is to
abort the connection with forced umount (if mount is attached) or
through the abort attribute in sysfs.
This is solved with doing the copy atomically, and allowing abort
while the page(s) belonging to the write buffer are faulted with
get_user_pages(). The 'req->locked' flag indicates when the copy is
taking place, and abort is delayed until this flag is unset.
Loading

0 comments on commit 309e88b

Please sign in to comment.