Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 142247
b: refs/heads/master
c: 8e320d0
h: refs/heads/master
i:
  142245: e9ce68c
  142243: 9918b9a
  142239: 9e1f028
v: v3
  • Loading branch information
Linus Torvalds committed Apr 6, 2009
1 parent 174ffc4 commit a0cad2a
Show file tree
Hide file tree
Showing 1,392 changed files with 360,290 additions and 87,833 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: 303a0e11d0ee136ad8f53f747f3c377daece763b
refs/heads/master: 8e320d02718d2872d52ef88a69a493e420494269
71 changes: 71 additions & 0 deletions trunk/Documentation/ABI/testing/debugfs-kmemtrace
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
What: /sys/kernel/debug/kmemtrace/
Date: July 2008
Contact: Eduard - Gabriel Munteanu <eduard.munteanu@linux360.ro>
Description:

In kmemtrace-enabled kernels, the following files are created:

/sys/kernel/debug/kmemtrace/
cpu<n> (0400) Per-CPU tracing data, see below. (binary)
total_overruns (0400) Total number of bytes which were dropped from
cpu<n> files because of full buffer condition,
non-binary. (text)
abi_version (0400) Kernel's kmemtrace ABI version. (text)

Each per-CPU file should be read according to the relay interface. That is,
the reader should set affinity to that specific CPU and, as currently done by
the userspace application (though there are other methods), use poll() with
an infinite timeout before every read(). Otherwise, erroneous data may be
read. The binary data has the following _core_ format:

Event ID (1 byte) Unsigned integer, one of:
0 - represents an allocation (KMEMTRACE_EVENT_ALLOC)
1 - represents a freeing of previously allocated memory
(KMEMTRACE_EVENT_FREE)
Type ID (1 byte) Unsigned integer, one of:
0 - this is a kmalloc() / kfree()
1 - this is a kmem_cache_alloc() / kmem_cache_free()
2 - this is a __get_free_pages() et al.
Event size (2 bytes) Unsigned integer representing the
size of this event. Used to extend
kmemtrace. Discard the bytes you
don't know about.
Sequence number (4 bytes) Signed integer used to reorder data
logged on SMP machines. Wraparound
must be taken into account, although
it is unlikely.
Caller address (8 bytes) Return address to the caller.
Pointer to mem (8 bytes) Pointer to target memory area. Can be
NULL, but not all such calls might be
recorded.

In case of KMEMTRACE_EVENT_ALLOC events, the next fields follow:

Requested bytes (8 bytes) Total number of requested bytes,
unsigned, must not be zero.
Allocated bytes (8 bytes) Total number of actually allocated
bytes, unsigned, must not be lower
than requested bytes.
Requested flags (4 bytes) GFP flags supplied by the caller.
Target CPU (4 bytes) Signed integer, valid for event id 1.
If equal to -1, target CPU is the same
as origin CPU, but the reverse might
not be true.

The data is made available in the same endianness the machine has.

Other event ids and type ids may be defined and added. Other fields may be
added by increasing event size, but see below for details.
Every modification to the ABI, including new id definitions, are followed
by bumping the ABI version by one.

Adding new data to the packet (features) is done at the end of the mandatory
data:
Feature size (2 byte)
Feature ID (1 byte)
Feature data (Feature size - 3 bytes)


Users:
kmemtrace-user - git://repo.or.cz/kmemtrace-user.git

2 changes: 1 addition & 1 deletion trunk/Documentation/DocBook/kernel-api.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -259,7 +259,7 @@ X!Earch/x86/kernel/mca_32.c
!Eblock/blk-tag.c
!Iblock/blk-tag.c
!Eblock/blk-integrity.c
!Iblock/blktrace.c
!Ikernel/trace/blktrace.c
!Iblock/genhd.c
!Eblock/genhd.c
</chapter>
Expand Down
3 changes: 2 additions & 1 deletion trunk/Documentation/feature-removal-schedule.txt
Original file line number Diff line number Diff line change
Expand Up @@ -354,7 +354,8 @@ Who: Krzysztof Piotr Oledzki <ole@ans.pl>

---------------------------

What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client()
What: i2c_attach_client(), i2c_detach_client(), i2c_driver->detach_client(),
i2c_adapter->client_register(), i2c_adapter->client_unregister
When: 2.6.30
Check: i2c_attach_client i2c_detach_client
Why: Deprecated by the new (standard) device driver binding model. Use
Expand Down
159 changes: 159 additions & 0 deletions trunk/Documentation/filesystems/knfsd-stats.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,159 @@

Kernel NFS Server Statistics
============================

This document describes the format and semantics of the statistics
which the kernel NFS server makes available to userspace. These
statistics are available in several text form pseudo files, each of
which is described separately below.

In most cases you don't need to know these formats, as the nfsstat(8)
program from the nfs-utils distribution provides a helpful command-line
interface for extracting and printing them.

All the files described here are formatted as a sequence of text lines,
separated by newline '\n' characters. Lines beginning with a hash
'#' character are comments intended for humans and should be ignored
by parsing routines. All other lines contain a sequence of fields
separated by whitespace.

/proc/fs/nfsd/pool_stats
------------------------

This file is available in kernels from 2.6.30 onwards, if the
/proc/fs/nfsd filesystem is mounted (it almost always should be).

The first line is a comment which describes the fields present in
all the other lines. The other lines present the following data as
a sequence of unsigned decimal numeric fields. One line is shown
for each NFS thread pool.

All counters are 64 bits wide and wrap naturally. There is no way
to zero these counters, instead applications should do their own
rate conversion.

pool
The id number of the NFS thread pool to which this line applies.
This number does not change.

Thread pool ids are a contiguous set of small integers starting
at zero. The maximum value depends on the thread pool mode, but
currently cannot be larger than the number of CPUs in the system.
Note that in the default case there will be a single thread pool
which contains all the nfsd threads and all the CPUs in the system,
and thus this file will have a single line with a pool id of "0".

packets-arrived
Counts how many NFS packets have arrived. More precisely, this
is the number of times that the network stack has notified the
sunrpc server layer that new data may be available on a transport
(e.g. an NFS or UDP socket or an NFS/RDMA endpoint).

Depending on the NFS workload patterns and various network stack
effects (such as Large Receive Offload) which can combine packets
on the wire, this may be either more or less than the number
of NFS calls received (which statistic is available elsewhere).
However this is a more accurate and less workload-dependent measure
of how much CPU load is being placed on the sunrpc server layer
due to NFS network traffic.

sockets-enqueued
Counts how many times an NFS transport is enqueued to wait for
an nfsd thread to service it, i.e. no nfsd thread was considered
available.

The circumstance this statistic tracks indicates that there was NFS
network-facing work to be done but it couldn't be done immediately,
thus introducing a small delay in servicing NFS calls. The ideal
rate of change for this counter is zero; significantly non-zero
values may indicate a performance limitation.

This can happen either because there are too few nfsd threads in the
thread pool for the NFS workload (the workload is thread-limited),
or because the NFS workload needs more CPU time than is available in
the thread pool (the workload is CPU-limited). In the former case,
configuring more nfsd threads will probably improve the performance
of the NFS workload. In the latter case, the sunrpc server layer is
already choosing not to wake idle nfsd threads because there are too
many nfsd threads which want to run but cannot, so configuring more
nfsd threads will make no difference whatsoever. The overloads-avoided
statistic (see below) can be used to distinguish these cases.

threads-woken
Counts how many times an idle nfsd thread is woken to try to
receive some data from an NFS transport.

This statistic tracks the circumstance where incoming
network-facing NFS work is being handled quickly, which is a good
thing. The ideal rate of change for this counter will be close
to but less than the rate of change of the packets-arrived counter.

overloads-avoided
Counts how many times the sunrpc server layer chose not to wake an
nfsd thread, despite the presence of idle nfsd threads, because
too many nfsd threads had been recently woken but could not get
enough CPU time to actually run.

This statistic counts a circumstance where the sunrpc layer
heuristically avoids overloading the CPU scheduler with too many
runnable nfsd threads. The ideal rate of change for this counter
is zero. Significant non-zero values indicate that the workload
is CPU limited. Usually this is associated with heavy CPU usage
on all the CPUs in the nfsd thread pool.

If a sustained large overloads-avoided rate is detected on a pool,
the top(1) utility should be used to check for the following
pattern of CPU usage on all the CPUs associated with the given
nfsd thread pool.

- %us ~= 0 (as you're *NOT* running applications on your NFS server)

- %wa ~= 0

- %id ~= 0

- %sy + %hi + %si ~= 100

If this pattern is seen, configuring more nfsd threads will *not*
improve the performance of the workload. If this patten is not
seen, then something more subtle is wrong.

threads-timedout
Counts how many times an nfsd thread triggered an idle timeout,
i.e. was not woken to handle any incoming network packets for
some time.

This statistic counts a circumstance where there are more nfsd
threads configured than can be used by the NFS workload. This is
a clue that the number of nfsd threads can be reduced without
affecting performance. Unfortunately, it's only a clue and not
a strong indication, for a couple of reasons:

- Currently the rate at which the counter is incremented is quite
slow; the idle timeout is 60 minutes. Unless the NFS workload
remains constant for hours at a time, this counter is unlikely
to be providing information that is still useful.

- It is usually a wise policy to provide some slack,
i.e. configure a few more nfsds than are currently needed,
to allow for future spikes in load.


Note that incoming packets on NFS transports will be dealt with in
one of three ways. An nfsd thread can be woken (threads-woken counts
this case), or the transport can be enqueued for later attention
(sockets-enqueued counts this case), or the packet can be temporarily
deferred because the transport is currently being used by an nfsd
thread. This last case is not very interesting and is not explicitly
counted, but can be inferred from the other counters thus:

packets-deferred = packets-arrived - ( sockets-enqueued + threads-woken )


More
----
Descriptions of the other statistics file should go here.


Greg Banks <gnb@sgi.com>
26 Mar 2009
Loading

0 comments on commit a0cad2a

Please sign in to comment.