Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 143254
b: refs/heads/master
c: b21597d
h: refs/heads/master
v: v3
  • Loading branch information
Linus Torvalds committed Apr 14, 2009
1 parent 5ae6e9a commit 0f7286d
Show file tree
Hide file tree
Showing 106 changed files with 2,600 additions and 1,647 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: 39826a1e17c1957bd7b5cd7815b83940e5e3a230
refs/heads/master: b21597d0268983f8f9e8b563494f75490403e948
6 changes: 3 additions & 3 deletions trunk/Documentation/ABI/testing/debugfs-pktcdvd
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
What: /debug/pktcdvd/pktcdvd[0-7]
What: /sys/kernel/debug/pktcdvd/pktcdvd[0-7]
Date: Oct. 2006
KernelVersion: 2.6.20
Contact: Thomas Maier <balagi@justmail.de>
Expand All @@ -10,10 +10,10 @@ debugfs interface
The pktcdvd module (packet writing driver) creates
these files in debugfs:

/debug/pktcdvd/pktcdvd[0-7]/
/sys/kernel/debug/pktcdvd/pktcdvd[0-7]/
info (0444) Lots of driver statistics and infos.

Example:
-------

cat /debug/pktcdvd/pktcdvd0/info
cat /sys/kernel/debug/pktcdvd/pktcdvd0/info
55 changes: 32 additions & 23 deletions trunk/Documentation/cgroups/memory.txt
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,14 @@ used here with the memory controller that is used in hardware.

Salient features

a. Enable control of both RSS (mapped) and Page Cache (unmapped) pages
a. Enable control of Anonymous, Page Cache (mapped and unmapped) and
Swap Cache memory pages.
b. The infrastructure allows easy addition of other types of memory to control
c. Provides *zero overhead* for non memory controller users
d. Provides a double LRU: global memory pressure causes reclaim from the
global LRU; a cgroup on hitting a limit, reclaims from the per
cgroup LRU

NOTE: Swap Cache (unmapped) is not accounted now.

Benefits and Purpose of the memory controller

The memory controller isolates the memory behaviour of a group of tasks
Expand Down Expand Up @@ -290,34 +289,44 @@ will be charged as a new owner of it.
moved to the parent. If you want to avoid that, force_empty will be useful.

5.2 stat file
memory.stat file includes following statistics (now)
cache - # of pages from page-cache and shmem.
rss - # of pages from anonymous memory.
pgpgin - # of event of charging
pgpgout - # of event of uncharging
active_anon - # of pages on active lru of anon, shmem.
inactive_anon - # of pages on active lru of anon, shmem
active_file - # of pages on active lru of file-cache
inactive_file - # of pages on inactive lru of file cache
unevictable - # of pages cannot be reclaimed.(mlocked etc)

Below is depend on CONFIG_DEBUG_VM.
inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)

Memo:

memory.stat file includes following statistics

cache - # of bytes of page cache memory.
rss - # of bytes of anonymous and swap cache memory.
pgpgin - # of pages paged in (equivalent to # of charging events).
pgpgout - # of pages paged out (equivalent to # of uncharging events).
active_anon - # of bytes of anonymous and swap cache memory on active
lru list.
inactive_anon - # of bytes of anonymous memory and swap cache memory on
inactive lru list.
active_file - # of bytes of file-backed memory on active lru list.
inactive_file - # of bytes of file-backed memory on inactive lru list.
unevictable - # of bytes of memory that cannot be reclaimed (mlocked etc).

The following additional stats are dependent on CONFIG_DEBUG_VM.

inactive_ratio - VM internal parameter. (see mm/page_alloc.c)
recent_rotated_anon - VM internal parameter. (see mm/vmscan.c)
recent_rotated_file - VM internal parameter. (see mm/vmscan.c)
recent_scanned_anon - VM internal parameter. (see mm/vmscan.c)
recent_scanned_file - VM internal parameter. (see mm/vmscan.c)

Memo:
recent_rotated means recent frequency of lru rotation.
recent_scanned means recent # of scans to lru.
showing for better debug please see the code for meanings.

Note:
Only anonymous and swap cache memory is listed as part of 'rss' stat.
This should not be confused with the true 'resident set size' or the
amount of physical memory used by the cgroup. Per-cgroup rss
accounting is not done yet.

5.3 swappiness
Similar to /proc/sys/vm/swappiness, but affecting a hierarchy of groups only.

Following cgroup's swapiness can't be changed.
Following cgroups' swapiness can't be changed.
- root cgroup (uses /proc/sys/vm/swappiness).
- a cgroup which uses hierarchy and it has child cgroup.
- a cgroup which uses hierarchy and not the root of hierarchy.
Expand Down
27 changes: 21 additions & 6 deletions trunk/Documentation/cgroups/resource_counter.txt
Original file line number Diff line number Diff line change
Expand Up @@ -47,13 +47,18 @@ to work with it.

2. Basic accounting routines

a. void res_counter_init(struct res_counter *rc)
a. void res_counter_init(struct res_counter *rc,
struct res_counter *rc_parent)

Initializes the resource counter. As usual, should be the first
routine called for a new counter.

b. int res_counter_charge[_locked]
(struct res_counter *rc, unsigned long val)
The struct res_counter *parent can be used to define a hierarchical
child -> parent relationship directly in the res_counter structure,
NULL can be used to define no relationship.

c. int res_counter_charge(struct res_counter *rc, unsigned long val,
struct res_counter **limit_fail_at)

When a resource is about to be allocated it has to be accounted
with the appropriate resource counter (controller should determine
Expand All @@ -67,15 +72,25 @@ to work with it.
* if the charging is performed first, then it should be uncharged
on error path (if the one is called).

c. void res_counter_uncharge[_locked]
If the charging fails and a hierarchical dependency exists, the
limit_fail_at parameter is set to the particular res_counter element
where the charging failed.

d. int res_counter_charge_locked
(struct res_counter *rc, unsigned long val)

The same as res_counter_charge(), but it must not acquire/release the
res_counter->lock internally (it must be called with res_counter->lock
held).

e. void res_counter_uncharge[_locked]
(struct res_counter *rc, unsigned long val)

When a resource is released (freed) it should be de-accounted
from the resource counter it was accounted to. This is called
"uncharging".

The _locked routines imply that the res_counter->lock is taken.

The _locked routines imply that the res_counter->lock is taken.

2.1 Other accounting routines

Expand Down
2 changes: 1 addition & 1 deletion trunk/Documentation/sysctl/net.txt
Original file line number Diff line number Diff line change
Expand Up @@ -95,7 +95,7 @@ of struct cmsghdr structures with appended data.

There is only one file in this directory.
unix_dgram_qlen limits the max number of datagrams queued in Unix domain
socket's buffer. It will not take effect unless PF_UNIX flag is spicified.
socket's buffer. It will not take effect unless PF_UNIX flag is specified.


3. /proc/sys/net/ipv4 - IPV4 settings
Expand Down
2 changes: 2 additions & 0 deletions trunk/Documentation/vm/00-INDEX
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
00-INDEX
- this file.
active_mm.txt
- An explanation from Linus about tsk->active_mm vs tsk->mm.
balance
- various information on memory balancing.
hugetlbpage.txt
Expand Down
83 changes: 83 additions & 0 deletions trunk/Documentation/vm/active_mm.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
List: linux-kernel
Subject: Re: active_mm
From: Linus Torvalds <torvalds () transmeta ! com>
Date: 1999-07-30 21:36:24

Cc'd to linux-kernel, because I don't write explanations all that often,
and when I do I feel better about more people reading them.

On Fri, 30 Jul 1999, David Mosberger wrote:
>
> Is there a brief description someplace on how "mm" vs. "active_mm" in
> the task_struct are supposed to be used? (My apologies if this was
> discussed on the mailing lists---I just returned from vacation and
> wasn't able to follow linux-kernel for a while).

Basically, the new setup is:

- we have "real address spaces" and "anonymous address spaces". The
difference is that an anonymous address space doesn't care about the
user-level page tables at all, so when we do a context switch into an
anonymous address space we just leave the previous address space
active.

The obvious use for a "anonymous address space" is any thread that
doesn't need any user mappings - all kernel threads basically fall into
this category, but even "real" threads can temporarily say that for
some amount of time they are not going to be interested in user space,
and that the scheduler might as well try to avoid wasting time on
switching the VM state around. Currently only the old-style bdflush
sync does that.

- "tsk->mm" points to the "real address space". For an anonymous process,
tsk->mm will be NULL, for the logical reason that an anonymous process
really doesn't _have_ a real address space at all.

- however, we obviously need to keep track of which address space we
"stole" for such an anonymous user. For that, we have "tsk->active_mm",
which shows what the currently active address space is.

The rule is that for a process with a real address space (ie tsk->mm is
non-NULL) the active_mm obviously always has to be the same as the real
one.

For a anonymous process, tsk->mm == NULL, and tsk->active_mm is the
"borrowed" mm while the anonymous process is running. When the
anonymous process gets scheduled away, the borrowed address space is
returned and cleared.

To support all that, the "struct mm_struct" now has two counters: a
"mm_users" counter that is how many "real address space users" there are,
and a "mm_count" counter that is the number of "lazy" users (ie anonymous
users) plus one if there are any real users.

Usually there is at least one real user, but it could be that the real
user exited on another CPU while a lazy user was still active, so you do
actually get cases where you have a address space that is _only_ used by
lazy users. That is often a short-lived state, because once that thread
gets scheduled away in favour of a real thread, the "zombie" mm gets
released because "mm_users" becomes zero.

Also, a new rule is that _nobody_ ever has "init_mm" as a real MM any
more. "init_mm" should be considered just a "lazy context when no other
context is available", and in fact it is mainly used just at bootup when
no real VM has yet been created. So code that used to check

if (current->mm == &init_mm)

should generally just do

if (!current->mm)

instead (which makes more sense anyway - the test is basically one of "do
we have a user context", and is generally done by the page fault handler
and things like that).

Anyway, I put a pre-patch-2.3.13-1 on ftp.kernel.org just a moment ago,
because it slightly changes the interfaces to accomodate the alpha (who
would have thought it, but the alpha actually ends up having one of the
ugliest context switch codes - unlike the other architectures where the MM
and register state is separate, the alpha PALcode joins the two, and you
need to switch both together).

(From http://marc.info/?l=linux-kernel&m=93337278602211&w=2)
Loading

0 comments on commit 0f7286d

Please sign in to comment.