Skip to content

Commit

Permalink
---
Browse files Browse the repository at this point in the history
yaml
---
r: 139570
b: refs/heads/master
c: 4fe7041
h: refs/heads/master
v: v3
  • Loading branch information
Linus Torvalds committed Apr 1, 2009
1 parent a0bc54e commit 88945cd
Show file tree
Hide file tree
Showing 48 changed files with 3,993 additions and 2,229 deletions.
2 changes: 1 addition & 1 deletion [refs]
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
---
refs/heads/master: cc85906110e26fe8537c3bdbc08a74ae8110030b
refs/heads/master: 4fe70410d9a219dabb47328effccae7e7f2a6e26
81 changes: 81 additions & 0 deletions trunk/Documentation/ABI/testing/sysfs-fs-ext4
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
What: /sys/fs/ext4/<disk>/mb_stats
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Controls whether the multiblock allocator should
collect statistics, which are shown during the unmount.
1 means to collect statistics, 0 means not to collect
statistics

What: /sys/fs/ext4/<disk>/mb_group_prealloc
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The multiblock allocator will round up allocation
requests to a multiple of this tuning parameter if the
stripe size is not set in the ext4 superblock

What: /sys/fs/ext4/<disk>/mb_max_to_scan
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The maximum number of extents the multiblock allocator
will search to find the best extent

What: /sys/fs/ext4/<disk>/mb_min_to_scan
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
The minimum number of extents the multiblock allocator
will search to find the best extent

What: /sys/fs/ext4/<disk>/mb_order2_req
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which controls the minimum size for
requests (as a power of 2) where the buddy cache is
used

What: /sys/fs/ext4/<disk>/mb_stream_req
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Files which have fewer blocks than this tunable
parameter will have their blocks allocated out of a
block group specific preallocation pool, so that small
files are packed closely together. Each large file
will have its blocks allocated out of its own unique
preallocation pool.

What: /sys/fs/ext4/<disk>/inode_readahead
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
Tuning parameter which controls the maximum number of
inode table blocks that ext4's inode table readahead
algorithm will pre-read into the buffer cache

What: /sys/fs/ext4/<disk>/delayed_allocation_blocks
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of blocks
that are dirty in the page cache, but which do not
have their location in the filesystem allocated yet.

What: /sys/fs/ext4/<disk>/lifetime_write_kbytes
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of kilobytes
of data that have been written to this filesystem since it was
created.

What: /sys/fs/ext4/<disk>/session_write_kbytes
Date: March 2008
Contact: "Theodore Ts'o" <tytso@mit.edu>
Description:
This file is read-only and shows the number of
kilobytes of data that have been written to this
filesystem since it was mounted.
30 changes: 27 additions & 3 deletions trunk/Documentation/filesystems/ext4.txt
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Note: More extensive information for getting started with ext4 can be
* extent format more robust in face of on-disk corruption due to magics,
* internal redundancy in tree
* improved file allocation (multi-block alloc)
* fix 32000 subdirectory limit
* lift 32000 subdirectory limit imposed by i_links_count[1]
* nsec timestamps for mtime, atime, ctime, create time
* inode version field on disk (NFSv4, Lustre)
* reduced e2fsck time via uninit_bg feature
Expand All @@ -100,6 +100,9 @@ Note: More extensive information for getting started with ext4 can be
* efficent new ordered mode in JBD2 and ext4(avoid using buffer head to force
the ordering)

[1] Filesystems with a block size of 1k may see a limit imposed by the
directory hash tree having a maximum depth of two.

2.2 Candidate features for future inclusion

* Online defrag (patches available but not well tested)
Expand Down Expand Up @@ -180,15 +183,18 @@ commit=nrsec (*) Ext4 can be told to sync all its data and metadata
performance.

barrier=<0|1(*)> This enables/disables the use of write barriers in
the jbd code. barrier=0 disables, barrier=1 enables.
This also requires an IO stack which can support
barrier(*) the jbd code. barrier=0 disables, barrier=1 enables.
nobarrier This also requires an IO stack which can support
barriers, and if jbd gets an error on a barrier
write, it will disable again with a warning.
Write barriers enforce proper on-disk ordering
of journal commits, making volatile disk write caches
safe to use, at some performance penalty. If
your disks are battery-backed in one way or another,
disabling barriers may safely improve performance.
The mount options "barrier" and "nobarrier" can
also be used to enable or disable barriers, for
consistency with other ext4 mount options.

inode_readahead=n This tuning parameter controls the maximum
number of inode table blocks that ext4's inode
Expand Down Expand Up @@ -310,6 +316,24 @@ journal_ioprio=prio The I/O priority (from 0 to 7, where 0 is the
a slightly higher priority than the default I/O
priority.

auto_da_alloc(*) Many broken applications don't use fsync() when
noauto_da_alloc replacing existing files via patterns such as
fd = open("foo.new")/write(fd,..)/close(fd)/
rename("foo.new", "foo"), or worse yet,
fd = open("foo", O_TRUNC)/write(fd,..)/close(fd).
If auto_da_alloc is enabled, ext4 will detect
the replace-via-rename and replace-via-truncate
patterns and force that any delayed allocation
blocks are allocated such that at the next
journal commit, in the default data=ordered
mode, the data blocks of the new file are forced
to disk before the rename() operation is
commited. This provides roughly the same level
of guarantees as ext3, and avoids the
"zero-length" problem that can happen when a
system crashes before the delayed allocation
blocks are forced to disk.

Data Mode
=========
There are 3 different data modes:
Expand Down
21 changes: 0 additions & 21 deletions trunk/Documentation/filesystems/proc.txt
Original file line number Diff line number Diff line change
Expand Up @@ -940,27 +940,6 @@ Table 1-10: Files in /proc/fs/ext4/<devname>
File Content
mb_groups details of multiblock allocator buddy cache of free blocks
mb_history multiblock allocation history
stats controls whether the multiblock allocator should start
collecting statistics, which are shown during the unmount
group_prealloc the multiblock allocator will round up allocation
requests to a multiple of this tuning parameter if the
stripe size is not set in the ext4 superblock
max_to_scan The maximum number of extents the multiblock allocator
will search to find the best extent
min_to_scan The minimum number of extents the multiblock allocator
will search to find the best extent
order2_req Tuning parameter which controls the minimum size for
requests (as a power of 2) where the buddy cache is
used
stream_req Files which have fewer blocks than this tunable
parameter will have their blocks allocated out of a
block group specific preallocation pool, so that small
files are packed closely together. Each large file
will have its blocks allocated out of its own unique
preallocation pool.
inode_readahead Tuning parameter which controls the maximum number of
inode table blocks that ext4's inode table readahead
algorithm will pre-read into the buffer cache
..............................................................................


Expand Down
2 changes: 1 addition & 1 deletion trunk/fs/btrfs/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ btrfs-y := super.o ctree.o extent-tree.o print-tree.o root-tree.o dir-item.o \
extent_map.o sysfs.o struct-funcs.o xattr.o ordered-data.o \
extent_io.o volumes.o async-thread.o ioctl.o locking.o orphan.o \
ref-cache.o export.o tree-log.o acl.o free-space-cache.o zlib.o \
compression.o
compression.o delayed-ref.o
else

# Normal Makefile
Expand Down
31 changes: 25 additions & 6 deletions trunk/fs/btrfs/btrfs_inode.h
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,12 @@ struct btrfs_inode {
*/
struct list_head delalloc_inodes;

/*
* list for tracking inodes that must be sent to disk before a
* rename or truncate commit
*/
struct list_head ordered_operations;

/* the space_info for where this inode's data allocations are done */
struct btrfs_space_info *space_info;

Expand All @@ -86,12 +92,6 @@ struct btrfs_inode {
*/
u64 logged_trans;

/*
* trans that last made a change that should be fully fsync'd. This
* gets reset to zero each time the inode is logged
*/
u64 log_dirty_trans;

/* total number of bytes pending delalloc, used by stat to calc the
* real block usage of the file
*/
Expand Down Expand Up @@ -121,6 +121,25 @@ struct btrfs_inode {
/* the start of block group preferred for allocations. */
u64 block_group;

/* the fsync log has some corner cases that mean we have to check
* directories to see if any unlinks have been done before
* the directory was logged. See tree-log.c for all the
* details
*/
u64 last_unlink_trans;

/*
* ordered_data_close is set by truncate when a file that used
* to have good data has been truncated to zero. When it is set
* the btrfs file release call will add this inode to the
* ordered operations list so that we make sure to flush out any
* new data the application may have written before commit.
*
* yes, its silly to have a single bitflag, but we might grow more
* of these.
*/
unsigned ordered_data_close:1;

struct inode vfs_inode;
};

Expand Down
Loading

0 comments on commit 88945cd

Please sign in to comment.