Skip to content

Commit

Permalink
ext4: mballoc fix mb_normalize_request algorithm for 1KB block size f…
Browse files Browse the repository at this point in the history
…ilesystems

In case of inode preallocation, the number of blocks to allocate depends
on the file size and it is calculated in ext4_mb_normalize_request().
Each group in the filesystem is then checked to find one that can be
used for allocation; this is done in ext4_mb_good_group().

When a file bigger than 4MB is created, the requested number of blocks
to preallocate, calculated by ext4_mb_normalize_request is 4096.
However for a filesystem with 1KB block size, the maximum size of the
block buddies used by the multiblock allocator is 2048, so none of
groups in the filesystem satisfies the search criteria in
ext4_mb_good_group(). Scanning all the filesystem groups impacts
performance.

This was demonstrated by using a freshly created, 70GB, 1k block
filesystem, with caches dropped write before the test via
/proc/sys/vm/drop_caches, and with the filesystem mounted with
nodelalloc and nodealloc,nomballoc.  The time to write an 8 megabyte
file using "dd if=/dev/zero of=/mnt/test/fo bs=8k count=1k conv=fsync"
took 35.5091 seconds (236kB/s) with nodellaloc, and 0.233754 seconds
(35.9 MB/s) with the nodelloc,nomballoc options.  With a 1TB partition,
it took several minutes to write 8MB!

This patch modifies the algorithm in ext4_mb_normalize_group_request to
calculate the number of blocks to allocate by taking into account the
maximum size of free blocks chunks handled by the multiblock allocator.

It has also been tested for filesystems with 2KB and 4KB block sizes to
ensure that those cases don't regress.

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Valerie Clement <valerie.clement@bull.net>
Signed-off-by: Mingming Cao <cmm@us.ibm.com>
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
  • Loading branch information
Valerie Clement authored and Theodore Ts'o committed May 13, 2008
1 parent 2c8be6b commit 1930479
Showing 1 changed file with 9 additions and 10 deletions.
19 changes: 9 additions & 10 deletions fs/ext4/mballoc.c
Original file line number Diff line number Diff line change
Expand Up @@ -2880,12 +2880,11 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
if (size < i_size_read(ac->ac_inode))
size = i_size_read(ac->ac_inode);

/* max available blocks in a free group */
max = EXT4_BLOCKS_PER_GROUP(ac->ac_sb) - 1 - 1 -
EXT4_SB(ac->ac_sb)->s_itb_per_group;
/* max size of free chunks */
max = 2 << bsbits;

#define NRL_CHECK_SIZE(req, size, max,bits) \
(req <= (size) || max <= ((size) >> bits))
#define NRL_CHECK_SIZE(req, size, max, chunk_size) \
(req <= (size) || max <= (chunk_size))

/* first, try to predict filesize */
/* XXX: should this table be tunable? */
Expand All @@ -2904,16 +2903,16 @@ ext4_mb_normalize_request(struct ext4_allocation_context *ac,
size = 512 * 1024;
} else if (size <= 1024 * 1024) {
size = 1024 * 1024;
} else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, bsbits)) {
} else if (NRL_CHECK_SIZE(size, 4 * 1024 * 1024, max, 2 * 1024)) {
start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
(20 - bsbits)) << 20;
size = 1024 * 1024;
} else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, bsbits)) {
(21 - bsbits)) << 21;
size = 2 * 1024 * 1024;
} else if (NRL_CHECK_SIZE(size, 8 * 1024 * 1024, max, 4 * 1024)) {
start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
(22 - bsbits)) << 22;
size = 4 * 1024 * 1024;
} else if (NRL_CHECK_SIZE(ac->ac_o_ex.fe_len,
(8<<20)>>bsbits, max, bsbits)) {
(8<<20)>>bsbits, max, 8 * 1024)) {
start_off = ((loff_t)ac->ac_o_ex.fe_logical >>
(23 - bsbits)) << 23;
size = 8 * 1024 * 1024;
Expand Down

0 comments on commit 1930479

Please sign in to comment.