Skip to content

Commit

Permalink
Btrfs: remove unnecessary delalloc mutex for inodes
Browse files Browse the repository at this point in the history
The inode delalloc mutex was added a long time ago by commit f248679
("Btrfs: add a delalloc mutex to inodes for delalloc reservations"), and
the reason for its introduction is not very clear from the change log. It
claims it solves bogus warnings from lockdep, however it lacks an example
report/warning from lockdep, or any explanation.

Since we have enough concurrentcy protection from the locks of the space
info and block reserve objects, and such lockdep warnings don't seem to
exist anymore (at least on a 5.3 kernel I couldn't get them with fstests,
ltp, fs_mark, etc), remove it, simplifying things a bit and decreasing
the size of the btrfs_inode structure. With some quick fio tests doing
direct IO and mmap writes I couldn't observe any significant performance
increase either (direct IO writes that don't increase the file's size
don't hold the inode's lock for their entire duration and mmap writes
don't hold the inode's lock at all), which are the only type of writes
that could see any performance gain due to less serialization.

Review feedback from Josef:

The problem was taking the i_mutex in mmap, which is how I was
protecting delalloc reservations originally.  The delalloc mutex didn't
come with all of the other dependencies.  That's what the lockdep
messages were about, removing the lock isn't going to make them appear
again.

We _had_ to lock around this because we used to do tricks to keep from
over-reserving, and if we didn't serialize delalloc reservations we'd
end up with ugly accounting problems when we tried to clean things up.

However with my recentish changes this isn't the case anymore.  Every
operation is responsible for reserving its space, and then adding it to
the inode.  Then cleaning up is straightforward and can't be mucked up
by other users.  So we no longer need the delalloc mutex to safe us from
ourselves.

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
  • Loading branch information
Filipe Manana authored and David Sterba committed Nov 18, 2019
1 parent bf2df5a commit 16ad3be
Show file tree
Hide file tree
Showing 3 changed files with 5 additions and 20 deletions.
3 changes: 0 additions & 3 deletions fs/btrfs/btrfs_inode.h
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,6 @@ struct btrfs_inode {
/* held while logging the inode in tree-log.c */
struct mutex log_mutex;

/* held while doing delalloc reservations */
struct mutex delalloc_mutex;

/* used to order data wrt metadata */
struct btrfs_ordered_inode_tree ordered_tree;

Expand Down
21 changes: 5 additions & 16 deletions fs/btrfs/delalloc-space.c
Original file line number Diff line number Diff line change
Expand Up @@ -307,7 +307,6 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
unsigned nr_extents;
enum btrfs_reserve_flush_enum flush = BTRFS_RESERVE_FLUSH_ALL;
int ret = 0;
bool delalloc_lock = true;

/*
* If we are a free space inode we need to not flush since we will be in
Expand All @@ -320,7 +319,6 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
*/
if (btrfs_is_free_space_inode(inode)) {
flush = BTRFS_RESERVE_NO_FLUSH;
delalloc_lock = false;
} else {
if (current->journal_info)
flush = BTRFS_RESERVE_FLUSH_LIMIT;
Expand All @@ -329,9 +327,6 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
schedule_timeout(1);
}

if (delalloc_lock)
mutex_lock(&inode->delalloc_mutex);

num_bytes = ALIGN(num_bytes, fs_info->sectorsize);

/*
Expand All @@ -348,10 +343,12 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
&qgroup_reserve);
ret = btrfs_qgroup_reserve_meta_prealloc(root, qgroup_reserve, true);
if (ret)
goto out_fail;
return ret;
ret = btrfs_reserve_metadata_bytes(root, block_rsv, meta_reserve, flush);
if (ret)
goto out_qgroup;
if (ret) {
btrfs_qgroup_free_meta_prealloc(root, qgroup_reserve);
return ret;
}

/*
* Now we need to update our outstanding extents and csum bytes _first_
Expand All @@ -375,15 +372,7 @@ int btrfs_delalloc_reserve_metadata(struct btrfs_inode *inode, u64 num_bytes)
block_rsv->qgroup_rsv_reserved += qgroup_reserve;
spin_unlock(&block_rsv->lock);

if (delalloc_lock)
mutex_unlock(&inode->delalloc_mutex);
return 0;
out_qgroup:
btrfs_qgroup_free_meta_prealloc(root, qgroup_reserve);
out_fail:
if (delalloc_lock)
mutex_unlock(&inode->delalloc_mutex);
return ret;
}

/**
Expand Down
1 change: 0 additions & 1 deletion fs/btrfs/inode.c
Original file line number Diff line number Diff line change
Expand Up @@ -9356,7 +9356,6 @@ struct inode *btrfs_alloc_inode(struct super_block *sb)
ei->io_failure_tree.track_uptodate = true;
atomic_set(&ei->sync_writers, 0);
mutex_init(&ei->log_mutex);
mutex_init(&ei->delalloc_mutex);
btrfs_ordered_inode_tree_init(&ei->ordered_tree);
INIT_LIST_HEAD(&ei->delalloc_inodes);
INIT_LIST_HEAD(&ei->delayed_iput);
Expand Down

0 comments on commit 16ad3be

Please sign in to comment.