Skip to content

Commit

Permalink
Btrfs: make the chunk allocator completely tree lockless
Browse files Browse the repository at this point in the history
When adjusting the enospc rules for relocation I ran into a deadlock because we
were relocating the only system chunk and that forced us to try and allocate a
new system chunk while holding locks in the chunk tree, which caused us to
deadlock.  To fix this I've moved all of the dev extent addition and chunk
addition out to the delayed chunk completion stuff.  We still keep the in-memory
stuff which makes sure everything is consistent.

One change I had to make was to search the commit root of the device tree to
find a free dev extent, and hold onto any chunk em's that we allocated in that
transaction so we do not allocate the same dev extent twice.  This has the side
effect of fixing a bug with balance that has been there ever since balance
existed.  Basically you can free a block group and it's dev extent and then
immediately allocate that dev extent for a new block group and write stuff to
that dev extent, all within the same transaction.  So if you happen to crash
during a balance you could come back to a completely broken file system.  This
patch should keep these sort of things from happening in the future since we
won't be able to allocate free'd dev extents until after the transaction
commits.  This has passed all of the xfstests and my super annoying stress test
followed by a balance.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
  • Loading branch information
Josef Bacik committed Jul 2, 2013
1 parent 68a7342 commit 6df9a95
Show file tree
Hide file tree
Showing 5 changed files with 166 additions and 169 deletions.
15 changes: 14 additions & 1 deletion fs/btrfs/extent-tree.c
Original file line number Diff line number Diff line change
Expand Up @@ -7950,6 +7950,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr)
struct btrfs_space_info *space_info;
struct btrfs_fs_devices *fs_devices = root->fs_info->fs_devices;
struct btrfs_device *device;
struct btrfs_trans_handle *trans;
u64 min_free;
u64 dev_min = 1;
u64 dev_nr = 0;
Expand Down Expand Up @@ -8036,6 +8037,13 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr)
do_div(min_free, dev_min);
}

/* We need to do this so that we can look at pending chunks */
trans = btrfs_join_transaction(root);
if (IS_ERR(trans)) {
ret = PTR_ERR(trans);
goto out;
}

mutex_lock(&root->fs_info->chunk_mutex);
list_for_each_entry(device, &fs_devices->alloc_list, dev_alloc_list) {
u64 dev_offset;
Expand All @@ -8046,7 +8054,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr)
*/
if (device->total_bytes > device->bytes_used + min_free &&
!device->is_tgtdev_for_dev_replace) {
ret = find_free_dev_extent(device, min_free,
ret = find_free_dev_extent(trans, device, min_free,
&dev_offset, NULL);
if (!ret)
dev_nr++;
Expand All @@ -8058,6 +8066,7 @@ int btrfs_can_relocate(struct btrfs_root *root, u64 bytenr)
}
}
mutex_unlock(&root->fs_info->chunk_mutex);
btrfs_end_transaction(trans, root);
out:
btrfs_put_block_group(block_group);
return ret;
Expand Down Expand Up @@ -8423,6 +8432,10 @@ void btrfs_create_pending_block_groups(struct btrfs_trans_handle *trans,
sizeof(item));
if (ret)
btrfs_abort_transaction(trans, extent_root, ret);
ret = btrfs_finish_chunk_alloc(trans, extent_root,
key.objectid, key.offset);
if (ret)
btrfs_abort_transaction(trans, extent_root, ret);
}
}

Expand Down
9 changes: 9 additions & 0 deletions fs/btrfs/transaction.c
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,14 @@ static void put_transaction(struct btrfs_transaction *transaction)
if (atomic_dec_and_test(&transaction->use_count)) {
BUG_ON(!list_empty(&transaction->list));
WARN_ON(transaction->delayed_refs.root.rb_node);
while (!list_empty(&transaction->pending_chunks)) {
struct extent_map *em;

em = list_first_entry(&transaction->pending_chunks,
struct extent_map, list);
list_del_init(&em->list);
free_extent_map(em);
}
kmem_cache_free(btrfs_transaction_cachep, transaction);
}
}
Expand Down Expand Up @@ -202,6 +210,7 @@ static noinline int join_transaction(struct btrfs_root *root, unsigned int type)

INIT_LIST_HEAD(&cur_trans->pending_snapshots);
INIT_LIST_HEAD(&cur_trans->ordered_operations);
INIT_LIST_HEAD(&cur_trans->pending_chunks);
list_add_tail(&cur_trans->list, &fs_info->trans_list);
extent_io_tree_init(&cur_trans->dirty_pages,
fs_info->btree_inode->i_mapping);
Expand Down
1 change: 1 addition & 0 deletions fs/btrfs/transaction.h
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ struct btrfs_transaction {
wait_queue_head_t commit_wait;
struct list_head pending_snapshots;
struct list_head ordered_operations;
struct list_head pending_chunks;
struct btrfs_delayed_ref_root delayed_refs;
int aborted;
};
Expand Down
Loading

0 comments on commit 6df9a95

Please sign in to comment.