Skip to content

Commit

Permalink
Merge tag 'for-5.20-tag' of git://git.kernel.org/pub/scm/linux/kernel…
Browse files Browse the repository at this point in the history
…/git/kdave/linux

Pull btrfs updates from David Sterba:
 "This brings some long awaited changes, the send protocol bump,
  otherwise lots of small improvements and fixes. The main core part is
  reworking bio handling, cleaning up the submission and endio and
  improving error handling.

  There are some changes outside of btrfs adding helpers or updating
  API, listed at the end of the changelog.

  Features:

   - sysfs:
      - export chunk size, in debug mode add tunable for setting its size
      - show zoned among features (was only in debug mode)
      - show commit stats (number, last/max/total duration)

   - send protocol updated to 2
      - new commands:
         - ability write larger data chunks than 64K
         - send raw compressed extents (uses the encoded data ioctls),
           ie. no decompression on send side, no compression needed on
           receive side if supported
         - send 'otime' (inode creation time) among other timestamps
         - send file attributes (a.k.a file flags and xflags)
      - this is first version bump, backward compatibility on send and
        receive side is provided
      - there are still some known and wanted commands that will be
        implemented in the near future, another version bump will be
        needed, however we want to minimize that to avoid causing
        usability issues

   - print checksum type and implementation at mount time

   - don't print some messages at mount (mentioned as people asked about
     it), we want to print messages namely for new features so let's
     make some space for that
      - big metadata - this has been supported for a long time and is
        not a feature that's worth mentioning
      - skinny metadata - same reason, set by default by mkfs

  Performance improvements:

   - reduced amount of reserved metadata for delayed items
      - when inserted items can be batched into one leaf
      - when deleting batched directory index items
      - when deleting delayed items used for deletion
      - overall improved count of files/sec, decreased subvolume lock
        contention

   - metadata item access bounds checker micro-optimized, with a few
     percent of improved runtime for metadata-heavy operations

   - increase direct io limit for read to 256 sectors, improved
     throughput by 3x on sample workload

  Notable fixes:

   - raid56
      - reduce parity writes, skip sectors of stripe when there are no
        data updates
      - restore reading from on-disk data instead of using stripe cache,
        this reduces chances to damage correct data due to RMW cycle

   - refuse to replay log with unknown incompat read-only feature bit
     set

   - zoned
      - fix page locking when COW fails in the middle of allocation
      - improved tracking of active zones, ZNS drives may limit the
        number and there are ENOSPC errors due to that limit and not
        actual lack of space
      - adjust maximum extent size for zone append so it does not cause
        late ENOSPC due to underreservation

   - mirror reading error messages show the mirror number

   - don't fallback to buffered IO for NOWAIT direct IO writes, we don't
     have the NOWAIT semantics for buffered io yet

   - send, fix sending link commands for existing file paths when there
     are deleted and created hardlinks for same files

   - repair all mirrors for profiles with more than 1 copy (raid1c34)

   - fix repair of compressed extents, unify where error detection and
     repair happen

  Core changes:

   - bio completion cleanups
      - don't double defer compression bios
      - simplify endio workqueues
      - add more data to btrfs_bio to avoid allocation for read requests
      - rework bio error handling so it's same what block layer does,
        the submission works and errors are consumed in endio
      - when asynchronous bio offload fails fall back to synchronous
        checksum calculation to avoid errors under writeback or memory
        pressure

   - new trace points
      - raid56 events
      - ordered extent operations

   - super block log_root_transid deprecated (never used)

   - mixed_backref and big_metadata sysfs feature files removed, they've
     been default for sufficiently long time, there are no known users
     and mixed_backref could be confused with mixed_groups

  Non-btrfs changes, API updates:

   - minor highmem API update to cover const arguments

   - switch all kmap/kmap_atomic to kmap_local

   - remove redundant flush_dcache_page()

   - address_space_operations::writepage callback removed

   - add bdev_max_segments() helper"

* tag 'for-5.20-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux: (163 commits)
  btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read
  btrfs: fix repair of compressed extents
  btrfs: remove the start argument to check_data_csum and export
  btrfs: pass a btrfs_bio to btrfs_repair_one_sector
  btrfs: simplify the pending I/O counting in struct compressed_bio
  btrfs: repair all known bad mirrors
  btrfs: merge btrfs_dev_stat_print_on_error with its only caller
  btrfs: join running log transaction when logging new name
  btrfs: simplify error handling in btrfs_lookup_dentry
  btrfs: send: always use the rbtree based inode ref management infrastructure
  btrfs: send: fix sending link commands for existing file paths
  btrfs: send: introduce recorded_ref_alloc and recorded_ref_free
  btrfs: zoned: wait until zone is finished when allocation didn't progress
  btrfs: zoned: write out partially allocated region
  btrfs: zoned: activate necessary block group
  btrfs: zoned: activate metadata block group on flush_space
  btrfs: zoned: disable metadata overcommit for zoned
  btrfs: zoned: introduce space_info->active_total_bytes
  btrfs: zoned: finish least available block group on data bg allocation
  btrfs: let can_allocate_chunk return error
  ...
  • Loading branch information
Linus Torvalds committed Aug 3, 2022
2 parents ab17c0c + 0b078d9 commit 353767e
Show file tree
Hide file tree
Showing 57 changed files with 3,842 additions and 2,829 deletions.
6 changes: 3 additions & 3 deletions arch/parisc/include/asm/cacheflush.h
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ void flush_kernel_icache_range_asm(unsigned long, unsigned long);
void flush_user_dcache_range_asm(unsigned long, unsigned long);
void flush_kernel_dcache_range_asm(unsigned long, unsigned long);
void purge_kernel_dcache_range_asm(unsigned long, unsigned long);
void flush_kernel_dcache_page_asm(void *);
void flush_kernel_dcache_page_asm(const void *addr);
void flush_kernel_icache_page(void *);

/* Cache flush operations */
Expand All @@ -31,7 +31,7 @@ void flush_cache_all_local(void);
void flush_cache_all(void);
void flush_cache_mm(struct mm_struct *mm);

void flush_kernel_dcache_page_addr(void *addr);
void flush_kernel_dcache_page_addr(const void *addr);

#define flush_kernel_dcache_range(start,size) \
flush_kernel_dcache_range_asm((start), (start)+(size));
Expand Down Expand Up @@ -75,7 +75,7 @@ void flush_dcache_page_asm(unsigned long phys_addr, unsigned long vaddr);
void flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr);

#define ARCH_HAS_FLUSH_ON_KUNMAP
static inline void kunmap_flush_on_unmap(void *addr)
static inline void kunmap_flush_on_unmap(const void *addr)
{
flush_kernel_dcache_page_addr(addr);
}
Expand Down
2 changes: 1 addition & 1 deletion arch/parisc/kernel/cache.c
Original file line number Diff line number Diff line change
Expand Up @@ -549,7 +549,7 @@ extern void purge_kernel_dcache_page_asm(unsigned long);
extern void clear_user_page_asm(void *, unsigned long);
extern void copy_user_page_asm(void *, void *, unsigned long);

void flush_kernel_dcache_page_addr(void *addr)
void flush_kernel_dcache_page_addr(const void *addr)
{
unsigned long flags;

Expand Down
1 change: 0 additions & 1 deletion fs/btrfs/async-thread.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,6 @@ struct btrfs_fs_info;
struct btrfs_workqueue;
struct btrfs_work;
typedef void (*btrfs_func_t)(struct btrfs_work *arg);
typedef void (*btrfs_work_func_t)(struct work_struct *arg);

struct btrfs_work {
btrfs_func_t func;
Expand Down
88 changes: 49 additions & 39 deletions fs/btrfs/backref.c
Original file line number Diff line number Diff line change
Expand Up @@ -2028,10 +2028,29 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info,
return ret;
}

static int build_ino_list(u64 inum, u64 offset, u64 root, void *ctx)
{
struct btrfs_data_container *inodes = ctx;
const size_t c = 3 * sizeof(u64);

if (inodes->bytes_left >= c) {
inodes->bytes_left -= c;
inodes->val[inodes->elem_cnt] = inum;
inodes->val[inodes->elem_cnt + 1] = offset;
inodes->val[inodes->elem_cnt + 2] = root;
inodes->elem_cnt += 3;
} else {
inodes->bytes_missing += c - inodes->bytes_left;
inodes->bytes_left = 0;
inodes->elem_missed += 3;
}

return 0;
}

int iterate_inodes_from_logical(u64 logical, struct btrfs_fs_info *fs_info,
struct btrfs_path *path,
iterate_extent_inodes_t *iterate, void *ctx,
bool ignore_offset)
void *ctx, bool ignore_offset)
{
int ret;
u64 extent_item_pos;
Expand All @@ -2049,17 +2068,15 @@ int iterate_inodes_from_logical(u64 logical, struct btrfs_fs_info *fs_info,
extent_item_pos = logical - found_key.objectid;
ret = iterate_extent_inodes(fs_info, found_key.objectid,
extent_item_pos, search_commit_root,
iterate, ctx, ignore_offset);
build_ino_list, ctx, ignore_offset);

return ret;
}

typedef int (iterate_irefs_t)(u64 parent, u32 name_len, unsigned long name_off,
struct extent_buffer *eb, void *ctx);
static int inode_to_path(u64 inum, u32 name_len, unsigned long name_off,
struct extent_buffer *eb, struct inode_fs_paths *ipath);

static int iterate_inode_refs(u64 inum, struct btrfs_root *fs_root,
struct btrfs_path *path,
iterate_irefs_t *iterate, void *ctx)
static int iterate_inode_refs(u64 inum, struct inode_fs_paths *ipath)
{
int ret = 0;
int slot;
Expand All @@ -2068,6 +2085,8 @@ static int iterate_inode_refs(u64 inum, struct btrfs_root *fs_root,
u32 name_len;
u64 parent = 0;
int found = 0;
struct btrfs_root *fs_root = ipath->fs_root;
struct btrfs_path *path = ipath->btrfs_path;
struct extent_buffer *eb;
struct btrfs_inode_ref *iref;
struct btrfs_key found_key;
Expand Down Expand Up @@ -2103,8 +2122,8 @@ static int iterate_inode_refs(u64 inum, struct btrfs_root *fs_root,
"following ref at offset %u for inode %llu in tree %llu",
cur, found_key.objectid,
fs_root->root_key.objectid);
ret = iterate(parent, name_len,
(unsigned long)(iref + 1), eb, ctx);
ret = inode_to_path(parent, name_len,
(unsigned long)(iref + 1), eb, ipath);
if (ret)
break;
len = sizeof(*iref) + name_len;
Expand All @@ -2118,15 +2137,15 @@ static int iterate_inode_refs(u64 inum, struct btrfs_root *fs_root,
return ret;
}

static int iterate_inode_extrefs(u64 inum, struct btrfs_root *fs_root,
struct btrfs_path *path,
iterate_irefs_t *iterate, void *ctx)
static int iterate_inode_extrefs(u64 inum, struct inode_fs_paths *ipath)
{
int ret;
int slot;
u64 offset = 0;
u64 parent;
int found = 0;
struct btrfs_root *fs_root = ipath->fs_root;
struct btrfs_path *path = ipath->btrfs_path;
struct extent_buffer *eb;
struct btrfs_inode_extref *extref;
u32 item_size;
Expand Down Expand Up @@ -2162,8 +2181,8 @@ static int iterate_inode_extrefs(u64 inum, struct btrfs_root *fs_root,
extref = (struct btrfs_inode_extref *)(ptr + cur_offset);
parent = btrfs_inode_extref_parent(eb, extref);
name_len = btrfs_inode_extref_name_len(eb, extref);
ret = iterate(parent, name_len,
(unsigned long)&extref->name, eb, ctx);
ret = inode_to_path(parent, name_len,
(unsigned long)&extref->name, eb, ipath);
if (ret)
break;

Expand All @@ -2180,34 +2199,13 @@ static int iterate_inode_extrefs(u64 inum, struct btrfs_root *fs_root,
return ret;
}

static int iterate_irefs(u64 inum, struct btrfs_root *fs_root,
struct btrfs_path *path, iterate_irefs_t *iterate,
void *ctx)
{
int ret;
int found_refs = 0;

ret = iterate_inode_refs(inum, fs_root, path, iterate, ctx);
if (!ret)
++found_refs;
else if (ret != -ENOENT)
return ret;

ret = iterate_inode_extrefs(inum, fs_root, path, iterate, ctx);
if (ret == -ENOENT && found_refs)
return 0;

return ret;
}

/*
* returns 0 if the path could be dumped (probably truncated)
* returns <0 in case of an error
*/
static int inode_to_path(u64 inum, u32 name_len, unsigned long name_off,
struct extent_buffer *eb, void *ctx)
struct extent_buffer *eb, struct inode_fs_paths *ipath)
{
struct inode_fs_paths *ipath = ctx;
char *fspath;
char *fspath_min;
int i = ipath->fspath->elem_cnt;
Expand Down Expand Up @@ -2248,8 +2246,20 @@ static int inode_to_path(u64 inum, u32 name_len, unsigned long name_off,
*/
int paths_from_inode(u64 inum, struct inode_fs_paths *ipath)
{
return iterate_irefs(inum, ipath->fs_root, ipath->btrfs_path,
inode_to_path, ipath);
int ret;
int found_refs = 0;

ret = iterate_inode_refs(inum, ipath);
if (!ret)
++found_refs;
else if (ret != -ENOENT)
return ret;

ret = iterate_inode_extrefs(inum, ipath);
if (ret == -ENOENT && found_refs)
return 0;

return ret;
}

struct btrfs_data_container *init_data_container(u32 total_bytes)
Expand Down
3 changes: 1 addition & 2 deletions fs/btrfs/backref.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,8 +35,7 @@ int iterate_extent_inodes(struct btrfs_fs_info *fs_info,
bool ignore_offset);

int iterate_inodes_from_logical(u64 logical, struct btrfs_fs_info *fs_info,
struct btrfs_path *path,
iterate_extent_inodes_t *iterate, void *ctx,
struct btrfs_path *path, void *ctx,
bool ignore_offset);

int paths_from_inode(u64 inum, struct inode_fs_paths *ipath);
Expand Down
34 changes: 28 additions & 6 deletions fs/btrfs/block-group.c
Original file line number Diff line number Diff line change
Expand Up @@ -1051,8 +1051,13 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans,
< block_group->zone_unusable);
WARN_ON(block_group->space_info->disk_total
< block_group->length * factor);
WARN_ON(block_group->zone_is_active &&
block_group->space_info->active_total_bytes
< block_group->length);
}
block_group->space_info->total_bytes -= block_group->length;
if (block_group->zone_is_active)
block_group->space_info->active_total_bytes -= block_group->length;
block_group->space_info->bytes_readonly -=
(block_group->length - block_group->zone_unusable);
block_group->space_info->bytes_zone_unusable -=
Expand Down Expand Up @@ -1816,11 +1821,10 @@ int btrfs_rmap_block(struct btrfs_fs_info *fs_info, u64 chunk_start,
stripe_nr = physical - map->stripes[i].physical;
stripe_nr = div64_u64_rem(stripe_nr, map->stripe_len, &offset);

if (map->type & BTRFS_BLOCK_GROUP_RAID10) {
if (map->type & (BTRFS_BLOCK_GROUP_RAID0 |
BTRFS_BLOCK_GROUP_RAID10)) {
stripe_nr = stripe_nr * map->num_stripes + i;
stripe_nr = div_u64(stripe_nr, map->sub_stripes);
} else if (map->type & BTRFS_BLOCK_GROUP_RAID0) {
stripe_nr = stripe_nr * map->num_stripes + i;
}
/*
* The remaining case would be for RAID56, multiply by
Expand Down Expand Up @@ -2108,7 +2112,8 @@ static int read_one_block_group(struct btrfs_fs_info *info,
trace_btrfs_add_block_group(info, cache, 0);
btrfs_update_space_info(info, cache->flags, cache->length,
cache->used, cache->bytes_super,
cache->zone_unusable, &space_info);
cache->zone_unusable, cache->zone_is_active,
&space_info);

cache->space_info = space_info;

Expand Down Expand Up @@ -2178,7 +2183,7 @@ static int fill_dummy_bgs(struct btrfs_fs_info *fs_info)
}

btrfs_update_space_info(fs_info, bg->flags, em->len, em->len,
0, 0, &space_info);
0, 0, false, &space_info);
bg->space_info = space_info;
link_block_group(bg);

Expand Down Expand Up @@ -2559,7 +2564,7 @@ struct btrfs_block_group *btrfs_make_block_group(struct btrfs_trans_handle *tran
trace_btrfs_add_block_group(fs_info, cache, 1);
btrfs_update_space_info(fs_info, cache->flags, size, bytes_used,
cache->bytes_super, cache->zone_unusable,
&cache->space_info);
cache->zone_is_active, &cache->space_info);
btrfs_update_global_block_rsv(fs_info);

link_block_group(cache);
Expand Down Expand Up @@ -2659,6 +2664,14 @@ int btrfs_inc_block_group_ro(struct btrfs_block_group *cache,
ret = btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE);
if (ret < 0)
goto out;
/*
* We have allocated a new chunk. We also need to activate that chunk to
* grant metadata tickets for zoned filesystem.
*/
ret = btrfs_zoned_activate_one_bg(fs_info, cache->space_info, true);
if (ret < 0)
goto out;

ret = inc_block_group_ro(cache, 0);
if (ret == -ETXTBSY)
goto unlock_out;
Expand Down Expand Up @@ -3761,6 +3774,7 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags,
* attempt.
*/
wait_for_alloc = true;
force = CHUNK_ALLOC_NO_FORCE;
spin_unlock(&space_info->lock);
mutex_lock(&fs_info->chunk_mutex);
mutex_unlock(&fs_info->chunk_mutex);
Expand Down Expand Up @@ -3883,6 +3897,14 @@ static void reserve_chunk_space(struct btrfs_trans_handle *trans,
if (IS_ERR(bg)) {
ret = PTR_ERR(bg);
} else {
/*
* We have a new chunk. We also need to activate it for
* zoned filesystem.
*/
ret = btrfs_zoned_activate_one_bg(fs_info, info, true);
if (ret < 0)
return;

/*
* If we fail to add the chunk item here, we end up
* trying again at phase 2 of chunk allocation, at
Expand Down
21 changes: 9 additions & 12 deletions fs/btrfs/block-rsv.c
Original file line number Diff line number Diff line change
Expand Up @@ -118,7 +118,7 @@ static u64 block_rsv_release_bytes(struct btrfs_fs_info *fs_info,
if (block_rsv->reserved >= block_rsv->size) {
num_bytes = block_rsv->reserved - block_rsv->size;
block_rsv->reserved = block_rsv->size;
block_rsv->full = 1;
block_rsv->full = true;
} else {
num_bytes = 0;
}
Expand All @@ -142,7 +142,7 @@ static u64 block_rsv_release_bytes(struct btrfs_fs_info *fs_info,
bytes_to_add = min(num_bytes, bytes_to_add);
dest->reserved += bytes_to_add;
if (dest->reserved >= dest->size)
dest->full = 1;
dest->full = true;
num_bytes -= bytes_to_add;
}
spin_unlock(&dest->lock);
Expand Down Expand Up @@ -171,7 +171,7 @@ int btrfs_block_rsv_migrate(struct btrfs_block_rsv *src,
return 0;
}

void btrfs_init_block_rsv(struct btrfs_block_rsv *rsv, unsigned short type)
void btrfs_init_block_rsv(struct btrfs_block_rsv *rsv, enum btrfs_rsv_type type)
{
memset(rsv, 0, sizeof(*rsv));
spin_lock_init(&rsv->lock);
Expand All @@ -180,15 +180,15 @@ void btrfs_init_block_rsv(struct btrfs_block_rsv *rsv, unsigned short type)

void btrfs_init_metadata_block_rsv(struct btrfs_fs_info *fs_info,
struct btrfs_block_rsv *rsv,
unsigned short type)
enum btrfs_rsv_type type)
{
btrfs_init_block_rsv(rsv, type);
rsv->space_info = btrfs_find_space_info(fs_info,
BTRFS_BLOCK_GROUP_METADATA);
}

struct btrfs_block_rsv *btrfs_alloc_block_rsv(struct btrfs_fs_info *fs_info,
unsigned short type)
enum btrfs_rsv_type type)
{
struct btrfs_block_rsv *block_rsv;

Expand Down Expand Up @@ -304,7 +304,7 @@ int btrfs_block_rsv_use_bytes(struct btrfs_block_rsv *block_rsv, u64 num_bytes)
if (block_rsv->reserved >= num_bytes) {
block_rsv->reserved -= num_bytes;
if (block_rsv->reserved < block_rsv->size)
block_rsv->full = 0;
block_rsv->full = false;
ret = 0;
}
spin_unlock(&block_rsv->lock);
Expand All @@ -319,7 +319,7 @@ void btrfs_block_rsv_add_bytes(struct btrfs_block_rsv *block_rsv,
if (update_size)
block_rsv->size += num_bytes;
else if (block_rsv->reserved >= block_rsv->size)
block_rsv->full = 1;
block_rsv->full = true;
spin_unlock(&block_rsv->lock);
}

Expand All @@ -341,7 +341,7 @@ int btrfs_cond_migrate_bytes(struct btrfs_fs_info *fs_info,
}
global_rsv->reserved -= num_bytes;
if (global_rsv->reserved < global_rsv->size)
global_rsv->full = 0;
global_rsv->full = false;
spin_unlock(&global_rsv->lock);

btrfs_block_rsv_add_bytes(dest, num_bytes, true);
Expand Down Expand Up @@ -408,10 +408,7 @@ void btrfs_update_global_block_rsv(struct btrfs_fs_info *fs_info)
btrfs_try_granting_tickets(fs_info, sinfo);
}

if (block_rsv->reserved == block_rsv->size)
block_rsv->full = 1;
else
block_rsv->full = 0;
block_rsv->full = (block_rsv->reserved == block_rsv->size);

if (block_rsv->size >= sinfo->total_bytes)
sinfo->force_alloc = CHUNK_ALLOC_FORCE;
Expand Down
Loading

0 comments on commit 353767e

Please sign in to comment.