Skip to content

Commit

Permalink
f2fs: support age threshold based garbage collection
Browse files Browse the repository at this point in the history
There are several issues in current background GC algorithm:
- valid blocks is one of key factors during cost overhead calculation,
so if segment has less valid block, however even its age is young or
it locates hot segment, CB algorithm will still choose the segment as
victim, it's not appropriate.
- GCed data/node will go to existing logs, no matter in-there datas'
update frequency is the same or not, it may mix hot and cold data
again.
- GC alloctor mainly use LFS type segment, it will cost free segment
more quickly.

This patch introduces a new algorithm named age threshold based
garbage collection to solve above issues, there are three steps
mainly:

1. select a source victim:
- set an age threshold, and select candidates beased threshold:
e.g.
 0 means youngest, 100 means oldest, if we set age threshold to 80
 then select dirty segments which has age in range of [80, 100] as
 candiddates;
- set candidate_ratio threshold, and select candidates based the
ratio, so that we can shrink candidates to those oldest segments;
- select target segment with fewest valid blocks in order to
migrate blocks with minimum cost;

2. select a target victim:
- select candidates beased age threshold;
- set candidate_radius threshold, search candidates whose age is
around source victims, searching radius should less than the
radius threshold.
- select target segment with most valid blocks in order to avoid
migrating current target segment.

3. merge valid blocks from source victim into target victim with
SSR alloctor.

Test steps:
- create 160 dirty segments:
 * half of them have 128 valid blocks per segment
 * left of them have 384 valid blocks per segment
- run background GC

Benefit: GC count and block movement count both decrease obviously:

- Before:
  - Valid: 86
  - Dirty: 1
  - Prefree: 11
  - Free: 6001 (6001)

GC calls: 162 (BG: 220)
  - data segments : 160 (160)
  - node segments : 2 (2)
Try to move 41454 blocks (BG: 41454)
  - data blocks : 40960 (40960)
  - node blocks : 494 (494)

IPU: 0 blocks
SSR: 0 blocks in 0 segments
LFS: 41364 blocks in 81 segments

- After:

  - Valid: 87
  - Dirty: 0
  - Prefree: 4
  - Free: 6008 (6008)

GC calls: 75 (BG: 76)
  - data segments : 74 (74)
  - node segments : 1 (1)
Try to move 12813 blocks (BG: 12813)
  - data blocks : 12544 (12544)
  - node blocks : 269 (269)

IPU: 0 blocks
SSR: 12032 blocks in 77 segments
LFS: 855 blocks in 2 segments

Signed-off-by: Chao Yu <yuchao0@huawei.com>
[Jaegeuk Kim: fix a bug along with pinfile in-mem segment & clean up]
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
  • Loading branch information
Chao Yu authored and Jaegeuk Kim committed Sep 11, 2020
1 parent 568d2a1 commit 093749e
Show file tree
Hide file tree
Showing 13 changed files with 632 additions and 64 deletions.
3 changes: 2 additions & 1 deletion Documentation/ABI/testing/sysfs-fs-f2fs
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,8 @@ Contact: "Namjae Jeon" <namjae.jeon@samsung.com>
Description: Controls the victim selection policy for garbage collection.
Setting gc_idle = 0(default) will disable this option. Setting
gc_idle = 1 will select the Cost Benefit approach & setting
gc_idle = 2 will select the greedy approach.
gc_idle = 2 will select the greedy approach & setting
gc_idle = 3 will select the age-threshold based approach.

What: /sys/fs/f2fs/<disk>/reclaim_segments
Date: October 2013
Expand Down
2 changes: 2 additions & 0 deletions Documentation/filesystems/f2fs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,8 @@ inlinecrypt When possible, encrypt/decrypt the contents of encrypted
inline encryption hardware. The on-disk format is
unaffected. For more details, see
Documentation/block/inline-encryption.rst.
atgc Enable age-threshold garbage collection, it provides high
effectiveness and efficiency on background GC.
======================== ============================================================

Debugfs Entries
Expand Down
4 changes: 2 additions & 2 deletions fs/f2fs/checkpoint.c
Original file line number Diff line number Diff line change
Expand Up @@ -1620,15 +1620,15 @@ int f2fs_write_checkpoint(struct f2fs_sb_info *sbi, struct cp_control *cpc)
f2fs_flush_sit_entries(sbi, cpc);

/* save inmem log status */
f2fs_save_inmem_curseg(sbi, CURSEG_COLD_DATA_PINNED);
f2fs_save_inmem_curseg(sbi);

err = do_checkpoint(sbi, cpc);
if (err)
f2fs_release_discard_addrs(sbi);
else
f2fs_clear_prefree_segments(sbi, cpc);

f2fs_restore_inmem_curseg(sbi, CURSEG_COLD_DATA_PINNED);
f2fs_restore_inmem_curseg(sbi);
stop:
unblock_operations(sbi);
stat_inc_cp_count(sbi->stat_info);
Expand Down
2 changes: 1 addition & 1 deletion fs/f2fs/data.c
Original file line number Diff line number Diff line change
Expand Up @@ -1416,7 +1416,7 @@ static int __allocate_data_block(struct dnode_of_data *dn, int seg_type)
set_summary(&sum, dn->nid, dn->ofs_in_node, ni.version);
old_blkaddr = dn->data_blkaddr;
f2fs_allocate_data_block(sbi, NULL, old_blkaddr, &dn->data_blkaddr,
&sum, seg_type, NULL, false);
&sum, seg_type, NULL);
if (GET_SEGNO(sbi, old_blkaddr) != NULL_SEGNO)
invalidate_mapping_pages(META_MAPPING(sbi),
old_blkaddr, old_blkaddr);
Expand Down
4 changes: 4 additions & 0 deletions fs/f2fs/debug.c
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,10 @@ static int stat_show(struct seq_file *s, void *v)
si->curseg[CURSEG_COLD_DATA_PINNED],
si->cursec[CURSEG_COLD_DATA_PINNED],
si->curzone[CURSEG_COLD_DATA_PINNED]);
seq_printf(s, " - ATGC data: %8d %8d %8d\n",
si->curseg[CURSEG_ALL_DATA_ATGC],
si->cursec[CURSEG_ALL_DATA_ATGC],
si->curzone[CURSEG_ALL_DATA_ATGC]);
seq_printf(s, "\n - Valid: %d\n - Dirty: %d\n",
si->main_area_segs - si->dirty_count -
si->prefree_count - si->free_segs,
Expand Down
29 changes: 25 additions & 4 deletions fs/f2fs/f2fs.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,6 +98,7 @@ extern const char *f2fs_fault_name[FAULT_MAX];
#define F2FS_MOUNT_RESERVE_ROOT 0x01000000
#define F2FS_MOUNT_DISABLE_CHECKPOINT 0x02000000
#define F2FS_MOUNT_NORECOVERY 0x04000000
#define F2FS_MOUNT_ATGC 0x08000000

#define F2FS_OPTION(sbi) ((sbi)->mount_opt)
#define clear_opt(sbi, option) (F2FS_OPTION(sbi).opt &= ~F2FS_MOUNT_##option)
Expand Down Expand Up @@ -978,7 +979,7 @@ static inline void set_new_dnode(struct dnode_of_data *dn, struct inode *inode,
*/
#define NR_CURSEG_DATA_TYPE (3)
#define NR_CURSEG_NODE_TYPE (3)
#define NR_CURSEG_INMEM_TYPE (1)
#define NR_CURSEG_INMEM_TYPE (2)
#define NR_CURSEG_PERSIST_TYPE (NR_CURSEG_DATA_TYPE + NR_CURSEG_NODE_TYPE)
#define NR_CURSEG_TYPE (NR_CURSEG_INMEM_TYPE + NR_CURSEG_PERSIST_TYPE)

Expand All @@ -992,6 +993,7 @@ enum {
NR_PERSISTENT_LOG, /* number of persistent log */
CURSEG_COLD_DATA_PINNED = NR_PERSISTENT_LOG,
/* pinned file that needs consecutive block address */
CURSEG_ALL_DATA_ATGC, /* SSR alloctor in hot/warm/cold data area */
NO_CHECK_TYPE, /* number of persistent & inmem log */
};

Expand Down Expand Up @@ -1238,6 +1240,18 @@ struct inode_management {
unsigned long ino_num; /* number of entries */
};

/* for GC_AT */
struct atgc_management {
bool atgc_enabled; /* ATGC is enabled or not */
struct rb_root_cached root; /* root of victim rb-tree */
struct list_head victim_list; /* linked with all victim entries */
unsigned int victim_count; /* victim count in rb-tree */
unsigned int candidate_ratio; /* candidate ratio */
unsigned int max_candidate_count; /* max candidate count */
unsigned int age_weight; /* age weight, vblock_weight = 100 - age_weight */
unsigned long long age_threshold; /* age threshold */
};

/* For s_flag in struct f2fs_sb_info */
enum {
SBI_IS_DIRTY, /* dirty flag for checkpoint */
Expand Down Expand Up @@ -1270,6 +1284,7 @@ enum {
GC_NORMAL,
GC_IDLE_CB,
GC_IDLE_GREEDY,
GC_IDLE_AT,
GC_URGENT_HIGH,
GC_URGENT_LOW,
};
Expand Down Expand Up @@ -1521,6 +1536,7 @@ struct f2fs_sb_info {
* race between GC and GC or CP
*/
struct f2fs_gc_kthread *gc_thread; /* GC thread */
struct atgc_management am; /* atgc management */
unsigned int cur_victim_sec; /* current victim section num */
unsigned int gc_mode; /* current GC state */
unsigned int next_victim_seg[2]; /* next segment in victim section */
Expand Down Expand Up @@ -3338,8 +3354,11 @@ block_t f2fs_get_unusable_blocks(struct f2fs_sb_info *sbi);
int f2fs_disable_cp_again(struct f2fs_sb_info *sbi, block_t unusable);
void f2fs_release_discard_addrs(struct f2fs_sb_info *sbi);
int f2fs_npages_for_summary_flush(struct f2fs_sb_info *sbi, bool for_ra);
void f2fs_save_inmem_curseg(struct f2fs_sb_info *sbi, int type);
void f2fs_restore_inmem_curseg(struct f2fs_sb_info *sbi, int type);
void f2fs_init_inmem_curseg(struct f2fs_sb_info *sbi);
void f2fs_save_inmem_curseg(struct f2fs_sb_info *sbi);
void f2fs_restore_inmem_curseg(struct f2fs_sb_info *sbi);
void f2fs_get_new_segment(struct f2fs_sb_info *sbi,
unsigned int *newseg, bool new_sec, int dir);
void f2fs_allocate_segment_for_resize(struct f2fs_sb_info *sbi, int type,
unsigned int start, unsigned int end);
void f2fs_allocate_new_segment(struct f2fs_sb_info *sbi, int type);
Expand Down Expand Up @@ -3367,7 +3386,7 @@ void f2fs_replace_block(struct f2fs_sb_info *sbi, struct dnode_of_data *dn,
void f2fs_allocate_data_block(struct f2fs_sb_info *sbi, struct page *page,
block_t old_blkaddr, block_t *new_blkaddr,
struct f2fs_summary *sum, int type,
struct f2fs_io_info *fio, bool from_gc);
struct f2fs_io_info *fio);
void f2fs_wait_on_page_writeback(struct page *page,
enum page_type type, bool ordered, bool locked);
void f2fs_wait_on_block_writeback(struct inode *inode, block_t blkaddr);
Expand Down Expand Up @@ -3506,6 +3525,8 @@ int f2fs_gc(struct f2fs_sb_info *sbi, bool sync, bool background,
unsigned int segno);
void f2fs_build_gc_manager(struct f2fs_sb_info *sbi);
int f2fs_resize_fs(struct f2fs_sb_info *sbi, __u64 block_count);
int __init f2fs_create_garbage_collection_cache(void);
void f2fs_destroy_garbage_collection_cache(void);

/*
* recovery.c
Expand Down
Loading

0 comments on commit 093749e

Please sign in to comment.