Skip to content

Commit

Permalink
slub: per cpu cache for partial pages
Browse files Browse the repository at this point in the history
Allow filling out the rest of the kmem_cache_cpu cacheline with pointers to
partial pages. The partial page list is used in slab_free() to avoid
per node lock taking.

In __slab_alloc() we can then take multiple partial pages off the per
node partial list in one go reducing node lock pressure.

We can also use the per cpu partial list in slab_alloc() to avoid scanning
partial lists for pages with free objects.

The main effect of a per cpu partial list is that the per node list_lock
is taken for batches of partial pages instead of individual ones.

Potential future enhancements:

1. The pickup from the partial list could be perhaps be done without disabling
   interrupts with some work. The free path already puts the page into the
   per cpu partial list without disabling interrupts.

2. __slab_free() may have some code paths that could use optimization.

Performance:

				Before		After
./hackbench 100 process 200000
				Time: 1953.047	1564.614
./hackbench 100 process 20000
				Time: 207.176   156.940
./hackbench 100 process 20000
				Time: 204.468	156.940
./hackbench 100 process 20000
				Time: 204.879	158.772
./hackbench 10 process 20000
				Time: 20.153	15.853
./hackbench 10 process 20000
				Time: 20.153	15.986
./hackbench 10 process 20000
				Time: 19.363	16.111
./hackbench 1 process 20000
				Time: 2.518	2.307
./hackbench 1 process 20000
				Time: 2.258	2.339
./hackbench 1 process 20000
				Time: 2.864	2.163

Signed-off-by: Christoph Lameter <cl@linux.com>
Signed-off-by: Pekka Enberg <penberg@kernel.org>
  • Loading branch information
Christoph Lameter authored and Pekka Enberg committed Aug 19, 2011
1 parent 497b66f commit 49e2258
Show file tree
Hide file tree
Showing 3 changed files with 309 additions and 48 deletions.
14 changes: 13 additions & 1 deletion include/linux/mm_types.h
Original file line number Diff line number Diff line change
Expand Up @@ -79,9 +79,21 @@ struct page {
};

/* Third double word block */
struct list_head lru; /* Pageout list, eg. active_list
union {
struct list_head lru; /* Pageout list, eg. active_list
* protected by zone->lru_lock !
*/
struct { /* slub per cpu partial pages */
struct page *next; /* Next partial slab */
#ifdef CONFIG_64BIT
int pages; /* Nr of partial slabs left */
int pobjects; /* Approximate # of objects */
#else
short int pages;
short int pobjects;
#endif
};
};

/* Remainder is not double word aligned */
union {
Expand Down
4 changes: 4 additions & 0 deletions include/linux/slub_def.h
Original file line number Diff line number Diff line change
Expand Up @@ -36,12 +36,15 @@ enum stat_item {
ORDER_FALLBACK, /* Number of times fallback was necessary */
CMPXCHG_DOUBLE_CPU_FAIL,/* Failure of this_cpu_cmpxchg_double */
CMPXCHG_DOUBLE_FAIL, /* Number of times that cmpxchg double did not match */
CPU_PARTIAL_ALLOC, /* Used cpu partial on alloc */
CPU_PARTIAL_FREE, /* USed cpu partial on free */
NR_SLUB_STAT_ITEMS };

struct kmem_cache_cpu {
void **freelist; /* Pointer to next available object */
unsigned long tid; /* Globally unique transaction id */
struct page *page; /* The slab from which we are allocating */
struct page *partial; /* Partially allocated frozen slabs */
int node; /* The node of the page (or -1 for debug) */
#ifdef CONFIG_SLUB_STATS
unsigned stat[NR_SLUB_STAT_ITEMS];
Expand Down Expand Up @@ -79,6 +82,7 @@ struct kmem_cache {
int size; /* The size of an object including meta data */
int objsize; /* The size of an object without meta data */
int offset; /* Free pointer offset. */
int cpu_partial; /* Number of per cpu partial pages to keep around */
struct kmem_cache_order_objects oo;

/* Allocation and freeing of slabs */
Expand Down
Loading

0 comments on commit 49e2258

Please sign in to comment.