Skip to content

Commit

Permalink
Merge tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/ke…
Browse files Browse the repository at this point in the history
…rnel/git/vbabka/slab

Pull slab fixes from Vlastimil Babka:

 - The "common kmalloc v4" series [1] by Hyeonggon Yoo.

   While the plan after LPC is to try again if it's possible to get rid
   of SLOB and SLAB (and if any critical aspect of those is not possible
   to achieve with SLUB today, modify it accordingly), it will take a
   while even in case there are no objections.

   Meanwhile this is a nice cleanup and some parts (e.g. to the
   tracepoints) will be useful even if we end up with a single slab
   implementation in the future:

      - Improves the mm/slab_common.c wrappers to allow deleting
        duplicated code between SLAB and SLUB.

      - Large kmalloc() allocations in SLAB are passed to page allocator
        like in SLUB, reducing number of kmalloc caches.

      - Removes the {kmem_cache_alloc,kmalloc}_node variants of
        tracepoints, node id parameter added to non-_node variants.

 - Addition of kmalloc_size_roundup()

   The first two patches from a series by Kees Cook [2] that introduce
   kmalloc_size_roundup(). This will allow merging of per-subsystem
   patches using the new function and ultimately stop (ab)using ksize()
   in a way that causes ongoing trouble for debugging functionality and
   static checkers.

 - Wasted kmalloc() memory tracking in debugfs alloc_traces

   A patch from Feng Tang that enhances the existing debugfs
   alloc_traces file for kmalloc caches with information about how much
   space is wasted by allocations that needs less space than the
   particular kmalloc cache provides.

 - My series [3] to fix validation races for caches with enabled
   debugging:

      - By decoupling the debug cache operation more from non-debug
        fastpaths, extra locking simplifications were possible and thus
        done afterwards.

      - Additional cleanup of PREEMPT_RT specific code on top, by Thomas
        Gleixner.

      - A late fix for slab page leaks caused by the series, by Feng
        Tang.

 - Smaller fixes and cleanups:

      - Unneeded variable removals, by ye xingchen

      - A cleanup removing a BUG_ON() in create_unique_id(), by Chao Yu

Link: https://lore.kernel.org/all/20220817101826.236819-1-42.hyeyoo@gmail.com/ [1]
Link: https://lore.kernel.org/all/20220923202822.2667581-1-keescook@chromium.org/ [2]
Link: https://lore.kernel.org/all/20220823170400.26546-1-vbabka@suse.cz/ [3]

* tag 'slab-for-6.1-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/vbabka/slab: (30 commits)
  mm/slub: fix a slab missed to be freed problem
  slab: Introduce kmalloc_size_roundup()
  slab: Remove __malloc attribute from realloc functions
  mm/slub: clean up create_unique_id()
  mm/slub: enable debugging memory wasting of kmalloc
  slub: Make PREEMPT_RT support less convoluted
  mm/slub: simplify __cmpxchg_double_slab() and slab_[un]lock()
  mm/slub: convert object_map_lock to non-raw spinlock
  mm/slub: remove slab_lock() usage for debug operations
  mm/slub: restrict sysfs validation to debug caches and make it safe
  mm/sl[au]b: check if large object is valid in __ksize()
  mm/slab_common: move declaration of __ksize() to mm/slab.h
  mm/slab_common: drop kmem_alloc & avoid dereferencing fields when not using
  mm/slab_common: unify NUMA and UMA version of tracepoints
  mm/sl[au]b: cleanup kmem_cache_alloc[_node]_trace()
  mm/sl[au]b: generalize kmalloc subsystem
  mm/slub: move free_debug_processing() further
  mm/sl[au]b: introduce common alloc/free functions without tracepoint
  mm/slab: kmalloc: pass requests larger than order-1 page to page allocator
  mm/slab_common: cleanup kmalloc_large()
  ...
  • Loading branch information
Linus Torvalds committed Oct 10, 2022
2 parents 55be608 + 00a7829 commit 52abb27
Showing 11 changed files with 854 additions and 921 deletions.
33 changes: 21 additions & 12 deletions Documentation/mm/slub.rst
Original file line number Diff line number Diff line change
@@ -400,21 +400,30 @@ information:
allocated objects. The output is sorted by frequency of each trace.

Information in the output:
Number of objects, allocating function, minimal/average/maximal jiffies since alloc,
pid range of the allocating processes, cpu mask of allocating cpus, and stack trace.
Number of objects, allocating function, possible memory wastage of
kmalloc objects(total/per-object), minimal/average/maximal jiffies
since alloc, pid range of the allocating processes, cpu mask of
allocating cpus, numa node mask of origins of memory, and stack trace.

Example:::

1085 populate_error_injection_list+0x97/0x110 age=166678/166680/166682 pid=1 cpus=1::
__slab_alloc+0x6d/0x90
kmem_cache_alloc_trace+0x2eb/0x300
populate_error_injection_list+0x97/0x110
init_error_injection+0x1b/0x71
do_one_initcall+0x5f/0x2d0
kernel_init_freeable+0x26f/0x2d7
kernel_init+0xe/0x118
ret_from_fork+0x22/0x30

338 pci_alloc_dev+0x2c/0xa0 waste=521872/1544 age=290837/291891/293509 pid=1 cpus=106 nodes=0-1
__kmem_cache_alloc_node+0x11f/0x4e0
kmalloc_trace+0x26/0xa0
pci_alloc_dev+0x2c/0xa0
pci_scan_single_device+0xd2/0x150
pci_scan_slot+0xf7/0x2d0
pci_scan_child_bus_extend+0x4e/0x360
acpi_pci_root_create+0x32e/0x3b0
pci_acpi_scan_root+0x2b9/0x2d0
acpi_pci_root_add.cold.11+0x110/0xb0a
acpi_bus_attach+0x262/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0
device_for_each_child+0xb7/0x110
acpi_dev_for_each_child+0x77/0xa0
acpi_bus_attach+0x108/0x3f0

2. free_traces::

3 changes: 2 additions & 1 deletion include/linux/compiler_attributes.h
Original file line number Diff line number Diff line change
@@ -35,7 +35,8 @@

/*
* Note: do not use this directly. Instead, use __alloc_size() since it is conditionally
* available and includes other attributes.
* available and includes other attributes. For GCC < 9.1, __alloc_size__ gets undefined
* in compiler-gcc.h, due to misbehaviors.
*
* gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
* clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size
8 changes: 5 additions & 3 deletions include/linux/compiler_types.h
Original file line number Diff line number Diff line change
@@ -271,14 +271,16 @@ struct ftrace_likely_data {

/*
* Any place that could be marked with the "alloc_size" attribute is also
* a place to be marked with the "malloc" attribute. Do this as part of the
* __alloc_size macro to avoid redundant attributes and to avoid missing a
* __malloc marking.
* a place to be marked with the "malloc" attribute, except those that may
* be performing a _reallocation_, as that may alias the existing pointer.
* For these, use __realloc_size().
*/
#ifdef __alloc_size__
# define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc
# define __realloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__)
#else
# define __alloc_size(x, ...) __malloc
# define __realloc_size(x, ...)
#endif

#ifndef asm_volatile_goto
188 changes: 89 additions & 99 deletions include/linux/slab.h
Original file line number Diff line number Diff line change
@@ -29,6 +29,8 @@
#define SLAB_RED_ZONE ((slab_flags_t __force)0x00000400U)
/* DEBUG: Poison objects */
#define SLAB_POISON ((slab_flags_t __force)0x00000800U)
/* Indicate a kmalloc slab */
#define SLAB_KMALLOC ((slab_flags_t __force)0x00001000U)
/* Align objs on cache lines */
#define SLAB_HWCACHE_ALIGN ((slab_flags_t __force)0x00002000U)
/* Use GFP_DMA memory */
@@ -184,11 +186,25 @@ int kmem_cache_shrink(struct kmem_cache *s);
/*
* Common kmalloc functions provided by all allocators
*/
void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __alloc_size(2);
void * __must_check krealloc(const void *objp, size_t new_size, gfp_t flags) __realloc_size(2);
void kfree(const void *objp);
void kfree_sensitive(const void *objp);
size_t __ksize(const void *objp);

/**
* ksize - Report actual allocation size of associated object
*
* @objp: Pointer returned from a prior kmalloc()-family allocation.
*
* This should not be used for writing beyond the originally requested
* allocation size. Either use krealloc() or round up the allocation size
* with kmalloc_size_roundup() prior to allocation. If this is used to
* access beyond the originally requested allocation size, UBSAN_BOUNDS
* and/or FORTIFY_SOURCE may trip, since they only know about the
* originally allocated size via the __alloc_size attribute.
*/
size_t ksize(const void *objp);

#ifdef CONFIG_PRINTK
bool kmem_valid_obj(void *object);
void kmem_dump_obj(void *object);
@@ -243,27 +259,17 @@ static inline unsigned int arch_slab_minalign(void)

#ifdef CONFIG_SLAB
/*
* The largest kmalloc size supported by the SLAB allocators is
* 32 megabyte (2^25) or the maximum allocatable page order if that is
* less than 32 MB.
*
* WARNING: Its not easy to increase this value since the allocators have
* to do various tricks to work around compiler limitations in order to
* ensure proper constant folding.
* SLAB and SLUB directly allocates requests fitting in to an order-1 page
* (PAGE_SIZE*2). Larger requests are passed to the page allocator.
*/
#define KMALLOC_SHIFT_HIGH ((MAX_ORDER + PAGE_SHIFT - 1) <= 25 ? \
(MAX_ORDER + PAGE_SHIFT - 1) : 25)
#define KMALLOC_SHIFT_MAX KMALLOC_SHIFT_HIGH
#define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1)
#define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1)
#ifndef KMALLOC_SHIFT_LOW
#define KMALLOC_SHIFT_LOW 5
#endif
#endif

#ifdef CONFIG_SLUB
/*
* SLUB directly allocates requests fitting in to an order-1 page
* (PAGE_SIZE*2). Larger requests are passed to the page allocator.
*/
#define KMALLOC_SHIFT_HIGH (PAGE_SHIFT + 1)
#define KMALLOC_SHIFT_MAX (MAX_ORDER + PAGE_SHIFT - 1)
#ifndef KMALLOC_SHIFT_LOW
@@ -415,10 +421,6 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
if (size <= 512 * 1024) return 19;
if (size <= 1024 * 1024) return 20;
if (size <= 2 * 1024 * 1024) return 21;
if (size <= 4 * 1024 * 1024) return 22;
if (size <= 8 * 1024 * 1024) return 23;
if (size <= 16 * 1024 * 1024) return 24;
if (size <= 32 * 1024 * 1024) return 25;

if (!IS_ENABLED(CONFIG_PROFILE_ALL_BRANCHES) && size_is_constant)
BUILD_BUG_ON_MSG(1, "unexpected size in kmalloc_index()");
@@ -428,6 +430,7 @@ static __always_inline unsigned int __kmalloc_index(size_t size,
/* Will never be reached. Needed because the compiler may complain */
return -1;
}
static_assert(PAGE_SHIFT <= 20);
#define kmalloc_index(s) __kmalloc_index(s, true)
#endif /* !CONFIG_SLOB */

@@ -456,51 +459,32 @@ static __always_inline void kfree_bulk(size_t size, void **p)
kmem_cache_free_bulk(NULL, size, p);
}

#ifdef CONFIG_NUMA
void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_kmalloc_alignment
__alloc_size(1);
void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node) __assume_slab_alignment
__malloc;
#else
static __always_inline __alloc_size(1) void *__kmalloc_node(size_t size, gfp_t flags, int node)
{
return __kmalloc(size, flags);
}

static __always_inline void *kmem_cache_alloc_node(struct kmem_cache *s, gfp_t flags, int node)
{
return kmem_cache_alloc(s, flags);
}
#endif

#ifdef CONFIG_TRACING
extern void *kmem_cache_alloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
__assume_slab_alignment __alloc_size(3);

#ifdef CONFIG_NUMA
extern void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size) __assume_slab_alignment
__alloc_size(4);
#else
static __always_inline __alloc_size(4) void *kmem_cache_alloc_node_trace(struct kmem_cache *s,
gfp_t gfpflags, int node, size_t size)
{
return kmem_cache_alloc_trace(s, gfpflags, size);
}
#endif /* CONFIG_NUMA */
void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
__assume_kmalloc_alignment __alloc_size(3);

void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size) __assume_kmalloc_alignment
__alloc_size(4);
#else /* CONFIG_TRACING */
static __always_inline __alloc_size(3) void *kmem_cache_alloc_trace(struct kmem_cache *s,
gfp_t flags, size_t size)
/* Save a function call when CONFIG_TRACING=n */
static __always_inline __alloc_size(3)
void *kmalloc_trace(struct kmem_cache *s, gfp_t flags, size_t size)
{
void *ret = kmem_cache_alloc(s, flags);

ret = kasan_kmalloc(s, ret, size, flags);
return ret;
}

static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size)
static __always_inline __alloc_size(4)
void *kmalloc_node_trace(struct kmem_cache *s, gfp_t gfpflags,
int node, size_t size)
{
void *ret = kmem_cache_alloc_node(s, gfpflags, node);

@@ -509,25 +493,11 @@ static __always_inline void *kmem_cache_alloc_node_trace(struct kmem_cache *s, g
}
#endif /* CONFIG_TRACING */

extern void *kmalloc_order(size_t size, gfp_t flags, unsigned int order) __assume_page_alignment
__alloc_size(1);
void *kmalloc_large(size_t size, gfp_t flags) __assume_page_alignment
__alloc_size(1);

#ifdef CONFIG_TRACING
extern void *kmalloc_order_trace(size_t size, gfp_t flags, unsigned int order)
__assume_page_alignment __alloc_size(1);
#else
static __always_inline __alloc_size(1) void *kmalloc_order_trace(size_t size, gfp_t flags,
unsigned int order)
{
return kmalloc_order(size, flags, order);
}
#endif

static __always_inline __alloc_size(1) void *kmalloc_large(size_t size, gfp_t flags)
{
unsigned int order = get_order(size);
return kmalloc_order_trace(size, flags, order);
}
void *kmalloc_large_node(size_t size, gfp_t flags, int node) __assume_page_alignment
__alloc_size(1);

/**
* kmalloc - allocate memory
@@ -597,31 +567,43 @@ static __always_inline __alloc_size(1) void *kmalloc(size_t size, gfp_t flags)
if (!index)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_trace(
return kmalloc_trace(
kmalloc_caches[kmalloc_type(flags)][index],
flags, size);
#endif
}
return __kmalloc(size, flags);
}

#ifndef CONFIG_SLOB
static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
{
#ifndef CONFIG_SLOB
if (__builtin_constant_p(size) &&
size <= KMALLOC_MAX_CACHE_SIZE) {
unsigned int i = kmalloc_index(size);
if (__builtin_constant_p(size)) {
unsigned int index;

if (!i)
if (size > KMALLOC_MAX_CACHE_SIZE)
return kmalloc_large_node(size, flags, node);

index = kmalloc_index(size);

if (!index)
return ZERO_SIZE_PTR;

return kmem_cache_alloc_node_trace(
kmalloc_caches[kmalloc_type(flags)][i],
flags, node, size);
return kmalloc_node_trace(
kmalloc_caches[kmalloc_type(flags)][index],
flags, node, size);
}
#endif
return __kmalloc_node(size, flags, node);
}
#else
static __always_inline __alloc_size(1) void *kmalloc_node(size_t size, gfp_t flags, int node)
{
if (__builtin_constant_p(size) && size > KMALLOC_MAX_CACHE_SIZE)
return kmalloc_large_node(size, flags, node);

return __kmalloc_node(size, flags, node);
}
#endif

/**
* kmalloc_array - allocate memory for an array.
@@ -647,10 +629,10 @@ static inline __alloc_size(1, 2) void *kmalloc_array(size_t n, size_t size, gfp_
* @new_size: new size of a single member of the array
* @flags: the type of memory to allocate (see kmalloc)
*/
static inline __alloc_size(2, 3) void * __must_check krealloc_array(void *p,
size_t new_n,
size_t new_size,
gfp_t flags)
static inline __realloc_size(2, 3) void * __must_check krealloc_array(void *p,
size_t new_n,
size_t new_size,
gfp_t flags)
{
size_t bytes;

@@ -671,6 +653,12 @@ static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flag
return kmalloc_array(n, size, flags | __GFP_ZERO);
}

void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
unsigned long caller) __alloc_size(1);
#define kmalloc_node_track_caller(size, flags, node) \
__kmalloc_node_track_caller(size, flags, node, \
_RET_IP_)

/*
* kmalloc_track_caller is a special version of kmalloc that records the
* calling function of the routine calling it for slab leak tracking instead
@@ -679,9 +667,9 @@ static inline __alloc_size(1, 2) void *kcalloc(size_t n, size_t size, gfp_t flag
* allocator where we care about the real place the memory allocation
* request comes from.
*/
extern void *__kmalloc_track_caller(size_t size, gfp_t flags, unsigned long caller);
#define kmalloc_track_caller(size, flags) \
__kmalloc_track_caller(size, flags, _RET_IP_)
__kmalloc_node_track_caller(size, flags, \
NUMA_NO_NODE, _RET_IP_)

static inline __alloc_size(1, 2) void *kmalloc_array_node(size_t n, size_t size, gfp_t flags,
int node)
@@ -700,21 +688,6 @@ static inline __alloc_size(1, 2) void *kcalloc_node(size_t n, size_t size, gfp_t
return kmalloc_array_node(n, size, flags | __GFP_ZERO, node);
}


#ifdef CONFIG_NUMA
extern void *__kmalloc_node_track_caller(size_t size, gfp_t flags, int node,
unsigned long caller) __alloc_size(1);
#define kmalloc_node_track_caller(size, flags, node) \
__kmalloc_node_track_caller(size, flags, node, \
_RET_IP_)

#else /* CONFIG_NUMA */

#define kmalloc_node_track_caller(size, flags, node) \
kmalloc_track_caller(size, flags)

#endif /* CONFIG_NUMA */

/*
* Shortcuts
*/
@@ -774,11 +747,28 @@ static inline __alloc_size(1, 2) void *kvcalloc(size_t n, size_t size, gfp_t fla
}

extern void *kvrealloc(const void *p, size_t oldsize, size_t newsize, gfp_t flags)
__alloc_size(3);
__realloc_size(3);
extern void kvfree(const void *addr);
extern void kvfree_sensitive(const void *addr, size_t len);

unsigned int kmem_cache_size(struct kmem_cache *s);

/**
* kmalloc_size_roundup - Report allocation bucket size for the given size
*
* @size: Number of bytes to round up from.
*
* This returns the number of bytes that would be available in a kmalloc()
* allocation of @size bytes. For example, a 126 byte request would be
* rounded up to the next sized kmalloc bucket, 128 bytes. (This is strictly
* for the general-purpose kmalloc()-based allocations, and is not for the
* pre-sized kmem_cache_alloc()-based allocations.)
*
* Use this to kmalloc() the full bucket size ahead of time instead of using
* ksize() to query the size after an allocation.
*/
size_t kmalloc_size_roundup(size_t size);

void __init kmem_cache_init_late(void);

#if defined(CONFIG_SMP) && defined(CONFIG_SLAB)
Loading

0 comments on commit 52abb27

Please sign in to comment.