Skip to content

Commit

Permalink
mm: memcg: charge memcg percpu memory to the parent cgroup
Browse files Browse the repository at this point in the history
Memory cgroups are using large chunks of percpu memory to store vmstat
data.  Yet this memory is not accounted at all, so in the case when there
are many (dying) cgroups, it's not exactly clear where all the memory is.

Because the size of memory cgroup internal structures can dramatically
exceed the size of object or page which is pinning it in the memory, it's
not a good idea to simply ignore it.  It actually breaks the isolation
between cgroups.

Let's account the consumed percpu memory to the parent cgroup.

[guro@fb.com: add WARN_ON_ONCE()s, per Johannes]
  Link: http://lkml.kernel.org/r/20200811170611.GB1507044@carbon.DHCP.thefacebook.com

Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb@google.com>
Acked-by: Dennis Zhou <dennis@kernel.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Tobin C. Harding <tobin@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Waiman Long <longman@redhat.com>
Cc: Bixuan Cui <cuibixuan@huawei.com>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Link: http://lkml.kernel.org/r/20200623184515.4132564-5-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
  • Loading branch information
Roman Gushchin authored and Linus Torvalds committed Aug 12, 2020
1 parent 772616b commit 3e38e0a
Showing 1 changed file with 16 additions and 4 deletions.
20 changes: 16 additions & 4 deletions mm/memcontrol.c
Original file line number Diff line number Diff line change
Expand Up @@ -5131,13 +5131,18 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node)
if (!pn)
return 1;

pn->lruvec_stat_local = alloc_percpu(struct lruvec_stat);
/* We charge the parent cgroup, never the current task */
WARN_ON_ONCE(!current->active_memcg);

pn->lruvec_stat_local = alloc_percpu_gfp(struct lruvec_stat,
GFP_KERNEL_ACCOUNT);
if (!pn->lruvec_stat_local) {
kfree(pn);
return 1;
}

pn->lruvec_stat_cpu = alloc_percpu(struct lruvec_stat);
pn->lruvec_stat_cpu = alloc_percpu_gfp(struct lruvec_stat,
GFP_KERNEL_ACCOUNT);
if (!pn->lruvec_stat_cpu) {
free_percpu(pn->lruvec_stat_local);
kfree(pn);
Expand Down Expand Up @@ -5211,11 +5216,16 @@ static struct mem_cgroup *mem_cgroup_alloc(void)
goto fail;
}

memcg->vmstats_local = alloc_percpu(struct memcg_vmstats_percpu);
/* We charge the parent cgroup, never the current task */
WARN_ON_ONCE(!current->active_memcg);

memcg->vmstats_local = alloc_percpu_gfp(struct memcg_vmstats_percpu,
GFP_KERNEL_ACCOUNT);
if (!memcg->vmstats_local)
goto fail;

memcg->vmstats_percpu = alloc_percpu(struct memcg_vmstats_percpu);
memcg->vmstats_percpu = alloc_percpu_gfp(struct memcg_vmstats_percpu,
GFP_KERNEL_ACCOUNT);
if (!memcg->vmstats_percpu)
goto fail;

Expand Down Expand Up @@ -5264,7 +5274,9 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
struct mem_cgroup *memcg;
long error = -ENOMEM;

memalloc_use_memcg(parent);
memcg = mem_cgroup_alloc();
memalloc_unuse_memcg();
if (IS_ERR(memcg))
return ERR_CAST(memcg);

Expand Down

0 comments on commit 3e38e0a

Please sign in to comment.