-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge branch 'Support stashing local kptrs with bpf_kptr_xchg'
Dave Marchevsky says: ==================== Local kptrs are kptrs allocated via bpf_obj_new with a type specified in program BTF. A BPF program which creates a local kptr has exclusive control of the lifetime of the kptr, and, prior to terminating, must: * free the kptr via bpf_obj_drop * If the kptr is a {list,rbtree} node, add the node to a {list, rbtree}, thereby passing control of the lifetime to the collection This series adds a third option: * stash the kptr in a map value using bpf_kptr_xchg As indicated by the use of "stash" to describe this behavior, the intended use of this feature is temporary storage of local kptrs. For example, a sched_ext ([0]) scheduler may want to create an rbtree node for each new cgroup on cgroup init, but to add that node to the rbtree as part of a separate program which runs on enqueue. Stashing the node in a map_value allows its lifetime to outlive the execution of the cgroup_init program. Behavior: There is no semantic difference between adding a kptr to a graph collection and "stashing" it in a map. In both cases exclusive ownership of the kptr's lifetime is passed to some containing data structure, which is responsible for bpf_obj_drop'ing it when the container goes away. Since graph collections also expect exclusive ownership of the nodes they contain, graph nodes cannot be both stashed in a map_value and contained by their corresponding collection. Implementation: Two observations simplify the verifier changes for this feature. First, kptrs ("referenced kptrs" until a recent renaming) require registration of a dtor function as part of their acquire/release semantics, so that a referenced kptr which is placed in a map_value is properly released when the map goes away. We want this exact behavior for local kptrs, but with bpf_obj_drop as the dtor instead of a per-btf_id dtor. The second observation is that, in terms of identification, "referenced kptr" and "local kptr" already don't interfere with one another. Consider the following example: struct node_data { long key; long data; struct bpf_rb_node node; }; struct map_value { struct node_data __kptr *node; }; struct { __uint(type, BPF_MAP_TYPE_ARRAY); __type(key, int); __type(value, struct map_value); __uint(max_entries, 1); } some_nodes SEC(".maps"); struct map_value *mapval; struct node_data *res; int key = 0; res = bpf_obj_new(typeof(*res)); if (!res) { /* err handling */ } mapval = bpf_map_lookup_elem(&some_nodes, &key); if (!mapval) { /* err handling */ } res = bpf_kptr_xchg(&mapval->node, res); if (res) bpf_obj_drop(res); The __kptr tag identifies map_value's node as a referenced kptr, while the PTR_TO_BTF_ID which bpf_obj_new returns - a type in some non-vmlinux, non-module BTF - identifies res as a local kptr. Type tag on the pointer indicates referenced kptr, while the type of the pointee indicates local kptr. So using existing facilities we can tell the verifier about a "referenced kptr" pointer to a "local kptr" pointee. When kptr_xchg'ing a kptr into a map_value, the verifier can recognize local kptr types and treat them like referenced kptrs with a properly-typed bpf_obj_drop as a dtor. Other implementation notes: * We don't need to do anything special to enforce "graph nodes cannot be both stashed in a map_value and contained by their corresponding collection" * bpf_kptr_xchg both returns and takes as input a (possibly-null) owning reference. It does not accept non-owning references as input by virtue of requiring a ref_obj_id. By definition, if a program has an owning ref to a node, the node isn't in a collection, so it's safe to pass ownership via bpf_kptr_xchg. Summary of patches: * Patch 1 modifies BTF plumbing to support using bpf_obj_drop as a dtor * Patch 2 adds verifier plumbing to support MEM_ALLOC-flagged param for bpf_kptr_xchg * Patch 3 adds selftests exercising the new behavior Changelog: v1 -> v2: https://lore.kernel.org/bpf/20230309180111.1618459-1-davemarchevsky@fb.com/ Patch #s used below refer to the patch's position in v1 unless otherwise specified. Patches 1-3 were applied and are not included in v2. Rebase onto latest bpf-next: "libbpf: Revert poisoning of strlcpy" Patch 4: "bpf: Support __kptr to local kptrs" * Remove !btf_is_kernel(btf) check, WARN_ON_ONCE instead (Alexei) Patch 6: "selftests/bpf: Add local kptr stashing test" * Add test which stashes 2 nodes and later unstashes one of them using a separate BPF program (Alexei) * Fix incorrect runner subtest name for original test (was "rbtree_add_nodes") ==================== Signed-off-by: Alexei Starovoitov <ast@kernel.org>
- Loading branch information
Showing
8 changed files
with
234 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,60 @@ | ||
// SPDX-License-Identifier: GPL-2.0 | ||
/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ | ||
|
||
#include <test_progs.h> | ||
#include <network_helpers.h> | ||
|
||
#include "local_kptr_stash.skel.h" | ||
static void test_local_kptr_stash_simple(void) | ||
{ | ||
LIBBPF_OPTS(bpf_test_run_opts, opts, | ||
.data_in = &pkt_v4, | ||
.data_size_in = sizeof(pkt_v4), | ||
.repeat = 1, | ||
); | ||
struct local_kptr_stash *skel; | ||
int ret; | ||
|
||
skel = local_kptr_stash__open_and_load(); | ||
if (!ASSERT_OK_PTR(skel, "local_kptr_stash__open_and_load")) | ||
return; | ||
|
||
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.stash_rb_nodes), &opts); | ||
ASSERT_OK(ret, "local_kptr_stash_add_nodes run"); | ||
ASSERT_OK(opts.retval, "local_kptr_stash_add_nodes retval"); | ||
|
||
local_kptr_stash__destroy(skel); | ||
} | ||
|
||
static void test_local_kptr_stash_unstash(void) | ||
{ | ||
LIBBPF_OPTS(bpf_test_run_opts, opts, | ||
.data_in = &pkt_v4, | ||
.data_size_in = sizeof(pkt_v4), | ||
.repeat = 1, | ||
); | ||
struct local_kptr_stash *skel; | ||
int ret; | ||
|
||
skel = local_kptr_stash__open_and_load(); | ||
if (!ASSERT_OK_PTR(skel, "local_kptr_stash__open_and_load")) | ||
return; | ||
|
||
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.stash_rb_nodes), &opts); | ||
ASSERT_OK(ret, "local_kptr_stash_add_nodes run"); | ||
ASSERT_OK(opts.retval, "local_kptr_stash_add_nodes retval"); | ||
|
||
ret = bpf_prog_test_run_opts(bpf_program__fd(skel->progs.unstash_rb_node), &opts); | ||
ASSERT_OK(ret, "local_kptr_stash_add_nodes run"); | ||
ASSERT_EQ(opts.retval, 42, "local_kptr_stash_add_nodes retval"); | ||
|
||
local_kptr_stash__destroy(skel); | ||
} | ||
|
||
void test_local_kptr_stash_success(void) | ||
{ | ||
if (test__start_subtest("local_kptr_stash_simple")) | ||
test_local_kptr_stash_simple(); | ||
if (test__start_subtest("local_kptr_stash_unstash")) | ||
test_local_kptr_stash_unstash(); | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,108 @@ | ||
// SPDX-License-Identifier: GPL-2.0 | ||
/* Copyright (c) 2023 Meta Platforms, Inc. and affiliates. */ | ||
|
||
#include <vmlinux.h> | ||
#include <bpf/bpf_tracing.h> | ||
#include <bpf/bpf_helpers.h> | ||
#include <bpf/bpf_core_read.h> | ||
#include "bpf_experimental.h" | ||
|
||
struct node_data { | ||
long key; | ||
long data; | ||
struct bpf_rb_node node; | ||
}; | ||
|
||
struct map_value { | ||
struct prog_test_ref_kfunc *not_kptr; | ||
struct prog_test_ref_kfunc __kptr *val; | ||
struct node_data __kptr *node; | ||
}; | ||
|
||
/* This is necessary so that LLVM generates BTF for node_data struct | ||
* If it's not included, a fwd reference for node_data will be generated but | ||
* no struct. Example BTF of "node" field in map_value when not included: | ||
* | ||
* [10] PTR '(anon)' type_id=35 | ||
* [34] FWD 'node_data' fwd_kind=struct | ||
* [35] TYPE_TAG 'kptr_ref' type_id=34 | ||
* | ||
* (with no node_data struct defined) | ||
* Had to do the same w/ bpf_kfunc_call_test_release below | ||
*/ | ||
struct node_data *just_here_because_btf_bug; | ||
|
||
extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym; | ||
|
||
struct { | ||
__uint(type, BPF_MAP_TYPE_ARRAY); | ||
__type(key, int); | ||
__type(value, struct map_value); | ||
__uint(max_entries, 2); | ||
} some_nodes SEC(".maps"); | ||
|
||
static int create_and_stash(int idx, int val) | ||
{ | ||
struct map_value *mapval; | ||
struct node_data *res; | ||
|
||
mapval = bpf_map_lookup_elem(&some_nodes, &idx); | ||
if (!mapval) | ||
return 1; | ||
|
||
res = bpf_obj_new(typeof(*res)); | ||
if (!res) | ||
return 1; | ||
res->key = val; | ||
|
||
res = bpf_kptr_xchg(&mapval->node, res); | ||
if (res) | ||
bpf_obj_drop(res); | ||
return 0; | ||
} | ||
|
||
SEC("tc") | ||
long stash_rb_nodes(void *ctx) | ||
{ | ||
return create_and_stash(0, 41) ?: create_and_stash(1, 42); | ||
} | ||
|
||
SEC("tc") | ||
long unstash_rb_node(void *ctx) | ||
{ | ||
struct map_value *mapval; | ||
struct node_data *res; | ||
long retval; | ||
int key = 1; | ||
|
||
mapval = bpf_map_lookup_elem(&some_nodes, &key); | ||
if (!mapval) | ||
return 1; | ||
|
||
res = bpf_kptr_xchg(&mapval->node, NULL); | ||
if (res) { | ||
retval = res->key; | ||
bpf_obj_drop(res); | ||
return retval; | ||
} | ||
return 1; | ||
} | ||
|
||
SEC("tc") | ||
long stash_test_ref_kfunc(void *ctx) | ||
{ | ||
struct prog_test_ref_kfunc *res; | ||
struct map_value *mapval; | ||
int key = 0; | ||
|
||
mapval = bpf_map_lookup_elem(&some_nodes, &key); | ||
if (!mapval) | ||
return 1; | ||
|
||
res = bpf_kptr_xchg(&mapval->val, NULL); | ||
if (res) | ||
bpf_kfunc_call_test_release(res); | ||
return 0; | ||
} | ||
|
||
char _license[] SEC("license") = "GPL"; |