Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2023-03-06

We've added 85 non-merge commits during the last 13 day(s) which contain
a total of 131 files changed, 7102 insertions(+), 1792 deletions(-).

The main changes are:

1) Add skb and XDP typed dynptrs which allow BPF programs for more
   ergonomic and less brittle iteration through data and variable-sized
   accesses, from Joanne Koong.

2) Bigger batch of BPF verifier improvements to prepare for upcoming BPF
   open-coded iterators allowing for less restrictive looping capabilities,
   from Andrii Nakryiko.

3) Rework RCU enforcement in the verifier, add kptr_rcu and enforce BPF
   programs to NULL-check before passing such pointers into kfunc,
   from Alexei Starovoitov.

4) Add support for kptrs in percpu hashmaps, percpu LRU hashmaps and in
   local storage maps, from Kumar Kartikeya Dwivedi.

5) Add BPF verifier support for ST instructions in convert_ctx_access()
   which will help new -mcpu=v4 clang flag to start emitting them,
   from Eduard Zingerman.

6) Make uprobe attachment Android APK aware by supporting attachment
   to functions inside ELF objects contained in APKs via function names,
   from Daniel Müller.

7) Add a new flag BPF_F_TIMER_ABS flag for bpf_timer_start() helper
   to start the timer with absolute expiration value instead of relative
   one, from Tero Kristo.

8) Add a new kfunc bpf_cgroup_from_id() to look up cgroups via id,
   from Tejun Heo.

9) Extend libbpf to support users manually attaching kprobes/uprobes
   in the legacy/perf/link mode, from Menglong Dong.

10) Implement workarounds in the mips BPF JIT for DADDI/R4000,
   from Jiaxun Yang.

11) Enable mixing bpf2bpf and tailcalls for the loongarch BPF JIT,
    from Hengqi Chen.

12) Extend BPF instruction set doc with describing the encoding of BPF
    instructions in terms of how bytes are stored under big/little endian,
    from Jose E. Marchesi.

13) Follow-up to enable kfunc support for riscv BPF JIT, from Pu Lehui.

14) Fix bpf_xdp_query() backwards compatibility on old kernels,
    from Yonghong Song.

15) Fix BPF selftest cross compilation with CLANG_CROSS_FLAGS,
    from Florent Revest.

16) Improve bpf_cpumask_ma to only allocate one bpf_mem_cache,
    from Hou Tao.

17) Fix BPF verifier's check_subprogs to not unnecessarily mark
    a subprogram with has_tail_call, from Ilya Leoshkevich.

18) Fix arm syscall regs spec in libbpf's bpf_tracing.h, from Puranjay Mohan.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (85 commits)
  selftests/bpf: Add test for legacy/perf kprobe/uprobe attach mode
  selftests/bpf: Split test_attach_probe into multi subtests
  libbpf: Add support to set kprobe/uprobe attach mode
  tools/resolve_btfids: Add /libsubcmd to .gitignore
  bpf: add support for fixed-size memory pointer returns for kfuncs
  bpf: generalize dynptr_get_spi to be usable for iters
  bpf: mark PTR_TO_MEM as non-null register type
  bpf: move kfunc_call_arg_meta higher in the file
  bpf: ensure that r0 is marked scratched after any function call
  bpf: fix visit_insn()'s detection of BPF_FUNC_timer_set_callback helper
  bpf: clean up visit_insn()'s instruction processing
  selftests/bpf: adjust log_fixup's buffer size for proper truncation
  bpf: honor env->test_state_freq flag in is_state_visited()
  selftests/bpf: enhance align selftest's expected log matching
  bpf: improve regsafe() checks for PTR_TO_{MEM,BUF,TP_BUFFER}
  bpf: improve stack slot state printing
  selftests/bpf: Disassembler tests for verifier.c:convert_ctx_access()
  selftests/bpf: test if pointer type is tracked for BPF_ST_MEM
  bpf: allow ctx writes using BPF_ST_MEM instruction
  bpf: Use separate RCU callbacks for freeing selem
  ...
====================

Link: https://lore.kernel.org/r/20230307004346.27578-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
  • Loading branch information
Jakub Kicinski committed Mar 7, 2023
2 parents 5ca26d6 + 8f4c92f commit 36e5e39
Show file tree
Hide file tree
Showing 131 changed files with 7,102 additions and 1,792 deletions.
4 changes: 2 additions & 2 deletions Documentation/bpf/bpf_design_QA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -314,7 +314,7 @@ Q: What is the compatibility story for special BPF types in map values?
Q: Users are allowed to embed bpf_spin_lock, bpf_timer fields in their BPF map
values (when using BTF support for BPF maps). This allows to use helpers for
such objects on these fields inside map values. Users are also allowed to embed
pointers to some kernel types (with __kptr and __kptr_ref BTF tags). Will the
pointers to some kernel types (with __kptr_untrusted and __kptr BTF tags). Will the
kernel preserve backwards compatibility for these features?

A: It depends. For bpf_spin_lock, bpf_timer: YES, for kptr and everything else:
Expand All @@ -324,7 +324,7 @@ For struct types that have been added already, like bpf_spin_lock and bpf_timer,
the kernel will preserve backwards compatibility, as they are part of UAPI.

For kptrs, they are also part of UAPI, but only with respect to the kptr
mechanism. The types that you can use with a __kptr and __kptr_ref tagged
mechanism. The types that you can use with a __kptr_untrusted and __kptr tagged
pointer in your struct are NOT part of the UAPI contract. The supported types can
and will change across kernel releases. However, operations like accessing kptr
fields and bpf_kptr_xchg() helper will continue to be supported across kernel
Expand Down
14 changes: 7 additions & 7 deletions Documentation/bpf/bpf_devel_QA.rst
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ into the bpf-next tree will make their way into net-next tree. net and
net-next are both run by David S. Miller. From there, they will go
into the kernel mainline tree run by Linus Torvalds. To read up on the
process of net and net-next being merged into the mainline tree, see
the :ref:`netdev-FAQ`
the `netdev-FAQ`_.



Expand All @@ -147,7 +147,7 @@ request)::
Q: How do I indicate which tree (bpf vs. bpf-next) my patch should be applied to?
---------------------------------------------------------------------------------

A: The process is the very same as described in the :ref:`netdev-FAQ`,
A: The process is the very same as described in the `netdev-FAQ`_,
so please read up on it. The subject line must indicate whether the
patch is a fix or rather "next-like" content in order to let the
maintainers know whether it is targeted at bpf or bpf-next.
Expand Down Expand Up @@ -206,7 +206,7 @@ ii) run extensive BPF test suite and
Once the BPF pull request was accepted by David S. Miller, then
the patches end up in net or net-next tree, respectively, and
make their way from there further into mainline. Again, see the
:ref:`netdev-FAQ` for additional information e.g. on how often they are
`netdev-FAQ`_ for additional information e.g. on how often they are
merged to mainline.

Q: How long do I need to wait for feedback on my BPF patches?
Expand All @@ -230,7 +230,7 @@ Q: Are patches applied to bpf-next when the merge window is open?
-----------------------------------------------------------------
A: For the time when the merge window is open, bpf-next will not be
processed. This is roughly analogous to net-next patch processing,
so feel free to read up on the :ref:`netdev-FAQ` about further details.
so feel free to read up on the `netdev-FAQ`_ about further details.

During those two weeks of merge window, we might ask you to resend
your patch series once bpf-next is open again. Once Linus released
Expand Down Expand Up @@ -394,7 +394,7 @@ netdev kernel mailing list in Cc and ask for the fix to be queued up:
netdev@vger.kernel.org

The process in general is the same as on netdev itself, see also the
:ref:`netdev-FAQ`.
`netdev-FAQ`_.

Q: Do you also backport to kernels not currently maintained as stable?
----------------------------------------------------------------------
Expand All @@ -410,7 +410,7 @@ Q: The BPF patch I am about to submit needs to go to stable as well
What should I do?

A: The same rules apply as with netdev patch submissions in general, see
the :ref:`netdev-FAQ`.
the `netdev-FAQ`_.

Never add "``Cc: stable@vger.kernel.org``" to the patch description, but
ask the BPF maintainers to queue the patches instead. This can be done
Expand Down Expand Up @@ -685,7 +685,7 @@ when:

.. Links
.. _Documentation/process/: https://www.kernel.org/doc/html/latest/process/
.. _netdev-FAQ: Documentation/process/maintainer-netdev.rst
.. _netdev-FAQ: https://www.kernel.org/doc/html/latest/process/maintainer-netdev.html
.. _selftests:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/bpf/
.. _Documentation/dev-tools/kselftest.rst:
Expand Down
4 changes: 2 additions & 2 deletions Documentation/bpf/cpumasks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ For example:
.. code-block:: c
struct cpumask_map_value {
struct bpf_cpumask __kptr_ref * cpumask;
struct bpf_cpumask __kptr * cpumask;
};
struct array_map {
Expand Down Expand Up @@ -128,7 +128,7 @@ Here is an example of a ``struct bpf_cpumask *`` being retrieved from a map:
/* struct containing the struct bpf_cpumask kptr which is stored in the map. */
struct cpumasks_kfunc_map_value {
struct bpf_cpumask __kptr_ref * bpf_cpumask;
struct bpf_cpumask __kptr * bpf_cpumask;
};
/* The map containing struct cpumasks_kfunc_map_value entries. */
Expand Down
40 changes: 27 additions & 13 deletions Documentation/bpf/instruction-set.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,14 +38,11 @@ eBPF has two instruction encodings:
* the wide instruction encoding, which appends a second 64-bit immediate (i.e.,
constant) value after the basic instruction for a total of 128 bits.

The basic instruction encoding is as follows, where MSB and LSB mean the most significant
bits and least significant bits, respectively:
The fields conforming an encoded basic instruction are stored in the
following order::

============= ======= ======= ======= ============
32 bits (MSB) 16 bits 4 bits 4 bits 8 bits (LSB)
============= ======= ======= ======= ============
imm offset src_reg dst_reg opcode
============= ======= ======= ======= ============
opcode:8 src_reg:4 dst_reg:4 offset:16 imm:32 // In little-endian BPF.
opcode:8 dst_reg:4 src_reg:4 offset:16 imm:32 // In big-endian BPF.

**imm**
signed integer immediate value
Expand All @@ -63,6 +60,18 @@ imm offset src_reg dst_reg opcode
**opcode**
operation to perform

Note that the contents of multi-byte fields ('imm' and 'offset') are
stored using big-endian byte ordering in big-endian BPF and
little-endian byte ordering in little-endian BPF.

For example::

opcode offset imm assembly
src_reg dst_reg
07 0 1 00 00 44 33 22 11 r1 += 0x11223344 // little
dst_reg src_reg
07 1 0 00 00 11 22 33 44 r1 += 0x11223344 // big

Note that most instructions do not use all of the fields.
Unused fields shall be cleared to zero.

Expand All @@ -72,18 +81,23 @@ The 64 bits following the basic instruction contain a pseudo instruction
using the same format but with opcode, dst_reg, src_reg, and offset all set to zero,
and imm containing the high 32 bits of the immediate value.

================= ==================
64 bits (MSB) 64 bits (LSB)
================= ==================
basic instruction pseudo instruction
================= ==================
This is depicted in the following figure::

basic_instruction
.-----------------------------.
| |
code:8 regs:8 offset:16 imm:32 unused:32 imm:32
| |
'--------------'
pseudo instruction

Thus the 64-bit immediate value is constructed as follows:

imm64 = (next_imm << 32) | imm

where 'next_imm' refers to the imm value of the pseudo instruction
following the basic instruction.
following the basic instruction. The unused bytes in the pseudo
instruction are reserved and shall be cleared to zero.

Instruction classes
-------------------
Expand Down
41 changes: 32 additions & 9 deletions Documentation/bpf/kfuncs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,23 @@ Hence, whenever a constant scalar argument is accepted by a kfunc which is not a
size parameter, and the value of the constant matters for program safety, __k
suffix should be used.

2.2.2 __uninit Annotation
-------------------------

This annotation is used to indicate that the argument will be treated as
uninitialized.

An example is given below::

__bpf_kfunc int bpf_dynptr_from_skb(..., struct bpf_dynptr_kern *ptr__uninit)
{
...
}

Here, the dynptr will be treated as an uninitialized dynptr. Without this
annotation, the verifier will reject the program if the dynptr passed in is
not initialized.

.. _BPF_kfunc_nodef:

2.3 Using an existing kernel function
Expand Down Expand Up @@ -232,11 +249,13 @@ added later.
2.4.8 KF_RCU flag
-----------------

The KF_RCU flag is used for kfuncs which have a rcu ptr as its argument.
When used together with KF_ACQUIRE, it indicates the kfunc should have a
single argument which must be a trusted argument or a MEM_RCU pointer.
The argument may have reference count of 0 and the kfunc must take this
into consideration.
The KF_RCU flag is a weaker version of KF_TRUSTED_ARGS. The kfuncs marked with
KF_RCU expect either PTR_TRUSTED or MEM_RCU arguments. The verifier guarantees
that the objects are valid and there is no use-after-free. The pointers are not
NULL, but the object's refcount could have reached zero. The kfuncs need to
consider doing refcnt != 0 check, especially when returning a KF_ACQUIRE
pointer. Note as well that a KF_ACQUIRE kfunc that is KF_RCU should very likely
also be KF_RET_NULL.

.. _KF_deprecated_flag:

Expand Down Expand Up @@ -527,7 +546,7 @@ Here's an example of how it can be used:
/* struct containing the struct task_struct kptr which is actually stored in the map. */
struct __cgroups_kfunc_map_value {
struct cgroup __kptr_ref * cgroup;
struct cgroup __kptr * cgroup;
};
/* The map containing struct __cgroups_kfunc_map_value entries. */
Expand Down Expand Up @@ -583,13 +602,17 @@ Here's an example of how it can be used:
----

Another kfunc available for interacting with ``struct cgroup *`` objects is
bpf_cgroup_ancestor(). This allows callers to access the ancestor of a cgroup,
and return it as a cgroup kptr.
Other kfuncs available for interacting with ``struct cgroup *`` objects are
bpf_cgroup_ancestor() and bpf_cgroup_from_id(), allowing callers to access
the ancestor of a cgroup and find a cgroup by its ID, respectively. Both
return a cgroup kptr.

.. kernel-doc:: kernel/bpf/helpers.c
:identifiers: bpf_cgroup_ancestor

.. kernel-doc:: kernel/bpf/helpers.c
:identifiers: bpf_cgroup_from_id

Eventually, BPF should be updated to allow this to happen with a normal memory
load in the program itself. This is currently not possible without more work in
the verifier. bpf_cgroup_ancestor() can be used as follows:
Expand Down
7 changes: 4 additions & 3 deletions Documentation/bpf/maps.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ maps are accessed from BPF programs via BPF helpers which are documented in the
`man-pages`_ for `bpf-helpers(7)`_.

BPF maps are accessed from user space via the ``bpf`` syscall, which provides
commands to create maps, lookup elements, update elements and delete
elements. More details of the BPF syscall are available in
:doc:`/userspace-api/ebpf/syscall` and in the `man-pages`_ for `bpf(2)`_.
commands to create maps, lookup elements, update elements and delete elements.
More details of the BPF syscall are available in `ebpf-syscall`_ and in the
`man-pages`_ for `bpf(2)`_.

Map Types
=========
Expand Down Expand Up @@ -79,3 +79,4 @@ Find and delete element by key in a given map using ``attr->map_fd``,
.. _man-pages: https://www.kernel.org/doc/man-pages/
.. _bpf(2): https://man7.org/linux/man-pages/man2/bpf.2.html
.. _bpf-helpers(7): https://man7.org/linux/man-pages/man7/bpf-helpers.7.html
.. _ebpf-syscall: https://docs.kernel.org/userspace-api/ebpf/syscall.html
6 changes: 6 additions & 0 deletions arch/loongarch/net/bpf_jit.c
Original file line number Diff line number Diff line change
Expand Up @@ -1248,3 +1248,9 @@ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog)

return prog;
}

/* Indicate the JIT backend supports mixing bpf2bpf and tailcalls. */
bool bpf_jit_supports_subprog_tailcalls(void)
{
return true;
}
5 changes: 1 addition & 4 deletions arch/mips/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -63,10 +63,7 @@ config MIPS
select HAVE_DEBUG_STACKOVERFLOW
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_EBPF_JIT if !CPU_MICROMIPS && \
!CPU_DADDI_WORKAROUNDS && \
!CPU_R4000_WORKAROUNDS && \
!CPU_R4400_WORKAROUNDS
select HAVE_EBPF_JIT if !CPU_MICROMIPS
select HAVE_EXIT_THREAD
select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
Expand Down
4 changes: 4 additions & 0 deletions arch/mips/net/bpf_jit_comp.c
Original file line number Diff line number Diff line change
Expand Up @@ -218,9 +218,13 @@ bool valid_alu_i(u8 op, s32 imm)
/* All legal eBPF values are valid */
return true;
case BPF_ADD:
if (IS_ENABLED(CONFIG_CPU_DADDI_WORKAROUNDS))
return false;
/* imm must be 16 bits */
return imm >= -0x8000 && imm <= 0x7fff;
case BPF_SUB:
if (IS_ENABLED(CONFIG_CPU_DADDI_WORKAROUNDS))
return false;
/* -imm must be 16 bits */
return imm >= -0x7fff && imm <= 0x8000;
case BPF_AND:
Expand Down
3 changes: 3 additions & 0 deletions arch/mips/net/bpf_jit_comp64.c
Original file line number Diff line number Diff line change
Expand Up @@ -228,6 +228,9 @@ static void emit_alu_r64(struct jit_context *ctx, u8 dst, u8 src, u8 op)
} else {
emit(ctx, dmultu, dst, src);
emit(ctx, mflo, dst);
/* Ensure multiplication is completed */
if (IS_ENABLED(CONFIG_CPU_R4000_WORKAROUNDS))
emit(ctx, mfhi, MIPS_R_ZERO);
}
break;
/* dst = dst / src */
Expand Down
5 changes: 5 additions & 0 deletions arch/riscv/net/bpf_jit_comp64.c
Original file line number Diff line number Diff line change
Expand Up @@ -1751,3 +1751,8 @@ void bpf_jit_build_epilogue(struct rv_jit_context *ctx)
{
__build_epilogue(false, ctx);
}

bool bpf_jit_supports_kfunc_call(void)
{
return true;
}
Loading

0 comments on commit 36e5e39

Please sign in to comment.