Skip to content

Commit

Permalink
Merge tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel…
Browse files Browse the repository at this point in the history
…/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:

 - Have fprobes built on top of function graph infrastructure

   The fprobe logic is an optimized kprobe that uses ftrace to attach to
   functions when a probe is needed at the start or end of the function.
   The fprobe and kretprobe logic implements a similar method as the
   function graph tracer to trace the end of the function. That is to
   hijack the return address and jump to a trampoline to do the trace
   when the function exits. To do this, a shadow stack needs to be
   created to store the original return address. Fprobes and function
   graph do this slightly differently. Fprobes (and kretprobes) has
   slots per callsite that are reserved to save the return address. This
   is fine when just a few points are traced. But users of fprobes, such
   as BPF programs, are starting to add many more locations, and this
   method does not scale.

   The function graph tracer was created to trace all functions in the
   kernel. In order to do this, when function graph tracing is started,
   every task gets its own shadow stack to hold the return address that
   is going to be traced. The function graph tracer has been updated to
   allow multiple users to use its infrastructure. Now have fprobes be
   one of those users. This will also allow for the fprobe and kretprobe
   methods to trace the return address to become obsolete. With new
   technologies like CFI that need to know about these methods of
   hijacking the return address, going toward a solution that has only
   one method of doing this will make the kernel less complex.

 - Cleanup with guard() and free() helpers

   There were several places in the code that had a lot of "goto out" in
   the error paths to either unlock a lock or free some memory that was
   allocated. But this is error prone. Convert the code over to use the
   guard() and free() helpers that let the compiler unlock locks or free
   memory when the function exits.

 - Remove disabling of interrupts in the function graph tracer

   When function graph tracer was first introduced, it could race with
   interrupts and NMIs. To prevent that race, it would disable
   interrupts and not trace NMIs. But the code has changed to allow NMIs
   and also interrupts. This change was done a long time ago, but the
   disabling of interrupts was never removed. Remove the disabling of
   interrupts in the function graph tracer is it is not needed. This
   greatly improves its performance.

 - Allow the :mod: command to enable tracing module functions on the
   kernel command line.

   The function tracer already has a way to enable functions to be
   traced in modules by writing ":mod:<module>" into set_ftrace_filter.
   That will enable either all the functions for the module if it is
   loaded, or if it is not, it will cache that command, and when the
   module is loaded that matches <module>, its functions will be
   enabled. This also allows init functions to be traced. But currently
   events do not have that feature.

   Because enabling function tracing can be done very early at boot up
   (before scheduling is enabled), the commands that can be done when
   function tracing is started is limited. Having the ":mod:" command to
   trace module functions as they are loaded is very useful. Update the
   kernel command line function filtering to allow it.

* tag 'ftrace-v6.14' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (26 commits)
  ftrace: Implement :mod: cache filtering on kernel command line
  tracing: Adopt __free() and guard() for trace_fprobe.c
  bpf: Use ftrace_get_symaddr() for kprobe_multi probes
  ftrace: Add ftrace_get_symaddr to convert fentry_ip to symaddr
  Documentation: probes: Update fprobe on function-graph tracer
  selftests/ftrace: Add a test case for repeating register/unregister fprobe
  selftests: ftrace: Remove obsolate maxactive syntax check
  tracing/fprobe: Remove nr_maxactive from fprobe
  fprobe: Add fprobe_header encoding feature
  fprobe: Rewrite fprobe on function-graph tracer
  s390/tracing: Enable HAVE_FTRACE_GRAPH_FUNC
  ftrace: Add CONFIG_HAVE_FTRACE_GRAPH_FUNC
  bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  tracing: Add ftrace_fill_perf_regs() for perf event
  tracing: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  fprobe: Use ftrace_regs in fprobe exit handler
  fprobe: Use ftrace_regs in fprobe entry handler
  fgraph: Pass ftrace_regs to retfunc
  fgraph: Replace fgraph_ret_regs with ftrace_regs
  ...
  • Loading branch information
Linus Torvalds committed Jan 21, 2025
2 parents 0074ade + 31f505d commit 2e04247
Show file tree
Hide file tree
Showing 57 changed files with 1,504 additions and 852 deletions.
42 changes: 27 additions & 15 deletions Documentation/trace/fprobe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,10 @@ Fprobe - Function entry/exit probe
Introduction
============

Fprobe is a function entry/exit probe mechanism based on ftrace.
Instead of using ftrace full feature, if you only want to attach callbacks
on function entry and exit, similar to the kprobes and kretprobes, you can
Fprobe is a function entry/exit probe based on the function-graph tracing
feature in ftrace.
Instead of tracing all functions, if you want to attach callbacks on specific
function entry and exit, similar to the kprobes and kretprobes, you can
use fprobe. Compared with kprobes and kretprobes, fprobe gives faster
instrumentation for multiple functions with single handler. This document
describes how to use fprobe.
Expand Down Expand Up @@ -91,12 +92,14 @@ The prototype of the entry/exit callback function are as follows:

.. code-block:: c
int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
Note that the @entry_ip is saved at function entry and passed to exit handler.
If the entry callback function returns !0, the corresponding exit callback will be cancelled.
Note that the @entry_ip is saved at function entry and passed to exit
handler.
If the entry callback function returns !0, the corresponding exit callback
will be cancelled.

@fp
This is the address of `fprobe` data structure related to this handler.
Expand All @@ -112,19 +115,28 @@ If the entry callback function returns !0, the corresponding exit callback will
This is the return address that the traced function will return to,
somewhere in the caller. This can be used at both entry and exit.

@regs
This is the `pt_regs` data structure at the entry and exit. Note that
the instruction pointer of @regs may be different from the @entry_ip
in the entry_handler. If you need traced instruction pointer, you need
to use @entry_ip. On the other hand, in the exit_handler, the instruction
pointer of @regs is set to the current return address.
@fregs
This is the `ftrace_regs` data structure at the entry and exit. This
includes the function parameters, or the return values. So user can
access thos values via appropriate `ftrace_regs_*` APIs.

@entry_data
This is a local storage to share the data between entry and exit handlers.
This storage is NULL by default. If the user specify `exit_handler` field
and `entry_data_size` field when registering the fprobe, the storage is
allocated and passed to both `entry_handler` and `exit_handler`.

Entry data size and exit handlers on the same function
======================================================

Since the entry data is passed via per-task stack and it has limited size,
the entry data size per probe is limited to `15 * sizeof(long)`. You also need
to take care that the different fprobes are probing on the same function, this
limit becomes smaller. The entry data size is aligned to `sizeof(long)` and
each fprobe which has exit handler uses a `sizeof(long)` space on the stack,
you should keep the number of fprobes on the same function as small as
possible.

Share the callbacks with kprobes
================================

Expand Down Expand Up @@ -165,8 +177,8 @@ This counter counts up when;
- fprobe fails to take ftrace_recursion lock. This usually means that a function
which is traced by other ftrace users is called from the entry_handler.

- fprobe fails to setup the function exit because of the shortage of rethook
(the shadow stack for hooking the function return.)
- fprobe fails to setup the function exit because of failing to allocate the
data buffer from the per-task shadow stack.

The `fprobe::nmissed` field counts up in both cases. Therefore, the former
skips both of entry and exit callback and the latter skips the exit
Expand Down
2 changes: 2 additions & 0 deletions arch/arm64/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -217,9 +217,11 @@ config ARM64
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_GUP_FAST
select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_FREGS
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_GRAPH_RETVAL
select HAVE_GCC_PLUGINS
Expand Down
1 change: 1 addition & 0 deletions arch/arm64/include/asm/Kbuild
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ syscall-y += unistd_32.h
syscall-y += unistd_compat_32.h

generic-y += early_ioremap.h
generic-y += fprobe.h
generic-y += mcs_spinlock.h
generic-y += mmzone.h
generic-y += qrwlock.h
Expand Down
51 changes: 34 additions & 17 deletions arch/arm64/include/asm/ftrace.h
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,8 @@ extern unsigned long ftrace_graph_call;
extern void return_to_handler(void);

unsigned long ftrace_call_adjust(unsigned long addr);
unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip);
#define ftrace_get_symaddr(fentry_ip) arch_ftrace_get_symaddr(fentry_ip)

#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS
#define HAVE_ARCH_FTRACE_REGS
Expand Down Expand Up @@ -129,6 +131,38 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
arch_ftrace_regs(fregs)->pc = arch_ftrace_regs(fregs)->lr;
}

static __always_inline unsigned long
ftrace_regs_get_frame_pointer(const struct ftrace_regs *fregs)
{
return arch_ftrace_regs(fregs)->fp;
}

static __always_inline unsigned long
ftrace_regs_get_return_address(const struct ftrace_regs *fregs)
{
return arch_ftrace_regs(fregs)->lr;
}

static __always_inline struct pt_regs *
ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
{
struct __arch_ftrace_regs *afregs = arch_ftrace_regs(fregs);

memcpy(regs->regs, afregs->regs, sizeof(afregs->regs));
regs->sp = afregs->sp;
regs->pc = afregs->pc;
regs->regs[29] = afregs->fp;
regs->regs[30] = afregs->lr;
return regs;
}

#define arch_ftrace_fill_perf_regs(fregs, _regs) do { \
(_regs)->pc = arch_ftrace_regs(fregs)->pc; \
(_regs)->regs[29] = arch_ftrace_regs(fregs)->fp; \
(_regs)->sp = arch_ftrace_regs(fregs)->sp; \
(_regs)->pstate = PSR_MODE_EL1h; \
} while (0)

int ftrace_regs_query_register_offset(const char *name);

int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
Expand Down Expand Up @@ -186,23 +220,6 @@ static inline bool arch_syscall_match_sym_name(const char *sym,

#ifndef __ASSEMBLY__
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
struct fgraph_ret_regs {
/* x0 - x7 */
unsigned long regs[8];

unsigned long fp;
unsigned long __unused;
};

static inline unsigned long fgraph_ret_regs_return_value(struct fgraph_ret_regs *ret_regs)
{
return ret_regs->regs[0];
}

static inline unsigned long fgraph_ret_regs_frame_pointer(struct fgraph_ret_regs *ret_regs)
{
return ret_regs->fp;
}

void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
unsigned long frame_pointer);
Expand Down
12 changes: 0 additions & 12 deletions arch/arm64/kernel/asm-offsets.c
Original file line number Diff line number Diff line change
Expand Up @@ -179,18 +179,6 @@ int main(void)
DEFINE(FTRACE_OPS_FUNC, offsetof(struct ftrace_ops, func));
#endif
BLANK();
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
DEFINE(FGRET_REGS_X0, offsetof(struct fgraph_ret_regs, regs[0]));
DEFINE(FGRET_REGS_X1, offsetof(struct fgraph_ret_regs, regs[1]));
DEFINE(FGRET_REGS_X2, offsetof(struct fgraph_ret_regs, regs[2]));
DEFINE(FGRET_REGS_X3, offsetof(struct fgraph_ret_regs, regs[3]));
DEFINE(FGRET_REGS_X4, offsetof(struct fgraph_ret_regs, regs[4]));
DEFINE(FGRET_REGS_X5, offsetof(struct fgraph_ret_regs, regs[5]));
DEFINE(FGRET_REGS_X6, offsetof(struct fgraph_ret_regs, regs[6]));
DEFINE(FGRET_REGS_X7, offsetof(struct fgraph_ret_regs, regs[7]));
DEFINE(FGRET_REGS_FP, offsetof(struct fgraph_ret_regs, fp));
DEFINE(FGRET_REGS_SIZE, sizeof(struct fgraph_ret_regs));
#endif
#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
DEFINE(FTRACE_OPS_DIRECT_CALL, offsetof(struct ftrace_ops, direct_call));
#endif
Expand Down
32 changes: 18 additions & 14 deletions arch/arm64/kernel/entry-ftrace.S
Original file line number Diff line number Diff line change
Expand Up @@ -329,24 +329,28 @@ SYM_FUNC_END(ftrace_stub_graph)
* @fp is checked against the value passed by ftrace_graph_caller().
*/
SYM_CODE_START(return_to_handler)
/* save return value regs */
sub sp, sp, #FGRET_REGS_SIZE
stp x0, x1, [sp, #FGRET_REGS_X0]
stp x2, x3, [sp, #FGRET_REGS_X2]
stp x4, x5, [sp, #FGRET_REGS_X4]
stp x6, x7, [sp, #FGRET_REGS_X6]
str x29, [sp, #FGRET_REGS_FP] // parent's fp
/* Make room for ftrace_regs */
sub sp, sp, #FREGS_SIZE

/* Save return value regs */
stp x0, x1, [sp, #FREGS_X0]
stp x2, x3, [sp, #FREGS_X2]
stp x4, x5, [sp, #FREGS_X4]
stp x6, x7, [sp, #FREGS_X6]

/* Save the callsite's FP */
str x29, [sp, #FREGS_FP]

mov x0, sp
bl ftrace_return_to_handler // addr = ftrace_return_to_hander(regs);
bl ftrace_return_to_handler // addr = ftrace_return_to_hander(fregs);
mov x30, x0 // restore the original return address

/* restore return value regs */
ldp x0, x1, [sp, #FGRET_REGS_X0]
ldp x2, x3, [sp, #FGRET_REGS_X2]
ldp x4, x5, [sp, #FGRET_REGS_X4]
ldp x6, x7, [sp, #FGRET_REGS_X6]
add sp, sp, #FGRET_REGS_SIZE
/* Restore return value regs */
ldp x0, x1, [sp, #FREGS_X0]
ldp x2, x3, [sp, #FREGS_X2]
ldp x4, x5, [sp, #FREGS_X4]
ldp x6, x7, [sp, #FREGS_X6]
add sp, sp, #FREGS_SIZE

ret
SYM_CODE_END(return_to_handler)
Expand Down
78 changes: 77 additions & 1 deletion arch/arm64/kernel/ftrace.c
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,69 @@ unsigned long ftrace_call_adjust(unsigned long addr)
return addr;
}

/* Convert fentry_ip to the symbol address without kallsyms */
unsigned long arch_ftrace_get_symaddr(unsigned long fentry_ip)
{
u32 insn;

/*
* When using patchable-function-entry without pre-function NOPS, ftrace
* entry is the address of the first NOP after the function entry point.
*
* The compiler has either generated:
*
* func+00: func: NOP // To be patched to MOV X9, LR
* func+04: NOP // To be patched to BL <caller>
*
* Or:
*
* func-04: BTI C
* func+00: func: NOP // To be patched to MOV X9, LR
* func+04: NOP // To be patched to BL <caller>
*
* The fentry_ip is the address of `BL <caller>` which is at `func + 4`
* bytes in either case.
*/
if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS))
return fentry_ip - AARCH64_INSN_SIZE;

/*
* When using patchable-function-entry with pre-function NOPs, BTI is
* a bit different.
*
* func+00: func: NOP // To be patched to MOV X9, LR
* func+04: NOP // To be patched to BL <caller>
*
* Or:
*
* func+00: func: BTI C
* func+04: NOP // To be patched to MOV X9, LR
* func+08: NOP // To be patched to BL <caller>
*
* The fentry_ip is the address of `BL <caller>` which is at either
* `func + 4` or `func + 8` depends on whether there is a BTI.
*/

/* If there is no BTI, the func address should be one instruction before. */
if (!IS_ENABLED(CONFIG_ARM64_BTI_KERNEL))
return fentry_ip - AARCH64_INSN_SIZE;

/* We want to be extra safe in case entry ip is on the page edge,
* but otherwise we need to avoid get_kernel_nofault()'s overhead.
*/
if ((fentry_ip & ~PAGE_MASK) < AARCH64_INSN_SIZE * 2) {
if (get_kernel_nofault(insn, (u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2)))
return 0;
} else {
insn = *(u32 *)(fentry_ip - AARCH64_INSN_SIZE * 2);
}

if (aarch64_insn_is_bti(le32_to_cpu((__le32)insn)))
return fentry_ip - AARCH64_INSN_SIZE * 2;

return fentry_ip - AARCH64_INSN_SIZE;
}

/*
* Replace a single instruction, which may be a branch or NOP.
* If @validate == true, a replaced instruction is checked against 'old'.
Expand Down Expand Up @@ -481,7 +544,20 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
struct ftrace_ops *op, struct ftrace_regs *fregs)
{
prepare_ftrace_return(ip, &arch_ftrace_regs(fregs)->lr, arch_ftrace_regs(fregs)->fp);
unsigned long return_hooker = (unsigned long)&return_to_handler;
unsigned long frame_pointer = arch_ftrace_regs(fregs)->fp;
unsigned long *parent = &arch_ftrace_regs(fregs)->lr;
unsigned long old;

if (unlikely(atomic_read(&current->tracing_graph_pause)))
return;

old = *parent;

if (!function_graph_enter_regs(old, ip, frame_pointer,
(void *)frame_pointer, fregs)) {
*parent = return_hooker;
}
}
#else
/*
Expand Down
4 changes: 3 additions & 1 deletion arch/loongarch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -129,16 +129,18 @@ config LOONGARCH
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_ARGS
select HAVE_FTRACE_REGS_HAVING_PT_REGS
select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_EBPF_JIT
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !ARCH_STRICT_ALIGN
select HAVE_EXIT_THREAD
select HAVE_GUP_FAST
select HAVE_FTRACE_GRAPH_FUNC
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_ARG_ACCESS_API
select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_RETVAL if HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_GRAPH_FREGS
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_FUNCTION_TRACER
select HAVE_GCC_PLUGINS
Expand Down
12 changes: 12 additions & 0 deletions arch/loongarch/include/asm/fprobe.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
/* SPDX-License-Identifier: GPL-2.0 */
#ifndef _ASM_LOONGARCH_FPROBE_H
#define _ASM_LOONGARCH_FPROBE_H

/*
* Explicitly undef ARCH_DEFINE_ENCODE_FPROBE_HEADER, because loongarch does not
* have enough number of fixed MSBs of the address of kernel objects for
* encoding the size of data in fprobe_header. Use 2-entries encoding instead.
*/
#undef ARCH_DEFINE_ENCODE_FPROBE_HEADER

#endif /* _ASM_LOONGARCH_FPROBE_H */
Loading

0 comments on commit 2e04247

Please sign in to comment.