Skip to content

Commit

Permalink
Merge tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel…
Browse files Browse the repository at this point in the history
…/git/trace/linux-trace

Pull tracing updates from Steven Rostedt:
 "Main user visible change:

   - User events can now have "multi formats"

     The current user events have a single format. If another event is
     created with a different format, it will fail to be created. That
     is, once an event name is used, it cannot be used again with a
     different format. This can cause issues if a library is using an
     event and updates its format. An application using the older format
     will prevent an application using the new library from registering
     its event.

     A task could also DOS another application if it knows the event
     names, and it creates events with different formats.

     The multi-format event is in a different name space from the single
     format. Both the event name and its format are the unique
     identifier. This will allow two different applications to use the
     same user event name but with different payloads.

   - Added support to have ftrace_dump_on_oops dump out instances and
     not just the main top level tracing buffer.

  Other changes:

   - Add eventfs_root_inode

     Only the root inode has a dentry that is static (never goes away)
     and stores it upon creation. There's no reason that the thousands
     of other eventfs inodes should have a pointer that never gets set
     in its descriptor. Create a eventfs_root_inode desciptor that has a
     eventfs_inode descriptor and a dentry pointer, and only the root
     inode will use this.

   - Added WARN_ON()s in eventfs

     There's some conditionals remaining in eventfs that should never be
     hit, but instead of removing them, add WARN_ON() around them to
     make sure that they are never hit.

   - Have saved_cmdlines allocation also include the map_cmdline_to_pid
     array

     The saved_cmdlines structure allocates a large amount of data to
     hold its mappings. Within it, it has three arrays. Two are already
     apart of it: map_pid_to_cmdline[] and saved_cmdlines[]. More memory
     can be saved by also including the map_cmdline_to_pid[] array as
     well.

   - Restructure __string() and __assign_str() macros used in
     TRACE_EVENT()

     Dynamic strings in TRACE_EVENT() are declared with:

         __string(name, source)

     And assigned with:

        __assign_str(name, source)

     In the tracepoint callback of the event, the __string() is used to
     get the size needed to allocate on the ring buffer and
     __assign_str() is used to copy the string into the ring buffer.
     There's a helper structure that is created in the TRACE_EVENT()
     macro logic that will hold the string length and its position in
     the ring buffer which is created by __string().

     There are several trace events that have a function to create the
     string to save. This function is executed twice. Once for
     __string() and again for __assign_str(). There's no reason for
     this. The helper structure could also save the string it used in
     __string() and simply copy that into __assign_str() (it also
     already has its length).

     By using the structure to store the source string for the
     assignment, it means that the second argument to __assign_str() is
     no longer needed.

     It will be removed in the next merge window, but for now add a
     warning if the source string given to __string() is different than
     the source string given to __assign_str(), as the source to
     __assign_str() isn't even used and will be going away.

   - Added checks to make sure that the source of __string() is also the
     source of __assign_str() so that it can be safely removed in the
     next merge window.

     Included fixes that the above check found.

   - Other minor clean ups and fixes"

* tag 'trace-v6.9-2' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (34 commits)
  tracing: Add __string_src() helper to help compilers not to get confused
  tracing: Use strcmp() in __assign_str() WARN_ON() check
  tracepoints: Use WARN() and not WARN_ON() for warnings
  tracing: Use div64_u64() instead of do_div()
  tracing: Support to dump instance traces by ftrace_dump_on_oops
  tracing: Remove second parameter to __assign_rel_str()
  tracing: Add warning if string in __assign_str() does not match __string()
  tracing: Add __string_len() example
  tracing: Remove __assign_str_len()
  ftrace: Fix most kernel-doc warnings
  tracing: Decrement the snapshot if the snapshot trigger fails to register
  tracing: Fix snapshot counter going between two tracers that use it
  tracing: Use EVENT_NULL_STR macro instead of open coding "(null)"
  tracing: Use ? : shortcut in trace macros
  tracing: Do not calculate strlen() twice for __string() fields
  tracing: Rework __assign_str() and __string() to not duplicate getting the string
  cxl/trace: Properly initialize cxl_poison region name
  net: hns3: tracing: fix hclgevf trace event strings
  drm/i915: Add missing ; to __assign_str() macros in tracepoint code
  NFSD: Fix nfsd_clid_class use of __string_len() macro
  ...
  • Loading branch information
Linus Torvalds committed Mar 18, 2024
2 parents 2cb5c86 + 7604256 commit ad584d7
Show file tree
Hide file tree
Showing 31 changed files with 1,363 additions and 799 deletions.
26 changes: 21 additions & 5 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -1572,12 +1572,28 @@
The above will cause the "foo" tracing instance to trigger
a snapshot at the end of boot up.

ftrace_dump_on_oops[=orig_cpu]
ftrace_dump_on_oops[=2(orig_cpu) | =<instance>][,<instance> |
,<instance>=2(orig_cpu)]
[FTRACE] will dump the trace buffers on oops.
If no parameter is passed, ftrace will dump
buffers of all CPUs, but if you pass orig_cpu, it will
dump only the buffer of the CPU that triggered the
oops.
If no parameter is passed, ftrace will dump global
buffers of all CPUs, if you pass 2 or orig_cpu, it
will dump only the buffer of the CPU that triggered
the oops, or the specific instance will be dumped if
its name is passed. Multiple instance dump is also
supported, and instances are separated by commas. Each
instance supports only dump on CPU that triggered the
oops by passing 2 or orig_cpu to it.

ftrace_dump_on_oops=foo=orig_cpu

The above will dump only the buffer of "foo" instance
on CPU that triggered the oops.

ftrace_dump_on_oops,foo,bar=orig_cpu

The above will dump global buffer on all CPUs, the
buffer of "foo" instance on all CPUs and the buffer
of "bar" instance on CPU that triggered the oops.

ftrace_filter=[function-list]
[FTRACE] Limit the functions traced by the function
Expand Down
30 changes: 24 additions & 6 deletions Documentation/admin-guide/sysctl/kernel.rst
Original file line number Diff line number Diff line change
Expand Up @@ -296,12 +296,30 @@ kernel panic). This will output the contents of the ftrace buffers to
the console. This is very useful for capturing traces that lead to
crashes and outputting them to a serial console.

= ===================================================
0 Disabled (default).
1 Dump buffers of all CPUs.
2 Dump the buffer of the CPU that triggered the oops.
= ===================================================

======================= ===========================================
0 Disabled (default).
1 Dump buffers of all CPUs.
2(orig_cpu) Dump the buffer of the CPU that triggered the
oops.
<instance> Dump the specific instance buffer on all CPUs.
<instance>=2(orig_cpu) Dump the specific instance buffer on the CPU
that triggered the oops.
======================= ===========================================

Multiple instance dump is also supported, and instances are separated
by commas. If global buffer also needs to be dumped, please specify
the dump mode (1/2/orig_cpu) first for global buffer.

So for example to dump "foo" and "bar" instance buffer on all CPUs,
user can::

echo "foo,bar" > /proc/sys/kernel/ftrace_dump_on_oops

To dump global buffer and "foo" instance buffer on all
CPUs along with the "bar" instance buffer on CPU that triggered the
oops, user can::

echo "1,foo,bar=2" > /proc/sys/kernel/ftrace_dump_on_oops

ftrace_enabled, stack_tracer_enabled
====================================
Expand Down
27 changes: 26 additions & 1 deletion Documentation/trace/user_events.rst
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,24 @@ The following flags are currently supported.
process closes or unregisters the event. Requires CAP_PERFMON otherwise
-EPERM is returned.

+ USER_EVENT_REG_MULTI_FORMAT: The event can contain multiple formats. This
allows programs to prevent themselves from being blocked when their event
format changes and they wish to use the same name. When this flag is used the
tracepoint name will be in the new format of "name.unique_id" vs the older
format of "name". A tracepoint will be created for each unique pair of name
and format. This means if several processes use the same name and format,
they will use the same tracepoint. If yet another process uses the same name,
but a different format than the other processes, it will use a different
tracepoint with a new unique id. Recording programs need to scan tracefs for
the various different formats of the event name they are interested in
recording. The system name of the tracepoint will also use "user_events_multi"
instead of "user_events". This prevents single-format event names conflicting
with any multi-format event names within tracefs. The unique_id is output as
a hex string. Recording programs should ensure the tracepoint name starts with
the event name they registered and has a suffix that starts with . and only
has hex characters. For example to find all versions of the event "test" you
can use the regex "^test\.[0-9a-fA-F]+$".

Upon successful registration the following is set.

+ write_index: The index to use for this file descriptor that represents this
Expand All @@ -106,6 +124,9 @@ or perf record -e user_events:[name] when attaching/recording.
**NOTE:** The event subsystem name by default is "user_events". Callers should
not assume it will always be "user_events". Operators reserve the right in the
future to change the subsystem name per-process to accommodate event isolation.
In addition if the USER_EVENT_REG_MULTI_FORMAT flag is used the tracepoint name
will have a unique id appended to it and the system name will be
"user_events_multi" as described above.

Command Format
^^^^^^^^^^^^^^
Expand Down Expand Up @@ -156,7 +177,11 @@ to request deletes than the one used for registration due to this.
to the event. If programs do not want auto-delete, they must use the
USER_EVENT_REG_PERSIST flag when registering the event. Once that flag is used
the event exists until DIAG_IOCSDEL is invoked. Both register and delete of an
event that persists requires CAP_PERFMON, otherwise -EPERM is returned.
event that persists requires CAP_PERFMON, otherwise -EPERM is returned. When
there are multiple formats of the same event name, all events with the same
name will be attempted to be deleted. If only a specific version is wanted to
be deleted then the /sys/kernel/tracing/dynamic_events file should be used for
that specific format of the event.

Unregistering
-------------
Expand Down
14 changes: 7 additions & 7 deletions drivers/cxl/core/trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -646,18 +646,18 @@ u64 cxl_trace_hpa(struct cxl_region *cxlr, struct cxl_memdev *memdev, u64 dpa);

TRACE_EVENT(cxl_poison,

TP_PROTO(struct cxl_memdev *cxlmd, struct cxl_region *region,
TP_PROTO(struct cxl_memdev *cxlmd, struct cxl_region *cxlr,
const struct cxl_poison_record *record, u8 flags,
__le64 overflow_ts, enum cxl_poison_trace_type trace_type),

TP_ARGS(cxlmd, region, record, flags, overflow_ts, trace_type),
TP_ARGS(cxlmd, cxlr, record, flags, overflow_ts, trace_type),

TP_STRUCT__entry(
__string(memdev, dev_name(&cxlmd->dev))
__string(host, dev_name(cxlmd->dev.parent))
__field(u64, serial)
__field(u8, trace_type)
__string(region, region)
__string(region, cxlr ? dev_name(&cxlr->dev) : "")
__field(u64, overflow_ts)
__field(u64, hpa)
__field(u64, dpa)
Expand All @@ -677,10 +677,10 @@ TRACE_EVENT(cxl_poison,
__entry->source = cxl_poison_record_source(record);
__entry->trace_type = trace_type;
__entry->flags = flags;
if (region) {
__assign_str(region, dev_name(&region->dev));
memcpy(__entry->uuid, &region->params.uuid, 16);
__entry->hpa = cxl_trace_hpa(region, cxlmd,
if (cxlr) {
__assign_str(region, dev_name(&cxlr->dev));
memcpy(__entry->uuid, &cxlr->params.uuid, 16);
__entry->hpa = cxl_trace_hpa(cxlr, cxlmd,
__entry->dpa);
} else {
__assign_str(region, "");
Expand Down
6 changes: 3 additions & 3 deletions drivers/gpu/drm/i915/display/intel_display_trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -411,7 +411,7 @@ TRACE_EVENT(intel_fbc_activate,
struct intel_crtc *crtc = intel_crtc_for_pipe(to_i915(plane->base.dev),
plane->pipe);
__assign_str(dev, __dev_name_kms(plane));
__assign_str(name, plane->base.name)
__assign_str(name, plane->base.name);
__entry->pipe = crtc->pipe;
__entry->frame = intel_crtc_get_vblank_counter(crtc);
__entry->scanline = intel_get_crtc_scanline(crtc);
Expand All @@ -438,7 +438,7 @@ TRACE_EVENT(intel_fbc_deactivate,
struct intel_crtc *crtc = intel_crtc_for_pipe(to_i915(plane->base.dev),
plane->pipe);
__assign_str(dev, __dev_name_kms(plane));
__assign_str(name, plane->base.name)
__assign_str(name, plane->base.name);
__entry->pipe = crtc->pipe;
__entry->frame = intel_crtc_get_vblank_counter(crtc);
__entry->scanline = intel_get_crtc_scanline(crtc);
Expand All @@ -465,7 +465,7 @@ TRACE_EVENT(intel_fbc_nuke,
struct intel_crtc *crtc = intel_crtc_for_pipe(to_i915(plane->base.dev),
plane->pipe);
__assign_str(dev, __dev_name_kms(plane));
__assign_str(name, plane->base.name)
__assign_str(name, plane->base.name);
__entry->pipe = crtc->pipe;
__entry->frame = intel_crtc_get_vblank_counter(crtc);
__entry->scanline = intel_get_crtc_scanline(crtc);
Expand Down
8 changes: 4 additions & 4 deletions drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ TRACE_EVENT(hclge_pf_mbx_get,
__field(u8, code)
__field(u8, subcode)
__string(pciname, pci_name(hdev->pdev))
__string(devname, &hdev->vport[0].nic.kinfo.netdev->name)
__string(devname, hdev->vport[0].nic.kinfo.netdev->name)
__array(u32, mbx_data, PF_GET_MBX_LEN)
),

Expand All @@ -33,7 +33,7 @@ TRACE_EVENT(hclge_pf_mbx_get,
__entry->code = req->msg.code;
__entry->subcode = req->msg.subcode;
__assign_str(pciname, pci_name(hdev->pdev));
__assign_str(devname, &hdev->vport[0].nic.kinfo.netdev->name);
__assign_str(devname, hdev->vport[0].nic.kinfo.netdev->name);
memcpy(__entry->mbx_data, req,
sizeof(struct hclge_mbx_vf_to_pf_cmd));
),
Expand All @@ -56,15 +56,15 @@ TRACE_EVENT(hclge_pf_mbx_send,
__field(u8, vfid)
__field(u16, code)
__string(pciname, pci_name(hdev->pdev))
__string(devname, &hdev->vport[0].nic.kinfo.netdev->name)
__string(devname, hdev->vport[0].nic.kinfo.netdev->name)
__array(u32, mbx_data, PF_SEND_MBX_LEN)
),

TP_fast_assign(
__entry->vfid = req->dest_vfid;
__entry->code = le16_to_cpu(req->msg.code);
__assign_str(pciname, pci_name(hdev->pdev));
__assign_str(devname, &hdev->vport[0].nic.kinfo.netdev->name);
__assign_str(devname, hdev->vport[0].nic.kinfo.netdev->name);
memcpy(__entry->mbx_data, req,
sizeof(struct hclge_mbx_pf_to_vf_cmd));
),
Expand Down
8 changes: 4 additions & 4 deletions drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,15 +23,15 @@ TRACE_EVENT(hclge_vf_mbx_get,
__field(u8, vfid)
__field(u16, code)
__string(pciname, pci_name(hdev->pdev))
__string(devname, &hdev->nic.kinfo.netdev->name)
__string(devname, hdev->nic.kinfo.netdev->name)
__array(u32, mbx_data, VF_GET_MBX_LEN)
),

TP_fast_assign(
__entry->vfid = req->dest_vfid;
__entry->code = le16_to_cpu(req->msg.code);
__assign_str(pciname, pci_name(hdev->pdev));
__assign_str(devname, &hdev->nic.kinfo.netdev->name);
__assign_str(devname, hdev->nic.kinfo.netdev->name);
memcpy(__entry->mbx_data, req,
sizeof(struct hclge_mbx_pf_to_vf_cmd));
),
Expand All @@ -55,7 +55,7 @@ TRACE_EVENT(hclge_vf_mbx_send,
__field(u8, code)
__field(u8, subcode)
__string(pciname, pci_name(hdev->pdev))
__string(devname, &hdev->nic.kinfo.netdev->name)
__string(devname, hdev->nic.kinfo.netdev->name)
__array(u32, mbx_data, VF_SEND_MBX_LEN)
),

Expand All @@ -64,7 +64,7 @@ TRACE_EVENT(hclge_vf_mbx_send,
__entry->code = req->msg.code;
__entry->subcode = req->msg.subcode;
__assign_str(pciname, pci_name(hdev->pdev));
__assign_str(devname, &hdev->nic.kinfo.netdev->name);
__assign_str(devname, hdev->nic.kinfo.netdev->name);
memcpy(__entry->mbx_data, req,
sizeof(struct hclge_mbx_vf_to_pf_cmd));
),
Expand Down
10 changes: 5 additions & 5 deletions fs/nfsd/trace.h
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ TRACE_EVENT(nfsd_compound,
TP_fast_assign(
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->opcnt = opcnt;
__assign_str_len(tag, tag, taglen);
__assign_str(tag, tag);
),
TP_printk("xid=0x%08x opcnt=%u tag=%s",
__entry->xid, __entry->opcnt, __get_str(tag)
Expand Down Expand Up @@ -485,7 +485,7 @@ TRACE_EVENT(nfsd_dirent,
TP_fast_assign(
__entry->fh_hash = fhp ? knfsd_fh_hash(&fhp->fh_handle) : 0;
__entry->ino = ino;
__assign_str_len(name, name, namlen)
__assign_str(name, name);
),
TP_printk("fh_hash=0x%08x ino=%llu name=%s",
__entry->fh_hash, __entry->ino, __get_str(name)
Expand Down Expand Up @@ -896,7 +896,7 @@ DECLARE_EVENT_CLASS(nfsd_clid_class,
__array(unsigned char, addr, sizeof(struct sockaddr_in6))
__field(unsigned long, flavor)
__array(unsigned char, verifier, NFS4_VERIFIER_SIZE)
__string_len(name, name, clp->cl_name.len)
__string_len(name, clp->cl_name.data, clp->cl_name.len)
),
TP_fast_assign(
__entry->cl_boot = clp->cl_clientid.cl_boot;
Expand All @@ -906,7 +906,7 @@ DECLARE_EVENT_CLASS(nfsd_clid_class,
__entry->flavor = clp->cl_cred.cr_flavor;
memcpy(__entry->verifier, (void *)&clp->cl_verifier,
NFS4_VERIFIER_SIZE);
__assign_str_len(name, clp->cl_name.data, clp->cl_name.len);
__assign_str(name, clp->cl_name.data);
),
TP_printk("addr=%pISpc name='%s' verifier=0x%s flavor=%s client=%08x:%08x",
__entry->addr, __get_str(name),
Expand Down Expand Up @@ -1976,7 +1976,7 @@ TRACE_EVENT(nfsd_ctl_time,
TP_fast_assign(
__entry->netns_ino = net->ns.inum;
__entry->time = time;
__assign_str_len(name, name, namelen);
__assign_str(name, name);
),
TP_printk("file=%s time=%d\n",
__get_str(name), __entry->time
Expand Down
Loading

0 comments on commit ad584d7

Please sign in to comment.