From b168098912926236bbeebaf7795eb7aab76d2b45 Mon Sep 17 00:00:00 2001 From: Peter Zijlstra Date: Mon, 3 Apr 2023 11:08:58 +0200 Subject: [PATCH 1/2] perf: Optimize perf_pmu_migrate_context() Thomas reported that offlining CPUs spends a lot of time in synchronize_rcu() as called from perf_pmu_migrate_context() even though he's not actually using uncore events. Turns out, the thing is unconditionally waiting for RCU, even if there's no actual events to migrate. Fixes: 0cda4c023132 ("perf: Introduce perf_pmu_migrate_context()") Reported-by: Thomas Gleixner Signed-off-by: Peter Zijlstra (Intel) Tested-by: Thomas Gleixner Reviewed-by: Thomas Gleixner Reviewed-by: Paul E. McKenney Link: https://lkml.kernel.org/r/20230403090858.GT4253@hirez.programming.kicks-ass.net --- kernel/events/core.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index fb3e436bcd4ac..115320faf1db4 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -12893,12 +12893,14 @@ void perf_pmu_migrate_context(struct pmu *pmu, int src_cpu, int dst_cpu) __perf_pmu_remove(src_ctx, src_cpu, pmu, &src_ctx->pinned_groups, &events); __perf_pmu_remove(src_ctx, src_cpu, pmu, &src_ctx->flexible_groups, &events); - /* - * Wait for the events to quiesce before re-instating them. - */ - synchronize_rcu(); + if (!list_empty(&events)) { + /* + * Wait for the events to quiesce before re-instating them. + */ + synchronize_rcu(); - __perf_pmu_install(dst_ctx, dst_cpu, pmu, &events); + __perf_pmu_install(dst_ctx, dst_cpu, pmu, &events); + } mutex_unlock(&dst_ctx->mutex); mutex_unlock(&src_ctx->mutex); From 24d3ae2f37d8bc3c14b31d353c5d27baf582b6a6 Mon Sep 17 00:00:00 2001 From: Kan Liang Date: Wed, 22 Mar 2023 13:24:49 -0700 Subject: [PATCH 2/2] perf/core: Fix the same task check in perf_event_set_output MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The same task check in perf_event_set_output has some potential issues for some usages. For the current perf code, there is a problem if using of perf_event_open() to have multiple samples getting into the same mmap’d memory when they are both attached to the same process. https://lore.kernel.org/all/92645262-D319-4068-9C44-2409EF44888E@gmail.com/ Because the event->ctx is not ready when the perf_event_set_output() is invoked in the perf_event_open(). Besides the above issue, before the commit bd2756811766 ("perf: Rewrite core context handling"), perf record can errors out when sampling with a hardware event and a software event as below. $ perf record -e cycles,dummy --per-thread ls failed to mmap with 22 (Invalid argument) That's because that prior to the commit a hardware event and a software event are from different task context. The problem should be a long time issue since commit c3f00c70276d ("perk: Separate find_get_context() from event initialization"). The task struct is stored in the event->hw.target for each per-thread event. It is a more reliable way to determine whether two events are attached to the same task. The event->hw.target was also introduced several years ago by the commit 50f16a8bf9d7 ("perf: Remove type specific target pointers"). It can not only be used to fix the issue with the current code, but also back port to fix the issues with an older kernel. Note: The event->hw.target was introduced later than commit c3f00c70276d. The patch may cannot be applied between the commit c3f00c70276d and commit 50f16a8bf9d7. Anybody that wants to back-port this at that period may have to find other solutions. Fixes: c3f00c70276d ("perf: Separate find_get_context() from event initialization") Signed-off-by: Kan Liang Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Zhengjun Xing Link: https://lkml.kernel.org/r/20230322202449.512091-1-kan.liang@linux.intel.com --- kernel/events/core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/events/core.c b/kernel/events/core.c index 115320faf1db4..435815d3be3f8 100644 --- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -12173,7 +12173,7 @@ perf_event_set_output(struct perf_event *event, struct perf_event *output_event) /* * If its not a per-cpu rb, it must be the same task. */ - if (output_event->cpu == -1 && output_event->ctx != event->ctx) + if (output_event->cpu == -1 && output_event->hw.target != event->hw.target) goto out; /*