Skip to content

Commit

Permalink
Merge branch 'page_pool-stats'
Browse files Browse the repository at this point in the history
Joe Damato says:

====================
page_pool: Add stats counters

Greetings:

Welcome to v9.

This revisions adds a commit which updates the page_pool documentation to
describe the stats API, structures, and fields.

Additionally, this revision contains a minor cosmetic change suggested by
Saeed in page_pool_recycle_in_ring in commit 2: "page_pool: Add recycle
stats", which removes an unnecessary #ifdef.

There are no functional changes in this revision.

Benchmark output from the v7 cover [1] is pasted below, as it is still
relevant since no functional changes have been made in this revision:

Benchmarks have been re-run. As always, results between runs are highly
variable; you'll find results showing that stats disabled are both faster
and slower than stats enabled in back to back benchmark runs.

Raw benchmark output with stats off [2] and stats on [3] are available for
examination.

Test system:
	- 2x Intel(R) Xeon(R) Gold 6140 CPU @ 2.30GHz
	- 2 NUMA zones, with 18 cores per zone and 2 threads per core

bench_page_pool_simple results, loops=200000000
test name			stats enabled		stats disabled
				cycles	nanosec		cycles	nanosec

for_loop			0	0.335		0	0.336
atomic_inc 			14	6.106		13	6.022
lock				30	13.365		32	13.968

no-softirq-page_pool01		75	32.884		74	32.308
no-softirq-page_pool02		79	34.696		74	32.302
no-softirq-page_pool03		110	48.005		105	46.073

tasklet_page_pool01_fast_path	14	6.156		14	6.211
tasklet_page_pool02_ptr_ring	41	18.028		39	17.391
tasklet_page_pool03_slow	107	46.646		105	46.123

bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=4:
test name			stats enabled		stats disabled
				cycles	nanosec		cycles	nanosec

page_pool_cross_cpu CPU(0)	3973	1731.596	4015	1750.015
page_pool_cross_cpu CPU(1)	3976	1733.217	4022	1752.864
page_pool_cross_cpu CPU(2)	3973	1731.615	4016	1750.433
page_pool_cross_cpu CPU(3)	3976	1733.218	4021	1752.806
page_pool_cross_cpu CPU(4)	994	433.305		1005	438.217

page_pool_cross_cpu average	3378	-		3415	-

bench_page_pool_cross_cpu results, loops=20000000 returning_cpus=8:
test name			stats enabled		stats disabled
				cycles	nanosec		cycles	nanosec

page_pool_cross_cpu CPU(0)	6969	3037.488	6909	3011.463
page_pool_cross_cpu CPU(1)	6974	3039.469	6913	3012.961
page_pool_cross_cpu CPU(2)	6969	3037.575	6910	3011.585
page_pool_cross_cpu CPU(3)	6974	3039.415	6913	3012.961
page_pool_cross_cpu CPU(4)	6969	3037.288	6909	3011.368
page_pool_cross_cpu CPU(5)	6972	3038.732	6913	3012.920
page_pool_cross_cpu CPU(6)	6969	3037.350	6909	3011.386
page_pool_cross_cpu CPU(7)	6973	3039.356	6913	3012.921
page_pool_cross_cpu CPU(8)	871	379.934		864	376.620

page_pool_cross_cpu average	6293	-		6239	-

Thanks.

[1]: https://lore.kernel.org/all/1645810914-35485-1-git-send-email-jdamato@fastly.com/
[2]: https://gist.githubusercontent.com/jdamato-fsly/d7c34b9fa7be1ce132a266b0f2b92aea/raw/327dcd71d11ece10238fbf19e0472afbcbf22fd4/v7_stats_disabled
[3]: https://gist.githubusercontent.com/jdamato-fsly/d7c34b9fa7be1ce132a266b0f2b92aea/raw/327dcd71d11ece10238fbf19e0472afbcbf22fd4/v7_stats_enabled

v8 -> v9:
	- Add documentation about the page_pool_get_stats API, stats
	  structures, and fields to Documentation/networking/page_pool.rst.
	- Remove unnecessary #ifdef in page_pool_recycle_in_ring.

v7 -> v8:
	- Rename mlx5 ethtool stats so that users have a better idea of
	  their meaning.

v6 -> v7:
	- stats split out into two structs one single per-page pool struct
	  for allocation path stats and one per-cpu pointer for recycle
	  path stats.
	- page_pool_get_stats updated to use a wrapper struct to gather
	  stats for allocation and recycle stats with a single argument.
	- placement of structs adjusted
	- mlx5 driver modified to use page_pool_get_stats API

v5 -> v6:
	- Per cpu page_pool_stats struct pointer is now marked as
	  ____cacheline_aligned_in_smp. Placement of the field in the
	  struct is unchanged; it is the last field.

v4 -> v5:
	- Fixed the description of the kernel option in Kconfig.
	- Squashed commits 1-10 from v4 into a single commit for easier
	  review.
	- Changed the comment style of the comment for
	  the this_cpu_inc_alloc_stat macro.
	- Changed the return type of page_pool_get_stats from struct
	  page_pool_stat * to bool.

v3 -> v4:
	- Restructured stats to be per-cpu per-pool.
	- Global stats and proc file were removed.
	- Exposed an API (page_pool_get_stats) for batching the pool stats.

v2 -> v3:
	- patch 8/10 ("Add stat tracking cache refill") fixed placement of
	  counter increment.
	- patch 10/10 ("net-procfs: Show page pool stats in proc") updated:
		- fix unused label warning from kernel test robot,
		- fixed page_pool_seq_show to only display the refill stat
		  once,
		- added a remove_proc_entry for page_pool_stat to
		  dev_proc_net_exit.

v1 -> v2:
	- A new kernel config option has been added, which defaults to N,
	   preventing this code from being compiled in by default
	- The stats structure has been converted to a per-cpu structure
	- The stats are now exported via proc (/proc/net/page_pool_stat)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
  • Loading branch information
David S. Miller committed Mar 3, 2022
2 parents 42f0c19 + cc10e84 commit a8ff736
Show file tree
Hide file tree
Showing 6 changed files with 294 additions and 7 deletions.
56 changes: 56 additions & 0 deletions Documentation/networking/page_pool.rst
Original file line number Diff line number Diff line change
Expand Up @@ -105,6 +105,47 @@ a page will cause no race conditions is enough.
Please note the caller must not use data area after running
page_pool_put_page_bulk(), as this function overwrites it.

* page_pool_get_stats(): Retrieve statistics about the page_pool. This API
is only available if the kernel has been configured with
``CONFIG_PAGE_POOL_STATS=y``. A pointer to a caller allocated ``struct
page_pool_stats`` structure is passed to this API which is filled in. The
caller can then report those stats to the user (perhaps via ethtool,
debugfs, etc.). See below for an example usage of this API.

Stats API and structures
------------------------
If the kernel is configured with ``CONFIG_PAGE_POOL_STATS=y``, the API
``page_pool_get_stats()`` and structures described below are available. It
takes a pointer to a ``struct page_pool`` and a pointer to a ``struct
page_pool_stats`` allocated by the caller.

The API will fill in the provided ``struct page_pool_stats`` with
statistics about the page_pool.

The stats structure has the following fields::

struct page_pool_stats {
struct page_pool_alloc_stats alloc_stats;
struct page_pool_recycle_stats recycle_stats;
};


The ``struct page_pool_alloc_stats`` has the following fields:
* ``fast``: successful fast path allocations
* ``slow``: slow path order-0 allocations
* ``slow_high_order``: slow path high order allocations
* ``empty``: ptr ring is empty, so a slow path allocation was forced.
* ``refill``: an allocation which triggered a refill of the cache
* ``waive``: pages obtained from the ptr ring that cannot be added to
the cache due to a NUMA mismatch.

The ``struct page_pool_recycle_stats`` has the following fields:
* ``cached``: recycling placed page in the page pool cache
* ``cache_full``: page pool cache was full
* ``ring``: page placed into the ptr ring
* ``ring_full``: page released from page pool because the ptr ring was full
* ``released_refcnt``: page released (and not recycled) because refcnt > 1

Coding examples
===============

Expand Down Expand Up @@ -157,6 +198,21 @@ NAPI poller
}
}
Stats
-----

.. code-block:: c
#ifdef CONFIG_PAGE_POOL_STATS
/* retrieve stats */
struct page_pool_stats stats = { 0 };
if (page_pool_get_stats(page_pool, &stats)) {
/* perhaps the driver reports statistics with ethool */
ethtool_print_allocation_stats(&stats.alloc_stats);
ethtool_print_recycle_stats(&stats.recycle_stats);
}
#endif
Driver unload
-------------

Expand Down
75 changes: 75 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en_stats.c
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@
#include "en/ptp.h"
#include "en/port.h"

#ifdef CONFIG_PAGE_POOL_STATS
#include <net/page_pool.h>
#endif

static unsigned int stats_grps_num(struct mlx5e_priv *priv)
{
return !priv->profile->stats_grps_num ? 0 :
Expand Down Expand Up @@ -183,6 +187,19 @@ static const struct counter_desc sw_stats_desc[] = {
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_congst_umr) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_arfs_err) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_recover) },
#ifdef CONFIG_PAGE_POOL_STATS
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_fast) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_slow) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_slow_high_order) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_empty) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_refill) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_alloc_waive) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_cached) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_cache_full) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_ring) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_ring_full) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_pp_recycle_released_ref) },
#endif
#ifdef CONFIG_MLX5_EN_TLS
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_decrypted_packets) },
{ MLX5E_DECLARE_STAT(struct mlx5e_sw_stats, rx_tls_decrypted_bytes) },
Expand Down Expand Up @@ -349,6 +366,19 @@ static void mlx5e_stats_grp_sw_update_stats_rq_stats(struct mlx5e_sw_stats *s,
s->rx_congst_umr += rq_stats->congst_umr;
s->rx_arfs_err += rq_stats->arfs_err;
s->rx_recover += rq_stats->recover;
#ifdef CONFIG_PAGE_POOL_STATS
s->rx_pp_alloc_fast += rq_stats->pp_alloc_fast;
s->rx_pp_alloc_slow += rq_stats->pp_alloc_slow;
s->rx_pp_alloc_empty += rq_stats->pp_alloc_empty;
s->rx_pp_alloc_refill += rq_stats->pp_alloc_refill;
s->rx_pp_alloc_waive += rq_stats->pp_alloc_waive;
s->rx_pp_alloc_slow_high_order += rq_stats->pp_alloc_slow_high_order;
s->rx_pp_recycle_cached += rq_stats->pp_recycle_cached;
s->rx_pp_recycle_cache_full += rq_stats->pp_recycle_cache_full;
s->rx_pp_recycle_ring += rq_stats->pp_recycle_ring;
s->rx_pp_recycle_ring_full += rq_stats->pp_recycle_ring_full;
s->rx_pp_recycle_released_ref += rq_stats->pp_recycle_released_ref;
#endif
#ifdef CONFIG_MLX5_EN_TLS
s->rx_tls_decrypted_packets += rq_stats->tls_decrypted_packets;
s->rx_tls_decrypted_bytes += rq_stats->tls_decrypted_bytes;
Expand Down Expand Up @@ -455,6 +485,35 @@ static void mlx5e_stats_grp_sw_update_stats_qos(struct mlx5e_priv *priv,
}
}

#ifdef CONFIG_PAGE_POOL_STATS
static void mlx5e_stats_update_stats_rq_page_pool(struct mlx5e_channel *c)
{
struct mlx5e_rq_stats *rq_stats = c->rq.stats;
struct page_pool *pool = c->rq.page_pool;
struct page_pool_stats stats = { 0 };

if (!page_pool_get_stats(pool, &stats))
return;

rq_stats->pp_alloc_fast = stats.alloc_stats.fast;
rq_stats->pp_alloc_slow = stats.alloc_stats.slow;
rq_stats->pp_alloc_slow_high_order = stats.alloc_stats.slow_high_order;
rq_stats->pp_alloc_empty = stats.alloc_stats.empty;
rq_stats->pp_alloc_waive = stats.alloc_stats.waive;
rq_stats->pp_alloc_refill = stats.alloc_stats.refill;

rq_stats->pp_recycle_cached = stats.recycle_stats.cached;
rq_stats->pp_recycle_cache_full = stats.recycle_stats.cache_full;
rq_stats->pp_recycle_ring = stats.recycle_stats.ring;
rq_stats->pp_recycle_ring_full = stats.recycle_stats.ring_full;
rq_stats->pp_recycle_released_ref = stats.recycle_stats.released_refcnt;
}
#else
static void mlx5e_stats_update_stats_rq_page_pool(struct mlx5e_channel *c)
{
}
#endif

static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw)
{
struct mlx5e_sw_stats *s = &priv->stats.sw;
Expand All @@ -465,8 +524,11 @@ static MLX5E_DECLARE_STATS_GRP_OP_UPDATE_STATS(sw)
for (i = 0; i < priv->stats_nch; i++) {
struct mlx5e_channel_stats *channel_stats =
priv->channel_stats[i];

int j;

mlx5e_stats_update_stats_rq_page_pool(priv->channels.c[i]);

mlx5e_stats_grp_sw_update_stats_rq_stats(s, &channel_stats->rq);
mlx5e_stats_grp_sw_update_stats_xdpsq(s, &channel_stats->rq_xdpsq);
mlx5e_stats_grp_sw_update_stats_ch_stats(s, &channel_stats->ch);
Expand Down Expand Up @@ -1887,6 +1949,19 @@ static const struct counter_desc rq_stats_desc[] = {
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, congst_umr) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, arfs_err) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, recover) },
#ifdef CONFIG_PAGE_POOL_STATS
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_fast) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_slow) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_slow_high_order) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_empty) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_refill) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_alloc_waive) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_cached) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_cache_full) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_ring) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_ring_full) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, pp_recycle_released_ref) },
#endif
#ifdef CONFIG_MLX5_EN_TLS
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_decrypted_packets) },
{ MLX5E_DECLARE_RX_STAT(struct mlx5e_rq_stats, tls_decrypted_bytes) },
Expand Down
27 changes: 26 additions & 1 deletion drivers/net/ethernet/mellanox/mlx5/core/en_stats.h
Original file line number Diff line number Diff line change
Expand Up @@ -205,7 +205,19 @@ struct mlx5e_sw_stats {
u64 ch_aff_change;
u64 ch_force_irq;
u64 ch_eq_rearm;

#ifdef CONFIG_PAGE_POOL_STATS
u64 rx_pp_alloc_fast;
u64 rx_pp_alloc_slow;
u64 rx_pp_alloc_slow_high_order;
u64 rx_pp_alloc_empty;
u64 rx_pp_alloc_refill;
u64 rx_pp_alloc_waive;
u64 rx_pp_recycle_cached;
u64 rx_pp_recycle_cache_full;
u64 rx_pp_recycle_ring;
u64 rx_pp_recycle_ring_full;
u64 rx_pp_recycle_released_ref;
#endif
#ifdef CONFIG_MLX5_EN_TLS
u64 tx_tls_encrypted_packets;
u64 tx_tls_encrypted_bytes;
Expand Down Expand Up @@ -352,6 +364,19 @@ struct mlx5e_rq_stats {
u64 congst_umr;
u64 arfs_err;
u64 recover;
#ifdef CONFIG_PAGE_POOL_STATS
u64 pp_alloc_fast;
u64 pp_alloc_slow;
u64 pp_alloc_slow_high_order;
u64 pp_alloc_empty;
u64 pp_alloc_refill;
u64 pp_alloc_waive;
u64 pp_recycle_cached;
u64 pp_recycle_cache_full;
u64 pp_recycle_ring;
u64 pp_recycle_ring_full;
u64 pp_recycle_released_ref;
#endif
#ifdef CONFIG_MLX5_EN_TLS
u64 tls_decrypted_packets;
u64 tls_decrypted_bytes;
Expand Down
51 changes: 51 additions & 0 deletions include/net/page_pool.h
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,48 @@ struct page_pool_params {
void *init_arg;
};

#ifdef CONFIG_PAGE_POOL_STATS
struct page_pool_alloc_stats {
u64 fast; /* fast path allocations */
u64 slow; /* slow-path order 0 allocations */
u64 slow_high_order; /* slow-path high order allocations */
u64 empty; /* failed refills due to empty ptr ring, forcing
* slow path allocation
*/
u64 refill; /* allocations via successful refill */
u64 waive; /* failed refills due to numa zone mismatch */
};

struct page_pool_recycle_stats {
u64 cached; /* recycling placed page in the cache. */
u64 cache_full; /* cache was full */
u64 ring; /* recycling placed page back into ptr ring */
u64 ring_full; /* page was released from page-pool because
* PTR ring was full.
*/
u64 released_refcnt; /* page released because of elevated
* refcnt
*/
};

/* This struct wraps the above stats structs so users of the
* page_pool_get_stats API can pass a single argument when requesting the
* stats for the page pool.
*/
struct page_pool_stats {
struct page_pool_alloc_stats alloc_stats;
struct page_pool_recycle_stats recycle_stats;
};

/*
* Drivers that wish to harvest page pool stats and report them to users
* (perhaps via ethtool, debugfs, or another mechanism) can allocate a
* struct page_pool_stats call page_pool_get_stats to get stats for the specified pool.
*/
bool page_pool_get_stats(struct page_pool *pool,
struct page_pool_stats *stats);
#endif

struct page_pool {
struct page_pool_params p;

Expand All @@ -96,6 +138,11 @@ struct page_pool {
unsigned int frag_offset;
struct page *frag_page;
long frag_users;

#ifdef CONFIG_PAGE_POOL_STATS
/* these stats are incremented while in softirq context */
struct page_pool_alloc_stats alloc_stats;
#endif
u32 xdp_mem_id;

/*
Expand Down Expand Up @@ -126,6 +173,10 @@ struct page_pool {
*/
struct ptr_ring ring;

#ifdef CONFIG_PAGE_POOL_STATS
/* recycle stats are per-cpu to avoid locking */
struct page_pool_recycle_stats __percpu *recycle_stats;
#endif
atomic_t pages_state_release_cnt;

/* A page_pool is strictly tied to a single RX-queue being
Expand Down
13 changes: 13 additions & 0 deletions net/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -434,6 +434,19 @@ config NET_DEVLINK
config PAGE_POOL
bool

config PAGE_POOL_STATS
default n
bool "Page pool stats"
depends on PAGE_POOL
help
Enable page pool statistics to track page allocation and recycling
in page pools. This option incurs additional CPU cost in allocation
and recycle paths and additional memory cost to store the statistics.
These statistics are only available if this option is enabled and if
the driver using the page pool supports exporting this data.

If unsure, say N.

config FAILOVER
tristate "Generic failover module"
help
Expand Down
Loading

0 comments on commit a8ff736

Please sign in to comment.