-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
crypto: qat - add support for device telemetry
Expose through debugfs device telemetry data for QAT GEN4 devices. This allows to gather metrics about the performance and the utilization of a device. In particular, statistics on (1) the utilization of the PCIe channel, (2) address translation, when SVA is enabled and (3) the internal engines for crypto and data compression. If telemetry is supported by the firmware, the driver allocates a DMA region and a circular buffer. When telemetry is enabled, through the `control` attribute in debugfs, the driver sends to the firmware, via the admin interface, the `TL_START` command. This triggers the device to periodically gather telemetry data from hardware registers and write it into the DMA memory region. The device writes into the shared region every second. The driver, every 500ms, snapshots the DMA shared region into the circular buffer. This is then used to compute basic metric (min/max/average) on each counter, every time the `device_data` attribute is queried. Telemetry counters are exposed through debugfs in the folder /sys/kernel/debug/qat_<device>_<BDF>/telemetry. For details, refer to debugfs-driver-qat_telemetry in Documentation/ABI. This patch is based on earlier work done by Wojciech Ziemba. Signed-off-by: Lucas Segarra Fernandez <lucas.segarra.fernandez@intel.com> Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com> Reviewed-by: Damian Muszynski <damian.muszynski@intel.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
- Loading branch information
Lucas Segarra Fernandez
authored and
Herbert Xu
committed
Dec 29, 2023
1 parent
7f06679
commit 69e7649
Showing
13 changed files
with
1,339 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
What: /sys/kernel/debug/qat_<device>_<BDF>/telemetry/control | ||
Date: March 2024 | ||
KernelVersion: 6.8 | ||
Contact: qat-linux@intel.com | ||
Description: (RW) Enables/disables the reporting of telemetry metrics. | ||
|
||
Allowed values to write: | ||
======================== | ||
* 0: disable telemetry | ||
* 1: enable telemetry | ||
* 2, 3, 4: enable telemetry and calculate minimum, maximum | ||
and average for each counter over 2, 3 or 4 samples | ||
|
||
Returned values: | ||
================ | ||
* 1-4: telemetry is enabled and running | ||
* 0: telemetry is disabled | ||
|
||
Example. | ||
|
||
Writing '3' to this file starts the collection of | ||
telemetry metrics. Samples are collected every second and | ||
stored in a circular buffer of size 3. These values are then | ||
used to calculate the minimum, maximum and average for each | ||
counter. After enabling, counters can be retrieved through | ||
the ``device_data`` file:: | ||
|
||
echo 3 > /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/control | ||
|
||
Writing '0' to this file stops the collection of telemetry | ||
metrics:: | ||
|
||
echo 0 > /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/control | ||
|
||
This attribute is only available for qat_4xxx devices. | ||
|
||
What: /sys/kernel/debug/qat_<device>_<BDF>/telemetry/device_data | ||
Date: March 2024 | ||
KernelVersion: 6.8 | ||
Contact: qat-linux@intel.com | ||
Description: (RO) Reports device telemetry counters. | ||
Reads report metrics about performance and utilization of | ||
a QAT device: | ||
|
||
======================= ======================================== | ||
Field Description | ||
======================= ======================================== | ||
sample_cnt number of acquisitions of telemetry data | ||
from the device. Reads are performed | ||
every 1000 ms. | ||
pci_trans_cnt number of PCIe partial transactions | ||
max_rd_lat maximum logged read latency [ns] (could | ||
be any read operation) | ||
rd_lat_acc_avg average read latency [ns] | ||
max_gp_lat max get to put latency [ns] (only takes | ||
samples for AE0) | ||
gp_lat_acc_avg average get to put latency [ns] | ||
bw_in PCIe, write bandwidth [Mbps] | ||
bw_out PCIe, read bandwidth [Mbps] | ||
at_page_req_lat_avg Address Translator(AT), average page | ||
request latency [ns] | ||
at_trans_lat_avg AT, average page translation latency [ns] | ||
at_max_tlb_used AT, maximum uTLB used | ||
util_cpr<N> utilization of Compression slice N [%] | ||
exec_cpr<N> execution count of Compression slice N | ||
util_xlt<N> utilization of Translator slice N [%] | ||
exec_xlt<N> execution count of Translator slice N | ||
util_dcpr<N> utilization of Decompression slice N [%] | ||
exec_dcpr<N> execution count of Decompression slice N | ||
util_pke<N> utilization of PKE N [%] | ||
exec_pke<N> execution count of PKE N | ||
util_ucs<N> utilization of UCS slice N [%] | ||
exec_ucs<N> execution count of UCS slice N | ||
util_wat<N> utilization of Wireless Authentication | ||
slice N [%] | ||
exec_wat<N> execution count of Wireless Authentication | ||
slice N | ||
util_wcp<N> utilization of Wireless Cipher slice N [%] | ||
exec_wcp<N> execution count of Wireless Cipher slice N | ||
util_cph<N> utilization of Cipher slice N [%] | ||
exec_cph<N> execution count of Cipher slice N | ||
util_ath<N> utilization of Authentication slice N [%] | ||
exec_ath<N> execution count of Authentication slice N | ||
======================= ======================================== | ||
|
||
The telemetry report file can be read with the following command:: | ||
|
||
cat /sys/kernel/debug/qat_4xxx_0000:6b:00.0/telemetry/device_data | ||
|
||
If ``control`` is set to 1, only the current values of the | ||
counters are displayed:: | ||
|
||
<counter_name> <current> | ||
|
||
If ``control`` is 2, 3 or 4, counters are displayed in the | ||
following format:: | ||
|
||
<counter_name> <current> <min> <max> <avg> | ||
|
||
If a device lacks of a specific accelerator, the corresponding | ||
attribute is not reported. | ||
|
||
This attribute is only available for qat_4xxx devices. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,118 @@ | ||
// SPDX-License-Identifier: GPL-2.0-only | ||
/* Copyright (c) 2023 Intel Corporation. */ | ||
#include <linux/export.h> | ||
#include <linux/kernel.h> | ||
|
||
#include "adf_gen4_tl.h" | ||
#include "adf_telemetry.h" | ||
#include "adf_tl_debugfs.h" | ||
|
||
#define ADF_GEN4_TL_DEV_REG_OFF(reg) ADF_TL_DEV_REG_OFF(reg, gen4) | ||
|
||
#define ADF_GEN4_TL_SL_UTIL_COUNTER(_name) \ | ||
ADF_TL_COUNTER("util_" #_name, \ | ||
ADF_TL_SIMPLE_COUNT, \ | ||
ADF_TL_SLICE_REG_OFF(_name, reg_tm_slice_util, gen4)) | ||
|
||
#define ADF_GEN4_TL_SL_EXEC_COUNTER(_name) \ | ||
ADF_TL_COUNTER("exec_" #_name, \ | ||
ADF_TL_SIMPLE_COUNT, \ | ||
ADF_TL_SLICE_REG_OFF(_name, reg_tm_slice_exec_cnt, gen4)) | ||
|
||
/* Device level counters. */ | ||
static const struct adf_tl_dbg_counter dev_counters[] = { | ||
/* PCIe partial transactions. */ | ||
ADF_TL_COUNTER(PCI_TRANS_CNT_NAME, ADF_TL_SIMPLE_COUNT, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_pci_trans_cnt)), | ||
/* Max read latency[ns]. */ | ||
ADF_TL_COUNTER(MAX_RD_LAT_NAME, ADF_TL_COUNTER_NS, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_rd_lat_max)), | ||
/* Read latency average[ns]. */ | ||
ADF_TL_COUNTER_LATENCY(RD_LAT_ACC_NAME, ADF_TL_COUNTER_NS_AVG, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_rd_lat_acc), | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_rd_cmpl_cnt)), | ||
/* Max get to put latency[ns]. */ | ||
ADF_TL_COUNTER(MAX_LAT_NAME, ADF_TL_COUNTER_NS, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_gp_lat_max)), | ||
/* Get to put latency average[ns]. */ | ||
ADF_TL_COUNTER_LATENCY(LAT_ACC_NAME, ADF_TL_COUNTER_NS_AVG, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_gp_lat_acc), | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_ae_put_cnt)), | ||
/* PCIe write bandwidth[Mbps]. */ | ||
ADF_TL_COUNTER(BW_IN_NAME, ADF_TL_COUNTER_MBPS, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_bw_in)), | ||
/* PCIe read bandwidth[Mbps]. */ | ||
ADF_TL_COUNTER(BW_OUT_NAME, ADF_TL_COUNTER_MBPS, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_bw_out)), | ||
/* Page request latency average[ns]. */ | ||
ADF_TL_COUNTER_LATENCY(PAGE_REQ_LAT_NAME, ADF_TL_COUNTER_NS_AVG, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_at_page_req_lat_acc), | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_at_page_req_cnt)), | ||
/* Page translation latency average[ns]. */ | ||
ADF_TL_COUNTER_LATENCY(AT_TRANS_LAT_NAME, ADF_TL_COUNTER_NS_AVG, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_at_trans_lat_acc), | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_at_trans_lat_cnt)), | ||
/* Maximum uTLB used. */ | ||
ADF_TL_COUNTER(AT_MAX_UTLB_USED_NAME, ADF_TL_SIMPLE_COUNT, | ||
ADF_GEN4_TL_DEV_REG_OFF(reg_tl_at_max_tlb_used)), | ||
}; | ||
|
||
/* Slice utilization counters. */ | ||
static const struct adf_tl_dbg_counter sl_util_counters[ADF_TL_SL_CNT_COUNT] = { | ||
/* Compression slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(cpr), | ||
/* Translator slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(xlt), | ||
/* Decompression slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(dcpr), | ||
/* PKE utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(pke), | ||
/* Wireless Authentication slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(wat), | ||
/* Wireless Cipher slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(wcp), | ||
/* UCS slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(ucs), | ||
/* Cipher slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(cph), | ||
/* Authentication slice utilization. */ | ||
ADF_GEN4_TL_SL_UTIL_COUNTER(ath), | ||
}; | ||
|
||
/* Slice execution counters. */ | ||
static const struct adf_tl_dbg_counter sl_exec_counters[ADF_TL_SL_CNT_COUNT] = { | ||
/* Compression slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(cpr), | ||
/* Translator slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(xlt), | ||
/* Decompression slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(dcpr), | ||
/* PKE execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(pke), | ||
/* Wireless Authentication slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(wat), | ||
/* Wireless Cipher slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(wcp), | ||
/* UCS slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(ucs), | ||
/* Cipher slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(cph), | ||
/* Authentication slice execution count. */ | ||
ADF_GEN4_TL_SL_EXEC_COUNTER(ath), | ||
}; | ||
|
||
void adf_gen4_init_tl_data(struct adf_tl_hw_data *tl_data) | ||
{ | ||
tl_data->layout_sz = ADF_GEN4_TL_LAYOUT_SZ; | ||
tl_data->slice_reg_sz = ADF_GEN4_TL_SLICE_REG_SZ; | ||
tl_data->num_hbuff = ADF_GEN4_TL_NUM_HIST_BUFFS; | ||
tl_data->msg_cnt_off = ADF_GEN4_TL_MSG_CNT_OFF; | ||
tl_data->cpp_ns_per_cycle = ADF_GEN4_CPP_NS_PER_CYCLE; | ||
tl_data->bw_units_to_bytes = ADF_GEN4_TL_BW_HW_UNITS_TO_BYTES; | ||
|
||
tl_data->dev_counters = dev_counters; | ||
tl_data->num_dev_counters = ARRAY_SIZE(dev_counters); | ||
tl_data->sl_util_counters = sl_util_counters; | ||
tl_data->sl_exec_counters = sl_exec_counters; | ||
} | ||
EXPORT_SYMBOL_GPL(adf_gen4_init_tl_data); |
Oops, something went wrong.