Skip to content

Commit

Permalink
Merge branch 'thermal-hfi'
Browse files Browse the repository at this point in the history
Merge Intel Hardware Feedback Interface (HFI) thermal driver for
5.18-rc1 and update the intel-speed-select utility to support that
driver.

* thermal-hfi:
  tools/power/x86/intel-speed-select: v1.12 release
  tools/power/x86/intel-speed-select: HFI support
  tools/power/x86/intel-speed-select: OOB daemon mode
  thermal: intel: hfi: INTEL_HFI_THERMAL depends on NET
  thermal: netlink: Fix parameter type of thermal_genl_cpu_capability_event() stub
  thermal: intel: hfi: Notify user space for HFI events
  thermal: netlink: Add a new event to notify CPU capabilities change
  thermal: intel: hfi: Enable notification interrupt
  thermal: intel: hfi: Handle CPU hotplug events
  thermal: intel: hfi: Minimally initialize the Hardware Feedback Interface
  x86/cpu: Add definitions for the Intel Hardware Feedback Interface
  x86/Documentation: Describe the Intel Hardware Feedback Interface
  • Loading branch information
Rafael J. Wysocki committed Mar 18, 2022
2 parents 2d6fc14 + 2045d38 commit 31035f3
Show file tree
Hide file tree
Showing 18 changed files with 1,392 additions and 16 deletions.
1 change: 1 addition & 0 deletions Documentation/x86/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ x86-specific Documentation
tlb
mtrr
pat
intel-hfi
intel-iommu
intel_txt
amd-memory-encryption
Expand Down
72 changes: 72 additions & 0 deletions Documentation/x86/intel-hfi.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
.. SPDX-License-Identifier: GPL-2.0
============================================================
Hardware-Feedback Interface for scheduling on Intel Hardware
============================================================

Overview
--------

Intel has described the Hardware Feedback Interface (HFI) in the Intel 64 and
IA-32 Architectures Software Developer's Manual (Intel SDM) Volume 3 Section
14.6 [1]_.

The HFI gives the operating system a performance and energy efficiency
capability data for each CPU in the system. Linux can use the information from
the HFI to influence task placement decisions.

The Hardware Feedback Interface
-------------------------------

The Hardware Feedback Interface provides to the operating system information
about the performance and energy efficiency of each CPU in the system. Each
capability is given as a unit-less quantity in the range [0-255]. Higher values
indicate higher capability. Energy efficiency and performance are reported in
separate capabilities. Even though on some systems these two metrics may be
related, they are specified as independent capabilities in the Intel SDM.

These capabilities may change at runtime as a result of changes in the
operating conditions of the system or the action of external factors. The rate
at which these capabilities are updated is specific to each processor model. On
some models, capabilities are set at boot time and never change. On others,
capabilities may change every tens of milliseconds. For instance, a remote
mechanism may be used to lower Thermal Design Power. Such change can be
reflected in the HFI. Likewise, if the system needs to be throttled due to
excessive heat, the HFI may reflect reduced performance on specific CPUs.

The kernel or a userspace policy daemon can use these capabilities to modify
task placement decisions. For instance, if either the performance or energy
capabilities of a given logical processor becomes zero, it is an indication that
the hardware recommends to the operating system to not schedule any tasks on
that processor for performance or energy efficiency reasons, respectively.

Implementation details for Linux
--------------------------------

The infrastructure to handle thermal event interrupts has two parts. In the
Local Vector Table of a CPU's local APIC, there exists a register for the
Thermal Monitor Register. This register controls how interrupts are delivered
to a CPU when the thermal monitor generates and interrupt. Further details
can be found in the Intel SDM Vol. 3 Section 10.5 [1]_.

The thermal monitor may generate interrupts per CPU or per package. The HFI
generates package-level interrupts. This monitor is configured and initialized
via a set of machine-specific registers. Specifically, the HFI interrupt and
status are controlled via designated bits in the IA32_PACKAGE_THERM_INTERRUPT
and IA32_PACKAGE_THERM_STATUS registers, respectively. There exists one HFI
table per package. Further details can be found in the Intel SDM Vol. 3
Section 14.9 [1]_.

The hardware issues an HFI interrupt after updating the HFI table and is ready
for the operating system to consume it. CPUs receive such interrupt via the
thermal entry in the Local APIC's Local Vector Table.

When servicing such interrupt, the HFI driver parses the updated table and
relays the update to userspace using the thermal notification framework. Given
that there may be many HFI updates every second, the updates relayed to
userspace are throttled at a rate of CONFIG_HZ jiffies.

References
----------

.. [1] https://www.intel.com/sdm
1 change: 1 addition & 0 deletions arch/x86/include/asm/cpufeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -330,6 +330,7 @@
#define X86_FEATURE_HWP_ACT_WINDOW (14*32+ 9) /* HWP Activity Window */
#define X86_FEATURE_HWP_EPP (14*32+10) /* HWP Energy Perf. Preference */
#define X86_FEATURE_HWP_PKG_REQ (14*32+11) /* HWP Package Level Request */
#define X86_FEATURE_HFI (14*32+19) /* Hardware Feedback Interface */

/* AMD SVM Feature Identification, CPUID level 0x8000000a (EDX), word 15 */
#define X86_FEATURE_NPT (15*32+ 0) /* Nested Page Table support */
Expand Down
6 changes: 6 additions & 0 deletions arch/x86/include/asm/msr-index.h
Original file line number Diff line number Diff line change
Expand Up @@ -705,12 +705,14 @@

#define PACKAGE_THERM_STATUS_PROCHOT (1 << 0)
#define PACKAGE_THERM_STATUS_POWER_LIMIT (1 << 10)
#define PACKAGE_THERM_STATUS_HFI_UPDATED (1 << 26)

#define MSR_IA32_PACKAGE_THERM_INTERRUPT 0x000001b2

#define PACKAGE_THERM_INT_HIGH_ENABLE (1 << 0)
#define PACKAGE_THERM_INT_LOW_ENABLE (1 << 1)
#define PACKAGE_THERM_INT_PLN_ENABLE (1 << 24)
#define PACKAGE_THERM_INT_HFI_ENABLE (1 << 25)

/* Thermal Thresholds Support */
#define THERM_INT_THRESHOLD0_ENABLE (1 << 15)
Expand Down Expand Up @@ -959,4 +961,8 @@
#define MSR_VM_IGNNE 0xc0010115
#define MSR_VM_HSAVE_PA 0xc0010117

/* Hardware Feedback Interface */
#define MSR_IA32_HW_FEEDBACK_PTR 0x17d0
#define MSR_IA32_HW_FEEDBACK_CONFIG 0x17d1

#endif /* _ASM_X86_MSR_INDEX_H */
14 changes: 14 additions & 0 deletions drivers/thermal/intel/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -99,3 +99,17 @@ config INTEL_MENLOW
Intel Menlow platform.

If unsure, say N.

config INTEL_HFI_THERMAL
bool "Intel Hardware Feedback Interface"
depends on NET
depends on CPU_SUP_INTEL
depends on X86_THERMAL_VECTOR
select THERMAL_NETLINK
help
Select this option to enable the Hardware Feedback Interface. If
selected, hardware provides guidance to the operating system on
the performance and energy efficiency capabilities of each CPU.
These capabilities may change as a result of changes in the operating
conditions of the system such power and thermal limits. If selected,
the kernel relays updates in CPUs' capabilities to userspace.
1 change: 1 addition & 0 deletions drivers/thermal/intel/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,4 @@ obj-$(CONFIG_INTEL_PCH_THERMAL) += intel_pch_thermal.o
obj-$(CONFIG_INTEL_TCC_COOLING) += intel_tcc_cooling.o
obj-$(CONFIG_X86_THERMAL_VECTOR) += therm_throt.o
obj-$(CONFIG_INTEL_MENLOW) += intel_menlow.o
obj-$(CONFIG_INTEL_HFI_THERMAL) += intel_hfi.o
Loading

0 comments on commit 31035f3

Please sign in to comment.