Skip to content

Commit

Permalink
Merge tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/…
Browse files Browse the repository at this point in the history
…git/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "The most signigicant change here is the addition of a new cpufreq
  'P-state' driver for AMD processors as a better replacement for the
  venerable acpi-cpufreq driver.

  There are also other cpufreq updates (in the core, intel_pstate, ARM
  drivers), PM core updates (mostly related to adding new macros for
  declaring PM operations which should make the lives of driver
  developers somewhat easier), and a bunch of assorted fixes and
  cleanups.

  Summary:

   - Add new P-state driver for AMD processors (Huang Rui).

   - Fix initialization of min and max frequency QoS requests in the
     cpufreq core (Rafael Wysocki).

   - Fix EPP handling on Alder Lake in intel_pstate (Srinivas
     Pandruvada).

   - Make intel_pstate update cpuinfo.max_freq when notified of HWP
     capabilities changes and drop a redundant function call from that
     driver (Rafael Wysocki).

   - Improve IRQ support in the Qcom cpufreq driver (Ard Biesheuvel,
     Stephen Boyd, Vladimir Zapolskiy).

   - Fix double devm_remap() in the Mediatek cpufreq driver (Hector
     Yuan).

   - Introduce thermal pressure helpers for cpufreq CPU cooling (Lukasz
     Luba).

   - Make cpufreq use default_groups in kobj_type (Greg Kroah-Hartman).

   - Make cpuidle use default_groups in kobj_type (Greg Kroah-Hartman).

   - Fix two comments in cpuidle code (Jason Wang, Yang Li).

   - Allow model-specific normal EPB value to be used in the intel_epb
     sysfs attribute handling code (Srinivas Pandruvada).

   - Simplify locking in pm_runtime_put_suppliers() (Rafael Wysocki).

   - Add safety net to supplier device release in the runtime PM core
     code (Rafael Wysocki).

   - Capture device status before disabling runtime PM for it (Rafael
     Wysocki).

   - Add new macros for declaring PM operations to allow drivers to
     avoid guarding them with CONFIG_PM #ifdefs or __maybe_unused and
     update some drivers to use these macros (Paul Cercueil).

   - Allow ACPI hardware signature to be honoured during restore from
     hibernation (David Woodhouse).

   - Update outdated operating performance points (OPP) documentation
     (Tang Yizhou).

   - Reduce log severity for informative message regarding frequency
     transition failures in devfreq (Tzung-Bi Shih).

   - Add DRAM frequency controller devfreq driver for Allwinner sunXi
     SoCs (Samuel Holland).

   - Add missing COMMON_CLK dependency to sun8i devfreq driver (Arnd
     Bergmann).

   - Add support for new layout of Psys PowerLimit Register on SPR to
     the Intel RAPL power capping driver (Zhang Rui).

   - Fix typo in a comment in idle_inject.c (Jason Wang).

   - Remove unused function definition from the DTPM (Dynamit Thermal
     Power Management) power capping framework (Daniel Lezcano).

   - Reduce DTPM trace verbosity (Daniel Lezcano)"

* tag 'pm-5.17-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
  x86, sched: Fix undefined reference to init_freq_invariance_cppc() build error
  cpufreq: amd-pstate: Fix Kconfig dependencies for AMD P-State
  cpufreq: amd-pstate: Fix struct amd_cpudata kernel-doc comment
  cpuidle: use default_groups in kobj_type
  x86: intel_epb: Allow model specific normal EPB value
  MAINTAINERS: Add AMD P-State driver maintainer entry
  Documentation: amd-pstate: Add AMD P-State driver introduction
  cpufreq: amd-pstate: Add AMD P-State performance attributes
  cpufreq: amd-pstate: Add AMD P-State frequencies attributes
  cpufreq: amd-pstate: Add boost mode support for AMD P-State
  cpufreq: amd-pstate: Add trace for AMD P-State module
  cpufreq: amd-pstate: Introduce the support for the processors with shared memory solution
  cpufreq: amd-pstate: Add fast switch function for AMD P-State
  cpufreq: amd-pstate: Introduce a new AMD P-State driver to support future processors
  ACPI: CPPC: Add CPPC enable register function
  ACPI: CPPC: Check present CPUs for determining _CPC is valid
  ACPI: CPPC: Implement support for SystemIO registers
  x86/msr: Add AMD CPPC MSR definitions
  x86/cpufeatures: Add AMD Collaborative Processor Performance Control feature flag
  cpufreq: use default_groups in kobj_type
  ...
  • Loading branch information
Linus Torvalds committed Jan 11, 2022
2 parents bca2175 + 78e6e4d commit b35b6d4
Show file tree
Hide file tree
Showing 55 changed files with 2,274 additions and 218 deletions.
2 changes: 2 additions & 0 deletions Documentation/admin-guide/acpi/cppc_sysfs.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
Collaborative Processor Performance Control (CPPC)
==================================================

.. _cppc_sysfs:

CPPC
====

Expand Down
15 changes: 12 additions & 3 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -225,14 +225,23 @@
For broken nForce2 BIOS resulting in XT-PIC timer.

acpi_sleep= [HW,ACPI] Sleep options
Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig,
old_ordering, nonvs, sci_force_enable, nobl }
Format: { s3_bios, s3_mode, s3_beep, s4_hwsig,
s4_nohwsig, old_ordering, nonvs,
sci_force_enable, nobl }
See Documentation/power/video.rst for information on
s3_bios and s3_mode.
s3_beep is for debugging; it makes the PC's speaker beep
as soon as the kernel's real-mode entry point is called.
s4_hwsig causes the kernel to check the ACPI hardware
signature during resume from hibernation, and gracefully
refuse to resume if it has changed. This complies with
the ACPI specification but not with reality, since
Windows does not do this and many laptops do change it
on docking. So the default behaviour is to allow resume
and simply warn when the signature changes, unless the
s4_hwsig option is enabled.
s4_nohwsig prevents ACPI hardware signature from being
used during resume from hibernation.
used (or even warned about) during resume.
old_ordering causes the ACPI 1.0 ordering of the _PTS
control method, with respect to putting devices into
low power states, to be enforced (the ACPI 2.0 ordering
Expand Down
382 changes: 382 additions & 0 deletions Documentation/admin-guide/pm/amd-pstate.rst

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions Documentation/admin-guide/pm/working-state.rst
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ Working-State Power Management
intel_idle
cpufreq
intel_pstate
amd-pstate
cpufreq_drivers
intel_epb
intel-speed-select
14 changes: 7 additions & 7 deletions Documentation/power/opp.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,9 @@ We can represent these as three OPPs as the following {Hz, uV} tuples:
OPP library provides a set of helper functions to organize and query the OPP
information. The library is located in drivers/opp/ directory and the header
is located in include/linux/pm_opp.h. OPP library can be enabled by enabling
CONFIG_PM_OPP from power management menuconfig menu. OPP library depends on
CONFIG_PM as certain SoCs such as Texas Instrument's OMAP framework allows to
optionally boot at a certain OPP without needing cpufreq.
CONFIG_PM_OPP from power management menuconfig menu. Certain SoCs such as Texas
Instrument's OMAP framework allows to optionally boot at a certain OPP without
needing cpufreq.

Typical usage of the OPP library is as follows::

Expand All @@ -75,8 +75,8 @@ operations until that OPP could be re-enabled if possible.

OPP library facilitates this concept in its implementation. The following
operational functions operate only on available opps:
opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq,
dev_pm_opp_get_opp_count
dev_pm_opp_find_freq_{ceil, floor}, dev_pm_opp_get_voltage, dev_pm_opp_get_freq,
dev_pm_opp_get_opp_count.

dev_pm_opp_find_freq_exact is meant to be used to find the opp pointer
which can then be used for dev_pm_opp_enable/disable functions to make an
Expand All @@ -103,7 +103,7 @@ dev_pm_opp_add
The OPP is defined using the frequency and voltage. Once added, the OPP
is assumed to be available and control of its availability can be done
with the dev_pm_opp_enable/disable functions. OPP library
internally stores and manages this information in the opp struct.
internally stores and manages this information in the dev_pm_opp struct.
This function may be used by SoC framework to define a optimal list
as per the demands of SoC usage environment.

Expand Down Expand Up @@ -247,7 +247,7 @@ dev_pm_opp_disable
5. OPP Data Retrieval Functions
===============================
Since OPP library abstracts away the OPP information, a set of functions to pull
information from the OPP structure is necessary. Once an OPP pointer is
information from the dev_pm_opp structure is necessary. Once an OPP pointer is
retrieved using the search functions, the following functions can be used by SoC
framework to retrieve the information represented inside the OPP layer.

Expand Down
14 changes: 10 additions & 4 deletions Documentation/power/runtime_pm.rst
Original file line number Diff line number Diff line change
Expand Up @@ -265,6 +265,10 @@ defined in include/linux/pm.h:
RPM_SUSPENDED, which means that each device is initially regarded by the
PM core as 'suspended', regardless of its real hardware status

`enum rpm_status last_status;`
- the last runtime PM status of the device captured before disabling runtime
PM for it (invalid initially and when disable_depth is 0)

`unsigned int runtime_auto;`
- if set, indicates that the user space has allowed the device driver to
power manage the device at run time via the /sys/devices/.../power/control
Expand Down Expand Up @@ -333,10 +337,12 @@ drivers/base/power/runtime.c and include/linux/pm_runtime.h:

`int pm_runtime_resume(struct device *dev);`
- execute the subsystem-level resume callback for the device; returns 0 on
success, 1 if the device's runtime PM status was already 'active' or
error code on failure, where -EAGAIN means it may be safe to attempt to
resume the device again in future, but 'power.runtime_error' should be
checked additionally, and -EACCES means that 'power.disable_depth' is
success, 1 if the device's runtime PM status is already 'active' (also if
'power.disable_depth' is nonzero, but the status was 'active' when it was
changing from 0 to 1) or error code on failure, where -EAGAIN means it may
be safe to attempt to resume the device again in future, but
'power.runtime_error' should be checked additionally, and -EACCES means
that the callback could not be run, because 'power.disable_depth' was
different from 0

`int pm_runtime_resume_and_get(struct device *dev);`
Expand Down
7 changes: 7 additions & 0 deletions MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -994,6 +994,13 @@ S: Supported
T: git https://gitlab.freedesktop.org/agd5f/linux.git
F: drivers/gpu/drm/amd/pm/

AMD PSTATE DRIVER
M: Huang Rui <ray.huang@amd.com>
L: linux-pm@vger.kernel.org
S: Supported
F: Documentation/admin-guide/pm/amd-pstate.rst
F: drivers/cpufreq/amd-pstate*

AMD PTDMA DRIVER
M: Sanjay R Mehta <sanju.mehta@amd.com>
L: dmaengine@vger.kernel.org
Expand Down
2 changes: 1 addition & 1 deletion arch/arm/include/asm/topology.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@

/* Replace task scheduler's default thermal pressure API */
#define arch_scale_thermal_pressure topology_get_thermal_pressure
#define arch_set_thermal_pressure topology_set_thermal_pressure
#define arch_update_thermal_pressure topology_update_thermal_pressure

#else

Expand Down
2 changes: 1 addition & 1 deletion arch/arm64/include/asm/topology.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ void update_freq_counters_refs(void);

/* Replace task scheduler's default thermal pressure API */
#define arch_scale_thermal_pressure topology_get_thermal_pressure
#define arch_set_thermal_pressure topology_set_thermal_pressure
#define arch_update_thermal_pressure topology_update_thermal_pressure

#include <asm-generic/topology.h>

Expand Down
1 change: 1 addition & 0 deletions arch/x86/include/asm/cpufeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -315,6 +315,7 @@
#define X86_FEATURE_AMD_SSBD (13*32+24) /* "" Speculative Store Bypass Disable */
#define X86_FEATURE_VIRT_SSBD (13*32+25) /* Virtualized Speculative Store Bypass Disable */
#define X86_FEATURE_AMD_SSB_NO (13*32+26) /* "" Speculative Store Bypass is fixed in hardware. */
#define X86_FEATURE_CPPC (13*32+27) /* Collaborative Processor Performance Control */

/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
#define X86_FEATURE_DTHERM (14*32+ 0) /* Digital Thermal Sensor */
Expand Down
17 changes: 17 additions & 0 deletions arch/x86/include/asm/msr-index.h
Original file line number Diff line number Diff line change
Expand Up @@ -486,6 +486,23 @@

#define MSR_AMD64_VIRT_SPEC_CTRL 0xc001011f

/* AMD Collaborative Processor Performance Control MSRs */
#define MSR_AMD_CPPC_CAP1 0xc00102b0
#define MSR_AMD_CPPC_ENABLE 0xc00102b1
#define MSR_AMD_CPPC_CAP2 0xc00102b2
#define MSR_AMD_CPPC_REQ 0xc00102b3
#define MSR_AMD_CPPC_STATUS 0xc00102b4

#define AMD_CPPC_LOWEST_PERF(x) (((x) >> 0) & 0xff)
#define AMD_CPPC_LOWNONLIN_PERF(x) (((x) >> 8) & 0xff)
#define AMD_CPPC_NOMINAL_PERF(x) (((x) >> 16) & 0xff)
#define AMD_CPPC_HIGHEST_PERF(x) (((x) >> 24) & 0xff)

#define AMD_CPPC_MAX_PERF(x) (((x) & 0xff) << 0)
#define AMD_CPPC_MIN_PERF(x) (((x) & 0xff) << 8)
#define AMD_CPPC_DES_PERF(x) (((x) & 0xff) << 16)
#define AMD_CPPC_ENERGY_PERF_PREF(x) (((x) & 0xff) << 24)

/* Fam 17h MSRs */
#define MSR_F17H_IRPERF 0xc00000e9

Expand Down
2 changes: 1 addition & 1 deletion arch/x86/include/asm/topology.h
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,7 @@ static inline void arch_set_max_freq_ratio(bool turbo_disabled)
}
#endif

#ifdef CONFIG_ACPI_CPPC_LIB
#if defined(CONFIG_ACPI_CPPC_LIB) && defined(CONFIG_SMP)
void init_freq_invariance_cppc(void);
#define init_freq_invariance_cppc init_freq_invariance_cppc
#endif
Expand Down
4 changes: 3 additions & 1 deletion arch/x86/kernel/acpi/sleep.c
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,10 @@ static int __init acpi_sleep_setup(char *str)
if (strncmp(str, "s3_beep", 7) == 0)
acpi_realmode_flags |= 4;
#ifdef CONFIG_HIBERNATION
if (strncmp(str, "s4_hwsig", 8) == 0)
acpi_check_s4_hw_signature(1);
if (strncmp(str, "s4_nohwsig", 10) == 0)
acpi_no_s4_hw_signature();
acpi_check_s4_hw_signature(0);
#endif
if (strncmp(str, "nonvs", 5) == 0)
acpi_nvs_nosave();
Expand Down
45 changes: 32 additions & 13 deletions arch/x86/kernel/cpu/intel_epb.c
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include <linux/syscore_ops.h>
#include <linux/pm.h>

#include <asm/cpu_device_id.h>
#include <asm/cpufeature.h>
#include <asm/msr.h>

Expand Down Expand Up @@ -58,6 +59,22 @@ static DEFINE_PER_CPU(u8, saved_epb);
#define EPB_SAVED 0x10ULL
#define MAX_EPB EPB_MASK

enum energy_perf_value_index {
EPB_INDEX_PERFORMANCE,
EPB_INDEX_BALANCE_PERFORMANCE,
EPB_INDEX_NORMAL,
EPB_INDEX_BALANCE_POWERSAVE,
EPB_INDEX_POWERSAVE,
};

static u8 energ_perf_values[] = {
[EPB_INDEX_PERFORMANCE] = ENERGY_PERF_BIAS_PERFORMANCE,
[EPB_INDEX_BALANCE_PERFORMANCE] = ENERGY_PERF_BIAS_BALANCE_PERFORMANCE,
[EPB_INDEX_NORMAL] = ENERGY_PERF_BIAS_NORMAL,
[EPB_INDEX_BALANCE_POWERSAVE] = ENERGY_PERF_BIAS_BALANCE_POWERSAVE,
[EPB_INDEX_POWERSAVE] = ENERGY_PERF_BIAS_POWERSAVE,
};

static int intel_epb_save(void)
{
u64 epb;
Expand Down Expand Up @@ -90,7 +107,7 @@ static void intel_epb_restore(void)
*/
val = epb & EPB_MASK;
if (val == ENERGY_PERF_BIAS_PERFORMANCE) {
val = ENERGY_PERF_BIAS_NORMAL;
val = energ_perf_values[EPB_INDEX_NORMAL];
pr_warn_once("ENERGY_PERF_BIAS: Set to 'normal', was 'performance'\n");
}
}
Expand All @@ -103,18 +120,11 @@ static struct syscore_ops intel_epb_syscore_ops = {
};

static const char * const energy_perf_strings[] = {
"performance",
"balance-performance",
"normal",
"balance-power",
"power"
};
static const u8 energ_perf_values[] = {
ENERGY_PERF_BIAS_PERFORMANCE,
ENERGY_PERF_BIAS_BALANCE_PERFORMANCE,
ENERGY_PERF_BIAS_NORMAL,
ENERGY_PERF_BIAS_BALANCE_POWERSAVE,
ENERGY_PERF_BIAS_POWERSAVE
[EPB_INDEX_PERFORMANCE] = "performance",
[EPB_INDEX_BALANCE_PERFORMANCE] = "balance-performance",
[EPB_INDEX_NORMAL] = "normal",
[EPB_INDEX_BALANCE_POWERSAVE] = "balance-power",
[EPB_INDEX_POWERSAVE] = "power",
};

static ssize_t energy_perf_bias_show(struct device *dev,
Expand Down Expand Up @@ -193,13 +203,22 @@ static int intel_epb_offline(unsigned int cpu)
return 0;
}

static const struct x86_cpu_id intel_epb_normal[] = {
X86_MATCH_INTEL_FAM6_MODEL(ALDERLAKE_L, 7),
{}
};

static __init int intel_epb_init(void)
{
const struct x86_cpu_id *id = x86_match_cpu(intel_epb_normal);
int ret;

if (!boot_cpu_has(X86_FEATURE_EPB))
return -ENODEV;

if (id)
energ_perf_values[EPB_INDEX_NORMAL] = id->driver_data;

ret = cpuhp_setup_state(CPUHP_AP_X86_INTEL_EPB_ONLINE,
"x86/intel/epb:online", intel_epb_online,
intel_epb_offline);
Expand Down
Loading

0 comments on commit b35b6d4

Please sign in to comment.