Skip to content

Commit

Permalink
Merge tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/g…
Browse files Browse the repository at this point in the history
…it/rafael/linux-pm

Pull power management updates from Rafael Wysocki:
 "These add EPP support to the AMD P-state cpufreq driver, add support
  for new platforms to the Intel RAPL power capping driver, intel_idle
  and the Qualcomm cpufreq driver, enable thermal cooling for Tegra194,
  drop the custom cpufreq driver for loongson1 that is not necessary any
  more (and the corresponding cpufreq platform device), fix assorted
  issues and clean up code.

  Specifics:

   - Add EPP support to the AMD P-state cpufreq driver (Perry Yuan, Wyes
     Karny, Arnd Bergmann, Bagas Sanjaya)

   - Drop the custom cpufreq driver for loongson1 that is not necessary
     any more and the corresponding cpufreq platform device (Keguang
     Zhang)

   - Remove "select SRCU" from system sleep, cpufreq and OPP Kconfig
     entries (Paul E. McKenney)

   - Enable thermal cooling for Tegra194 (Yi-Wei Wang)

   - Register module device table and add missing compatibles for
     cpufreq-qcom-hw (Nícolas F. R. A. Prado, Abel Vesa and Luca Weiss)

   - Various dt binding updates for qcom-cpufreq-nvmem and
     opp-v2-kryo-cpu (Christian Marangi)

   - Make kobj_type structure in the cpufreq core constant (Thomas
     Weißschuh)

   - Make cpufreq_unregister_driver() return void (Uwe Kleine-König)

   - Make the TEO cpuidle governor check CPU utilization in order to
     refine idle state selection (Kajetan Puchalski)

   - Make Kconfig select the haltpoll cpuidle governor when the haltpoll
     cpuidle driver is selected and replace a default_idle() call in
     that driver with arch_cpu_idle() to allow MWAIT to be used (Li
     RongQing)

   - Add Emerald Rapids Xeon support to the intel_idle driver (Artem
     Bityutskiy)

   - Add ARCH_SUSPEND_POSSIBLE dependencies for ARMv4 cpuidle drivers to
     avoid randconfig build failures (Arnd Bergmann)

   - Make kobj_type structures used in the cpuidle sysfs interface
     constant (Thomas Weißschuh)

   - Make the cpuidle driver registration code update microsecond values
     of idle state parameters in accordance with their nanosecond values
     if they are provided (Rafael Wysocki)

   - Make the PSCI cpuidle driver prevent topology CPUs from being
     suspended on PREEMPT_RT (Krzysztof Kozlowski)

   - Document that pm_runtime_force_suspend() cannot be used with
     DPM_FLAG_SMART_SUSPEND (Richard Fitzgerald)

   - Add EXPORT macros for exporting PM functions from drivers (Richard
     Fitzgerald)

   - Remove /** from non-kernel-doc comments in hibernation code (Randy
     Dunlap)

   - Fix possible name leak in powercap_register_zone() (Yang Yingliang)

   - Add Meteor Lake and Emerald Rapids support to the intel_rapl power
     capping driver (Zhang Rui)

   - Modify the idle_inject power capping facility to support 100% idle
     injection (Srinivas Pandruvada)

   - Fix large time windows handling in the intel_rapl power capping
     driver (Zhang Rui)

   - Fix memory leaks with using debugfs_lookup() in the generic PM
     domains and Energy Model code (Greg Kroah-Hartman)

   - Add missing 'cache-unified' property in the example for kryo OPP
     bindings (Rob Herring)

   - Fix error checking in opp_migrate_dentry() (Qi Zheng)

   - Let qcom,opp-fuse-level be a 2-long array for qcom SoCs (Konrad
     Dybcio)

   - Modify some power management utilities to use the canonical ftrace
     path (Ross Zwisler)

   - Correct spelling problems for Documentation/power/ as reported by
     codespell (Randy Dunlap)"

* tag 'pm-6.3-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (53 commits)
  Documentation: amd-pstate: disambiguate user space sections
  cpufreq: amd-pstate: Fix invalid write to MSR_AMD_CPPC_REQ
  dt-bindings: opp: opp-v2-kryo-cpu: enlarge opp-supported-hw maximum
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: make cpr bindings optional
  dt-bindings: cpufreq: qcom-cpufreq-nvmem: specify supported opp tables
  PM: Add EXPORT macros for exporting PM functions
  cpuidle: psci: Do not suspend topology CPUs on PREEMPT_RT
  MIPS: loongson32: Drop obsolete cpufreq platform device
  powercap: intel_rapl: Fix handling for large time window
  cpuidle: driver: Update microsecond values of state parameters as needed
  cpuidle: sysfs: make kobj_type structures constant
  cpuidle: add ARCH_SUSPEND_POSSIBLE dependencies
  PM: EM: fix memory leak with using debugfs_lookup()
  PM: domains: fix memory leak with using debugfs_lookup()
  cpufreq: Make kobj_type structure constant
  cpufreq: davinci: Fix clk use after free
  cpufreq: amd-pstate: avoid uninitialized variable use
  cpufreq: Make cpufreq_unregister_driver() return void
  OPP: fix error checking in opp_migrate_dentry()
  dt-bindings: cpufreq: cpufreq-qcom-hw: Add SM8550 compatible
  ...
  • Loading branch information
Linus Torvalds committed Feb 21, 2023
2 parents 4a7d37e + dd855f0 commit 2504ba8
Show file tree
Hide file tree
Showing 50 changed files with 1,160 additions and 384 deletions.
7 changes: 7 additions & 0 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -7041,3 +7041,10 @@
management firmware translates the requests into actual
hardware states (core frequency, data fabric and memory
clocks etc.)
active
Use amd_pstate_epp driver instance as the scaling driver,
driver provides a hint to the hardware if software wants
to bias toward performance (0x0) or energy efficiency (0xff)
to the CPPC firmware. then CPPC power algorithm will
calculate the runtime workload and adjust the realtime cores
frequency.
78 changes: 74 additions & 4 deletions Documentation/admin-guide/pm/amd-pstate.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,8 +230,8 @@ with :c:macro:`MSR_AMD_CPPC_ENABLE` or ``cppc_set_enable``, it will respond
to the request from AMD P-States.


User Space Interface in ``sysfs``
==================================
User Space Interface in ``sysfs`` - Per-policy control
======================================================

``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
control its functionality at the system level. They are located in the
Expand Down Expand Up @@ -262,6 +262,25 @@ lowest non-linear performance in `AMD CPPC Performance Capability
<perf_cap_>`_.)
This attribute is read-only.

``energy_performance_available_preferences``

A list of all the supported EPP preferences that could be used for
``energy_performance_preference`` on this system.
These profiles represent different hints that are provided
to the low-level firmware about the user's desired energy vs efficiency
tradeoff. ``default`` represents the epp value is set by platform
firmware. This attribute is read-only.

``energy_performance_preference``

The current energy performance preference can be read from this attribute.
and user can change current preference according to energy or performance needs
Please get all support profiles list from
``energy_performance_available_preferences`` attribute, all the profiles are
integer values defined between 0 to 255 when EPP feature is enabled by platform
firmware, if EPP feature is disabled, driver will ignore the written value
This attribute is read-write.

Other performance and frequency values can be read back from
``/sys/devices/system/cpu/cpuX/acpi_cppc/``, see :ref:`cppc_sysfs`.

Expand All @@ -280,8 +299,30 @@ module which supports the new AMD P-States mechanism on most of the future AMD
platforms. The AMD P-States mechanism is the more performance and energy
efficiency frequency management method on AMD processors.

Kernel Module Options for ``amd-pstate``
=========================================

AMD Pstate Driver Operation Modes
=================================

``amd_pstate`` CPPC has two operation modes: CPPC Autonomous(active) mode and
CPPC non-autonomous(passive) mode.
active mode and passive mode can be chosen by different kernel parameters.
When in Autonomous mode, CPPC ignores requests done in the Desired Performance
Target register and takes into account only the values set to the Minimum requested
performance, Maximum requested performance, and Energy Performance Preference
registers. When Autonomous is disabled, it only considers the Desired Performance Target.

Active Mode
------------

``amd_pstate=active``

This is the low-level firmware control mode which is implemented by ``amd_pstate_epp``
driver with ``amd_pstate=active`` passed to the kernel in the command line.
In this mode, ``amd_pstate_epp`` driver provides a hint to the hardware if software
wants to bias toward performance (0x0) or energy efficiency (0xff) to the CPPC firmware.
then CPPC power algorithm will calculate the runtime workload and adjust the realtime
cores frequency according to the power supply and thermal, core voltage and some other
hardware conditions.

Passive Mode
------------
Expand All @@ -298,6 +339,35 @@ processor must provide at least nominal performance requested and go higher if c
operating conditions allow.


User Space Interface in ``sysfs`` - General
===========================================

Global Attributes
-----------------

``amd-pstate`` exposes several global attributes (files) in ``sysfs`` to
control its functionality at the system level. They are located in the
``/sys/devices/system/cpu/amd-pstate/`` directory and affect all CPUs.

``status``
Operation mode of the driver: "active", "passive" or "disable".

"active"
The driver is functional and in the ``active mode``

"passive"
The driver is functional and in the ``passive mode``

"disable"
The driver is unregistered and not functional now.

This attribute can be written to in order to change the driver's
operation mode or to unregister it. The string written to it must be
one of the possible values of it and, if successful, writing one of
these values to the sysfs file will cause the driver to switch over
to the operation mode represented by that string - or to be
unregistered in the "disable" case.

``cpupower`` tool support for ``amd-pstate``
===============================================

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,13 @@ properties:
items:
- enum:
- qcom,qdu1000-cpufreq-epss
- qcom,sc7280-cpufreq-epss
- qcom,sc8280xp-cpufreq-epss
- qcom,sm6375-cpufreq-epss
- qcom,sm8250-cpufreq-epss
- qcom,sm8350-cpufreq-epss
- qcom,sm8450-cpufreq-epss
- qcom,sm8550-cpufreq-epss
- const: qcom,cpufreq-epss

reg:
Expand Down
81 changes: 56 additions & 25 deletions Documentation/devicetree/bindings/cpufreq/qcom-cpufreq-nvmem.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ description: |
on the CPU OPP in use. The CPUFreq driver sets the CPR power domain level
according to the required OPPs defined in the CPU OPP tables.
For old implementation efuses are parsed to select the correct opp table and
voltage and CPR is not supported/used.
select:
properties:
compatible:
Expand All @@ -33,37 +36,65 @@ select:
required:
- compatible

properties:
cpus:
type: object

patternProperties:
'^cpu@[0-9a-f]+$':
type: object

properties:
power-domains:
maxItems: 1

power-domain-names:
items:
- const: cpr

required:
- power-domains
- power-domain-names

patternProperties:
'^opp-table(-[a-z0-9]+)?$':
if:
allOf:
- if:
properties:
compatible:
const: operating-points-v2-kryo-cpu
then:
$ref: /schemas/opp/opp-v2-kryo-cpu.yaml#

- if:
properties:
compatible:
const: operating-points-v2-qcom-level
then:
$ref: /schemas/opp/opp-v2-qcom-level.yaml#

unevaluatedProperties: false

allOf:
- if:
properties:
compatible:
const: operating-points-v2-kryo-cpu
contains:
enum:
- qcom,qcs404

then:
properties:
cpus:
type: object

patternProperties:
'^cpu@[0-9a-f]+$':
type: object

properties:
power-domains:
maxItems: 1

power-domain-names:
items:
- const: cpr

required:
- power-domains
- power-domain-names

patternProperties:
'^opp-?[0-9]+$':
required:
- required-opps
'^opp-table(-[a-z0-9]+)?$':
if:
properties:
compatible:
const: operating-points-v2-kryo-cpu
then:
patternProperties:
'^opp-?[0-9]+$':
required:
- required-opps

additionalProperties: true

Expand Down
18 changes: 15 additions & 3 deletions Documentation/devicetree/bindings/opp/opp-v2-kryo-cpu.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,12 +50,22 @@ patternProperties:
opp-supported-hw:
description: |
A single 32 bit bitmap value, representing compatible HW.
Bitmap:
Bitmap for MSM8996 format:
0: MSM8996, speedbin 0
1: MSM8996, speedbin 1
2: MSM8996, speedbin 2
3-31: unused
maximum: 0x7
3: MSM8996, speedbin 3
4-31: unused
Bitmap for MSM8996SG format (speedbin shifted of 4 left):
0-3: unused
4: MSM8996SG, speedbin 0
5: MSM8996SG, speedbin 1
6: MSM8996SG, speedbin 2
7-31: unused
enum: [0x1, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7,
0x9, 0xd, 0xe, 0xf,
0x10, 0x20, 0x30, 0x70]

clock-latency-ns: true

Expand Down Expand Up @@ -106,6 +116,7 @@ examples:
L2_0: l2-cache {
compatible = "cache";
cache-level = <2>;
cache-unified;
};
};
Expand Down Expand Up @@ -140,6 +151,7 @@ examples:
L2_1: l2-cache {
compatible = "cache";
cache-level = <2>;
cache-unified;
};
};
Expand Down
4 changes: 3 additions & 1 deletion Documentation/devicetree/bindings/opp/opp-v2-qcom-level.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,9 @@ patternProperties:
this OPP node. Sometimes several corners/levels shares a certain fuse
corner/level. A fuse corner/level contains e.g. ref uV, min uV,
and max uV.
$ref: /schemas/types.yaml#/definitions/uint32
$ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 2

required:
- opp-level
Expand Down
2 changes: 1 addition & 1 deletion Documentation/power/suspend-and-interrupts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ That may involve turning on a special signal handling logic within the platform
during system sleep so as to trigger a system wakeup when needed. For example,
the platform may include a dedicated interrupt controller used specifically for
handling system wakeup events. Then, if a given interrupt line is supposed to
wake up the system from sleep sates, the corresponding input of that interrupt
wake up the system from sleep states, the corresponding input of that interrupt
controller needs to be enabled to receive signals from the line in question.
After wakeup, it generally is better to disable that input to prevent the
dedicated controller from triggering interrupts unnecessarily.
Expand Down
18 changes: 0 additions & 18 deletions arch/mips/include/asm/mach-loongson32/cpufreq.h

This file was deleted.

1 change: 0 additions & 1 deletion arch/mips/include/asm/mach-loongson32/platform.h
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@
#include <nand.h>

extern struct platform_device ls1x_uart_pdev;
extern struct platform_device ls1x_cpufreq_pdev;
extern struct platform_device ls1x_eth0_pdev;
extern struct platform_device ls1x_eth1_pdev;
extern struct platform_device ls1x_ehci_pdev;
Expand Down
16 changes: 0 additions & 16 deletions arch/mips/loongson32/common/platform.c
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,6 @@

#include <platform.h>
#include <loongson1.h>
#include <cpufreq.h>
#include <dma.h>
#include <nand.h>

Expand Down Expand Up @@ -62,21 +61,6 @@ void __init ls1x_serial_set_uartclk(struct platform_device *pdev)
p->uartclk = clk_get_rate(clk);
}

/* CPUFreq */
static struct plat_ls1x_cpufreq ls1x_cpufreq_pdata = {
.clk_name = "cpu_clk",
.osc_clk_name = "osc_clk",
.max_freq = 266 * 1000,
.min_freq = 33 * 1000,
};

struct platform_device ls1x_cpufreq_pdev = {
.name = "ls1x-cpufreq",
.dev = {
.platform_data = &ls1x_cpufreq_pdata,
},
};

/* Synopsys Ethernet GMAC */
static struct stmmac_mdio_bus_data ls1x_mdio_bus_data = {
.phy_mask = 0,
Expand Down
1 change: 0 additions & 1 deletion arch/mips/loongson32/ls1b/board.c
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ static const struct gpio_led_platform_data ls1x_led_pdata __initconst = {

static struct platform_device *ls1b_platform_devices[] __initdata = {
&ls1x_uart_pdev,
&ls1x_cpufreq_pdev,
&ls1x_eth0_pdev,
&ls1x_eth1_pdev,
&ls1x_ehci_pdev,
Expand Down
1 change: 1 addition & 0 deletions arch/x86/kernel/process.c
Original file line number Diff line number Diff line change
Expand Up @@ -739,6 +739,7 @@ void __cpuidle arch_cpu_idle(void)
{
static_call(x86_idle)();
}
EXPORT_SYMBOL_GPL(arch_cpu_idle);

#ifdef CONFIG_XEN
bool xen_set_default_idle(void)
Expand Down
Loading

0 comments on commit 2504ba8

Please sign in to comment.