Skip to content

Commit

Permalink
Merge branch 'pm-cpufreq'
Browse files Browse the repository at this point in the history
* pm-cpufreq:
  cpufreq: OMAP: remove loops_per_jiffy recalculate for smp
  sections: fix section conflicts in drivers/cpufreq
  cpufreq: conservative: update frequency when limits are relaxed
  cpufreq / ondemand: update frequency when limits are relaxed
  cpufreq: Add a generic cpufreq-cpu0 driver
  PM / OPP: Initialize OPP table from device tree
  ARM: add cpufreq transiton notifier to adjust loops_per_jiffy for smp
  cpufreq: Remove support for hardware P-state chips from powernow-k8
  acpi-cpufreq: Add compatibility for legacy AMD cpb sysfs knob
  acpi-cpufreq: Add support for disabling dynamic overclocking
  ACPI: Add fixups for AMD P-state figures
  powernow-k8: delay info messages until initialization has succeeded
  cpufreq: Add warning message to powernow-k8
  acpi-cpufreq: Add quirk to disable _PSD usage on all AMD CPUs
  acpi-cpufreq: Add support for modern AMD CPUs
  cpufreq / powernow-k8: Fixup missing _PSS objects message
  PM / cpufreq: Initialise the cpu field during conservative governor start
  • Loading branch information
Rafael J. Wysocki committed Sep 17, 2012
2 parents 87a2337 + cd664cc commit fa373ab
Show file tree
Hide file tree
Showing 20 changed files with 947 additions and 455 deletions.
11 changes: 11 additions & 0 deletions Documentation/ABI/testing/sysfs-devices-system-cpu
Original file line number Diff line number Diff line change
Expand Up @@ -176,3 +176,14 @@ Description: Disable L3 cache indices
All AMD processors with L3 caches provide this functionality.
For details, see BKDGs at
http://developer.amd.com/documentation/guides/Pages/default.aspx


What: /sys/devices/system/cpu/cpufreq/boost
Date: August 2012
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Processor frequency boosting control

This switch controls the boost setting for the whole system.
Boosting allows the CPU and the firmware to run at a frequency
beyound it's nominal limit.
More details can be found in Documentation/cpu-freq/boost.txt
93 changes: 93 additions & 0 deletions Documentation/cpu-freq/boost.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
Processor boosting control

- information for users -

Quick guide for the impatient:
--------------------
/sys/devices/system/cpu/cpufreq/boost
controls the boost setting for the whole system. You can read and write
that file with either "0" (boosting disabled) or "1" (boosting allowed).
Reading or writing 1 does not mean that the system is boosting at this
very moment, but only that the CPU _may_ raise the frequency at it's
discretion.
--------------------

Introduction
-------------
Some CPUs support a functionality to raise the operating frequency of
some cores in a multi-core package if certain conditions apply, mostly
if the whole chip is not fully utilized and below it's intended thermal
budget. This is done without operating system control by a combination
of hardware and firmware.
On Intel CPUs this is called "Turbo Boost", AMD calls it "Turbo-Core",
in technical documentation "Core performance boost". In Linux we use
the term "boost" for convenience.

Rationale for disable switch
----------------------------

Though the idea is to just give better performance without any user
intervention, sometimes the need arises to disable this functionality.
Most systems offer a switch in the (BIOS) firmware to disable the
functionality at all, but a more fine-grained and dynamic control would
be desirable:
1. While running benchmarks, reproducible results are important. Since
the boosting functionality depends on the load of the whole package,
single thread performance can vary. By explicitly disabling the boost
functionality at least for the benchmark's run-time the system will run
at a fixed frequency and results are reproducible again.
2. To examine the impact of the boosting functionality it is helpful
to do tests with and without boosting.
3. Boosting means overclocking the processor, though under controlled
conditions. By raising the frequency and the voltage the processor
will consume more power than without the boosting, which may be
undesirable for instance for mobile users. Disabling boosting may
save power here, though this depends on the workload.


User controlled switch
----------------------

To allow the user to toggle the boosting functionality, the acpi-cpufreq
driver exports a sysfs knob to disable it. There is a file:
/sys/devices/system/cpu/cpufreq/boost
which can either read "0" (boosting disabled) or "1" (boosting enabled).
Reading the file is always supported, even if the processor does not
support boosting. In this case the file will be read-only and always
reads as "0". Explicitly changing the permissions and writing to that
file anyway will return EINVAL.

On supported CPUs one can write either a "0" or a "1" into this file.
This will either disable the boost functionality on all cores in the
whole system (0) or will allow the hardware to boost at will (1).

Writing a "1" does not explicitly boost the system, but just allows the
CPU (and the firmware) to boost at their discretion. Some implementations
take external factors like the chip's temperature into account, so
boosting once does not necessarily mean that it will occur every time
even using the exact same software setup.


AMD legacy cpb switch
---------------------
The AMD powernow-k8 driver used to support a very similar switch to
disable or enable the "Core Performance Boost" feature of some AMD CPUs.
This switch was instantiated in each CPU's cpufreq directory
(/sys/devices/system/cpu[0-9]*/cpufreq) and was called "cpb".
Though the per CPU existence hints at a more fine grained control, the
actual implementation only supported a system-global switch semantics,
which was simply reflected into each CPU's file. Writing a 0 or 1 into it
would pull the other CPUs to the same state.
For compatibility reasons this file and its behavior is still supported
on AMD CPUs, though it is now protected by a config switch
(X86_ACPI_CPUFREQ_CPB). On Intel CPUs this file will never be created,
even with the config option set.
This functionality is considered legacy and will be removed in some future
kernel version.

More fine grained boosting control
----------------------------------

Technically it is possible to switch the boosting functionality at least
on a per package basis, for some CPUs even per core. Currently the driver
does not support it, but this may be implemented in the future.
55 changes: 55 additions & 0 deletions Documentation/devicetree/bindings/cpufreq/cpufreq-cpu0.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
Generic CPU0 cpufreq driver

It is a generic cpufreq driver for CPU0 frequency management. It
supports both uniprocessor (UP) and symmetric multiprocessor (SMP)
systems which share clock and voltage across all CPUs.

Both required and optional properties listed below must be defined
under node /cpus/cpu@0.

Required properties:
- operating-points: Refer to Documentation/devicetree/bindings/power/opp.txt
for details

Optional properties:
- clock-latency: Specify the possible maximum transition latency for clock,
in unit of nanoseconds.
- voltage-tolerance: Specify the CPU voltage tolerance in percentage.

Examples:

cpus {
#address-cells = <1>;
#size-cells = <0>;

cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
next-level-cache = <&L2>;
operating-points = <
/* kHz uV */
792000 1100000
396000 950000
198000 850000
>;
transition-latency = <61036>; /* two CLK32 periods */
};

cpu@1 {
compatible = "arm,cortex-a9";
reg = <1>;
next-level-cache = <&L2>;
};

cpu@2 {
compatible = "arm,cortex-a9";
reg = <2>;
next-level-cache = <&L2>;
};

cpu@3 {
compatible = "arm,cortex-a9";
reg = <3>;
next-level-cache = <&L2>;
};
};
25 changes: 25 additions & 0 deletions Documentation/devicetree/bindings/power/opp.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
* Generic OPP Interface

SoCs have a standard set of tuples consisting of frequency and
voltage pairs that the device will support per voltage domain. These
are called Operating Performance Points or OPPs.

Properties:
- operating-points: An array of 2-tuples items, and each item consists
of frequency and voltage like <freq-kHz vol-uV>.
freq: clock frequency in kHz
vol: voltage in microvolt

Examples:

cpu@0 {
compatible = "arm,cortex-a9";
reg = <0>;
next-level-cache = <&L2>;
operating-points = <
/* kHz uV */
792000 1100000
396000 950000
198000 850000
>;
};
54 changes: 54 additions & 0 deletions arch/arm/kernel/smp.c
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,7 @@
#include <linux/percpu.h>
#include <linux/clockchips.h>
#include <linux/completion.h>
#include <linux/cpufreq.h>

#include <linux/atomic.h>
#include <asm/cacheflush.h>
Expand Down Expand Up @@ -584,3 +585,56 @@ int setup_profiling_timer(unsigned int multiplier)
{
return -EINVAL;
}

#ifdef CONFIG_CPU_FREQ

static DEFINE_PER_CPU(unsigned long, l_p_j_ref);
static DEFINE_PER_CPU(unsigned long, l_p_j_ref_freq);
static unsigned long global_l_p_j_ref;
static unsigned long global_l_p_j_ref_freq;

static int cpufreq_callback(struct notifier_block *nb,
unsigned long val, void *data)
{
struct cpufreq_freqs *freq = data;
int cpu = freq->cpu;

if (freq->flags & CPUFREQ_CONST_LOOPS)
return NOTIFY_OK;

if (!per_cpu(l_p_j_ref, cpu)) {
per_cpu(l_p_j_ref, cpu) =
per_cpu(cpu_data, cpu).loops_per_jiffy;
per_cpu(l_p_j_ref_freq, cpu) = freq->old;
if (!global_l_p_j_ref) {
global_l_p_j_ref = loops_per_jiffy;
global_l_p_j_ref_freq = freq->old;
}
}

if ((val == CPUFREQ_PRECHANGE && freq->old < freq->new) ||
(val == CPUFREQ_POSTCHANGE && freq->old > freq->new) ||
(val == CPUFREQ_RESUMECHANGE || val == CPUFREQ_SUSPENDCHANGE)) {
loops_per_jiffy = cpufreq_scale(global_l_p_j_ref,
global_l_p_j_ref_freq,
freq->new);
per_cpu(cpu_data, cpu).loops_per_jiffy =
cpufreq_scale(per_cpu(l_p_j_ref, cpu),
per_cpu(l_p_j_ref_freq, cpu),
freq->new);
}
return NOTIFY_OK;
}

static struct notifier_block cpufreq_notifier = {
.notifier_call = cpufreq_callback,
};

static int __init register_cpufreq_notifier(void)
{
return cpufreq_register_notifier(&cpufreq_notifier,
CPUFREQ_TRANSITION_NOTIFIER);
}
core_initcall(register_cpufreq_notifier);

#endif
3 changes: 3 additions & 0 deletions arch/x86/include/asm/msr-index.h
Original file line number Diff line number Diff line change
Expand Up @@ -248,6 +248,9 @@

#define MSR_IA32_PERF_STATUS 0x00000198
#define MSR_IA32_PERF_CTL 0x00000199
#define MSR_AMD_PSTATE_DEF_BASE 0xc0010064
#define MSR_AMD_PERF_STATUS 0xc0010063
#define MSR_AMD_PERF_CTL 0xc0010062

#define MSR_IA32_MPERF 0x000000e7
#define MSR_IA32_APERF 0x000000e8
Expand Down
30 changes: 30 additions & 0 deletions drivers/acpi/processor_perflib.c
Original file line number Diff line number Diff line change
Expand Up @@ -324,6 +324,34 @@ static int acpi_processor_get_performance_control(struct acpi_processor *pr)
return result;
}

#ifdef CONFIG_X86
/*
* Some AMDs have 50MHz frequency multiples, but only provide 100MHz rounding
* in their ACPI data. Calculate the real values and fix up the _PSS data.
*/
static void amd_fixup_frequency(struct acpi_processor_px *px, int i)
{
u32 hi, lo, fid, did;
int index = px->control & 0x00000007;

if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
return;

if ((boot_cpu_data.x86 == 0x10 && boot_cpu_data.x86_model < 10)
|| boot_cpu_data.x86 == 0x11) {
rdmsr(MSR_AMD_PSTATE_DEF_BASE + index, lo, hi);
fid = lo & 0x3f;
did = (lo >> 6) & 7;
if (boot_cpu_data.x86 == 0x10)
px->core_frequency = (100 * (fid + 0x10)) >> did;
else
px->core_frequency = (100 * (fid + 8)) >> did;
}
}
#else
static void amd_fixup_frequency(struct acpi_processor_px *px, int i) {};
#endif

static int acpi_processor_get_performance_states(struct acpi_processor *pr)
{
int result = 0;
Expand Down Expand Up @@ -379,6 +407,8 @@ static int acpi_processor_get_performance_states(struct acpi_processor *pr)
goto end;
}

amd_fixup_frequency(px, i);

ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"State [%d]: core_frequency[%d] power[%d] transition_latency[%d] bus_master_latency[%d] control[0x%x] status[0x%x]\n",
i,
Expand Down
47 changes: 47 additions & 0 deletions drivers/base/power/opp.c
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@
#include <linux/rculist.h>
#include <linux/rcupdate.h>
#include <linux/opp.h>
#include <linux/of.h>

/*
* Internal data structure organization with the OPP layer library is as
Expand Down Expand Up @@ -674,3 +675,49 @@ struct srcu_notifier_head *opp_get_notifier(struct device *dev)

return &dev_opp->head;
}

#ifdef CONFIG_OF
/**
* of_init_opp_table() - Initialize opp table from device tree
* @dev: device pointer used to lookup device OPPs.
*
* Register the initial OPP table with the OPP library for given device.
*/
int of_init_opp_table(struct device *dev)
{
const struct property *prop;
const __be32 *val;
int nr;

prop = of_find_property(dev->of_node, "operating-points", NULL);
if (!prop)
return -ENODEV;
if (!prop->value)
return -ENODATA;

/*
* Each OPP is a set of tuples consisting of frequency and
* voltage like <freq-kHz vol-uV>.
*/
nr = prop->length / sizeof(u32);
if (nr % 2) {
dev_err(dev, "%s: Invalid OPP list\n", __func__);
return -EINVAL;
}

val = prop->value;
while (nr) {
unsigned long freq = be32_to_cpup(val++) * 1000;
unsigned long volt = be32_to_cpup(val++);

if (opp_add(dev, freq, volt)) {
dev_warn(dev, "%s: Failed to add OPP %ld\n",
__func__, freq);
continue;
}
nr -= 2;
}

return 0;
}
#endif
11 changes: 11 additions & 0 deletions drivers/cpufreq/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,17 @@ config CPU_FREQ_GOV_CONSERVATIVE

If in doubt, say N.

config GENERIC_CPUFREQ_CPU0
bool "Generic CPU0 cpufreq driver"
depends on HAVE_CLK && REGULATOR && PM_OPP && OF
select CPU_FREQ_TABLE
help
This adds a generic cpufreq driver for CPU0 frequency management.
It supports both uniprocessor (UP) and symmetric multiprocessor (SMP)
systems which share clock and voltage across all CPUs.

If in doubt, say N.

menu "x86 CPU frequency scaling drivers"
depends on X86
source "drivers/cpufreq/Kconfig.x86"
Expand Down
Loading

0 comments on commit fa373ab

Please sign in to comment.