Skip to content

Commit

Permalink
Merge branch 'x86/mm' into x86/platform, to pick up TLB flush dependency
Browse files Browse the repository at this point in the history
Signed-off-by: Ingo Molnar <mingo@kernel.org>
  • Loading branch information
Ingo Molnar committed Aug 31, 2017
2 parents 3308376 + 9e52fc2 commit 3e83dfd
Show file tree
Hide file tree
Showing 596 changed files with 7,066 additions and 2,300 deletions.
13 changes: 13 additions & 0 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2233,6 +2233,17 @@
memory contents and reserves bad memory
regions that are detected.

mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control
Valid arguments: on, off
Default (depends on kernel configuration option):
on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME

Refer to Documentation/x86/amd-memory-encryption.txt
for details on when memory encryption can be activated.

mem_sleep_default= [SUSPEND] Default system suspend mode:
s2idle - Suspend-To-Idle
shallow - Power-On Suspend or equivalent (if supported)
Expand Down Expand Up @@ -2696,6 +2707,8 @@
nopat [X86] Disable PAT (page attribute table extension of
pagetables) support.

nopcid [X86-64] Disable the PCID cpu feature.

norandmaps Don't use address space randomization. Equivalent to
echo 0 > /proc/sys/kernel/randomize_va_space

Expand Down
6 changes: 6 additions & 0 deletions Documentation/fb/efifb.txt
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,11 @@ You have to add the following kernel parameters in your elilo.conf:
Macbook Pro 17", iMac 20" :
video=efifb:i20

Accepted options:

nowc Don't map the framebuffer write combined. This can be used
to workaround side-effects and slowdowns on other CPU cores
when large amounts of console data are written.

--
Edgar Hucek <gimli@dark-green.com>
4 changes: 2 additions & 2 deletions Documentation/networking/switchdev.txt
Original file line number Diff line number Diff line change
Expand Up @@ -228,7 +228,7 @@ Learning on the device port should be enabled, as well as learning_sync:
bridge link set dev DEV learning on self
bridge link set dev DEV learning_sync on self

Learning_sync attribute enables syncing of the learned/forgotton FDB entry to
Learning_sync attribute enables syncing of the learned/forgotten FDB entry to
the bridge's FDB. It's possible, but not optimal, to enable learning on the
device port and on the bridge port, and disable learning_sync.

Expand All @@ -245,7 +245,7 @@ the responsibility of the port driver/device to age out these entries. If the
port device supports ageing, when the FDB entry expires, it will notify the
driver which in turn will notify the bridge with SWITCHDEV_FDB_DEL. If the
device does not support ageing, the driver can simulate ageing using a
garbage collection timer to monitor FBD entries. Expired entries will be
garbage collection timer to monitor FDB entries. Expired entries will be
notified to the bridge using SWITCHDEV_FDB_DEL. See rocker driver for
example of driver running ageing timer.

Expand Down
19 changes: 11 additions & 8 deletions Documentation/printk-formats.txt
Original file line number Diff line number Diff line change
Expand Up @@ -58,20 +58,23 @@ Symbols/Function Pointers
%ps versatile_init
%pB prev_fn_of_versatile_init+0x88/0x88

For printing symbols and function pointers. The ``S`` and ``s`` specifiers
result in the symbol name with (``S``) or without (``s``) offsets. Where
this is used on a kernel without KALLSYMS - the symbol address is
printed instead.
The ``F`` and ``f`` specifiers are for printing function pointers,
for example, f->func, &gettimeofday. They have the same result as
``S`` and ``s`` specifiers. But they do an extra conversion on
ia64, ppc64 and parisc64 architectures where the function pointers
are actually function descriptors.

The ``S`` and ``s`` specifiers can be used for printing symbols
from direct addresses, for example, __builtin_return_address(0),
(void *)regs->ip. They result in the symbol name with (``S``) or
without (``s``) offsets. If KALLSYMS are disabled then the symbol
address is printed instead.

The ``B`` specifier results in the symbol name with offsets and should be
used when printing stack backtraces. The specifier takes into
consideration the effect of compiler optimisations which may occur
when tail-call``s are used and marked with the noreturn GCC attribute.

On ia64, ppc64 and parisc64 architectures function pointers are
actually function descriptors which must first be resolved. The ``F`` and
``f`` specifiers perform this resolution and then provide the same
functionality as the ``S`` and ``s`` specifiers.

Kernel Pointers
===============
Expand Down
47 changes: 36 additions & 11 deletions Documentation/sysctl/net.txt
Original file line number Diff line number Diff line change
Expand Up @@ -35,9 +35,34 @@ Table : Subdirectories in /proc/sys/net
bpf_jit_enable
--------------

This enables Berkeley Packet Filter Just in Time compiler.
Currently supported on x86_64 architecture, bpf_jit provides a framework
to speed packet filtering, the one used by tcpdump/libpcap for example.
This enables the BPF Just in Time (JIT) compiler. BPF is a flexible
and efficient infrastructure allowing to execute bytecode at various
hook points. It is used in a number of Linux kernel subsystems such
as networking (e.g. XDP, tc), tracing (e.g. kprobes, uprobes, tracepoints)
and security (e.g. seccomp). LLVM has a BPF back end that can compile
restricted C into a sequence of BPF instructions. After program load
through bpf(2) and passing a verifier in the kernel, a JIT will then
translate these BPF proglets into native CPU instructions. There are
two flavors of JITs, the newer eBPF JIT currently supported on:
- x86_64
- arm64
- ppc64
- sparc64
- mips64
- s390x

And the older cBPF JIT supported on the following archs:
- arm
- mips
- ppc
- sparc

eBPF JITs are a superset of cBPF JITs, meaning the kernel will
migrate cBPF instructions into eBPF instructions and then JIT
compile them transparently. Older cBPF JITs can only translate
tcpdump filters, seccomp rules, etc, but not mentioned eBPF
programs loaded through bpf(2).

Values :
0 - disable the JIT (default value)
1 - enable the JIT
Expand All @@ -46,9 +71,9 @@ Values :
bpf_jit_harden
--------------

This enables hardening for the Berkeley Packet Filter Just in Time compiler.
Supported are eBPF JIT backends. Enabling hardening trades off performance,
but can mitigate JIT spraying.
This enables hardening for the BPF JIT compiler. Supported are eBPF
JIT backends. Enabling hardening trades off performance, but can
mitigate JIT spraying.
Values :
0 - disable JIT hardening (default value)
1 - enable JIT hardening for unprivileged users only
Expand All @@ -57,11 +82,11 @@ Values :
bpf_jit_kallsyms
----------------

When Berkeley Packet Filter Just in Time compiler is enabled, then compiled
images are unknown addresses to the kernel, meaning they neither show up in
traces nor in /proc/kallsyms. This enables export of these addresses, which
can be used for debugging/tracing. If bpf_jit_harden is enabled, this feature
is disabled.
When BPF JIT compiler is enabled, then compiled images are unknown
addresses to the kernel, meaning they neither show up in traces nor
in /proc/kallsyms. This enables export of these addresses, which can
be used for debugging/tracing. If bpf_jit_harden is enabled, this
feature is disabled.
Values :
0 - disable JIT kallsyms export (default value)
1 - enable JIT kallsyms export for privileged users only
Expand Down
68 changes: 68 additions & 0 deletions Documentation/x86/amd-memory-encryption.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
Secure Memory Encryption (SME) is a feature found on AMD processors.

SME provides the ability to mark individual pages of memory as encrypted using
the standard x86 page tables. A page that is marked encrypted will be
automatically decrypted when read from DRAM and encrypted when written to
DRAM. SME can therefore be used to protect the contents of DRAM from physical
attacks on the system.

A page is encrypted when a page table entry has the encryption bit set (see
below on how to determine its position). The encryption bit can also be
specified in the cr3 register, allowing the PGD table to be encrypted. Each
successive level of page tables can also be encrypted by setting the encryption
bit in the page table entry that points to the next table. This allows the full
page table hierarchy to be encrypted. Note, this means that just because the
encryption bit is set in cr3, doesn't imply the full hierarchy is encyrpted.
Each page table entry in the hierarchy needs to have the encryption bit set to
achieve that. So, theoretically, you could have the encryption bit set in cr3
so that the PGD is encrypted, but not set the encryption bit in the PGD entry
for a PUD which results in the PUD pointed to by that entry to not be
encrypted.

Support for SME can be determined through the CPUID instruction. The CPUID
function 0x8000001f reports information related to SME:

0x8000001f[eax]:
Bit[0] indicates support for SME
0x8000001f[ebx]:
Bits[5:0] pagetable bit number used to activate memory
encryption
Bits[11:6] reduction in physical address space, in bits, when
memory encryption is enabled (this only affects
system physical addresses, not guest physical
addresses)

If support for SME is present, MSR 0xc00100010 (MSR_K8_SYSCFG) can be used to
determine if SME is enabled and/or to enable memory encryption:

0xc0010010:
Bit[23] 0 = memory encryption features are disabled
1 = memory encryption features are enabled

Linux relies on BIOS to set this bit if BIOS has determined that the reduction
in the physical address space as a result of enabling memory encryption (see
CPUID information above) will not conflict with the address space resource
requirements for the system. If this bit is not set upon Linux startup then
Linux itself will not set it and memory encryption will not be possible.

The state of SME in the Linux kernel can be documented as follows:
- Supported:
The CPU supports SME (determined through CPUID instruction).

- Enabled:
Supported and bit 23 of MSR_K8_SYSCFG is set.

- Active:
Supported, Enabled and the Linux kernel is actively applying
the encryption bit to page table entries (the SME mask in the
kernel is non-zero).

SME can also be enabled and activated in the BIOS. If SME is enabled and
activated in the BIOS, then all memory accesses will be encrypted and it will
not be necessary to activate the Linux memory encryption support. If the BIOS
merely enables SME (sets bit 23 of the MSR_K8_SYSCFG), then Linux can activate
memory encryption by default (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y) or
by supplying mem_encrypt=on on the kernel command line. However, if BIOS does
not enable SME, then Linux will not be able to activate memory encryption, even
if configured to do so by default or the mem_encrypt=on command line parameter
is specified.
6 changes: 3 additions & 3 deletions Documentation/x86/protection-keys.txt
Original file line number Diff line number Diff line change
Expand Up @@ -34,17 +34,17 @@ with a key. In this example WRPKRU is wrapped by a C function
called pkey_set().

int real_prot = PROT_READ|PROT_WRITE;
pkey = pkey_alloc(0, PKEY_DENY_WRITE);
pkey = pkey_alloc(0, PKEY_DISABLE_WRITE);
ptr = mmap(NULL, PAGE_SIZE, PROT_NONE, MAP_ANONYMOUS|MAP_PRIVATE, -1, 0);
ret = pkey_mprotect(ptr, PAGE_SIZE, real_prot, pkey);
... application runs here

Now, if the application needs to update the data at 'ptr', it can
gain access, do the update, then remove its write access:

pkey_set(pkey, 0); // clear PKEY_DENY_WRITE
pkey_set(pkey, 0); // clear PKEY_DISABLE_WRITE
*ptr = foo; // assign something
pkey_set(pkey, PKEY_DENY_WRITE); // set PKEY_DENY_WRITE again
pkey_set(pkey, PKEY_DISABLE_WRITE); // set PKEY_DISABLE_WRITE again

Now when it frees the memory, it will also free the pkey since it
is no longer in use:
Expand Down
64 changes: 64 additions & 0 deletions Documentation/x86/x86_64/5level-paging.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
== Overview ==

Original x86-64 was limited by 4-level paing to 256 TiB of virtual address
space and 64 TiB of physical address space. We are already bumping into
this limit: some vendors offers servers with 64 TiB of memory today.

To overcome the limitation upcoming hardware will introduce support for
5-level paging. It is a straight-forward extension of the current page
table structure adding one more layer of translation.

It bumps the limits to 128 PiB of virtual address space and 4 PiB of
physical address space. This "ought to be enough for anybody" ©.

QEMU 2.9 and later support 5-level paging.

Virtual memory layout for 5-level paging is described in
Documentation/x86/x86_64/mm.txt

== Enabling 5-level paging ==

CONFIG_X86_5LEVEL=y enables the feature.

So far, a kernel compiled with the option enabled will be able to boot
only on machines that supports the feature -- see for 'la57' flag in
/proc/cpuinfo.

The plan is to implement boot-time switching between 4- and 5-level paging
in the future.

== User-space and large virtual address space ==

On x86, 5-level paging enables 56-bit userspace virtual address space.
Not all user space is ready to handle wide addresses. It's known that
at least some JIT compilers use higher bits in pointers to encode their
information. It collides with valid pointers with 5-level paging and
leads to crashes.

To mitigate this, we are not going to allocate virtual address space
above 47-bit by default.

But userspace can ask for allocation from full address space by
specifying hint address (with or without MAP_FIXED) above 47-bits.

If hint address set above 47-bit, but MAP_FIXED is not specified, we try
to look for unmapped area by specified address. If it's already
occupied, we look for unmapped area in *full* address space, rather than
from 47-bit window.

A high hint address would only affect the allocation in question, but not
any future mmap()s.

Specifying high hint address on older kernel or on machine without 5-level
paging support is safe. The hint will be ignored and kernel will fall back
to allocation from 47-bit address space.

This approach helps to easily make application's memory allocator aware
about large address space without manually tracking allocated virtual
address space.

One important case we need to handle here is interaction with MPX.
MPX (without MAWA extension) cannot handle addresses above 47-bit, so we
need to make sure that MPX cannot be enabled we already have VMA above
the boundary and forbid creating such VMAs once MPX is enabled.

2 changes: 1 addition & 1 deletion MAINTAINERS
Original file line number Diff line number Diff line change
Expand Up @@ -7111,7 +7111,6 @@ M: Marc Zyngier <marc.zyngier@arm.com>
L: linux-kernel@vger.kernel.org
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git irq/core
T: git git://git.infradead.org/users/jcooper/linux.git irqchip/core
F: Documentation/devicetree/bindings/interrupt-controller/
F: drivers/irqchip/

Expand Down Expand Up @@ -14005,6 +14004,7 @@ F: drivers/block/virtio_blk.c
F: include/linux/virtio*.h
F: include/uapi/linux/virtio_*.h
F: drivers/crypto/virtio/
F: mm/balloon_compaction.c

VIRTIO CRYPTO DRIVER
M: Gonglei <arei.gonglei@huawei.com>
Expand Down
15 changes: 8 additions & 7 deletions Makefile
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
VERSION = 4
PATCHLEVEL = 13
SUBLEVEL = 0
EXTRAVERSION = -rc4
EXTRAVERSION = -rc6
NAME = Fearless Coyote

# *DOCUMENTATION*
Expand Down Expand Up @@ -396,7 +396,7 @@ LINUXINCLUDE := \
KBUILD_CPPFLAGS := -D__KERNEL__

KBUILD_CFLAGS := -Wall -Wundef -Wstrict-prototypes -Wno-trigraphs \
-fno-strict-aliasing -fno-common \
-fno-strict-aliasing -fno-common -fshort-wchar \
-Werror-implicit-function-declaration \
-Wno-format-security \
-std=gnu89 $(call cc-option,-fno-PIE)
Expand Down Expand Up @@ -442,7 +442,7 @@ export RCS_TAR_IGNORE := --exclude SCCS --exclude BitKeeper --exclude .svn \
# ===========================================================================
# Rules shared between *config targets and build targets

# Basic helpers built in scripts/
# Basic helpers built in scripts/basic/
PHONY += scripts_basic
scripts_basic:
$(Q)$(MAKE) $(build)=scripts/basic
Expand Down Expand Up @@ -505,7 +505,7 @@ ifeq ($(KBUILD_EXTMOD),)
endif
endif
endif
# install and module_install need also be processed one by one
# install and modules_install need also be processed one by one
ifneq ($(filter install,$(MAKECMDGOALS)),)
ifneq ($(filter modules_install,$(MAKECMDGOALS)),)
mixed-targets := 1
Expand Down Expand Up @@ -964,7 +964,7 @@ export KBUILD_VMLINUX_MAIN := $(core-y) $(libs-y2) $(drivers-y) $(net-y) $(virt-
export KBUILD_VMLINUX_LIBS := $(libs-y1)
export KBUILD_LDS := arch/$(SRCARCH)/kernel/vmlinux.lds
export LDFLAGS_vmlinux
# used by scripts/pacmage/Makefile
# used by scripts/package/Makefile
export KBUILD_ALLDIRS := $(sort $(filter-out arch/%,$(vmlinux-alldirs)) arch Documentation include samples scripts tools)

vmlinux-deps := $(KBUILD_LDS) $(KBUILD_VMLINUX_INIT) $(KBUILD_VMLINUX_MAIN) $(KBUILD_VMLINUX_LIBS)
Expand Down Expand Up @@ -992,8 +992,8 @@ include/generated/autoksyms.h: FORCE
ARCH_POSTLINK := $(wildcard $(srctree)/arch/$(SRCARCH)/Makefile.postlink)

# Final link of vmlinux with optional arch pass after final link
cmd_link-vmlinux = \
$(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ; \
cmd_link-vmlinux = \
$(CONFIG_SHELL) $< $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) ; \
$(if $(ARCH_POSTLINK), $(MAKE) -f $(ARCH_POSTLINK) $@, true)

vmlinux: scripts/link-vmlinux.sh vmlinux_prereq $(vmlinux-deps) FORCE
Expand Down Expand Up @@ -1184,6 +1184,7 @@ PHONY += kselftest
kselftest:
$(Q)$(MAKE) -C tools/testing/selftests run_tests

PHONY += kselftest-clean
kselftest-clean:
$(Q)$(MAKE) -C tools/testing/selftests clean

Expand Down
1 change: 0 additions & 1 deletion arch/arc/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,6 @@ menu "ARC Architecture Configuration"

menu "ARC Platform/SoC/Board"

source "arch/arc/plat-sim/Kconfig"
source "arch/arc/plat-tb10x/Kconfig"
source "arch/arc/plat-axs10x/Kconfig"
#New platform adds here
Expand Down
Loading

0 comments on commit 3e83dfd

Please sign in to comment.