Skip to content

Commit

Permalink
Merge branch 'x86/grand-schemozzle' of git://git.kernel.org/pub/scm/l…
Browse files Browse the repository at this point in the history
…inux/kernel/git/tip/tip

Pull pti updates from Thomas Gleixner:
 "The performance deterioration departement is not proud at all to
  present yet another set of speculation fences to mitigate the next
  chapter in the 'what could possibly go wrong' story.

  The new vulnerability belongs to the Spectre class and affects GS
  based data accesses and has therefore been dubbed 'Grand Schemozzle'
  for secret communication purposes. It's officially listed as
  CVE-2019-1125.

  Conditional branches in the entry paths which contain a SWAPGS
  instruction (interrupts and exceptions) can be mis-speculated which
  results in speculative accesses with a wrong GS base.

  This can happen on entry from user mode through a mis-speculated
  branch which takes the entry from kernel mode path and therefore does
  not execute the SWAPGS instruction. The following speculative accesses
  are done with user GS base.

  On entry from kernel mode the mis-speculated branch executes the
  SWAPGS instruction in the entry from user mode path which has the same
  effect that the following GS based accesses are done with user GS
  base.

  If there is a disclosure gadget available in these code paths the
  mis-speculated data access can be leaked through the usual side
  channels.

  The entry from user mode issue affects all CPUs which have speculative
  execution. The entry from kernel mode issue affects only Intel CPUs
  which can speculate through SWAPGS. On CPUs from other vendors SWAPGS
  has semantics which prevent that.

  SMAP migitates both problems but only when the CPU is not affected by
  the Meltdown vulnerability.

  The mitigation is to issue LFENCE instructions in the entry from
  kernel mode path for all affected CPUs and on the affected Intel CPUs
  also in the entry from user mode path unless PTI is enabled because
  the CR3 write is serializing.

  The fences are as usual enabled conditionally and can be completely
  disabled on the kernel command line. The Spectre V1 documentation is
  updated accordingly.

  A big "Thank You!" goes to Josh for doing the heavy lifting for this
  round of hardware misfeature 'repair'. Of course also "Thank You!" to
  everybody else who contributed in one way or the other"

* 'x86/grand-schemozzle' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  Documentation: Add swapgs description to the Spectre v1 documentation
  x86/speculation/swapgs: Exclude ATOMs from speculation through SWAPGS
  x86/entry/64: Use JMP instead of JMPQ
  x86/speculation: Enable Spectre v1 swapgs mitigations
  x86/speculation: Prepare entry code for Spectre v1 swapgs mitigations
  • Loading branch information
Linus Torvalds committed Aug 6, 2019
2 parents 0eb0ce0 + 4c92057 commit 4368c4b
Show file tree
Hide file tree
Showing 7 changed files with 246 additions and 40 deletions.
88 changes: 80 additions & 8 deletions Documentation/admin-guide/hw-vuln/spectre.rst
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,11 @@ Related CVEs

The following CVE entries describe Spectre variants:

============= ======================= =================
============= ======================= ==========================
CVE-2017-5753 Bounds check bypass Spectre variant 1
CVE-2017-5715 Branch target injection Spectre variant 2
============= ======================= =================
CVE-2019-1125 Spectre v1 swapgs Spectre variant 1 (swapgs)
============= ======================= ==========================

Problem
-------
Expand Down Expand Up @@ -78,6 +79,13 @@ There are some extensions of Spectre variant 1 attacks for reading data
over the network, see :ref:`[12] <spec_ref12>`. However such attacks
are difficult, low bandwidth, fragile, and are considered low risk.

Note that, despite "Bounds Check Bypass" name, Spectre variant 1 is not
only about user-controlled array bounds checks. It can affect any
conditional checks. The kernel entry code interrupt, exception, and NMI
handlers all have conditional swapgs checks. Those may be problematic
in the context of Spectre v1, as kernel code can speculatively run with
a user GS.

Spectre variant 2 (Branch Target Injection)
-------------------------------------------

Expand Down Expand Up @@ -132,6 +140,9 @@ not cover all possible attack vectors.
1. A user process attacking the kernel
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Spectre variant 1
~~~~~~~~~~~~~~~~~

The attacker passes a parameter to the kernel via a register or
via a known address in memory during a syscall. Such parameter may
be used later by the kernel as an index to an array or to derive
Expand All @@ -144,7 +155,40 @@ not cover all possible attack vectors.
potentially be influenced for Spectre attacks, new "nospec" accessor
macros are used to prevent speculative loading of data.

Spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
Spectre variant 1 (swapgs)
~~~~~~~~~~~~~~~~~~~~~~~~~~

An attacker can train the branch predictor to speculatively skip the
swapgs path for an interrupt or exception. If they initialize
the GS register to a user-space value, if the swapgs is speculatively
skipped, subsequent GS-related percpu accesses in the speculation
window will be done with the attacker-controlled GS value. This
could cause privileged memory to be accessed and leaked.

For example:

::

if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg
mov (%reg), %reg1

When coming from user space, the CPU can speculatively skip the
swapgs, and then do a speculative percpu load using the user GS
value. So the user can speculatively force a read of any kernel
value. If a gadget exists which uses the percpu value as an address
in another load/store, then the contents of the kernel value may
become visible via an L1 side channel attack.

A similar attack exists when coming from kernel space. The CPU can
speculatively do the swapgs, causing the user GS to get used for the
rest of the speculative window.

Spectre variant 2
~~~~~~~~~~~~~~~~~

A spectre variant 2 attacker can :ref:`poison <poison_btb>` the branch
target buffer (BTB) before issuing syscall to launch an attack.
After entering the kernel, the kernel could use the poisoned branch
target buffer on indirect jump and jump to gadget code in speculative
Expand Down Expand Up @@ -280,11 +324,18 @@ The sysfs file showing Spectre variant 1 mitigation status is:

The possible values in this file are:

======================================= =================================
'Mitigation: __user pointer sanitation' Protection in kernel on a case by
case base with explicit pointer
sanitation.
======================================= =================================
.. list-table::

* - 'Not affected'
- The processor is not vulnerable.
* - 'Vulnerable: __user pointer sanitization and usercopy barriers only; no swapgs barriers'
- The swapgs protections are disabled; otherwise it has
protection in the kernel on a case by case base with explicit
pointer sanitation and usercopy LFENCE barriers.
* - 'Mitigation: usercopy/swapgs barriers and __user pointer sanitization'
- Protection in the kernel on a case by case base with explicit
pointer sanitation, usercopy LFENCE barriers, and swapgs LFENCE
barriers.

However, the protections are put in place on a case by case basis,
and there is no guarantee that all possible attack vectors for Spectre
Expand Down Expand Up @@ -366,12 +417,27 @@ Turning on mitigation for Spectre variant 1 and Spectre variant 2
1. Kernel mitigation
^^^^^^^^^^^^^^^^^^^^

Spectre variant 1
~~~~~~~~~~~~~~~~~

For the Spectre variant 1, vulnerable kernel code (as determined
by code audit or scanning tools) is annotated on a case by case
basis to use nospec accessor macros for bounds clipping :ref:`[2]
<spec_ref2>` to avoid any usable disclosure gadgets. However, it may
not cover all attack vectors for Spectre variant 1.

Copy-from-user code has an LFENCE barrier to prevent the access_ok()
check from being mis-speculated. The barrier is done by the
barrier_nospec() macro.

For the swapgs variant of Spectre variant 1, LFENCE barriers are
added to interrupt, exception and NMI entry where needed. These
barriers are done by the FENCE_SWAPGS_KERNEL_ENTRY and
FENCE_SWAPGS_USER_ENTRY macros.

Spectre variant 2
~~~~~~~~~~~~~~~~~

For Spectre variant 2 mitigation, the compiler turns indirect calls or
jumps in the kernel into equivalent return trampolines (retpolines)
:ref:`[3] <spec_ref3>` :ref:`[9] <spec_ref9>` to go to the target
Expand Down Expand Up @@ -473,6 +539,12 @@ Mitigation control on the kernel command line
Spectre variant 2 mitigation can be disabled or force enabled at the
kernel command line.

nospectre_v1

[X86,PPC] Disable mitigations for Spectre Variant 1
(bounds check bypass). With this option data leaks are
possible in the system.

nospectre_v2

[X86] Disable all mitigations for the Spectre variant 2
Expand Down
8 changes: 4 additions & 4 deletions Documentation/admin-guide/kernel-parameters.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2604,7 +2604,7 @@
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
nospectre_v1 [PPC]
nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
Expand Down Expand Up @@ -2965,9 +2965,9 @@
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.

nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
check bypass). With this option data leaks are possible
in the system.
nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
(bounds check bypass). With this option data leaks are
possible in the system.

nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
the Spectre variant 2 (indirect branch prediction)
Expand Down
17 changes: 17 additions & 0 deletions arch/x86/entry/calling.h
Original file line number Diff line number Diff line change
Expand Up @@ -314,6 +314,23 @@ For 32-bit we have the following conventions - kernel is built with

#endif

/*
* Mitigate Spectre v1 for conditional swapgs code paths.
*
* FENCE_SWAPGS_USER_ENTRY is used in the user entry swapgs code path, to
* prevent a speculative swapgs when coming from kernel space.
*
* FENCE_SWAPGS_KERNEL_ENTRY is used in the kernel entry non-swapgs code path,
* to prevent the swapgs from getting speculatively skipped when coming from
* user space.
*/
.macro FENCE_SWAPGS_USER_ENTRY
ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_USER
.endm
.macro FENCE_SWAPGS_KERNEL_ENTRY
ALTERNATIVE "", "lfence", X86_FEATURE_FENCE_SWAPGS_KERNEL
.endm

.macro STACKLEAK_ERASE_NOCLOBBER
#ifdef CONFIG_GCC_PLUGIN_STACKLEAK
PUSH_AND_CLEAR_REGS
Expand Down
21 changes: 18 additions & 3 deletions arch/x86/entry/entry_64.S
Original file line number Diff line number Diff line change
Expand Up @@ -519,7 +519,7 @@ ENTRY(interrupt_entry)
testb $3, CS-ORIG_RAX+8(%rsp)
jz 1f
SWAPGS

FENCE_SWAPGS_USER_ENTRY
/*
* Switch to the thread stack. The IRET frame and orig_ax are
* on the stack, as well as the return address. RDI..R12 are
Expand Down Expand Up @@ -549,8 +549,10 @@ ENTRY(interrupt_entry)
UNWIND_HINT_FUNC

movq (%rdi), %rdi
jmp 2f
1:

FENCE_SWAPGS_KERNEL_ENTRY
2:
PUSH_AND_CLEAR_REGS save_ret=1
ENCODE_FRAME_POINTER 8

Expand Down Expand Up @@ -1238,6 +1240,13 @@ ENTRY(paranoid_entry)
*/
SAVE_AND_SWITCH_TO_KERNEL_CR3 scratch_reg=%rax save_reg=%r14

/*
* The above SAVE_AND_SWITCH_TO_KERNEL_CR3 macro doesn't do an
* unconditional CR3 write, even in the PTI case. So do an lfence
* to prevent GS speculation, regardless of whether PTI is enabled.
*/
FENCE_SWAPGS_KERNEL_ENTRY

ret
END(paranoid_entry)

Expand Down Expand Up @@ -1288,6 +1297,7 @@ ENTRY(error_entry)
* from user mode due to an IRET fault.
*/
SWAPGS
FENCE_SWAPGS_USER_ENTRY
/* We have user CR3. Change to kernel CR3. */
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax

Expand All @@ -1301,6 +1311,8 @@ ENTRY(error_entry)
pushq %r12
ret

.Lerror_entry_done_lfence:
FENCE_SWAPGS_KERNEL_ENTRY
.Lerror_entry_done:
ret

Expand All @@ -1318,14 +1330,15 @@ ENTRY(error_entry)
cmpq %rax, RIP+8(%rsp)
je .Lbstep_iret
cmpq $.Lgs_change, RIP+8(%rsp)
jne .Lerror_entry_done
jne .Lerror_entry_done_lfence

/*
* hack: .Lgs_change can fail with user gsbase. If this happens, fix up
* gsbase and proceed. We'll fix up the exception and land in
* .Lgs_change's error handler with kernel gsbase.
*/
SWAPGS
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax
jmp .Lerror_entry_done

Expand All @@ -1340,6 +1353,7 @@ ENTRY(error_entry)
* gsbase and CR3. Switch to kernel gsbase and CR3:
*/
SWAPGS
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rax

/*
Expand Down Expand Up @@ -1431,6 +1445,7 @@ ENTRY(nmi)

swapgs
cld
FENCE_SWAPGS_USER_ENTRY
SWITCH_TO_KERNEL_CR3 scratch_reg=%rdx
movq %rsp, %rdx
movq PER_CPU_VAR(cpu_current_top_of_stack), %rsp
Expand Down
3 changes: 3 additions & 0 deletions arch/x86/include/asm/cpufeatures.h
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,8 @@
#define X86_FEATURE_CQM_OCCUP_LLC (11*32+ 1) /* LLC occupancy monitoring */
#define X86_FEATURE_CQM_MBM_TOTAL (11*32+ 2) /* LLC Total MBM monitoring */
#define X86_FEATURE_CQM_MBM_LOCAL (11*32+ 3) /* LLC Local MBM monitoring */
#define X86_FEATURE_FENCE_SWAPGS_USER (11*32+ 4) /* "" LFENCE in user entry SWAPGS path */
#define X86_FEATURE_FENCE_SWAPGS_KERNEL (11*32+ 5) /* "" LFENCE in kernel entry SWAPGS path */

/* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */
#define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */
Expand Down Expand Up @@ -394,5 +396,6 @@
#define X86_BUG_L1TF X86_BUG(18) /* CPU is affected by L1 Terminal Fault */
#define X86_BUG_MDS X86_BUG(19) /* CPU is affected by Microarchitectural data sampling */
#define X86_BUG_MSBDS_ONLY X86_BUG(20) /* CPU is only affected by the MSDBS variant of BUG_MDS */
#define X86_BUG_SWAPGS X86_BUG(21) /* CPU is affected by speculation through SWAPGS */

#endif /* _ASM_X86_CPUFEATURES_H */
Loading

0 comments on commit 4368c4b

Please sign in to comment.