Skip to content

Commit

Permalink
Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/…
Browse files Browse the repository at this point in the history
…pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
  • Loading branch information
Linus Torvalds committed May 23, 2024
2 parents 5c6f4d6 + db3e24a commit c760b37
Show file tree
Hide file tree
Showing 41 changed files with 365 additions and 220 deletions.
78 changes: 78 additions & 0 deletions Documentation/core-api/floating-point.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
.. SPDX-License-Identifier: GPL-2.0+
Floating-point API
==================

Kernel code is normally prohibited from using floating-point (FP) registers or
instructions, including the C float and double data types. This rule reduces
system call overhead, because the kernel does not need to save and restore the
userspace floating-point register state.

However, occasionally drivers or library functions may need to include FP code.
This is supported by isolating the functions containing FP code to a separate
translation unit (a separate source file), and saving/restoring the FP register
state around calls to those functions. This creates "critical sections" of
floating-point usage.

The reason for this isolation is to prevent the compiler from generating code
touching the FP registers outside these critical sections. Compilers sometimes
use FP registers to optimize inlined ``memcpy`` or variable assignment, as
floating-point registers may be wider than general-purpose registers.

Usability of floating-point code within the kernel is architecture-specific.
Additionally, because a single kernel may be configured to support platforms
both with and without a floating-point unit, FPU availability must be checked
both at build time and at run time.

Several architectures implement the generic kernel floating-point API from
``linux/fpu.h``, as described below. Some other architectures implement their
own unique APIs, which are documented separately.

Build-time API
--------------

Floating-point code may be built if the option ``ARCH_HAS_KERNEL_FPU_SUPPORT``
is enabled. For C code, such code must be placed in a separate file, and that
file must have its compilation flags adjusted using the following pattern::

CFLAGS_foo.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_foo.o += $(CC_FLAGS_NO_FPU)

Architectures are expected to define one or both of these variables in their
top-level Makefile as needed. For example::

CC_FLAGS_FPU := -mhard-float

or::

CC_FLAGS_NO_FPU := -msoft-float

Normal kernel code is assumed to use the equivalent of ``CC_FLAGS_NO_FPU``.

Runtime API
-----------

The runtime API is provided in ``linux/fpu.h``. This header cannot be included
from files implementing FP code (those with their compilation flags adjusted as
above). Instead, it must be included when defining the FP critical sections.

.. c:function:: bool kernel_fpu_available( void )
This function reports if floating-point code can be used on this CPU or
platform. The value returned by this function is not expected to change
at runtime, so it only needs to be called once, not before every
critical section.
.. c:function:: void kernel_fpu_begin( void )
void kernel_fpu_end( void )
These functions create a floating-point critical section. It is only
valid to call ``kernel_fpu_begin()`` after a previous call to
``kernel_fpu_available()`` returned ``true``. These functions are only
guaranteed to be callable from (preemptible or non-preemptible) process
context.
Preemption may be disabled inside critical sections, so their size
should be minimized. They are *not* required to be reentrant. If the
caller expects to nest critical sections, it must implement its own
reference counting.
1 change: 1 addition & 0 deletions Documentation/core-api/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ Library functionality that is used throughout the kernel.
errseq
wrappers/atomic_t
wrappers/atomic_bitops
floating-point

Low level entry and exit
========================
Expand Down
5 changes: 5 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -970,6 +970,11 @@ KBUILD_CFLAGS += $(CC_FLAGS_CFI)
export CC_FLAGS_CFI
endif

# Architectures can define flags to add/remove for floating-point support
CC_FLAGS_FPU += -D_LINUX_FPU_COMPILATION_UNIT
export CC_FLAGS_FPU
export CC_FLAGS_NO_FPU

ifneq ($(CONFIG_FUNCTION_ALIGNMENT),0)
# Set the minimal function alignment. Use the newer GCC option
# -fmin-function-alignment if it is available, or fall back to -falign-funtions.
Expand Down
6 changes: 6 additions & 0 deletions arch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -1594,6 +1594,12 @@ config ARCH_HAS_NONLEAF_PMD_YOUNG
address translations. Page table walkers that clear the accessed bit
may use this capability to reduce their search space.

config ARCH_HAS_KERNEL_FPU_SUPPORT
bool
help
Architectures that select this option can run floating-point code in
the kernel, as described in Documentation/core-api/floating-point.rst.

source "kernel/gcov/Kconfig"

source "scripts/gcc-plugins/Kconfig"
Expand Down
7 changes: 7 additions & 0 deletions arch/arm/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -130,6 +130,13 @@ endif
# Accept old syntax despite ".syntax unified"
AFLAGS_NOWARN :=$(call as-option,-Wa$(comma)-mno-warn-deprecated,-Wa$(comma)-W)

# The GCC option -ffreestanding is required in order to compile code containing
# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
CC_FLAGS_FPU := -ffreestanding
# Enable <arm_neon.h>
CC_FLAGS_FPU += -isystem $(shell $(CC) -print-file-name=include)
CC_FLAGS_FPU += -march=armv7-a -mfloat-abi=softfp -mfpu=neon

ifeq ($(CONFIG_THUMB2_KERNEL),y)
CFLAGS_ISA :=-Wa,-mimplicit-it=always $(AFLAGS_NOWARN)
AFLAGS_ISA :=$(CFLAGS_ISA) -Wa$(comma)-mthumb
Expand Down
15 changes: 15 additions & 0 deletions arch/arm/include/asm/fpu.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/

#ifndef __ASM_FPU_H
#define __ASM_FPU_H

#include <asm/neon.h>

#define kernel_fpu_available() cpu_has_neon()
#define kernel_fpu_begin() kernel_neon_begin()
#define kernel_fpu_end() kernel_neon_end()

#endif /* ! __ASM_FPU_H */
3 changes: 1 addition & 2 deletions arch/arm/lib/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -40,8 +40,7 @@ $(obj)/csumpartialcopy.o: $(obj)/csumpartialcopygeneric.S
$(obj)/csumpartialcopyuser.o: $(obj)/csumpartialcopygeneric.S

ifeq ($(CONFIG_KERNEL_MODE_NEON),y)
NEON_FLAGS := -march=armv7-a -mfloat-abi=softfp -mfpu=neon
CFLAGS_xor-neon.o += $(NEON_FLAGS)
CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
endif

Expand Down
1 change: 1 addition & 0 deletions arch/arm64/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ config ARM64
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if KERNEL_MODE_NEON
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
Expand Down
9 changes: 8 additions & 1 deletion arch/arm64/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,14 @@ ifeq ($(CONFIG_BROKEN_GAS_INST),y)
$(warning Detected assembler with broken .inst; disassembly will be unreliable)
endif

KBUILD_CFLAGS += -mgeneral-regs-only \
# The GCC option -ffreestanding is required in order to compile code containing
# ARM/NEON intrinsics in a non C99-compliant environment (such as the kernel)
CC_FLAGS_FPU := -ffreestanding
# Enable <arm_neon.h>
CC_FLAGS_FPU += -isystem $(shell $(CC) -print-file-name=include)
CC_FLAGS_NO_FPU := -mgeneral-regs-only

KBUILD_CFLAGS += $(CC_FLAGS_NO_FPU) \
$(compat_vdso) $(cc_has_k_constraint)
KBUILD_CFLAGS += $(call cc-disable-warning, psabi)
KBUILD_AFLAGS += $(compat_vdso)
Expand Down
15 changes: 15 additions & 0 deletions arch/arm64/include/asm/fpu.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/

#ifndef __ASM_FPU_H
#define __ASM_FPU_H

#include <asm/neon.h>

#define kernel_fpu_available() cpu_has_neon()
#define kernel_fpu_begin() kernel_neon_begin()
#define kernel_fpu_end() kernel_neon_end()

#endif /* ! __ASM_FPU_H */
6 changes: 2 additions & 4 deletions arch/arm64/lib/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -7,10 +7,8 @@ lib-y := clear_user.o delay.o copy_from_user.o \

ifeq ($(CONFIG_KERNEL_MODE_NEON), y)
obj-$(CONFIG_XOR_BLOCKS) += xor-neon.o
CFLAGS_REMOVE_xor-neon.o += -mgeneral-regs-only
CFLAGS_xor-neon.o += -ffreestanding
# Enable <arm_neon.h>
CFLAGS_xor-neon.o += -isystem $(shell $(CC) -print-file-name=include)
CFLAGS_xor-neon.o += $(CC_FLAGS_FPU)
CFLAGS_REMOVE_xor-neon.o += $(CC_FLAGS_NO_FPU)
endif

lib-$(CONFIG_ARCH_HAS_UACCESS_FLUSHCACHE) += uaccess_flushcache.o
Expand Down
1 change: 1 addition & 0 deletions arch/loongarch/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ config LOONGARCH
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if CPU_HAS_FPU
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL
Expand Down
5 changes: 4 additions & 1 deletion arch/loongarch/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,9 @@ endif
32bit-emul = elf32loongarch
64bit-emul = elf64loongarch

CC_FLAGS_FPU := -mfpu=64
CC_FLAGS_NO_FPU := -msoft-float

ifdef CONFIG_UNWINDER_ORC
orc_hash_h := arch/$(SRCARCH)/include/generated/asm/orc_hash.h
orc_hash_sh := $(srctree)/scripts/orc_hash.sh
Expand Down Expand Up @@ -59,7 +62,7 @@ ld-emul = $(64bit-emul)
cflags-y += -mabi=lp64s
endif

cflags-y += -pipe -msoft-float
cflags-y += -pipe $(CC_FLAGS_NO_FPU)
LDFLAGS_vmlinux += -static -n -nostdlib

# When the assembler supports explicit relocation hint, we must use it.
Expand Down
1 change: 1 addition & 0 deletions arch/loongarch/include/asm/fpu.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

struct sigcontext;

#define kernel_fpu_available() cpu_has_fpu
extern void kernel_fpu_begin(void);
extern void kernel_fpu_end(void);

Expand Down
1 change: 1 addition & 0 deletions arch/powerpc/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,7 @@ config PPC
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_HUGEPD if HUGETLB_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if PPC_FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MEMREMAP_COMPAT_ALIGN if PPC_64S_HASH_MMU
Expand Down
5 changes: 4 additions & 1 deletion arch/powerpc/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,9 @@ CFLAGS-$(CONFIG_PPC32) += $(call cc-option, $(MULTIPLEWORD))

CFLAGS-$(CONFIG_PPC32) += $(call cc-option,-mno-readonly-in-sdata)

CC_FLAGS_FPU := $(call cc-option,-mhard-float)
CC_FLAGS_NO_FPU := $(call cc-option,-msoft-float)

ifdef CONFIG_FUNCTION_TRACER
ifdef CONFIG_ARCH_USING_PATCHABLE_FUNCTION_ENTRY
KBUILD_CPPFLAGS += -DCC_USING_PATCHABLE_FUNCTION_ENTRY
Expand All @@ -170,7 +173,7 @@ asinstr := $(call as-instr,lis 9$(comma)foo@high,-DHAVE_AS_ATHIGH=1)

KBUILD_CPPFLAGS += -I $(srctree)/arch/powerpc $(asinstr)
KBUILD_AFLAGS += $(AFLAGS-y)
KBUILD_CFLAGS += $(call cc-option,-msoft-float)
KBUILD_CFLAGS += $(CC_FLAGS_NO_FPU)
KBUILD_CFLAGS += $(CFLAGS-y)
CPP = $(CC) -E $(KBUILD_CFLAGS)

Expand Down
28 changes: 28 additions & 0 deletions arch/powerpc/include/asm/fpu.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/

#ifndef _ASM_POWERPC_FPU_H
#define _ASM_POWERPC_FPU_H

#include <linux/preempt.h>

#include <asm/cpu_has_feature.h>
#include <asm/switch_to.h>

#define kernel_fpu_available() (!cpu_has_feature(CPU_FTR_FPU_UNAVAILABLE))

static inline void kernel_fpu_begin(void)
{
preempt_disable();
enable_kernel_fp();
}

static inline void kernel_fpu_end(void)
{
disable_kernel_fp();
preempt_enable();
}

#endif /* ! _ASM_POWERPC_FPU_H */
1 change: 1 addition & 0 deletions arch/riscv/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ config RISCV
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_KERNEL_FPU_SUPPORT if 64BIT && FPU
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_MMIOWB
Expand Down
3 changes: 3 additions & 0 deletions arch/riscv/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -91,6 +91,9 @@ KBUILD_CFLAGS += -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64i

KBUILD_AFLAGS += -march=$(riscv-march-y)

# For C code built with floating-point support, exclude V but keep F and D.
CC_FLAGS_FPU := -march=$(shell echo $(riscv-march-y) | sed -E 's/(rv32ima|rv64ima)([^v_]*)v?/\1\2/')

KBUILD_CFLAGS += -mno-save-restore
KBUILD_CFLAGS += -DCONFIG_PAGE_OFFSET=$(CONFIG_PAGE_OFFSET)

Expand Down
16 changes: 16 additions & 0 deletions arch/riscv/include/asm/fpu.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
/* SPDX-License-Identifier: GPL-2.0-only */
/*
* Copyright (C) 2023 SiFive
*/

#ifndef _ASM_RISCV_FPU_H
#define _ASM_RISCV_FPU_H

#include <asm/switch_to.h>

#define kernel_fpu_available() has_fpu()

void kernel_fpu_begin(void);
void kernel_fpu_end(void);

#endif /* ! _ASM_RISCV_FPU_H */
1 change: 1 addition & 0 deletions arch/riscv/kernel/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o

obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_FPU) += kernel_mode_fpu.o
obj-$(CONFIG_RISCV_ISA_V) += vector.o
obj-$(CONFIG_RISCV_ISA_V) += kernel_mode_vector.o
obj-$(CONFIG_SMP) += smpboot.o
Expand Down
28 changes: 28 additions & 0 deletions arch/riscv/kernel/kernel_mode_fpu.c
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
// SPDX-License-Identifier: GPL-2.0-only
/*
* Copyright (C) 2023 SiFive
*/

#include <linux/export.h>
#include <linux/preempt.h>

#include <asm/csr.h>
#include <asm/fpu.h>
#include <asm/processor.h>
#include <asm/switch_to.h>

void kernel_fpu_begin(void)
{
preempt_disable();
fstate_save(current, task_pt_regs(current));
csr_set(CSR_SSTATUS, SR_FS);
}
EXPORT_SYMBOL_GPL(kernel_fpu_begin);

void kernel_fpu_end(void)
{
csr_clear(CSR_SSTATUS, SR_FS);
fstate_restore(current, task_pt_regs(current));
preempt_enable();
}
EXPORT_SYMBOL_GPL(kernel_fpu_end);
1 change: 1 addition & 0 deletions arch/x86/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ config X86
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV if X86_64
select ARCH_HAS_KERNEL_FPU_SUPPORT
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
Expand Down
Loading

0 comments on commit c760b37

Please sign in to comment.