Skip to content

Commit

Permalink
Merge patch series "RISC-V: Detect and report speed of unaligned vect…
Browse files Browse the repository at this point in the history
…or accesses"

Charlie Jenkins <charlie@rivosinc.com> says:

Adds support for detecting and reporting the speed of unaligned vector
accesses on RISC-V CPUs. Adds vec_misaligned_speed key to the hwprobe
adds Zicclsm to cpufeature and fixes the check for scalar unaligned
emulated all CPUs. The vec_misaligned_speed key keeps the same format
as the scalar unaligned access speed key.

This set does not emulate unaligned vector accesses on CPUs that do not
support them. Only reports if userspace can run them and speed of
unaligned vector accesses if supported.

* b4-shazam-merge:
  RISC-V: hwprobe: Document unaligned vector perf key
  RISC-V: Report vector unaligned access speed hwprobe
  RISC-V: Detect unaligned vector accesses supported
  RISC-V: Replace RISCV_MISALIGNED with RISCV_SCALAR_MISALIGNED
  RISC-V: Scalar unaligned access emulated on hotplug CPUs
  RISC-V: Check scalar unaligned access on all CPUs

Link: https://lore.kernel.org/r/20241017-jesse_unaligned_vector-v10-0-5b33500160f8@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
  • Loading branch information
Palmer Dabbelt committed Oct 18, 2024
2 parents 7727020 + 40e09eb commit 18efe86
Show file tree
Hide file tree
Showing 15 changed files with 474 additions and 38 deletions.
16 changes: 16 additions & 0 deletions Documentation/arch/riscv/hwprobe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -274,3 +274,19 @@ The following keys are defined:
represent the highest userspace virtual address usable.

* :c:macro:`RISCV_HWPROBE_KEY_TIME_CSR_FREQ`: Frequency (in Hz) of `time CSR`.

* :c:macro:`RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF`: An enum value describing the
performance of misaligned vector accesses on the selected set of processors.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN`: The performance of misaligned
vector accesses is unknown.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW`: 32-bit misaligned accesses using vector
registers are slower than the equivalent quantity of byte accesses via vector registers.
Misaligned accesses may be supported directly in hardware, or trapped and emulated by software.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_FAST`: 32-bit misaligned accesses using vector
registers are faster than the equivalent quantity of byte accesses via vector registers.

* :c:macro:`RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED`: Misaligned vector accesses are
not supported at all and will generate a misaligned address fault.
58 changes: 56 additions & 2 deletions arch/riscv/Kconfig
Original file line number Diff line number Diff line change
Expand Up @@ -786,10 +786,24 @@ config THREAD_SIZE_ORDER

config RISCV_MISALIGNED
bool
help
Embed support for detecting and emulating misaligned
scalar or vector loads and stores.

config RISCV_SCALAR_MISALIGNED
bool
select RISCV_MISALIGNED
select SYSCTL_ARCH_UNALIGN_ALLOW
help
Embed support for emulating misaligned loads and stores.

config RISCV_VECTOR_MISALIGNED
bool
select RISCV_MISALIGNED
depends on RISCV_ISA_V
help
Enable detecting support for vector misaligned loads and stores.

choice
prompt "Unaligned Accesses Support"
default RISCV_PROBE_UNALIGNED_ACCESS
Expand All @@ -801,7 +815,7 @@ choice

config RISCV_PROBE_UNALIGNED_ACCESS
bool "Probe for hardware unaligned access support"
select RISCV_MISALIGNED
select RISCV_SCALAR_MISALIGNED
help
During boot, the kernel will run a series of tests to determine the
speed of unaligned accesses. This probing will dynamically determine
Expand All @@ -812,7 +826,7 @@ config RISCV_PROBE_UNALIGNED_ACCESS

config RISCV_EMULATED_UNALIGNED_ACCESS
bool "Emulate unaligned access where system support is missing"
select RISCV_MISALIGNED
select RISCV_SCALAR_MISALIGNED
help
If unaligned memory accesses trap into the kernel as they are not
supported by the system, the kernel will emulate the unaligned
Expand Down Expand Up @@ -841,6 +855,46 @@ config RISCV_EFFICIENT_UNALIGNED_ACCESS

endchoice

choice
prompt "Vector unaligned Accesses Support"
depends on RISCV_ISA_V
default RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
help
This determines the level of support for vector unaligned accesses. This
information is used by the kernel to perform optimizations. It is also
exposed to user space via the hwprobe syscall. The hardware will be
probed at boot by default.

config RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
bool "Probe speed of vector unaligned accesses"
select RISCV_VECTOR_MISALIGNED
depends on RISCV_ISA_V
help
During boot, the kernel will run a series of tests to determine the
speed of vector unaligned accesses if they are supported. This probing
will dynamically determine the speed of vector unaligned accesses on
the underlying system if they are supported.

config RISCV_SLOW_VECTOR_UNALIGNED_ACCESS
bool "Assume the system supports slow vector unaligned memory accesses"
depends on NONPORTABLE
help
Assume that the system supports slow vector unaligned memory accesses. The
kernel and userspace programs may not be able to run at all on systems
that do not support unaligned memory accesses.

config RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS
bool "Assume the system supports fast vector unaligned memory accesses"
depends on NONPORTABLE
help
Assume that the system supports fast vector unaligned memory accesses. When
enabled, this option improves the performance of the kernel on such
systems. However, the kernel and userspace programs will run much more
slowly, or will not be able to run at all, on systems that do not
support efficient unaligned memory accesses.

endchoice

source "arch/riscv/Kconfig.vendor"

endmenu # "Platform type"
Expand Down
10 changes: 9 additions & 1 deletion arch/riscv/include/asm/cpufeature.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@

#include <linux/bitmap.h>
#include <linux/jump_label.h>
#include <linux/workqueue.h>
#include <asm/hwcap.h>
#include <asm/alternative-macros.h>
#include <asm/errno.h>
Expand Down Expand Up @@ -58,8 +59,9 @@ void __init riscv_user_isa_enable(void);
#define __RISCV_ISA_EXT_SUPERSET_VALIDATE(_name, _id, _sub_exts, _validate) \
_RISCV_ISA_EXT_DATA(_name, _id, _sub_exts, ARRAY_SIZE(_sub_exts), _validate)

#if defined(CONFIG_RISCV_MISALIGNED)
bool check_unaligned_access_emulated_all_cpus(void);
#if defined(CONFIG_RISCV_SCALAR_MISALIGNED)
void check_unaligned_access_emulated(struct work_struct *work __always_unused);
void unaligned_emulation_finish(void);
bool unaligned_ctl_available(void);
DECLARE_PER_CPU(long, misaligned_access_speed);
Expand All @@ -70,6 +72,12 @@ static inline bool unaligned_ctl_available(void)
}
#endif

bool check_vector_unaligned_access_emulated_all_cpus(void);
#if defined(CONFIG_RISCV_VECTOR_MISALIGNED)
void check_vector_unaligned_access_emulated(struct work_struct *work __always_unused);
DECLARE_PER_CPU(long, vector_misaligned_access);
#endif

#if defined(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS)
DECLARE_STATIC_KEY_FALSE(fast_unaligned_access_speed_key);

Expand Down
11 changes: 0 additions & 11 deletions arch/riscv/include/asm/entry-common.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,18 +25,7 @@ static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs,
void handle_page_fault(struct pt_regs *regs);
void handle_break(struct pt_regs *regs);

#ifdef CONFIG_RISCV_MISALIGNED
int handle_misaligned_load(struct pt_regs *regs);
int handle_misaligned_store(struct pt_regs *regs);
#else
static inline int handle_misaligned_load(struct pt_regs *regs)
{
return -1;
}
static inline int handle_misaligned_store(struct pt_regs *regs)
{
return -1;
}
#endif

#endif /* _ASM_RISCV_ENTRY_COMMON_H */
2 changes: 1 addition & 1 deletion arch/riscv/include/asm/hwprobe.h
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@

#include <uapi/asm/hwprobe.h>

#define RISCV_HWPROBE_MAX_KEY 9
#define RISCV_HWPROBE_MAX_KEY 10

static inline bool riscv_hwprobe_key_is_valid(__s64 key)
{
Expand Down
2 changes: 2 additions & 0 deletions arch/riscv/include/asm/vector.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@

extern unsigned long riscv_v_vsize;
int riscv_v_setup_vsize(void);
bool insn_is_vector(u32 insn_buf);
bool riscv_v_first_use_handler(struct pt_regs *regs);
void kernel_vector_begin(void);
void kernel_vector_end(void);
Expand Down Expand Up @@ -268,6 +269,7 @@ struct pt_regs;

static inline int riscv_v_setup_vsize(void) { return -EOPNOTSUPP; }
static __always_inline bool has_vector(void) { return false; }
static __always_inline bool insn_is_vector(u32 insn_buf) { return false; }
static inline bool riscv_v_first_use_handler(struct pt_regs *regs) { return false; }
static inline bool riscv_v_vstate_query(struct pt_regs *regs) { return false; }
static inline bool riscv_v_vstate_ctrl_user_allowed(void) { return false; }
Expand Down
5 changes: 5 additions & 0 deletions arch/riscv/include/uapi/asm/hwprobe.h
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,11 @@ struct riscv_hwprobe {
#define RISCV_HWPROBE_MISALIGNED_SCALAR_SLOW 2
#define RISCV_HWPROBE_MISALIGNED_SCALAR_FAST 3
#define RISCV_HWPROBE_MISALIGNED_SCALAR_UNSUPPORTED 4
#define RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF 10
#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN 0
#define RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW 2
#define RISCV_HWPROBE_MISALIGNED_VECTOR_FAST 3
#define RISCV_HWPROBE_MISALIGNED_VECTOR_UNSUPPORTED 4
/* Increase RISCV_HWPROBE_MAX_KEY when adding items. */

/* Flags */
Expand Down
3 changes: 2 additions & 1 deletion arch/riscv/kernel/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,8 @@ obj-$(CONFIG_MMU) += vdso.o vdso/

obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o
obj-$(CONFIG_RISCV_MISALIGNED) += unaligned_access_speed.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
obj-$(CONFIG_RISCV_PROBE_UNALIGNED_ACCESS) += copy-unaligned.o
obj-$(CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS) += vec-copy-unaligned.o

obj-$(CONFIG_FPU) += fpu.o
obj-$(CONFIG_FPU) += kernel_mode_fpu.o
Expand Down
5 changes: 5 additions & 0 deletions arch/riscv/kernel/copy-unaligned.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,4 +10,9 @@
void __riscv_copy_words_unaligned(void *dst, const void *src, size_t size);
void __riscv_copy_bytes_unaligned(void *dst, const void *src, size_t size);

#ifdef CONFIG_RISCV_PROBE_VECTOR_UNALIGNED_ACCESS
void __riscv_copy_vec_words_unaligned(void *dst, const void *src, size_t size);
void __riscv_copy_vec_bytes_unaligned(void *dst, const void *src, size_t size);
#endif

#endif /* __RISCV_KERNEL_COPY_UNALIGNED_H */
4 changes: 2 additions & 2 deletions arch/riscv/kernel/fpu.S
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ SYM_FUNC_END(__fstate_restore)
__access_func(f31)


#ifdef CONFIG_RISCV_MISALIGNED
#ifdef CONFIG_RISCV_SCALAR_MISALIGNED

/*
* Disable compressed instructions set to keep a constant offset between FP
Expand Down Expand Up @@ -224,4 +224,4 @@ SYM_FUNC_START(get_f64_reg)
fp_access_epilogue
SYM_FUNC_END(get_f64_reg)

#endif /* CONFIG_RISCV_MISALIGNED */
#endif /* CONFIG_RISCV_SCALAR_MISALIGNED */
41 changes: 41 additions & 0 deletions arch/riscv/kernel/sys_hwprobe.c
Original file line number Diff line number Diff line change
Expand Up @@ -201,6 +201,43 @@ static u64 hwprobe_misaligned(const struct cpumask *cpus)
}
#endif

#ifdef CONFIG_RISCV_VECTOR_MISALIGNED
static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
{
int cpu;
u64 perf = -1ULL;

/* Return if supported or not even if speed wasn't probed */
for_each_cpu(cpu, cpus) {
int this_perf = per_cpu(vector_misaligned_access, cpu);

if (perf == -1ULL)
perf = this_perf;

if (perf != this_perf) {
perf = RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
break;
}
}

if (perf == -1ULL)
return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;

return perf;
}
#else
static u64 hwprobe_vec_misaligned(const struct cpumask *cpus)
{
if (IS_ENABLED(CONFIG_RISCV_EFFICIENT_VECTOR_UNALIGNED_ACCESS))
return RISCV_HWPROBE_MISALIGNED_VECTOR_FAST;

if (IS_ENABLED(CONFIG_RISCV_SLOW_VECTOR_UNALIGNED_ACCESS))
return RISCV_HWPROBE_MISALIGNED_VECTOR_SLOW;

return RISCV_HWPROBE_MISALIGNED_VECTOR_UNKNOWN;
}
#endif

static void hwprobe_one_pair(struct riscv_hwprobe *pair,
const struct cpumask *cpus)
{
Expand Down Expand Up @@ -229,6 +266,10 @@ static void hwprobe_one_pair(struct riscv_hwprobe *pair,
pair->value = hwprobe_misaligned(cpus);
break;

case RISCV_HWPROBE_KEY_MISALIGNED_VECTOR_PERF:
pair->value = hwprobe_vec_misaligned(cpus);
break;

case RISCV_HWPROBE_KEY_ZICBOZ_BLOCK_SIZE:
pair->value = 0;
if (hwprobe_ext0_has(cpus, RISCV_HWPROBE_EXT_ZICBOZ))
Expand Down
Loading

0 comments on commit 18efe86

Please sign in to comment.