Skip to content

Commit

Permalink
Browse files Browse the repository at this point in the history
…l/git/kvmarm/kvmarm into HEAD

KVM/arm64 updates for 6.15

 - Nested virtualization support for VGICv3, giving the nested
   hypervisor control of the VGIC hardware when running an L2 VM

 - Removal of 'late' nested virtualization feature register masking,
   making the supported feature set directly visible to userspace

 - Support for emulating FEAT_PMUv3 on Apple silicon, taking advantage
   of an IMPLEMENTATION DEFINED trap that covers all PMUv3 registers

 - Paravirtual interface for discovering the set of CPU implementations
   where a VM may run, addressing a longstanding issue of guest CPU
   errata awareness in big-little systems and cross-implementation VM
   migration

 - Userspace control of the registers responsible for identifying a
   particular CPU implementation (MIDR_EL1, REVIDR_EL1, AIDR_EL1),
   allowing VMs to be migrated cross-implementation

 - pKVM updates, including support for tracking stage-2 page table
   allocations in the protected hypervisor in the 'SecPageTable' stat

 - Fixes to vPMU, ensuring that userspace updates to the vPMU after
   KVM_RUN are reflected into the backing perf events
  • Loading branch information
Paolo Bonzini committed Mar 20, 2025
2 parents c0f99fb + 369c012 commit 0afd104
Show file tree
Hide file tree
Showing 72 changed files with 2,173 additions and 752 deletions.
18 changes: 18 additions & 0 deletions Documentation/virt/kvm/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8262,6 +8262,24 @@ KVM exits with the register state of either the L1 or L2 guest
depending on which executed at the time of an exit. Userspace must
take care to differentiate between these cases.

7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS
-------------------------------------

:Architectures: arm64
:Target: VM
:Parameters: None
:Returns: 0 on success, -EINVAL if vCPUs have been created before enabling this
capability.

This capability changes the behavior of the registers that identify a PE
implementation of the Arm architecture: MIDR_EL1, REVIDR_EL1, and AIDR_EL1.
By default, these registers are visible to userspace but treated as invariant.

When this capability is enabled, KVM allows userspace to change the
aforementioned registers before the first KVM_RUN. These registers are VM
scoped, meaning that the same set of values are presented on all vCPUs in a
given VM.

8. Other capabilities.
======================

Expand Down
15 changes: 14 additions & 1 deletion Documentation/virt/kvm/arm/fw-pseudo-registers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -116,7 +116,7 @@ The pseudo-firmware bitmap register are as follows:
ARM DEN0057A.

* KVM_REG_ARM_VENDOR_HYP_BMAP:
Controls the bitmap of the Vendor specific Hypervisor Service Calls.
Controls the bitmap of the Vendor specific Hypervisor Service Calls[0-63].

The following bits are accepted:

Expand All @@ -127,6 +127,19 @@ The pseudo-firmware bitmap register are as follows:
Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_PTP:
The bit represents the Precision Time Protocol KVM service.

* KVM_REG_ARM_VENDOR_HYP_BMAP_2:
Controls the bitmap of the Vendor specific Hypervisor Service Calls[64-127].

The following bits are accepted:

Bit-0: KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_VER
This represents the ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID
function-id. This is reset to 0.

Bit-1: KVM_REG_ARM_VENDOR_HYP_BIT_DISCOVER_IMPL_CPUS
This represents the ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID
function-id. This is reset to 0.

Errors:

======= =============================================================
Expand Down
59 changes: 59 additions & 0 deletions Documentation/virt/kvm/arm/hypercalls.rst
Original file line number Diff line number Diff line change
Expand Up @@ -142,3 +142,62 @@ region is equal to the memory protection granule advertised by
| | | +---------------------------------------------+
| | | | ``INVALID_PARAMETER (-3)`` |
+---------------------+----------+----+---------------------------------------------+

``ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_VER_FUNC_ID``
-------------------------------------------------------
Request the target CPU implementation version information and the number of target
implementations for the Guest VM.

+---------------------+-------------------------------------------------------------+
| Presence: | Optional; KVM/ARM64 Guests only |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000040 |
+---------------------+----------+--------------------------------------------------+
| Arguments: | None |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``SUCCESS (0)`` |
| | | +---------------------------------------------+
| | | | ``NOT_SUPPORTED (-1)`` |
| +----------+----+---------------------------------------------+
| | (uint64) | R1 | Bits [63:32] Reserved/Must be zero |
| | | +---------------------------------------------+
| | | | Bits [31:16] Major version |
| | | +---------------------------------------------+
| | | | Bits [15:0] Minor version |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Number of target implementations |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+

``ARM_SMCCC_VENDOR_HYP_KVM_DISCOVER_IMPL_CPUS_FUNC_ID``
-------------------------------------------------------

Request the target CPU implementation information for the Guest VM. The Guest kernel
will use this information to enable the associated errata.

+---------------------+-------------------------------------------------------------+
| Presence: | Optional; KVM/ARM64 Guests only |
+---------------------+-------------------------------------------------------------+
| Calling convention: | HVC64 |
+---------------------+----------+--------------------------------------------------+
| Function ID: | (uint32) | 0xC6000041 |
+---------------------+----------+----+---------------------------------------------+
| Arguments: | (uint64) | R1 | selected implementation index |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | Reserved / Must be zero |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | Reserved / Must be zero |
+---------------------+----------+----+---------------------------------------------+
| Return Values: | (int64) | R0 | ``SUCCESS (0)`` |
| | | +---------------------------------------------+
| | | | ``INVALID_PARAMETER (-3)`` |
| +----------+----+---------------------------------------------+
| | (uint64) | R1 | MIDR_EL1 of the selected implementation |
| +----------+----+---------------------------------------------+
| | (uint64) | R2 | REVIDR_EL1 of the selected implementation |
| +----------+----+---------------------------------------------+
| | (uint64) | R3 | AIDR_EL1 of the selected implementation |
+---------------------+----------+----+---------------------------------------------+
5 changes: 4 additions & 1 deletion Documentation/virt/kvm/devices/arm-vgic-its.rst
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,8 @@ KVM_DEV_ARM_VGIC_GRP_ITS_REGS
ITS Restore Sequence:
---------------------

The following ordering must be followed when restoring the GIC and the ITS:
The following ordering must be followed when restoring the GIC, ITS, and
KVM_IRQFD assignments:

a) restore all guest memory and create vcpus
b) restore all redistributors
Expand All @@ -139,6 +140,8 @@ d) restore the ITS in the following order:
3. Load the ITS table data (KVM_DEV_ARM_ITS_RESTORE_TABLES)
4. Restore GITS_CTLR

e) restore KVM_IRQFD assignments for MSIs

Then vcpus can be started.

ITS Table ABI REV0:
Expand Down
12 changes: 11 additions & 1 deletion Documentation/virt/kvm/devices/arm-vgic-v3.rst
Original file line number Diff line number Diff line change
Expand Up @@ -291,8 +291,18 @@ Groups:
| Aff3 | Aff2 | Aff1 | Aff0 |

Errors:

======= =============================================
-EINVAL vINTID is not multiple of 32 or info field is
not VGIC_LEVEL_INFO_LINE_LEVEL
======= =============================================

KVM_DEV_ARM_VGIC_GRP_MAINT_IRQ
Attributes:

The attr field of kvm_device_attr encodes the following values:

bits: | 31 .... 5 | 4 .... 0 |
values: | RES0 | vINTID |

The vINTID specifies which interrupt is generated when the vGIC
must generate a maintenance interrupt. This must be a PPI.
1 change: 1 addition & 0 deletions arch/arm64/include/asm/apple_m1_pmu.h
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
#define PMCR0_PMI_ENABLE_8_9 GENMASK(45, 44)

#define SYS_IMP_APL_PMCR1_EL1 sys_reg(3, 1, 15, 1, 0)
#define SYS_IMP_APL_PMCR1_EL12 sys_reg(3, 1, 15, 7, 2)
#define PMCR1_COUNT_A64_EL0_0_7 GENMASK(15, 8)
#define PMCR1_COUNT_A64_EL1_0_7 GENMASK(23, 16)
#define PMCR1_COUNT_A64_EL0_8_9 GENMASK(41, 40)
Expand Down
2 changes: 2 additions & 0 deletions arch/arm64/include/asm/cpucaps.h
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,8 @@ cpucap_is_possible(const unsigned int cap)
* KVM MPAM support doesn't rely on the host kernel supporting MPAM.
*/
return true;
case ARM64_HAS_PMUV3:
return IS_ENABLED(CONFIG_HW_PERF_EVENTS);
}

return true;
Expand Down
28 changes: 5 additions & 23 deletions arch/arm64/include/asm/cpufeature.h
Original file line number Diff line number Diff line change
Expand Up @@ -525,29 +525,6 @@ cpuid_feature_extract_unsigned_field(u64 features, int field)
return cpuid_feature_extract_unsigned_field_width(features, field, 4);
}

/*
* Fields that identify the version of the Performance Monitors Extension do
* not follow the standard ID scheme. See ARM DDI 0487E.a page D13-2825,
* "Alternative ID scheme used for the Performance Monitors Extension version".
*/
static inline u64 __attribute_const__
cpuid_feature_cap_perfmon_field(u64 features, int field, u64 cap)
{
u64 val = cpuid_feature_extract_unsigned_field(features, field);
u64 mask = GENMASK_ULL(field + 3, field);

/* Treat IMPLEMENTATION DEFINED functionality as unimplemented */
if (val == ID_AA64DFR0_EL1_PMUVer_IMP_DEF)
val = 0;

if (val > cap) {
features &= ~mask;
features |= (cap << field) & mask;
}

return features;
}

static inline u64 arm64_ftr_mask(const struct arm64_ftr_bits *ftrp)
{
return (u64)GENMASK(ftrp->shift + ftrp->width - 1, ftrp->shift);
Expand Down Expand Up @@ -866,6 +843,11 @@ static __always_inline bool system_supports_mpam_hcr(void)
return alternative_has_cap_unlikely(ARM64_MPAM_HCR);
}

static inline bool system_supports_pmuv3(void)
{
return cpus_have_final_cap(ARM64_HAS_PMUV3);
}

int do_emulate_mrs(struct pt_regs *regs, u32 sys_reg, u32 rt);
bool try_emulate_mrs(struct pt_regs *regs, u32 isn);

Expand Down
40 changes: 17 additions & 23 deletions arch/arm64/include/asm/cputype.h
Original file line number Diff line number Diff line change
Expand Up @@ -231,6 +231,16 @@

#define read_cpuid(reg) read_sysreg_s(SYS_ ## reg)

/*
* The CPU ID never changes at run time, so we might as well tell the
* compiler that it's constant. Use this function to read the CPU ID
* rather than directly reading processor_id or read_cpuid() directly.
*/
static inline u32 __attribute_const__ read_cpuid_id(void)
{
return read_cpuid(MIDR_EL1);
}

/*
* Represent a range of MIDR values for a given CPU model and a
* range of variant/revision values.
Expand Down Expand Up @@ -266,30 +276,14 @@ static inline bool midr_is_cpu_model_range(u32 midr, u32 model, u32 rv_min,
return _model == model && rv >= rv_min && rv <= rv_max;
}

static inline bool is_midr_in_range(u32 midr, struct midr_range const *range)
{
return midr_is_cpu_model_range(midr, range->model,
range->rv_min, range->rv_max);
}

static inline bool
is_midr_in_range_list(u32 midr, struct midr_range const *ranges)
{
while (ranges->model)
if (is_midr_in_range(midr, ranges++))
return true;
return false;
}
struct target_impl_cpu {
u64 midr;
u64 revidr;
u64 aidr;
};

/*
* The CPU ID never changes at run time, so we might as well tell the
* compiler that it's constant. Use this function to read the CPU ID
* rather than directly reading processor_id or read_cpuid() directly.
*/
static inline u32 __attribute_const__ read_cpuid_id(void)
{
return read_cpuid(MIDR_EL1);
}
bool cpu_errata_set_target_impl(u64 num, void *impl_cpus);
bool is_midr_in_range_list(struct midr_range const *ranges);

static inline u64 __attribute_const__ read_cpuid_mpidr(void)
{
Expand Down
1 change: 1 addition & 0 deletions arch/arm64/include/asm/hypervisor.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,7 @@

void kvm_init_hyp_services(void);
bool kvm_arm_hyp_service_available(u32 func_id);
void kvm_arm_target_impl_cpu_init(void);

#ifdef CONFIG_ARM_PKVM_GUEST
void pkvm_init_hyp_services(void);
Expand Down
4 changes: 2 additions & 2 deletions arch/arm64/include/asm/kvm_arm.h
Original file line number Diff line number Diff line change
Expand Up @@ -92,12 +92,12 @@
* SWIO: Turn set/way invalidates into set/way clean+invalidate
* PTW: Take a stage2 fault if a stage1 walk steps in device memory
* TID3: Trap EL1 reads of group 3 ID registers
* TID2: Trap CTR_EL0, CCSIDR2_EL1, CLIDR_EL1, and CSSELR_EL1
* TID1: Trap REVIDR_EL1, AIDR_EL1, and SMIDR_EL1
*/
#define HCR_GUEST_FLAGS (HCR_TSC | HCR_TSW | HCR_TWE | HCR_TWI | HCR_VM | \
HCR_BSU_IS | HCR_FB | HCR_TACR | \
HCR_AMO | HCR_SWIO | HCR_TIDCP | HCR_RW | HCR_TLOR | \
HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3)
HCR_FMO | HCR_IMO | HCR_PTW | HCR_TID3 | HCR_TID1)
#define HCR_HOST_NVHE_FLAGS (HCR_RW | HCR_API | HCR_APK | HCR_ATA)
#define HCR_HOST_NVHE_PROTECTED_FLAGS (HCR_HOST_NVHE_FLAGS | HCR_TSC)
#define HCR_HOST_VHE_FLAGS (HCR_RW | HCR_TGE | HCR_E2H)
Expand Down
37 changes: 37 additions & 0 deletions arch/arm64/include/asm/kvm_emulate.h
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,19 @@ static __always_inline u64 kvm_vcpu_get_esr(const struct kvm_vcpu *vcpu)
return vcpu->arch.fault.esr_el2;
}

static inline bool guest_hyp_wfx_traps_enabled(const struct kvm_vcpu *vcpu)
{
u64 esr = kvm_vcpu_get_esr(vcpu);
bool is_wfe = !!(esr & ESR_ELx_WFx_ISS_WFE);
u64 hcr_el2 = __vcpu_sys_reg(vcpu, HCR_EL2);

if (!vcpu_has_nv(vcpu) || vcpu_is_el2(vcpu))
return false;

return ((is_wfe && (hcr_el2 & HCR_TWE)) ||
(!is_wfe && (hcr_el2 & HCR_TWI)));
}

static __always_inline int kvm_vcpu_get_condition(const struct kvm_vcpu *vcpu)
{
u64 esr = kvm_vcpu_get_esr(vcpu);
Expand Down Expand Up @@ -649,4 +662,28 @@ static inline bool guest_hyp_sve_traps_enabled(const struct kvm_vcpu *vcpu)
{
return __guest_hyp_cptr_xen_trap_enabled(vcpu, ZEN);
}

static inline void vcpu_set_hcrx(struct kvm_vcpu *vcpu)
{
struct kvm *kvm = vcpu->kvm;

if (cpus_have_final_cap(ARM64_HAS_HCX)) {
/*
* In general, all HCRX_EL2 bits are gated by a feature.
* The only reason we can set SMPME without checking any
* feature is that its effects are not directly observable
* from the guest.
*/
vcpu->arch.hcrx_el2 = HCRX_EL2_SMPME;

if (kvm_has_feat(kvm, ID_AA64ISAR2_EL1, MOPS, IMP))
vcpu->arch.hcrx_el2 |= (HCRX_EL2_MSCEn | HCRX_EL2_MCE2);

if (kvm_has_tcr2(kvm))
vcpu->arch.hcrx_el2 |= HCRX_EL2_TCR2En;

if (kvm_has_fpmr(kvm))
vcpu->arch.hcrx_el2 |= HCRX_EL2_EnFPM;
}
}
#endif /* __ARM64_KVM_EMULATE_H__ */
Loading

0 comments on commit 0afd104

Please sign in to comment.