Skip to content

Commit

Permalink
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Browse files Browse the repository at this point in the history
Pull first set of KVM updates from Paolo Bonzini:
 "PPC:
   - minor code cleanups

  x86:
   - PCID emulation and CR3 caching for shadow page tables
   - nested VMX live migration
   - nested VMCS shadowing
   - optimized IPI hypercall
   - some optimizations

  ARM will come next week"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (85 commits)
  kvm: x86: Set highest physical address bits in non-present/reserved SPTEs
  KVM/x86: Use CC_SET()/CC_OUT in arch/x86/kvm/vmx.c
  KVM: X86: Implement PV IPIs in linux guest
  KVM: X86: Add kvm hypervisor init time platform setup callback
  KVM: X86: Implement "send IPI" hypercall
  KVM/x86: Move X86_CR4_OSXSAVE check into kvm_valid_sregs()
  KVM: x86: Skip pae_root shadow allocation if tdp enabled
  KVM/MMU: Combine flushing remote tlb in mmu_set_spte()
  KVM: vmx: skip VMWRITE of HOST_{FS,GS}_BASE when possible
  KVM: vmx: skip VMWRITE of HOST_{FS,GS}_SEL when possible
  KVM: vmx: always initialize HOST_{FS,GS}_BASE to zero during setup
  KVM: vmx: move struct host_state usage to struct loaded_vmcs
  KVM: vmx: compute need to reload FS/GS/LDT on demand
  KVM: nVMX: remove a misleading comment regarding vmcs02 fields
  KVM: vmx: rename __vmx_load_host_state() and vmx_save_host_state()
  KVM: vmx: add dedicated utility to access guest's kernel_gs_base
  KVM: vmx: track host_state.loaded using a loaded_vmcs pointer
  KVM: vmx: refactor segmentation code in vmx_save_host_state()
  kvm: nVMX: Fix fault priority for VMX operations
  kvm: nVMX: Fix fault vector for VMX operation at CPL > 0
  ...
  • Loading branch information
Linus Torvalds committed Aug 19, 2018
2 parents 1009aa1 + 28a1f3a commit e61cf2e
Show file tree
Hide file tree
Showing 53 changed files with 3,110 additions and 726 deletions.
56 changes: 56 additions & 0 deletions Documentation/virtual/kvm/api.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3561,6 +3561,62 @@ Returns: 0 on success,
-ENOENT on deassign if the conn_id isn't registered
-EEXIST on assign if the conn_id is already registered

4.114 KVM_GET_NESTED_STATE

Capability: KVM_CAP_NESTED_STATE
Architectures: x86
Type: vcpu ioctl
Parameters: struct kvm_nested_state (in/out)
Returns: 0 on success, -1 on error
Errors:
E2BIG: the total state size (including the fixed-size part of struct
kvm_nested_state) exceeds the value of 'size' specified by
the user; the size required will be written into size.

struct kvm_nested_state {
__u16 flags;
__u16 format;
__u32 size;
union {
struct kvm_vmx_nested_state vmx;
struct kvm_svm_nested_state svm;
__u8 pad[120];
};
__u8 data[0];
};

#define KVM_STATE_NESTED_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_RUN_PENDING 0x00000002

#define KVM_STATE_NESTED_SMM_GUEST_MODE 0x00000001
#define KVM_STATE_NESTED_SMM_VMXON 0x00000002

struct kvm_vmx_nested_state {
__u64 vmxon_pa;
__u64 vmcs_pa;

struct {
__u16 flags;
} smm;
};

This ioctl copies the vcpu's nested virtualization state from the kernel to
userspace.

The maximum size of the state, including the fixed-size part of struct
kvm_nested_state, can be retrieved by passing KVM_CAP_NESTED_STATE to
the KVM_CHECK_EXTENSION ioctl().

4.115 KVM_SET_NESTED_STATE

Capability: KVM_CAP_NESTED_STATE
Architectures: x86
Type: vcpu ioctl
Parameters: struct kvm_nested_state (in)
Returns: 0 on success, -1 on error

This copies the vcpu's kvm_nested_state struct from userspace to the kernel. For
the definition of struct kvm_nested_state, see KVM_GET_NESTED_STATE.

5. The kvm_run structure
------------------------
Expand Down
4 changes: 4 additions & 0 deletions Documentation/virtual/kvm/cpuid.txt
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,10 @@ KVM_FEATURE_ASYNC_PF_VMEXIT || 10 || paravirtualized async PF VM exit
|| || can be enabled by setting bit 2
|| || when writing to msr 0x4b564d02
------------------------------------------------------------------------------
KVM_FEATURE_PV_SEND_IPI || 11 || guest checks this feature bit
|| || before using paravirtualized
|| || send IPIs.
------------------------------------------------------------------------------
KVM_FEATURE_CLOCKSOURCE_STABLE_BIT || 24 || host will warn if no guest-side
|| || per-cpu warps are expected in
|| || kvmclock.
Expand Down
20 changes: 20 additions & 0 deletions Documentation/virtual/kvm/hypercalls.txt
Original file line number Diff line number Diff line change
Expand Up @@ -121,3 +121,23 @@ compute the CLOCK_REALTIME for its clock, at the same instant.

Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.

6. KVM_HC_SEND_IPI
------------------------
Architecture: x86
Status: active
Purpose: Send IPIs to multiple vCPUs.

a0: lower part of the bitmap of destination APIC IDs
a1: higher part of the bitmap of destination APIC IDs
a2: the lowest APIC ID in bitmap
a3: APIC ICR

The hypercall lets a guest send multicast IPIs, with at most 128
128 destinations per hypercall in 64-bit mode and 64 vCPUs per
hypercall in 32-bit mode. The destinations are represented by a
bitmap contained in the first two arguments (a0 and a1). Bit 0 of
a0 corresponds to the APIC ID in the third argument (a2), bit 1
corresponds to the APIC ID a2+1, and so on.

Returns the number of CPUs to which the IPIs were delivered successfully.
47 changes: 47 additions & 0 deletions arch/powerpc/include/asm/kvm_book3s.h
Original file line number Diff line number Diff line change
Expand Up @@ -390,4 +390,51 @@ extern int kvmppc_h_logical_ci_store(struct kvm_vcpu *vcpu);
#define SPLIT_HACK_MASK 0xff000000
#define SPLIT_HACK_OFFS 0xfb000000

/*
* This packs a VCPU ID from the [0..KVM_MAX_VCPU_ID) space down to the
* [0..KVM_MAX_VCPUS) space, using knowledge of the guest's core stride
* (but not its actual threading mode, which is not available) to avoid
* collisions.
*
* The implementation leaves VCPU IDs from the range [0..KVM_MAX_VCPUS) (block
* 0) unchanged: if the guest is filling each VCORE completely then it will be
* using consecutive IDs and it will fill the space without any packing.
*
* For higher VCPU IDs, the packed ID is based on the VCPU ID modulo
* KVM_MAX_VCPUS (effectively masking off the top bits) and then an offset is
* added to avoid collisions.
*
* VCPU IDs in the range [KVM_MAX_VCPUS..(KVM_MAX_VCPUS*2)) (block 1) are only
* possible if the guest is leaving at least 1/2 of each VCORE empty, so IDs
* can be safely packed into the second half of each VCORE by adding an offset
* of (stride / 2).
*
* Similarly, if VCPU IDs in the range [(KVM_MAX_VCPUS*2)..(KVM_MAX_VCPUS*4))
* (blocks 2 and 3) are seen, the guest must be leaving at least 3/4 of each
* VCORE empty so packed IDs can be offset by (stride / 4) and (stride * 3 / 4).
*
* Finally, VCPU IDs from blocks 5..7 will only be seen if the guest is using a
* stride of 8 and 1 thread per core so the remaining offsets of 1, 5, 3 and 7
* must be free to use.
*
* (The offsets for each block are stored in block_offsets[], indexed by the
* block number if the stride is 8. For cases where the guest's stride is less
* than 8, we can re-use the block_offsets array by multiplying the block
* number by (MAX_SMT_THREADS / stride) to reach the correct entry.)
*/
static inline u32 kvmppc_pack_vcpu_id(struct kvm *kvm, u32 id)
{
const int block_offsets[MAX_SMT_THREADS] = {0, 4, 2, 6, 1, 5, 3, 7};
int stride = kvm->arch.emul_smt_mode;
int block = (id / KVM_MAX_VCPUS) * (MAX_SMT_THREADS / stride);
u32 packed_id;

if (WARN_ONCE(block >= MAX_SMT_THREADS, "VCPU ID too large to pack"))
return 0;
packed_id = (id % KVM_MAX_VCPUS) + block_offsets[block];
if (WARN_ONCE(packed_id >= KVM_MAX_VCPUS, "VCPU ID packing failed"))
return 0;
return packed_id;
}

#endif /* __ASM_KVM_BOOK3S_H__ */
26 changes: 16 additions & 10 deletions arch/powerpc/include/asm/kvm_host.h
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,14 @@
#define KVM_USER_MEM_SLOTS 512

#include <asm/cputhreads.h>
#define KVM_MAX_VCPU_ID (threads_per_subcore * KVM_MAX_VCORES)

#ifdef CONFIG_KVM_BOOK3S_HV_POSSIBLE
#include <asm/kvm_book3s_asm.h> /* for MAX_SMT_THREADS */
#define KVM_MAX_VCPU_ID (MAX_SMT_THREADS * KVM_MAX_VCORES)

#else
#define KVM_MAX_VCPU_ID KVM_MAX_VCPUS
#endif /* CONFIG_KVM_BOOK3S_HV_POSSIBLE */

#define __KVM_HAVE_ARCH_INTC_INITIALIZED

Expand Down Expand Up @@ -672,7 +679,7 @@ struct kvm_vcpu_arch {
gva_t vaddr_accessed;
pgd_t *pgdir;

u8 io_gpr; /* GPR used as IO source/target */
u16 io_gpr; /* GPR used as IO source/target */
u8 mmio_host_swabbed;
u8 mmio_sign_extend;
/* conversion between single and double precision */
Expand All @@ -688,7 +695,6 @@ struct kvm_vcpu_arch {
*/
u8 mmio_vsx_copy_nums;
u8 mmio_vsx_offset;
u8 mmio_vsx_tx_sx_enabled;
u8 mmio_vmx_copy_nums;
u8 mmio_vmx_offset;
u8 mmio_copy_type;
Expand Down Expand Up @@ -801,14 +807,14 @@ struct kvm_vcpu_arch {
#define KVMPPC_VCPU_BUSY_IN_HOST 2

/* Values for vcpu->arch.io_gpr */
#define KVM_MMIO_REG_MASK 0x001f
#define KVM_MMIO_REG_EXT_MASK 0xffe0
#define KVM_MMIO_REG_MASK 0x003f
#define KVM_MMIO_REG_EXT_MASK 0xffc0
#define KVM_MMIO_REG_GPR 0x0000
#define KVM_MMIO_REG_FPR 0x0020
#define KVM_MMIO_REG_QPR 0x0040
#define KVM_MMIO_REG_FQPR 0x0060
#define KVM_MMIO_REG_VSX 0x0080
#define KVM_MMIO_REG_VMX 0x00c0
#define KVM_MMIO_REG_FPR 0x0040
#define KVM_MMIO_REG_QPR 0x0080
#define KVM_MMIO_REG_FQPR 0x00c0
#define KVM_MMIO_REG_VSX 0x0100
#define KVM_MMIO_REG_VMX 0x0180

#define __KVM_HAVE_ARCH_WQP
#define __KVM_HAVE_CREATE_DEVICE
Expand Down
2 changes: 1 addition & 1 deletion arch/powerpc/include/asm/reg.h
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,7 @@
#define PSSCR_ESL 0x00200000 /* Enable State Loss */
#define PSSCR_SD 0x00400000 /* Status Disable */
#define PSSCR_PLS 0xf000000000000000 /* Power-saving Level Status */
#define PSSCR_GUEST_VIS 0xf0000000000003ff /* Guest-visible PSSCR fields */
#define PSSCR_GUEST_VIS 0xf0000000000003ffUL /* Guest-visible PSSCR fields */
#define PSSCR_FAKE_SUSPEND 0x00000400 /* Fake-suspend bit (P9 DD2.2) */
#define PSSCR_FAKE_SUSPEND_LG 10 /* Fake-suspend bit position */

Expand Down
5 changes: 2 additions & 3 deletions arch/powerpc/kvm/book3s_64_vio.c
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,7 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
if ((tbltmp->it_page_shift <= stt->page_shift) &&
(tbltmp->it_offset << tbltmp->it_page_shift ==
stt->offset << stt->page_shift) &&
(tbltmp->it_size << tbltmp->it_page_shift ==
(tbltmp->it_size << tbltmp->it_page_shift >=
stt->size << stt->page_shift)) {
/*
* Reference the table to avoid races with
Expand Down Expand Up @@ -295,15 +295,14 @@ long kvm_vm_ioctl_create_spapr_tce(struct kvm *kvm,
{
struct kvmppc_spapr_tce_table *stt = NULL;
struct kvmppc_spapr_tce_table *siter;
unsigned long npages, size;
unsigned long npages, size = args->size;
int ret = -ENOMEM;
int i;

if (!args->size || args->page_shift < 12 || args->page_shift > 34 ||
(args->offset + args->size > (ULLONG_MAX >> args->page_shift)))
return -EINVAL;

size = _ALIGN_UP(args->size, PAGE_SIZE >> 3);
npages = kvmppc_tce_pages(size);
ret = kvmppc_account_memlimit(kvmppc_stt_pages(npages), true);
if (ret)
Expand Down
42 changes: 29 additions & 13 deletions arch/powerpc/kvm/book3s_hv.c
Original file line number Diff line number Diff line change
Expand Up @@ -127,14 +127,14 @@ static int kvmppc_hv_setup_htab_rma(struct kvm_vcpu *vcpu);
* and SPURR count and should be set according to the number of
* online threads in the vcore being run.
*/
#define RWMR_RPA_P8_1THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_2THREAD 0x7FFF2908450D8DA9
#define RWMR_RPA_P8_3THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_4THREAD 0x199A421245058DA9
#define RWMR_RPA_P8_5THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_6THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_7THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_8THREAD 0x164520C62609AECA
#define RWMR_RPA_P8_1THREAD 0x164520C62609AECAUL
#define RWMR_RPA_P8_2THREAD 0x7FFF2908450D8DA9UL
#define RWMR_RPA_P8_3THREAD 0x164520C62609AECAUL
#define RWMR_RPA_P8_4THREAD 0x199A421245058DA9UL
#define RWMR_RPA_P8_5THREAD 0x164520C62609AECAUL
#define RWMR_RPA_P8_6THREAD 0x164520C62609AECAUL
#define RWMR_RPA_P8_7THREAD 0x164520C62609AECAUL
#define RWMR_RPA_P8_8THREAD 0x164520C62609AECAUL

static unsigned long p8_rwmr_values[MAX_SMT_THREADS + 1] = {
RWMR_RPA_P8_1THREAD,
Expand Down Expand Up @@ -1807,7 +1807,7 @@ static int threads_per_vcore(struct kvm *kvm)
return threads_per_subcore;
}

static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int id)
{
struct kvmppc_vcore *vcore;

Expand All @@ -1821,7 +1821,7 @@ static struct kvmppc_vcore *kvmppc_vcore_create(struct kvm *kvm, int core)
init_swait_queue_head(&vcore->wq);
vcore->preempt_tb = TB_NIL;
vcore->lpcr = kvm->arch.lpcr;
vcore->first_vcpuid = core * kvm->arch.smt_mode;
vcore->first_vcpuid = id;
vcore->kvm = kvm;
INIT_LIST_HEAD(&vcore->preempt_list);

Expand Down Expand Up @@ -2037,12 +2037,26 @@ static struct kvm_vcpu *kvmppc_core_vcpu_create_hv(struct kvm *kvm,
mutex_lock(&kvm->lock);
vcore = NULL;
err = -EINVAL;
core = id / kvm->arch.smt_mode;
if (cpu_has_feature(CPU_FTR_ARCH_300)) {
if (id >= (KVM_MAX_VCPUS * kvm->arch.emul_smt_mode)) {
pr_devel("KVM: VCPU ID too high\n");
core = KVM_MAX_VCORES;
} else {
BUG_ON(kvm->arch.smt_mode != 1);
core = kvmppc_pack_vcpu_id(kvm, id);
}
} else {
core = id / kvm->arch.smt_mode;
}
if (core < KVM_MAX_VCORES) {
vcore = kvm->arch.vcores[core];
if (!vcore) {
if (vcore && cpu_has_feature(CPU_FTR_ARCH_300)) {
pr_devel("KVM: collision on id %u", id);
vcore = NULL;
} else if (!vcore) {
err = -ENOMEM;
vcore = kvmppc_vcore_create(kvm, core);
vcore = kvmppc_vcore_create(kvm,
id & ~(kvm->arch.smt_mode - 1));
kvm->arch.vcores[core] = vcore;
kvm->arch.online_vcores++;
}
Expand Down Expand Up @@ -4550,6 +4564,8 @@ static int kvmppc_book3s_init_hv(void)
pr_err("KVM-HV: Cannot determine method for accessing XICS\n");
return -ENODEV;
}
/* presence of intc confirmed - node can be dropped again */
of_node_put(np);
}
#endif

Expand Down
19 changes: 12 additions & 7 deletions arch/powerpc/kvm/book3s_xive.c
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,11 @@ static int xive_select_target(struct kvm *kvm, u32 *server, u8 prio)
return -EBUSY;
}

static u32 xive_vp(struct kvmppc_xive *xive, u32 server)
{
return xive->vp_base + kvmppc_pack_vcpu_id(xive->kvm, server);
}

static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
struct kvmppc_xive_src_block *sb,
struct kvmppc_xive_irq_state *state)
Expand Down Expand Up @@ -362,7 +367,7 @@ static u8 xive_lock_and_mask(struct kvmppc_xive *xive,
*/
if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
xive_native_configure_irq(hw_num,
xive->vp_base + state->act_server,
xive_vp(xive, state->act_server),
MASKED, state->number);
/* set old_p so we can track if an H_EOI was done */
state->old_p = true;
Expand Down Expand Up @@ -418,7 +423,7 @@ static void xive_finish_unmask(struct kvmppc_xive *xive,
*/
if (xd->flags & OPAL_XIVE_IRQ_MASK_VIA_FW) {
xive_native_configure_irq(hw_num,
xive->vp_base + state->act_server,
xive_vp(xive, state->act_server),
state->act_priority, state->number);
/* If an EOI is needed, do it here */
if (!state->old_p)
Expand Down Expand Up @@ -495,7 +500,7 @@ static int xive_target_interrupt(struct kvm *kvm,
kvmppc_xive_select_irq(state, &hw_num, NULL);

return xive_native_configure_irq(hw_num,
xive->vp_base + server,
xive_vp(xive, server),
prio, state->number);
}

Expand Down Expand Up @@ -883,7 +888,7 @@ int kvmppc_xive_set_mapped(struct kvm *kvm, unsigned long guest_irq,
* which is fine for a never started interrupt.
*/
xive_native_configure_irq(hw_irq,
xive->vp_base + state->act_server,
xive_vp(xive, state->act_server),
state->act_priority, state->number);

/*
Expand Down Expand Up @@ -959,7 +964,7 @@ int kvmppc_xive_clr_mapped(struct kvm *kvm, unsigned long guest_irq,

/* Reconfigure the IPI */
xive_native_configure_irq(state->ipi_number,
xive->vp_base + state->act_server,
xive_vp(xive, state->act_server),
state->act_priority, state->number);

/*
Expand Down Expand Up @@ -1084,7 +1089,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
pr_devel("Duplicate !\n");
return -EEXIST;
}
if (cpu >= KVM_MAX_VCPUS) {
if (cpu >= (KVM_MAX_VCPUS * vcpu->kvm->arch.emul_smt_mode)) {
pr_devel("Out of bounds !\n");
return -EINVAL;
}
Expand All @@ -1098,7 +1103,7 @@ int kvmppc_xive_connect_vcpu(struct kvm_device *dev,
xc->xive = xive;
xc->vcpu = vcpu;
xc->server_num = cpu;
xc->vp_id = xive->vp_base + cpu;
xc->vp_id = xive_vp(xive, cpu);
xc->mfrr = 0xff;
xc->valid = true;

Expand Down
Loading

0 comments on commit e61cf2e

Please sign in to comment.