Skip to content

Commit

Permalink
KVM: Document KVM_PRE_FAULT_MEMORY ioctl
Browse files Browse the repository at this point in the history
Adds documentation of KVM_PRE_FAULT_MEMORY ioctl. [1]

It populates guest memory.  It doesn't do extra operations on the
underlying technology-specific initialization [2].  For example,
CoCo-related operations won't be performed.  Concretely for TDX, this API
won't invoke TDH.MEM.PAGE.ADD() or TDH.MR.EXTEND().  Vendor-specific APIs
are required for such operations.

The key point is to adapt of vcpu ioctl instead of VM ioctl.  First,
populating guest memory requires vcpu.  If it is VM ioctl, we need to pick
one vcpu somehow.  Secondly, vcpu ioctl allows each vcpu to invoke this
ioctl in parallel.  It helps to scale regarding guest memory size, e.g.,
hundreds of GB.

[1] https://lore.kernel.org/kvm/Zbrj5WKVgMsUFDtb@google.com/
[2] https://lore.kernel.org/kvm/Ze-TJh0BBOWm9spT@google.com/

Suggested-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Message-ID: <9a060293c9ad9a78f1d8994cfe1311e818e99257.1712785629.git.isaku.yamahata@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
  • Loading branch information
Isaku Yamahata authored and Paolo Bonzini committed Jul 12, 2024
1 parent 02b0d3b commit 9aed7a6
Showing 1 changed file with 55 additions and 0 deletions.
55 changes: 55 additions & 0 deletions Documentation/virt/kvm/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6352,6 +6352,61 @@ a single guest_memfd file, but the bound ranges must not overlap).

See KVM_SET_USER_MEMORY_REGION2 for additional details.

4.143 KVM_PRE_FAULT_MEMORY
------------------------

:Capability: KVM_CAP_PRE_FAULT_MEMORY
:Architectures: none
:Type: vcpu ioctl
:Parameters: struct kvm_pre_fault_memory (in/out)
:Returns: 0 if at least one page is processed, < 0 on error

Errors:

========== ===============================================================
EINVAL The specified `gpa` and `size` were invalid (e.g. not
page aligned, causes an overflow, or size is zero).
ENOENT The specified `gpa` is outside defined memslots.
EINTR An unmasked signal is pending and no page was processed.
EFAULT The parameter address was invalid.
EOPNOTSUPP Mapping memory for a GPA is unsupported by the
hypervisor, and/or for the current vCPU state/mode.
EIO unexpected error conditions (also causes a WARN)
========== ===============================================================

::

struct kvm_pre_fault_memory {
/* in/out */
__u64 gpa;
__u64 size;
/* in */
__u64 flags;
__u64 padding[5];
};

KVM_PRE_FAULT_MEMORY populates KVM's stage-2 page tables used to map memory
for the current vCPU state. KVM maps memory as if the vCPU generated a
stage-2 read page fault, e.g. faults in memory as needed, but doesn't break
CoW. However, KVM does not mark any newly created stage-2 PTE as Accessed.

In some cases, multiple vCPUs might share the page tables. In this
case, the ioctl can be called in parallel.

When the ioctl returns, the input values are updated to point to the
remaining range. If `size` > 0 on return, the caller can just issue
the ioctl again with the same `struct kvm_map_memory` argument.

Shadow page tables cannot support this ioctl because they
are indexed by virtual address or nested guest physical address.
Calling this ioctl when the guest is using shadow page tables (for
example because it is running a nested guest with nested page tables)
will fail with `EOPNOTSUPP` even if `KVM_CHECK_EXTENSION` reports
the capability to be present.

`flags` must currently be zero.


5. The kvm_run structure
========================

Expand Down

0 comments on commit 9aed7a6

Please sign in to comment.