KVM: VMX: Don't fudge CR0 and CR4 for restricted L2 guest
authorSean Christopherson <seanjc@google.com>
Tue, 13 Jun 2023 20:30:36 +0000 (13:30 -0700)
committerPaolo Bonzini <pbonzini@redhat.com>
Sat, 29 Jul 2023 15:05:32 +0000 (11:05 -0400)
Stuff CR0 and/or CR4 to be compliant with a restricted guest if and only
if KVM itself is not configured to utilize unrestricted guests, i.e. don't
stuff CR0/CR4 for a restricted L2 that is running as the guest of an
unrestricted L1.  Any attempt to VM-Enter a restricted guest with invalid
CR0/CR4 values should fail, i.e. in a nested scenario, KVM (as L0) should
never observe a restricted L2 with incompatible CR0/CR4, since nested
VM-Enter from L1 should have failed.

And if KVM does observe an active, restricted L2 with incompatible state,
e.g. due to a KVM bug, fudging CR0/CR4 instead of letting VM-Enter fail
does more harm than good, as KVM will often neglect to undo the side
effects, e.g. won't clear rmode.vm86_active on nested VM-Exit, and thus
the damage can easily spill over to L1.  On the other hand, letting
VM-Enter fail due to bad guest state is more likely to contain the damage
to L2 as KVM relies on hardware to perform most guest state consistency
checks, i.e. KVM needs to be able to reflect a failed nested VM-Enter into
L1 irrespective of (un)restricted guest behavior.

Cc: Jim Mattson <jmattson@google.com>
Cc: stable@vger.kernel.org
Fixes: bddd82d19e2e ("KVM: nVMX: KVM needs to unset "unrestricted guest" VM-execution control in vmcs02 if vmcs12 doesn't set it")
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20230613203037.1968489-3-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
arch/x86/kvm/vmx/vmx.c

index 3d011d62d96969a8c1c8d2354e78fc86b2e21ce8..df461f387e20d40dd6d1bee8f37553133574b8e2 100644 (file)
@@ -1513,6 +1513,11 @@ void vmx_set_rflags(struct kvm_vcpu *vcpu, unsigned long rflags)
        struct vcpu_vmx *vmx = to_vmx(vcpu);
        unsigned long old_rflags;
 
+       /*
+        * Unlike CR0 and CR4, RFLAGS handling requires checking if the vCPU
+        * is an unrestricted guest in order to mark L2 as needing emulation
+        * if L1 runs L2 as a restricted guest.
+        */
        if (is_unrestricted_guest(vcpu)) {
                kvm_register_mark_available(vcpu, VCPU_EXREG_RFLAGS);
                vmx->rflags = rflags;
@@ -3265,7 +3270,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
        old_cr0_pg = kvm_read_cr0_bits(vcpu, X86_CR0_PG);
 
        hw_cr0 = (cr0 & ~KVM_VM_CR0_ALWAYS_OFF);
-       if (is_unrestricted_guest(vcpu))
+       if (enable_unrestricted_guest)
                hw_cr0 |= KVM_VM_CR0_ALWAYS_ON_UNRESTRICTED_GUEST;
        else {
                hw_cr0 |= KVM_VM_CR0_ALWAYS_ON;
@@ -3293,7 +3298,7 @@ void vmx_set_cr0(struct kvm_vcpu *vcpu, unsigned long cr0)
        }
 #endif
 
-       if (enable_ept && !is_unrestricted_guest(vcpu)) {
+       if (enable_ept && !enable_unrestricted_guest) {
                /*
                 * Ensure KVM has an up-to-date snapshot of the guest's CR3.  If
                 * the below code _enables_ CR3 exiting, vmx_cache_reg() will
@@ -3424,7 +3429,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
         * this bit, even if host CR4.MCE == 0.
         */
        hw_cr4 = (cr4_read_shadow() & X86_CR4_MCE) | (cr4 & ~X86_CR4_MCE);
-       if (is_unrestricted_guest(vcpu))
+       if (enable_unrestricted_guest)
                hw_cr4 |= KVM_VM_CR4_ALWAYS_ON_UNRESTRICTED_GUEST;
        else if (vmx->rmode.vm86_active)
                hw_cr4 |= KVM_RMODE_VM_CR4_ALWAYS_ON;
@@ -3444,7 +3449,7 @@ void vmx_set_cr4(struct kvm_vcpu *vcpu, unsigned long cr4)
        vcpu->arch.cr4 = cr4;
        kvm_register_mark_available(vcpu, VCPU_EXREG_CR4);
 
-       if (!is_unrestricted_guest(vcpu)) {
+       if (!enable_unrestricted_guest) {
                if (enable_ept) {
                        if (!is_paging(vcpu)) {
                                hw_cr4 &= ~X86_CR4_PAE;