Message ID | 20240709175145.9986-4-manali.shukla@amd.com |
---|---|
State | New |
Headers | show |
Series | Add support for the Bus Lock Threshold | expand |
On Tue, Jul 09, 2024, Manali Shukla wrote: > Expose the Bus Lock Threshold in the guest CPUID and support its > functionality in nested guest. Why? This is a rather messy feature to support in a nested setup, and I'd much prefer to not open that can of worms unless there's a very good reason to do so. > Ensure proper restoration and saving of the bus_lock_counter at VM > Entry and VM Exit respectively in nested guest scenarios. > > Case 1: > L0 supports buslock exit and L1 does not: use buslock counter from L0 > and exits happen to L0 VMM. > > Case 2: > Both L0 and L1 supports buslock exit: use L1 buslock counter value and > exits happen to L1 VMM. Yeah, no. L1 wants to attack the host, so it runs L2 with buslock detection enabled, but the highest possible threshold. Game over. If we take the min between the two, then we have to track the delta and shove _that_ into the VMCB. E.g. L1 wants every 4, L0 wants every 5. After 4 locks, KVM synthesizes a nested VM-Exit. Then on nested VMRUN, KVM needs to remember it should run L2 with a threshold of 1. If we really want to support virtualizing bus lock detection for L1, the easiest approach would be to do so if and only if it's NOT in use by L0. But IMO that's not worth doing.
On Tue, Jul 09, 2024, Manali Shukla wrote: > @@ -758,6 +759,16 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm, > } > } > > + /* > + * If guest intercepts BUSLOCK, use guest's bus_lock_counter value, > + * otherwise use host bus_lock_counter value. > + */ > + if (guest_can_use(vcpu, X86_FEATURE_BUS_LOCK_THRESHOLD) && > + vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_BUSLOCK)) > + vmcb02->control.bus_lock_counter = svm->nested.ctl.bus_lock_counter; > + else > + vmcb02->control.bus_lock_counter = vmcb01->control.bus_lock_counter; Copying vmcb01's count to/from vmcb02 belongs in the core enabling patch. From KVM's perspective, the counter is associated with a vCPU, not a VMCB, and so the count should keep running across nested transitions. As written, taking only the core enabling patch will mean that L2 runs with the wrong count. Amusingly, because '0' means "always exit", L2 would run in a *more* restrictive environment due to the VMCB being zero-allocated.
On 8/17/2024 1:35 AM, Sean Christopherson wrote: > On Tue, Jul 09, 2024, Manali Shukla wrote: >> Expose the Bus Lock Threshold in the guest CPUID and support its >> functionality in nested guest. > > Why? This is a rather messy feature to support in a nested setup, and I'd much > prefer to not open that can of worms unless there's a very good reason to do so. Agreed. > >> Ensure proper restoration and saving of the bus_lock_counter at VM >> Entry and VM Exit respectively in nested guest scenarios. >> >> Case 1: >> L0 supports buslock exit and L1 does not: use buslock counter from L0 >> and exits happen to L0 VMM. >> >> Case 2: >> Both L0 and L1 supports buslock exit: use L1 buslock counter value and >> exits happen to L1 VMM. > > Yeah, no. L1 wants to attack the host, so it runs L2 with buslock detection > enabled, but the highest possible threshold. Game over. > > If we take the min between the two, then we have to track the delta and shove > _that_ into the VMCB. E.g. L1 wants every 4, L0 wants every 5. After 4 locks, > KVM synthesizes a nested VM-Exit. Then on nested VMRUN, KVM needs to remember > it should run L2 with a threshold of 1. > > If we really want to support virtualizing bus lock detection for L1, the easiest > approach would be to do so if and only if it's NOT in use by L0. But IMO that's > not worth doing. I will drop the nested implementation in V2. -Manali
On 8/17/2024 1:44 AM, Sean Christopherson wrote: > On Tue, Jul 09, 2024, Manali Shukla wrote: >> @@ -758,6 +759,16 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm, >> } >> } >> >> + /* >> + * If guest intercepts BUSLOCK, use guest's bus_lock_counter value, >> + * otherwise use host bus_lock_counter value. >> + */ >> + if (guest_can_use(vcpu, X86_FEATURE_BUS_LOCK_THRESHOLD) && >> + vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_BUSLOCK)) >> + vmcb02->control.bus_lock_counter = svm->nested.ctl.bus_lock_counter; >> + else >> + vmcb02->control.bus_lock_counter = vmcb01->control.bus_lock_counter; > > Copying vmcb01's count to/from vmcb02 belongs in the core enabling patch. From > KVM's perspective, the counter is associated with a vCPU, not a VMCB, and so the > count should keep running across nested transitions. > > As written, taking only the core enabling patch will mean that L2 runs with the > wrong count. Amusingly, because '0' means "always exit", L2 would run in a *more* > restrictive environment due to the VMCB being zero-allocated. Yeah. From my testing, with core enabling patch + copying vmcb01's count to/from vmcb02, L2 runs with correct value of bus lock counter and counter continues to run across nested transitions. The bus lock exit happens to L0 hypervisor when buslock is generated from L2 guest. - Manali
diff --git a/arch/x86/kvm/governed_features.h b/arch/x86/kvm/governed_features.h index ad463b1ed4e4..0982eb107f0b 100644 --- a/arch/x86/kvm/governed_features.h +++ b/arch/x86/kvm/governed_features.h @@ -17,6 +17,7 @@ KVM_GOVERNED_X86_FEATURE(PFTHRESHOLD) KVM_GOVERNED_X86_FEATURE(VGIF) KVM_GOVERNED_X86_FEATURE(VNMI) KVM_GOVERNED_X86_FEATURE(LAM) +KVM_GOVERNED_X86_FEATURE(BUS_LOCK_THRESHOLD) #undef KVM_GOVERNED_X86_FEATURE #undef KVM_GOVERNED_FEATURE diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c index 6f704c1037e5..d09434225e4d 100644 --- a/arch/x86/kvm/svm/nested.c +++ b/arch/x86/kvm/svm/nested.c @@ -363,6 +363,7 @@ void __nested_copy_vmcb_control_to_cache(struct kvm_vcpu *vcpu, to->virt_ext = from->virt_ext; to->pause_filter_count = from->pause_filter_count; to->pause_filter_thresh = from->pause_filter_thresh; + to->bus_lock_counter = from->bus_lock_counter; /* Copy asid here because nested_vmcb_check_controls will check it. */ to->asid = from->asid; @@ -758,6 +759,16 @@ static void nested_vmcb02_prepare_control(struct vcpu_svm *svm, } } + /* + * If guest intercepts BUSLOCK, use guest's bus_lock_counter value, + * otherwise use host bus_lock_counter value. + */ + if (guest_can_use(vcpu, X86_FEATURE_BUS_LOCK_THRESHOLD) && + vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_BUSLOCK)) + vmcb02->control.bus_lock_counter = svm->nested.ctl.bus_lock_counter; + else + vmcb02->control.bus_lock_counter = vmcb01->control.bus_lock_counter; + nested_svm_transition_tlb_flush(vcpu); /* Enter Guest-Mode */ @@ -1035,6 +1046,12 @@ int nested_svm_vmexit(struct vcpu_svm *svm) } + if (guest_can_use(vcpu, X86_FEATURE_BUS_LOCK_THRESHOLD) && + vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_BUSLOCK)) + vmcb12->control.bus_lock_counter = vmcb02->control.bus_lock_counter; + else + vmcb01->control.bus_lock_counter = vmcb02->control.bus_lock_counter; + nested_svm_copy_common_state(svm->nested.vmcb02.ptr, svm->vmcb01.ptr); svm_switch_vmcb(svm, &svm->vmcb01); @@ -1333,6 +1350,13 @@ static int nested_svm_intercept(struct vcpu_svm *svm) vmexit = NESTED_EXIT_DONE; break; } + case SVM_EXIT_BUS_LOCK: { + if (!(vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_BUSLOCK))) + vmexit = NESTED_EXIT_HOST; + else + vmexit = NESTED_EXIT_DONE; + break; + } default: { if (vmcb12_is_intercept(&svm->nested.ctl, exit_code)) vmexit = NESTED_EXIT_DONE; @@ -1572,6 +1596,7 @@ static void nested_copy_vmcb_cache_to_control(struct vmcb_control_area *dst, dst->virt_ext = from->virt_ext; dst->pause_filter_count = from->pause_filter_count; dst->pause_filter_thresh = from->pause_filter_thresh; + dst->bus_lock_counter = from->bus_lock_counter; /* 'clean' and 'hv_enlightenments' are not changed by KVM */ } diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 9f1d51384eac..bb2437a7694c 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4373,6 +4373,8 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu) set_msr_interception(vcpu, svm->msrpm, MSR_IA32_FLUSH_CMD, 0, !!guest_cpuid_has(vcpu, X86_FEATURE_FLUSH_L1D)); + kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_BUS_LOCK_THRESHOLD); + if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD) && vcpu->kvm->arch.bus_lock_detection_enabled) { svm_set_intercept(svm, INTERCEPT_BUSLOCK); @@ -5183,6 +5185,9 @@ static __init void svm_set_cpu_caps(void) if (vnmi) kvm_cpu_cap_set(X86_FEATURE_VNMI); + if (cpu_feature_enabled(X86_FEATURE_BUS_LOCK_THRESHOLD)) + kvm_cpu_cap_set(X86_FEATURE_BUS_LOCK_THRESHOLD); + /* Nested VM can receive #VMEXIT instead of triggering #GP */ kvm_cpu_cap_set(X86_FEATURE_SVME_ADDR_CHK); } diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index 8983eabf8f84..f49ea38187ba 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -150,6 +150,7 @@ struct vmcb_ctrl_area_cached { u64 nested_cr3; u64 virt_ext; u32 clean; + u16 bus_lock_counter; union { #if IS_ENABLED(CONFIG_HYPERV) || IS_ENABLED(CONFIG_KVM_HYPERV) struct hv_vmcb_enlightenments hv_enlightenments;
Expose the Bus Lock Threshold in the guest CPUID and support its functionality in nested guest. Ensure proper restoration and saving of the bus_lock_counter at VM Entry and VM Exit respectively in nested guest scenarios. Case 1: L0 supports buslock exit and L1 does not: use buslock counter from L0 and exits happen to L0 VMM. Case 2: Both L0 and L1 supports buslock exit: use L1 buslock counter value and exits happen to L1 VMM. Signed-off-by: Manali Shukla <manali.shukla@amd.com> --- arch/x86/kvm/governed_features.h | 1 + arch/x86/kvm/svm/nested.c | 25 +++++++++++++++++++++++++ arch/x86/kvm/svm/svm.c | 5 +++++ arch/x86/kvm/svm/svm.h | 1 + 4 files changed, 32 insertions(+)