diff mbox series

[RFC,22/22] KVM: x86: Enable guest usage of X86_FEATURE_APERFMPERF

Message ID 20241121185315.3416855-23-mizhang@google.com
State New
Headers show
Series KVM: x86: Virtualize IA32_APERF and IA32_MPERF MSRs | expand

Commit Message

Mingwei Zhang Nov. 21, 2024, 6:53 p.m. UTC
Enable support for IA32_APERF/IA32_MPERF performance monitoring in KVM
guests. These MSRs allow guests to measure their effective CPU
frequency by comparing actual CPU cycles (APERF) against reference
cycles (MPERF).

Only expose X86_FEATURE_APERFMPERF to guests when the host has both
CONSTANT_TSC and NONSTOP_TSC. These features ensure the TSC frequency
remains stable across C-states and P-states, which is necessary for
"background" MPERF accounting.

Guest TSC scaling via KVM_SET_TSC_KHZ is not supported:
- On Intel, IA32_MPERF ticks at host rate regardless of guest TSC
  scaling, making passthrough impossible without intercepting reads
- On AMD, guest TSC scaling does affect IA32_MPERF reads, but handling
  it would significantly complicate cycle accounting

Record host support in kvm_cpu_caps[], advertise the feature to
userspace via CPUID.06H:ECX, and enable the governed feature when
supported by both host and guest CPUID.

Signed-off-by: Mingwei Zhang <mizhang@google.com>
Co-developed-by: Jim Mattson <jmattson@google.com>
Signed-off-by: Jim Mattson <jmattson@google.com>
---
 arch/x86/kvm/cpuid.c | 12 +++++++++++-
 1 file changed, 11 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 41786b834b163..309fa7fef6b7b 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -399,6 +399,10 @@  static void kvm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
 	kvm_hv_set_cpuid(vcpu, kvm_cpuid_has_hyperv(vcpu->arch.cpuid_entries,
 						    vcpu->arch.cpuid_nent));
 
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
+	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
+		kvm_governed_feature_check_and_set(vcpu, X86_FEATURE_APERFMPERF);
+
 	/* Invoke the vendor callback only after the above state is updated. */
 	kvm_x86_call(vcpu_after_set_cpuid)(vcpu);
 
@@ -697,6 +701,12 @@  void kvm_set_cpu_caps(void)
 	if (boot_cpu_has(X86_FEATURE_AMD_SSBD))
 		kvm_cpu_cap_set(X86_FEATURE_SPEC_CTRL_SSBD);
 
+	if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) &&
+	    boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
+		kvm_cpu_cap_init_kvm_defined(CPUID_6_ECX, F(APERFMPERF));
+	else
+		kvm_cpu_cap_init_kvm_defined(CPUID_6_ECX, 0);
+
 	kvm_cpu_cap_mask(CPUID_7_1_EAX,
 		F(AVX_VNNI) | F(AVX512_BF16) | F(CMPCCXADD) |
 		F(FZRM) | F(FSRS) | F(FSRC) |
@@ -993,7 +1003,7 @@  static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
 	case 6: /* Thermal management */
 		entry->eax = 0x4; /* allow ARAT */
 		entry->ebx = 0;
-		entry->ecx = 0;
+		cpuid_entry_override(entry, CPUID_6_ECX);
 		entry->edx = 0;
 		break;
 	/* function 7 has additional index. */