Message ID | 20240329225835.400662-19-michael.roth@amd.com |
---|---|
State | New |
Headers | show |
Series | Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand |
On Tue, Apr 16, 2024 at 4:25 PM Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 4/16/24 06:53, Paolo Bonzini wrote: > > On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > >> > >> On 3/29/24 23:58, Michael Roth wrote: > >>> From: Tom Lendacky<thomas.lendacky@amd.com> > >>> > >>> In preparation to support SEV-SNP AP Creation, use a variable that holds > >>> the VMSA physical address rather than converting the virtual address. > >>> This will allow SEV-SNP AP Creation to set the new physical address that > >>> will be used should the vCPU reset path be taken. > >>> > >>> Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com> > >>> Signed-off-by: Ashish Kalra<ashish.kalra@amd.com> > >>> Signed-off-by: Michael Roth<michael.roth@amd.com> > >>> --- > >> > >> I'll get back to this one after Easter, but it looks like Sean had some > >> objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/. > > > > Note that AP create is called multiple times per vCPU under OVMF with > and added call by the kernel when booting the APs. Oooh, I somehow thought that + target_svm->sev_es.snp_vmsa_gpa = INVALID_PAGE; + target_svm->sev_es.snp_ap_create = true; was in svm_create_vcpu(). So there should be separate "snp_ap_waiting_for_reset" and "snp_has_guest_vmsa" flags. The latter is set once in __sev_snp_update_protected_guest_state and is what governs whether the VMSA page was allocated or just refcounted. > But I believe that Sean wants a separate KVM object per VMPL level, so > that would disappear anyway (Joerg and I want to get on the PUCK > schedule to talk about multi-VMPL level support soon.) Yes, agreed on both counts. > > /* > > * gmem pages aren't currently migratable, but if this ever > > * changes then care should be taken to ensure > > * svm->sev_es.vmsa_pa is pinned through some other means. > > */ > > kvm_release_pfn_clean(pfn); > > Removing this here will cause any previous guest VMSA page(s) to remain > pinned, that's the reason for unpinning here. OVMF re-uses the VMSA, but > that isn't a requirement for a firmware, and the kernel will create a > new VMSA page. Yes, and once you understand that I was thinking of a set-once flag "snp_has_guest_vmsa" it should all make a lot more sense. Paolo
On Tue, Apr 16, 2024 at 01:53:24PM +0200, Paolo Bonzini wrote: > On Sat, Mar 30, 2024 at 10:01 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 3/29/24 23:58, Michael Roth wrote: > > > From: Tom Lendacky<thomas.lendacky@amd.com> > > > > > > In preparation to support SEV-SNP AP Creation, use a variable that holds > > > the VMSA physical address rather than converting the virtual address. > > > This will allow SEV-SNP AP Creation to set the new physical address that > > > will be used should the vCPU reset path be taken. > > > > > > Signed-off-by: Tom Lendacky<thomas.lendacky@amd.com> > > > Signed-off-by: Ashish Kalra<ashish.kalra@amd.com> > > > Signed-off-by: Michael Roth<michael.roth@amd.com> > > > --- > > > > I'll get back to this one after Easter, but it looks like Sean had some > > objections at https://lore.kernel.org/lkml/ZeCqnq7dLcJI41O9@google.com/. > > So IIUC the gist of the solution here would be to replace > > /* Use the new VMSA */ > svm->sev_es.vmsa_pa = pfn_to_hpa(pfn); > svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa; > > with something like > > /* Use the new VMSA */ > __free_page(virt_to_page(svm->sev_es.vmsa)); One downside to free'ing VMSA at this point is there are a number of additional cleanup routines like wbinvd_on_all_cpus() and in sev_free_vcpu() which will need to be called before we are able to safely free the page back to the system. It would be simple to wrap all that up in an sev_free_vmsa() helper and also call it here rather than defer it, but from a performance perspective it would be nice to defer it to shutdown path. > svm->sev_es.vmsa = pfn_to_kaddr(pfn); > svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa); It turns out sev_es_init_vmcb() always ends up setting control.vmsa_pa again using the new vmsa stored in sev_es.vmsa before the AP re-enters the guest: svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa); If we modify that code to instead do: if (!svm->sev_es.snp_has_guest_vmsa) svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa); Then it will instead continue to use the control.vmsa_pa set here in __sev_snp_update_protected_guest_state(), in which case svm->sev_es.vmsa will only ever be used to store the initial VMSA that was allocated by KVM. Given that... > > and wrap the __free_page() in sev_free_vcpu() with "if > (!svm->sev_es.snp_ap_create)". If we take the deferred approach above, then no checks are needed here and the KVM-allocated VMSA is cleaned up the same way it is handled for SEV-ES. SNP never needs to piggy-back off of sev_es.vmsa to pass around VMSAs that reside in guest memory. I can still rework things to free KVM-allocated VMSA immediately here if you prefer but for now I have things implemented as above to keep SEV-ES/SNP handling similar and avoid performance penalty during guest boot. I've pushed the revised AP creation patch here for reference: https://github.com/mdroth/linux/commit/5a7e76231a7629ba62f8b0bba8039d93d3595ecb Thanks for the suggestions, this all looks a good bit cleaner either way. -Mike > > This should remove the need for svm->sev_es.vmsa_pa. It is always > equal to svm->vmcb->control.vmsa_pa anyway. > > Also, it's possible to remove > > /* > * gmem pages aren't currently migratable, but if this ever > * changes then care should be taken to ensure > * svm->sev_es.vmsa_pa is pinned through some other means. > */ > kvm_release_pfn_clean(pfn); > > if sev_free_vcpu() does > > if (svm->sev_es.snp_ap_create) { > __free_page(virt_to_page(svm->sev_es.vmsa)); > } else { > put_page(virt_to_page(svm->sev_es.vmsa)); > } > > and while at it, please reverse the polarity of snp_ap_create and > rename it to snp_ap_created. > > Paolo >
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index a0a88471f9ab..ce1c727bad23 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -3780,8 +3780,7 @@ static void sev_es_init_vmcb(struct vcpu_svm *svm) * the VMSA will be NULL if this vCPU is the destination for intrahost * migration, and will be copied later. */ - if (svm->sev_es.vmsa) - svm->vmcb->control.vmsa_pa = __pa(svm->sev_es.vmsa); + svm->vmcb->control.vmsa_pa = svm->sev_es.vmsa_pa; /* Can't intercept CR register access, HV can't modify CR registers */ svm_clr_intercept(svm, INTERCEPT_CR0_READ); diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c index 648a05ca53fc..e036a8927717 100644 --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1451,9 +1451,16 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu) svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT); svm_switch_vmcb(svm, &svm->vmcb01); - if (vmsa_page) + if (vmsa_page) { svm->sev_es.vmsa = page_address(vmsa_page); + /* + * Do not include the encryption mask on the VMSA physical + * address since hardware will access it using the guest key. + */ + svm->sev_es.vmsa_pa = __pa(svm->sev_es.vmsa); + } + svm->guest_state_loaded = false; return 0; diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h index c0675ff2d8a2..8cce3315b46c 100644 --- a/arch/x86/kvm/svm/svm.h +++ b/arch/x86/kvm/svm/svm.h @@ -199,6 +199,7 @@ struct vcpu_sev_es_state { struct ghcb *ghcb; u8 valid_bitmap[16]; struct kvm_host_map ghcb_map; + hpa_t vmsa_pa; bool received_first_sipi; unsigned int ap_reset_hold_type;