Message ID | 20250410132850.3708703-2-ardb+git@google.com |
---|---|
State | New |
Headers | show |
Series | [v3] x86/boot/sev: Avoid shared GHCB page for early memory acceptance | expand |
On Thu, 17 Apr 2025 at 18:08, Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 4/11/25 14:00, Ard Biesheuvel wrote: > > On Fri, 11 Apr 2025 at 20:40, Borislav Petkov <bp@alien8.de> wrote: > >> > >> On Thu, Apr 10, 2025 at 03:28:51PM +0200, Ard Biesheuvel wrote: > >>> From: Ard Biesheuvel <ardb@kernel.org> > >>> > >>> Communicating with the hypervisor using the shared GHCB page requires > >>> clearing the C bit in the mapping of that page. When executing in the > >>> context of the EFI boot services, the page tables are owned by the > >>> firmware, and this manipulation is not possible. > >>> > >>> So switch to a different API for accepting memory in SEV-SNP guests, one > >> > >> That being the GHCB MSR protocol, it seems. > >> > > > > Yes. > > > >> And since Tom co-developed, I guess we wanna do that. > >> > >> But then how much slower do we become? > >> > > > > Non-EFI stub boot will become slower if the memory that is used to > > decompress the kernel has not been accepted yet. But given how heavily > > SEV-SNP depends on EFI boot, this typically only happens on kexec, as > > that is the only boot path that goes through the traditional > > decompressor. > > Some quick testing showed no significant differences in kexec booting > and testing shows everything seems to be good. > Thanks. > But, in testing with non-2M sized memory (e.g. a guest with 4097M of > memory) and without the change to how SNP is detected before > sev_enable() is called, we hit the error path in arch_accept_memory() in > arch/x86/boot/compressed/mem.c and the boot crashes. > Right. So this is because sev_snp_enabled() is based on sev_status, which has not been set yet at this point, right? And for the record, could you please indicate whether you are ok with the co-developed-by/signed-off-by credits on this patch (and subsequent revisions)?
On Thu, 17 Apr 2025 at 18:21, Tom Lendacky <thomas.lendacky@amd.com> wrote: > > On 4/17/25 11:14, Ard Biesheuvel wrote: > > On Thu, 17 Apr 2025 at 18:08, Tom Lendacky <thomas.lendacky@amd.com> wrote: > >> > >> On 4/11/25 14:00, Ard Biesheuvel wrote: > >>> On Fri, 11 Apr 2025 at 20:40, Borislav Petkov <bp@alien8.de> wrote: > >>>> > >>>> On Thu, Apr 10, 2025 at 03:28:51PM +0200, Ard Biesheuvel wrote: > >>>>> From: Ard Biesheuvel <ardb@kernel.org> > >>>>> > >>>>> Communicating with the hypervisor using the shared GHCB page requires > >>>>> clearing the C bit in the mapping of that page. When executing in the > >>>>> context of the EFI boot services, the page tables are owned by the > >>>>> firmware, and this manipulation is not possible. > >>>>> > >>>>> So switch to a different API for accepting memory in SEV-SNP guests, one > >>>> > >>>> That being the GHCB MSR protocol, it seems. > >>>> > >>> > >>> Yes. > >>> > >>>> And since Tom co-developed, I guess we wanna do that. > >>>> > >>>> But then how much slower do we become? > >>>> > >>> > >>> Non-EFI stub boot will become slower if the memory that is used to > >>> decompress the kernel has not been accepted yet. But given how heavily > >>> SEV-SNP depends on EFI boot, this typically only happens on kexec, as > >>> that is the only boot path that goes through the traditional > >>> decompressor. > >> > >> Some quick testing showed no significant differences in kexec booting > >> and testing shows everything seems to be good. > >> > > > > Thanks. > > > >> But, in testing with non-2M sized memory (e.g. a guest with 4097M of > >> memory) and without the change to how SNP is detected before > >> sev_enable() is called, we hit the error path in arch_accept_memory() in > >> arch/x86/boot/compressed/mem.c and the boot crashes. > >> > > > > Right. So this is because sev_snp_enabled() is based on sev_status, > > which has not been set yet at this point, right? > > Correct. > OK. Would this do the trick? (with asm/sev.h added to the #includes) --- a/arch/x86/boot/compressed/mem.c +++ b/arch/x86/boot/compressed/mem.c @@ -34,11 +34,14 @@ static bool early_is_tdx_guest(void) void arch_accept_memory(phys_addr_t start, phys_addr_t end) { + static bool sevsnp; + /* Platform-specific memory-acceptance call goes here */ if (early_is_tdx_guest()) { if (!tdx_accept_memory(start, end)) panic("TDX: Failed to accept memory\n"); - } else if (sev_snp_enabled()) { + } else if (sevsnp || (sev_get_status() & MSR_AMD64_SEV_SNP_ENABLED)) { + sevsnp = true; snp_accept_memory(start, end); } else { error("Cannot accept memory: unknown platform\n"); > > > > And for the record, could you please indicate whether you are ok with > > the co-developed-by/signed-off-by credits on this patch (and > > subsequent revisions)? > > Yep, I'm fine with that. > Thanks.
On 4/17/25 12:26, Tom Lendacky wrote: > On 4/17/25 11:38, Ard Biesheuvel wrote: >> On Thu, 17 Apr 2025 at 18:21, Tom Lendacky <thomas.lendacky@amd.com> wrote: >>> >>> On 4/17/25 11:14, Ard Biesheuvel wrote: >>>> On Thu, 17 Apr 2025 at 18:08, Tom Lendacky <thomas.lendacky@amd.com> wrote: >>>>> >>>>> On 4/11/25 14:00, Ard Biesheuvel wrote: >>>>>> On Fri, 11 Apr 2025 at 20:40, Borislav Petkov <bp@alien8.de> wrote: >>>>>>> >>>>>>> On Thu, Apr 10, 2025 at 03:28:51PM +0200, Ard Biesheuvel wrote: >>>>>>>> From: Ard Biesheuvel <ardb@kernel.org> >>>>>>>> >>>>>>>> Communicating with the hypervisor using the shared GHCB page requires >>>>>>>> clearing the C bit in the mapping of that page. When executing in the >>>>>>>> context of the EFI boot services, the page tables are owned by the >>>>>>>> firmware, and this manipulation is not possible. >>>>>>>> >>>>>>>> So switch to a different API for accepting memory in SEV-SNP guests, one >>>>>>> >>>>>>> That being the GHCB MSR protocol, it seems. >>>>>>> >>>>>> >>>>>> Yes. >>>>>> >>>>>>> And since Tom co-developed, I guess we wanna do that. >>>>>>> >>>>>>> But then how much slower do we become? >>>>>>> >>>>>> >>>>>> Non-EFI stub boot will become slower if the memory that is used to >>>>>> decompress the kernel has not been accepted yet. But given how heavily >>>>>> SEV-SNP depends on EFI boot, this typically only happens on kexec, as >>>>>> that is the only boot path that goes through the traditional >>>>>> decompressor. >>>>> >>>>> Some quick testing showed no significant differences in kexec booting >>>>> and testing shows everything seems to be good. >>>>> >>>> >>>> Thanks. >>>> >>>>> But, in testing with non-2M sized memory (e.g. a guest with 4097M of >>>>> memory) and without the change to how SNP is detected before >>>>> sev_enable() is called, we hit the error path in arch_accept_memory() in >>>>> arch/x86/boot/compressed/mem.c and the boot crashes. >>>>> >>>> >>>> Right. So this is because sev_snp_enabled() is based on sev_status, >>>> which has not been set yet at this point, right? >>> >>> Correct. >>> >> >> OK. Would this do the trick? (with asm/sev.h added to the #includes) > > Yes, that works for booting. Let me do some kexec testing and get back > to you. Sorry, that might not be until tomorrow, though. Ok, found some time... looks good with kexec, too. Thanks, Tom > > Thanks, > Tom > >> >> --- a/arch/x86/boot/compressed/mem.c >> +++ b/arch/x86/boot/compressed/mem.c >> @@ -34,11 +34,14 @@ static bool early_is_tdx_guest(void) >> >> void arch_accept_memory(phys_addr_t start, phys_addr_t end) >> { >> + static bool sevsnp; >> + >> /* Platform-specific memory-acceptance call goes here */ >> if (early_is_tdx_guest()) { >> if (!tdx_accept_memory(start, end)) >> panic("TDX: Failed to accept memory\n"); >> - } else if (sev_snp_enabled()) { >> + } else if (sevsnp || (sev_get_status() & MSR_AMD64_SEV_SNP_ENABLED)) { >> + sevsnp = true; >> snp_accept_memory(start, end); >> } else { >> error("Cannot accept memory: unknown platform\n"); >> >>>> >>>> And for the record, could you please indicate whether you are ok with >>>> the co-developed-by/signed-off-by credits on this patch (and >>>> subsequent revisions)? >>> >>> Yep, I'm fine with that. >>> >> >> Thanks.
diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index bb55934c1cee..89ba168f4f0f 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -164,10 +164,7 @@ bool sev_snp_enabled(void) static void __page_state_change(unsigned long paddr, enum psc_op op) { - u64 val; - - if (!sev_snp_enabled()) - return; + u64 val, msr; /* * If private -> shared then invalidate the page before requesting the @@ -176,6 +173,9 @@ static void __page_state_change(unsigned long paddr, enum psc_op op) if (op == SNP_PAGE_STATE_SHARED) pvalidate_4k_page(paddr, paddr, false); + /* Save the current GHCB MSR value */ + msr = sev_es_rd_ghcb_msr(); + /* Issue VMGEXIT to change the page state in RMP table. */ sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op)); VMGEXIT(); @@ -185,6 +185,9 @@ static void __page_state_change(unsigned long paddr, enum psc_op op) if ((GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) || GHCB_MSR_PSC_RESP_VAL(val)) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); + /* Restore the GHCB MSR value */ + sev_es_wr_ghcb_msr(msr); + /* * Now that page state is changed in the RMP table, validate it so that it is * consistent with the RMP entry. @@ -195,11 +198,17 @@ static void __page_state_change(unsigned long paddr, enum psc_op op) void snp_set_page_private(unsigned long paddr) { + if (!sev_snp_enabled()) + return; + __page_state_change(paddr, SNP_PAGE_STATE_PRIVATE); } void snp_set_page_shared(unsigned long paddr) { + if (!sev_snp_enabled()) + return; + __page_state_change(paddr, SNP_PAGE_STATE_SHARED); } @@ -223,56 +232,10 @@ static bool early_setup_ghcb(void) return true; } -static phys_addr_t __snp_accept_memory(struct snp_psc_desc *desc, - phys_addr_t pa, phys_addr_t pa_end) -{ - struct psc_hdr *hdr; - struct psc_entry *e; - unsigned int i; - - hdr = &desc->hdr; - memset(hdr, 0, sizeof(*hdr)); - - e = desc->entries; - - i = 0; - while (pa < pa_end && i < VMGEXIT_PSC_MAX_ENTRY) { - hdr->end_entry = i; - - e->gfn = pa >> PAGE_SHIFT; - e->operation = SNP_PAGE_STATE_PRIVATE; - if (IS_ALIGNED(pa, PMD_SIZE) && (pa_end - pa) >= PMD_SIZE) { - e->pagesize = RMP_PG_SIZE_2M; - pa += PMD_SIZE; - } else { - e->pagesize = RMP_PG_SIZE_4K; - pa += PAGE_SIZE; - } - - e++; - i++; - } - - if (vmgexit_psc(boot_ghcb, desc)) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); - - pvalidate_pages(desc); - - return pa; -} - void snp_accept_memory(phys_addr_t start, phys_addr_t end) { - struct snp_psc_desc desc = {}; - unsigned int i; - phys_addr_t pa; - - if (!boot_ghcb && !early_setup_ghcb()) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); - - pa = start; - while (pa < end) - pa = __snp_accept_memory(&desc, pa, end); + for (phys_addr_t pa = start; pa < end; pa += PAGE_SIZE) + __page_state_change(pa, SNP_PAGE_STATE_PRIVATE); } void sev_es_shutdown_ghcb(void)