From patchwork Thu Apr 10 13:41:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 880625 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 238BE2857FA for ; Thu, 10 Apr 2025 13:41:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292506; cv=none; b=GBRG+Nq44BpIgaz4Y9mZaZVgtDRtFJOwdfY3YT2XxsegmeuD+xQ7nhwyH5iVKJR5hIesxIeiFOE2yvLJ/ICMbK100aHc4yNMhKTkAmLdDZWazaBPjVwHnFp5Y4jNeCGbdszbZ1mEENYafaEkTyv3WYOg1An2MqBfGCCTQxPMRAQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292506; c=relaxed/simple; bh=t76Wg9mCJpGbQFO226Opa35P1ZmW8qq3cQ+2zwr1lR0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=D6OGoRvkzzZsd9ykLVltDsv0ZEheJNIMtSyRr/7MgNvaB0m7e7LLHiZhkt5eAehH92gyqTi0EDN8nlj6CoYvZf5VLRuBZwfb5m0XX6dy4C1OgLttRuD+ZgC+qMNRfBl+NQbPkMI73SItB1ftQigfjNeNdTNQCBIRAWkOkxl4MTA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=EwLhVq5u; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="EwLhVq5u" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43947a0919aso6420235e9.0 for ; Thu, 10 Apr 2025 06:41:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292502; x=1744897302; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gXw/UaNA/8PDNEKLJNHvZMJeXVXKbmGygtg1CpsGD6E=; b=EwLhVq5u/iw0xHCjxn5XexRQscw4vllX0kVfja2j2xe66Wt4Li4SPcMybLhk8YlW2P KW2raYoQzJaCDXzS4FIJpFrboen6d/u3DcrYWrvzcT4FiPIKV2XVkR4/Q94hf7PP23h9 AgjaIN0CFUpJhzxwRQbuXzIzRvJi5tjZmiAkfVXpOpcGaBeRtV7UYAfbxi+GD+MUGHfK 9fKdbsk6jZuxwdYMns8XGO3fVwytMAjasA5SctevO7nNA7GKCxOylVIwRrra5OBJIR4i cfLKoG/cvLFAvGo/ZzEaxtCPM2w7KTXeZMdc/ywyRVch5iNbrTcQ9QjUWLo/hiubWgLa dkeg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292502; x=1744897302; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gXw/UaNA/8PDNEKLJNHvZMJeXVXKbmGygtg1CpsGD6E=; b=nkdZIHle68oCVzN8+PMTAcnDulyEbxtdgy0QgbT5K3ww9RQwRJTnEJ7FX/MvNmZeSa 4pqzqDU0FVHVUJ21i6IgFaI/5APR/O2UxqaByR1VaH9SfCe/fVM/iMFIegSisMPE3Kph Qi2bXkNOas5UOt7grs8quMN1qckPJF1GwD0rOBqQ3rBugqY4P875lTTlzYv6xvVRyBuL P0gNgh1lZaWP65jCHFkTbX1q19EX3bFuf7/IRzPPP/2wRZsY46a0xNzyIXItobvYMpOH cXEn5WBDm6jyjpuocJ0OapwIP1AL2CR58nMlSaXvtRG/kCXj9Zh414OTarBSIFx2MNHM 3CGQ== X-Gm-Message-State: AOJu0YwhH+88i11XITTk1m1Q3SqPUgIl4uatpeQb+gsxiWuCn/tHN6Ni wtHZ17e4Iuy8iMfY6aAdfpoVVqxWKe855nUpbUylXNfme1G9LWpS2IkgTIQI7lbOBB2D1zM4OO3 5ssbTSi88dG0ylFLp/jbOrSHidatBeiTSgMcchO7uowzOXhvWpPEjaNTAw1IK7y7v4ts7HUSNE4 yyzzO0DLiPeQQ9Mz/2KLv1iWJeqA== X-Google-Smtp-Source: AGHT+IEbBK7uSrQvOLZ1dSP0sU6X384GpC8H66VqBJri1LGrN5IkBkYGIS0pLmbALgbYAMDD1f3a53yZ X-Received: from wmcn12.prod.google.com ([2002:a05:600c:c0cc:b0:43c:f60a:4c59]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3b97:b0:43c:fc00:f94f with SMTP id 5b1f17b1804b1-43f2d95ebf4mr26942655e9.23.1744292502500; Thu, 10 Apr 2025 06:41:42 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:19 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=8366; i=ardb@kernel.org; h=from:subject; bh=niWmrBdmld5Ut46NlyltJZwh2miFi8PLlQF/GqLQX1A=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qYatm4Sv1eo+PbiM8x+Hz899TWpBtxJTlwfO02qcm nniq4hkRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZhITSLDf0frAGdhzqwPLl3O bz2UZl6IfSoYcNauXPHgRPPWiDW7nzEynLQ+b2t0N/gIjwu37MuLH4QPxJgfffp/l61F3vMnlks 2swIA X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-14-ardb+git@google.com> Subject: [PATCH v4 01/11] x86/asm: Make rip_rel_ptr() usable from fPIC code From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel RIP_REL_REF() is used in non-PIC C code that is called very early, before the kernel virtual mapping is up, which is the mapping that the linker expects. It is currently used in two different ways: - to refer to the value of a global variable, including as an lvalue in assignments; - to take the address of a global variable via the mapping that the code currently executes at. The former case is only needed in non-PIC code, as PIC code will never use absolute symbol references when the address of the symbol is not being used. But taking the address of a variable in PIC code may still require extra care, as a stack allocated struct assignment may be emitted as a memcpy() from a statically allocated copy in .rodata. For instance, this void startup_64_setup_gdt_idt(void) { struct desc_ptr startup_gdt_descr = { .address = (__force unsigned long)gdt_page.gdt, .size = GDT_SIZE - 1, }; may result in an absolute symbol reference in PIC code, even though the struct is allocated on the stack and populated at runtime. To address this case, make rip_rel_ptr() accessible in PIC code, and update any existing uses where the address of a global variable is taken using RIP_REL_REF. Once all code of this nature has been moved into arch/x86/boot/startup and built with -fPIC, RIP_REL_REF() can be retired, and only rip_rel_ptr() will remain. Signed-off-by: Ard Biesheuvel --- arch/x86/coco/sev/core.c | 2 +- arch/x86/coco/sev/shared.c | 4 ++-- arch/x86/include/asm/asm.h | 2 +- arch/x86/kernel/head64.c | 24 ++++++++++---------- arch/x86/mm/mem_encrypt_identity.c | 6 ++--- 5 files changed, 19 insertions(+), 19 deletions(-) diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c index b0c1a7a57497..832f7a7b10b2 100644 --- a/arch/x86/coco/sev/core.c +++ b/arch/x86/coco/sev/core.c @@ -2400,7 +2400,7 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info) * kernel was loaded (physbase), so the get the CA address using * RIP-relative addressing. */ - pa = (u64)&RIP_REL_REF(boot_svsm_ca_page); + pa = (u64)rip_rel_ptr(&boot_svsm_ca_page); /* * Switch over to the boot SVSM CA while the current CA is still diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c index 2e4122f8aa6b..04982d356803 100644 --- a/arch/x86/coco/sev/shared.c +++ b/arch/x86/coco/sev/shared.c @@ -475,7 +475,7 @@ static int sev_cpuid_hv(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid */ static const struct snp_cpuid_table *snp_cpuid_get_table(void) { - return &RIP_REL_REF(cpuid_table_copy); + return rip_rel_ptr(&cpuid_table_copy); } /* @@ -1681,7 +1681,7 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info) * routine is running identity mapped when called, both by the decompressor * code and the early kernel code. */ - if (!rmpadjust((unsigned long)&RIP_REL_REF(boot_ghcb_page), RMP_PG_SIZE_4K, 1)) + if (!rmpadjust((unsigned long)rip_rel_ptr(&boot_ghcb_page), RMP_PG_SIZE_4K, 1)) return false; /* diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index cc2881576c2c..a9f07799e337 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -114,13 +114,13 @@ #endif #ifndef __ASSEMBLER__ -#ifndef __pic__ static __always_inline __pure void *rip_rel_ptr(void *p) { asm("leaq %c1(%%rip), %0" : "=r"(p) : "i"(p)); return p; } +#ifndef __pic__ #define RIP_REL_REF(var) (*(typeof(&(var)))rip_rel_ptr(&(var))) #else #define RIP_REL_REF(var) (var) diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index fa9b6339975f..954d093f187b 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -106,8 +106,8 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, * attribute. */ if (sme_get_me_mask()) { - paddr = (unsigned long)&RIP_REL_REF(__start_bss_decrypted); - paddr_end = (unsigned long)&RIP_REL_REF(__end_bss_decrypted); + paddr = (unsigned long)rip_rel_ptr(__start_bss_decrypted); + paddr_end = (unsigned long)rip_rel_ptr(__end_bss_decrypted); for (; paddr < paddr_end; paddr += PMD_SIZE) { /* @@ -144,8 +144,8 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, unsigned long __head __startup_64(unsigned long p2v_offset, struct boot_params *bp) { - pmd_t (*early_pgts)[PTRS_PER_PMD] = RIP_REL_REF(early_dynamic_pgts); - unsigned long physaddr = (unsigned long)&RIP_REL_REF(_text); + pmd_t (*early_pgts)[PTRS_PER_PMD] = rip_rel_ptr(early_dynamic_pgts); + unsigned long physaddr = (unsigned long)rip_rel_ptr(_text); unsigned long va_text, va_end; unsigned long pgtable_flags; unsigned long load_delta; @@ -174,18 +174,18 @@ unsigned long __head __startup_64(unsigned long p2v_offset, for (;;); va_text = physaddr - p2v_offset; - va_end = (unsigned long)&RIP_REL_REF(_end) - p2v_offset; + va_end = (unsigned long)rip_rel_ptr(_end) - p2v_offset; /* Include the SME encryption mask in the fixup value */ load_delta += sme_get_me_mask(); /* Fixup the physical addresses in the page table */ - pgd = &RIP_REL_REF(early_top_pgt)->pgd; + pgd = rip_rel_ptr(early_top_pgt); pgd[pgd_index(__START_KERNEL_map)] += load_delta; if (IS_ENABLED(CONFIG_X86_5LEVEL) && la57) { - p4d = (p4dval_t *)&RIP_REL_REF(level4_kernel_pgt); + p4d = (p4dval_t *)rip_rel_ptr(level4_kernel_pgt); p4d[MAX_PTRS_PER_P4D - 1] += load_delta; pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE; @@ -258,7 +258,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset, * error, causing the BIOS to halt the system. */ - pmd = &RIP_REL_REF(level2_kernel_pgt)->pmd; + pmd = rip_rel_ptr(level2_kernel_pgt); /* invalidate pages before the kernel image */ for (i = 0; i < pmd_index(va_text); i++) @@ -531,7 +531,7 @@ static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data; static void __head startup_64_load_idt(void *vc_handler) { struct desc_ptr desc = { - .address = (unsigned long)&RIP_REL_REF(bringup_idt_table), + .address = (unsigned long)rip_rel_ptr(bringup_idt_table), .size = sizeof(bringup_idt_table) - 1, }; struct idt_data data; @@ -565,11 +565,11 @@ void early_setup_idt(void) */ void __head startup_64_setup_gdt_idt(void) { - struct desc_struct *gdt = (void *)(__force unsigned long)gdt_page.gdt; + struct gdt_page *gp = rip_rel_ptr((void *)(__force unsigned long)&gdt_page); void *handler = NULL; struct desc_ptr startup_gdt_descr = { - .address = (unsigned long)&RIP_REL_REF(*gdt), + .address = (unsigned long)gp->gdt, .size = GDT_SIZE - 1, }; @@ -582,7 +582,7 @@ void __head startup_64_setup_gdt_idt(void) "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory"); if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) - handler = &RIP_REL_REF(vc_no_ghcb); + handler = rip_rel_ptr(vc_no_ghcb); startup_64_load_idt(handler); } diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/mm/mem_encrypt_identity.c index 5eecdd92da10..e7fb3779b35f 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/mm/mem_encrypt_identity.c @@ -318,8 +318,8 @@ void __head sme_encrypt_kernel(struct boot_params *bp) * memory from being cached. */ - kernel_start = (unsigned long)RIP_REL_REF(_text); - kernel_end = ALIGN((unsigned long)RIP_REL_REF(_end), PMD_SIZE); + kernel_start = (unsigned long)rip_rel_ptr(_text); + kernel_end = ALIGN((unsigned long)rip_rel_ptr(_end), PMD_SIZE); kernel_len = kernel_end - kernel_start; initrd_start = 0; @@ -345,7 +345,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp) * pagetable structures for the encryption of the kernel * pagetable structures for workarea (in case not currently mapped) */ - execute_start = workarea_start = (unsigned long)RIP_REL_REF(sme_workarea); + execute_start = workarea_start = (unsigned long)rip_rel_ptr(sme_workarea); execute_end = execute_start + (PAGE_SIZE * 2) + PMD_SIZE; execute_len = execute_end - execute_start; From patchwork Thu Apr 10 13:41:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 879831 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2206228A40B for ; Thu, 10 Apr 2025 13:41:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292508; cv=none; b=TZ0yHoRsrOPWojMgsEVNUEIk23TI02uWcW4JJTPVCRrYVQUk15HRJI+3/ZqciLCWKj6JMZCvEYv4U6RYTZBewFEkVWnEo06WVDFKdLsEhCXYKOm24b9Gb6GpO/yqCwVt9U1Htl1VKz3w01W/gnTO99uw301vTNps4QhrO9F3bJg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292508; c=relaxed/simple; bh=Z9KU8XxKRScSaHvY2ha9sFr5gN/G6+Xwd8QUfPi9AL0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fG0bpjuf3/OlsxXwDzUaJRxbGIPcL+7ClKpD7QxHZTL8nH18gFlLoUD33gN71Yxlcfm47ZztrNKhdUZN3X14AuTav73+fWaqee7CyuAB3StGGNkXaL4RPn9vxF2dwbZFAOyx4qoveKmM9SdNK3VxC2+DbbxcuAjHsHhah4Ffot4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=r3zs9AdX; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="r3zs9AdX" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-3912fc9861cso293365f8f.1 for ; Thu, 10 Apr 2025 06:41:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292504; x=1744897304; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KF3wFyMqFk5OCzQRT3cGNIX8apAV8WuLNqnG519UiK0=; b=r3zs9AdX9GsxZMrMhPTPY310M0AOzU1c38MEYgWmYRVOojYRMT2dj5qMTegbXrzA20 /zMD7KuYbhQ+3tVMoViJRLh2hsegzi33c+RIn34hFF/LZPy3j80PYlI1nKZcxSj5O+wP oGyyMpvpkPREy8KeNDD0KG/h5Ts+KUtxt5nbKHG2R60pMq1mViqFgis7x33p65FhXsWU 8WC5ePFxK4BiU2uesNvxRQvEmdIbxC28PraSEAV7Df7iePMPLPxGZpr3BV5MUCXIKtts Kc/BHPsFr0aOmenhOMcC1XC9To2xIo3ErJa0xlk5ATmleXSMoksq1kEKedYAe/kllLcL ehNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292504; x=1744897304; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KF3wFyMqFk5OCzQRT3cGNIX8apAV8WuLNqnG519UiK0=; b=ltwZxiCmRb+QkWf2FvGGqKdD2s+xUo4ZFx0Wp6+858JOeNIL57KtP34CsF7i1uOAso uI9qR4mMPvI87jRKdTbgz6z8urucEF3WWhXd+wP9Y3CpTtlO9f/RMgzSMad62wy7rVku 8vWCZci1fCHDMPDqkP82bEbpvij+0t9fYVHGAySXwCao5j3JlGUyL6cXd3lC1h781VL2 vbrUjNMnFuC0UIJc2faPKaFxs73B57QXomEmFTxCyuNdvdxVDcATiq/++l7vndMthOJf FtVZBuldXlyvC7T3tNaCtT+Vaq3bl+LjMaobB5rX9m6aBcOA5u7Y9GSezvyYFRL9Znoz lhHQ== X-Gm-Message-State: AOJu0Yw/i+eDbxK73EEhwEGDCWp5db0HK/K7eCk0mfLiFqW2jZjLpP33 3x+tNbXpTSL6KoOWaTlCIy49kO+YYT/mpZl5hdova5exbGDWJkdaIr8QoD1nGJ9g+u7yzho9r3Z ujb4t24WwFcfC0VcHoRCm0o/+yuCp4XJtocm7i2rJAVyueKm49ui1j0gZYpo2UUjzhWUCEIsDbe Fs1De5XUp3Kk/mPTwg4UarEun4Ng== X-Google-Smtp-Source: AGHT+IGLJ1g9qWYFRLbvfayzPkGURQoNtuE64AN7e/FtM9+0Wx8poeCQDn7dHo9mhHcAXpCVHE9fI77j X-Received: from wmbh17.prod.google.com ([2002:a05:600c:a111:b0:43d:1c11:4e5d]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:5f4f:0:b0:390:eebc:6f32 with SMTP id ffacd0b85a97d-39d8fda7660mr2385734f8f.48.1744292504422; Thu, 10 Apr 2025 06:41:44 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:20 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=7814; i=ardb@kernel.org; h=from:subject; bh=9hD4nc9XUYWvD6V2ufz0yp8+oHtK4mbTifqG6gyUMrw=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qcZJmX2CITdv/wrNT9sSPzNXhEtxS07kzGu3+iIl+ uIXcMd2lLIwiHEwyIopsgjM/vtu5+mJUrXOs2Rh5rAygQxh4OIUgIl8zmJkeGe789WWFUf7Txne u32yVehgyC7v5ZqCu9J+nJ0X8GFWRxjD/2D5NIuPG4y2r13ueFaxcsfH18q64jlvVJR3MR1jqTH pYgcA X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-15-ardb+git@google.com> Subject: [PATCH v4 02/11] x86/boot: Move the early GDT/IDT setup code into startup/ From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Move the early GDT/IDT setup code that runs long before the kernel virtual mapping is up into arch/x86/boot/startup/, and build it in a way that ensures that the code tolerates being called from the 1:1 mapping of memory. The code itself is left unchanged by this patch. Also tweak the sed symbol matching pattern in the decompressor to match on lower case 't' or 'b', as these will be emitted by Clang for symbols with hidden linkage. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/Makefile | 2 +- arch/x86/boot/startup/Makefile | 15 ++++ arch/x86/boot/startup/gdt_idt.c | 84 ++++++++++++++++++++ arch/x86/kernel/head64.c | 74 ----------------- 4 files changed, 100 insertions(+), 75 deletions(-) diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 37b85ce9b2a3..0fcad7b7e007 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -73,7 +73,7 @@ LDFLAGS_vmlinux += -T hostprogs := mkpiggy HOST_EXTRACFLAGS += -I$(srctree)/tools/include -sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABCDGRSTVW] \(_text\|__start_rodata\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p' +sed-voffset := -e 's/^\([0-9a-fA-F]*\) [ABbCDGRSTtVW] \(_text\|__start_rodata\|__bss_start\|_end\)$$/\#define VO_\2 _AC(0x\1,UL)/p' quiet_cmd_voffset = VOFFSET $@ cmd_voffset = $(NM) $< | sed -n $(sed-voffset) > $@ diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile index 8919a1cbcb5a..1beb5de30735 100644 --- a/arch/x86/boot/startup/Makefile +++ b/arch/x86/boot/startup/Makefile @@ -1,6 +1,21 @@ # SPDX-License-Identifier: GPL-2.0 KBUILD_AFLAGS += -D__DISABLE_EXPORTS +KBUILD_CFLAGS += -D__DISABLE_EXPORTS -mcmodel=small -fPIC \ + -Os -DDISABLE_BRANCH_PROFILING \ + $(DISABLE_STACKLEAK_PLUGIN) \ + -fno-stack-protector -D__NO_FORTIFY \ + -include $(srctree)/include/linux/hidden.h + +# disable ftrace hooks +KBUILD_CFLAGS := $(subst $(CC_FLAGS_FTRACE),,$(KBUILD_CFLAGS)) +KASAN_SANITIZE := n +KCSAN_SANITIZE := n +KMSAN_SANITIZE := n +UBSAN_SANITIZE := n +KCOV_INSTRUMENT := n + +obj-$(CONFIG_X86_64) += gdt_idt.o lib-$(CONFIG_X86_64) += la57toggle.o lib-$(CONFIG_EFI_MIXED) += efi-mixed.o diff --git a/arch/x86/boot/startup/gdt_idt.c b/arch/x86/boot/startup/gdt_idt.c new file mode 100644 index 000000000000..7e34d0b426b1 --- /dev/null +++ b/arch/x86/boot/startup/gdt_idt.c @@ -0,0 +1,84 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include + +#include +#include +#include +#include +#include + +/* + * Data structures and code used for IDT setup in head_64.S. The bringup-IDT is + * used until the idt_table takes over. On the boot CPU this happens in + * x86_64_start_kernel(), on secondary CPUs in start_secondary(). In both cases + * this happens in the functions called from head_64.S. + * + * The idt_table can't be used that early because all the code modifying it is + * in idt.c and can be instrumented by tracing or KASAN, which both don't work + * during early CPU bringup. Also the idt_table has the runtime vectors + * configured which require certain CPU state to be setup already (like TSS), + * which also hasn't happened yet in early CPU bringup. + */ +static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data; + +/* This may run while still in the direct mapping */ +static void __head startup_64_load_idt(void *vc_handler) +{ + struct desc_ptr desc = { + .address = (unsigned long)rip_rel_ptr(bringup_idt_table), + .size = sizeof(bringup_idt_table) - 1, + }; + struct idt_data data; + gate_desc idt_desc; + + /* @vc_handler is set only for a VMM Communication Exception */ + if (vc_handler) { + init_idt_data(&data, X86_TRAP_VC, vc_handler); + idt_init_desc(&idt_desc, &data); + native_write_idt_entry((gate_desc *)desc.address, X86_TRAP_VC, &idt_desc); + } + + native_load_idt(&desc); +} + +/* This is used when running on kernel addresses */ +void early_setup_idt(void) +{ + void *handler = NULL; + + if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) { + setup_ghcb(); + handler = vc_boot_ghcb; + } + + startup_64_load_idt(handler); +} + +/* + * Setup boot CPU state needed before kernel switches to virtual addresses. + */ +void __head startup_64_setup_gdt_idt(void) +{ + struct gdt_page *gp = rip_rel_ptr((void *)(__force unsigned long)&gdt_page); + void *handler = NULL; + + struct desc_ptr startup_gdt_descr = { + .address = (unsigned long)gp->gdt, + .size = GDT_SIZE - 1, + }; + + /* Load GDT */ + native_load_gdt(&startup_gdt_descr); + + /* New GDT is live - reload data segment registers */ + asm volatile("movl %%eax, %%ds\n" + "movl %%eax, %%ss\n" + "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory"); + + if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) + handler = rip_rel_ptr(vc_no_ghcb); + + startup_64_load_idt(handler); +} diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index 954d093f187b..9b2ffec4bbad 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -512,77 +512,3 @@ void __init __noreturn x86_64_start_reservations(char *real_mode_data) start_kernel(); } - -/* - * Data structures and code used for IDT setup in head_64.S. The bringup-IDT is - * used until the idt_table takes over. On the boot CPU this happens in - * x86_64_start_kernel(), on secondary CPUs in start_secondary(). In both cases - * this happens in the functions called from head_64.S. - * - * The idt_table can't be used that early because all the code modifying it is - * in idt.c and can be instrumented by tracing or KASAN, which both don't work - * during early CPU bringup. Also the idt_table has the runtime vectors - * configured which require certain CPU state to be setup already (like TSS), - * which also hasn't happened yet in early CPU bringup. - */ -static gate_desc bringup_idt_table[NUM_EXCEPTION_VECTORS] __page_aligned_data; - -/* This may run while still in the direct mapping */ -static void __head startup_64_load_idt(void *vc_handler) -{ - struct desc_ptr desc = { - .address = (unsigned long)rip_rel_ptr(bringup_idt_table), - .size = sizeof(bringup_idt_table) - 1, - }; - struct idt_data data; - gate_desc idt_desc; - - /* @vc_handler is set only for a VMM Communication Exception */ - if (vc_handler) { - init_idt_data(&data, X86_TRAP_VC, vc_handler); - idt_init_desc(&idt_desc, &data); - native_write_idt_entry((gate_desc *)desc.address, X86_TRAP_VC, &idt_desc); - } - - native_load_idt(&desc); -} - -/* This is used when running on kernel addresses */ -void early_setup_idt(void) -{ - void *handler = NULL; - - if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) { - setup_ghcb(); - handler = vc_boot_ghcb; - } - - startup_64_load_idt(handler); -} - -/* - * Setup boot CPU state needed before kernel switches to virtual addresses. - */ -void __head startup_64_setup_gdt_idt(void) -{ - struct gdt_page *gp = rip_rel_ptr((void *)(__force unsigned long)&gdt_page); - void *handler = NULL; - - struct desc_ptr startup_gdt_descr = { - .address = (unsigned long)gp->gdt, - .size = GDT_SIZE - 1, - }; - - /* Load GDT */ - native_load_gdt(&startup_gdt_descr); - - /* New GDT is live - reload data segment registers */ - asm volatile("movl %%eax, %%ds\n" - "movl %%eax, %%ss\n" - "movl %%eax, %%es\n" : : "a"(__KERNEL_DS) : "memory"); - - if (IS_ENABLED(CONFIG_AMD_MEM_ENCRYPT)) - handler = rip_rel_ptr(vc_no_ghcb); - - startup_64_load_idt(handler); -} From patchwork Thu Apr 10 13:41:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 879830 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28ABE78F4E for ; Thu, 10 Apr 2025 13:41:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292512; cv=none; b=mOFia3WIEuFdsG/lCqUfiF6axMgK1bZ0OzLsAY1wCLhN9656Oah/Iz80WZaL0HedUYTNoH1F8jk/rUyZtGyQ07Bot6qKGtl5eo89YWDMEDOC6wugZr0dOwBvBlDv1bVnxsgDsEpDE0K1tQ6/uzH1IlJjfzVpAjKTpJT8CVn25cE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292512; c=relaxed/simple; bh=N4MtpQRBVB+U9+XwdsBV8OoaO5bJmSMiovkcXVy1Y8M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=CFZKT4c8LsCJaLZhRwHdo7L2gk34DPn4VI6Ie49uc2VgFsLLNjLP4hEp4LK7jyLUPuWqyhcYN7vGPX5zV738wGGncvNJ0+kT7lWl8HW0qszIaXaj0AoM3OLsKBB6ebPm2pvD/FnTzi4ofsWHAJ7GwNJZfDi3GqfwNLK19Tj2PaY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=HSvtvoze; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="HSvtvoze" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43d22c304adso8764645e9.0 for ; Thu, 10 Apr 2025 06:41:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292508; x=1744897308; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pfhvtPh90PyQrqwseeYAemYLSGq1dbH+1FaY8eo5bSo=; b=HSvtvozeMV8Z3dDkb4TLnuueFSa8Vo3ZzejIjdBL1yJxpUf80cOeolUF56jFKzMsyv 3B2z3lkRGTXcZ63ayEWeHgV7UKh9R2XhEjShzqOWmVwVl9+er25QizVR1gxuBkBQqKyU RXqLq4aP3VyTBms210t9FhMVWxLK+PiNaqVtcPaFGyzFyYqo7+65FSRrgBV/SV9KKoUp DBSDghi1RYkWAddfcCy8dK1QJ3Prlcg+8Ux7mHK94/kjx0BYQ3WAo7kZk1fX7SQGLOm5 voTOgD8rjwUv7xFBJqsx+x1waCSO7HTBWodFOwT25xwoBE32EeR8U9gkwOywcPg/mX6t EPRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292508; x=1744897308; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pfhvtPh90PyQrqwseeYAemYLSGq1dbH+1FaY8eo5bSo=; b=u3w1H8k142uWnTnXgcSHEuYVCEprXZfWGhndg1Nwden6QMhNxXxoY+rbmStTVT3lN6 wbsJUP3ghvxHy8gFw2qVLVm44EblX6aR13JfaHC03sY7KxF9MIS4rmNFLtbu7DWAwj0m XOyGkbeDLsSKxo2VFgrvCkoNW/Q4Q/smDlnE8f+l1W+kEWiIdg4L4uc4HJhx6kn4QmaM 6Cr/Pfes2e7N+JeyrZJ6MY6LggP6G9JW9oe0FhJxy0+2wNallJYXvMJvD+nhQzaJuFh2 fq23zaedL6NAboVtUfDTFKeBxKVGlWF9Y57odJot7gPcw1jVy+IbvfxuwsGNIFBKF9r0 C+1w== X-Gm-Message-State: AOJu0YxkwZLYTMNifc4cfFZJSCLfKBpFab1+qq3Cj7QJFBWSClAyzIny j7rrm3CDhLpbw6hXBnV1WhYG99b9fJyYaeUF4FyHdSw8sFzow+9+Zy7XzZhTuo/mhsE0G0j3jtZ 6Jyxhux8ny8M+YqP0nBatjYgv0w/1KLKXwflUUBYOg4DiVYT6PujqynJssc5fKjRTtSHtILXkh3 HcxUIv4aJmitNWAo7RxuI9j6IdNQ== X-Google-Smtp-Source: AGHT+IEHgsALtbsBtlCh5VUUbOKjAbYRqEioBxZdA2XTwAKO+rsn9gwbhSvj8B86/s9ydc4ad/YnzBYc X-Received: from wmbh13.prod.google.com ([2002:a05:600c:a10d:b0:43d:9f1:31a9]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:14b:b0:43d:174:2668 with SMTP id 5b1f17b1804b1-43f3630500dmr3554455e9.0.1744292508427; Thu, 10 Apr 2025 06:41:48 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:22 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=4518; i=ardb@kernel.org; h=from:subject; bh=vBaHbEMhT1iu9JGCBB53txIntdweeE+e86SJD43lhWg=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qZY3swVe9d7f39V0K/TExkdV8mZTc9VL75dvrJS35 k7yvXuso5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAEyE+wkjw7vHDXxpKdeStydN q1gV6xifuuJoEMctwQ8r8hmzWedEejD84dhQPq3Is2pH4/15LBJ8DRdux156liTt+2L2Tf9q7b3 x7AA= X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-17-ardb+git@google.com> Subject: [PATCH v4 04/11] x86/boot: Drop RIP_REL_REF() uses from early mapping code From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Now that __startup_64() is built using -fPIC, RIP_REL_REF() has become a NOP and can be removed. Only some occurrences of rip_rel_ptr() will remain, to explicitly take the address of certain global structures in the 1:1 mapping of memory. While at it, update the code comment to describe why this is needed. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/startup/map_kernel.c | 41 ++++++++++---------- 1 file changed, 21 insertions(+), 20 deletions(-) diff --git a/arch/x86/boot/startup/map_kernel.c b/arch/x86/boot/startup/map_kernel.c index 5f1b7e0ba26e..0eac3f17dbd3 100644 --- a/arch/x86/boot/startup/map_kernel.c +++ b/arch/x86/boot/startup/map_kernel.c @@ -26,12 +26,12 @@ static inline bool check_la57_support(void) if (!(native_read_cr4() & X86_CR4_LA57)) return false; - RIP_REL_REF(__pgtable_l5_enabled) = 1; - RIP_REL_REF(pgdir_shift) = 48; - RIP_REL_REF(ptrs_per_p4d) = 512; - RIP_REL_REF(page_offset_base) = __PAGE_OFFSET_BASE_L5; - RIP_REL_REF(vmalloc_base) = __VMALLOC_BASE_L5; - RIP_REL_REF(vmemmap_base) = __VMEMMAP_BASE_L5; + __pgtable_l5_enabled = 1; + pgdir_shift = 48; + ptrs_per_p4d = 512; + page_offset_base = __PAGE_OFFSET_BASE_L5; + vmalloc_base = __VMALLOC_BASE_L5; + vmemmap_base = __VMEMMAP_BASE_L5; return true; } @@ -81,12 +81,14 @@ static unsigned long __head sme_postprocess_startup(struct boot_params *bp, return sme_get_me_mask(); } -/* Code in __startup_64() can be relocated during execution, but the compiler - * doesn't have to generate PC-relative relocations when accessing globals from - * that function. Clang actually does not generate them, which leads to - * boot-time crashes. To work around this problem, every global pointer must - * be accessed using RIP_REL_REF(). Kernel virtual addresses can be determined - * by subtracting p2v_offset from the RIP-relative address. +/* + * This code is compiled using PIC codegen because it will execute from the + * early 1:1 mapping of memory, which deviates from the mapping expected by the + * linker. Due to this deviation, taking the address of a global variable will + * produce an ambiguous result when using the plain & operator. Instead, + * rip_rel_ptr() must be used, which will return the RIP-relative address in + * the 1:1 mapping of memory. Kernel virtual addresses can be determined by + * subtracting p2v_offset from the RIP-relative address. */ unsigned long __head __startup_64(unsigned long p2v_offset, struct boot_params *bp) @@ -113,8 +115,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset, * Compute the delta between the address I am compiled to run at * and the address I am actually running at. */ - load_delta = __START_KERNEL_map + p2v_offset; - RIP_REL_REF(phys_base) = load_delta; + phys_base = load_delta = __START_KERNEL_map + p2v_offset; /* Is the address not 2M aligned? */ if (load_delta & ~PMD_MASK) @@ -138,11 +139,11 @@ unsigned long __head __startup_64(unsigned long p2v_offset, pgd[pgd_index(__START_KERNEL_map)] = (pgdval_t)p4d | _PAGE_TABLE; } - RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 2].pud += load_delta; - RIP_REL_REF(level3_kernel_pgt)[PTRS_PER_PUD - 1].pud += load_delta; + level3_kernel_pgt[PTRS_PER_PUD - 2].pud += load_delta; + level3_kernel_pgt[PTRS_PER_PUD - 1].pud += load_delta; for (i = FIXMAP_PMD_TOP; i > FIXMAP_PMD_TOP - FIXMAP_PMD_NUM; i--) - RIP_REL_REF(level2_fixmap_pgt)[i].pmd += load_delta; + level2_fixmap_pgt[i].pmd += load_delta; /* * Set up the identity mapping for the switchover. These @@ -153,12 +154,12 @@ unsigned long __head __startup_64(unsigned long p2v_offset, pud = &early_pgts[0]->pmd; pmd = &early_pgts[1]->pmd; - RIP_REL_REF(next_early_pgt) = 2; + next_early_pgt = 2; pgtable_flags = _KERNPG_TABLE_NOENC + sme_get_me_mask(); if (la57) { - p4d = &early_pgts[RIP_REL_REF(next_early_pgt)++]->pmd; + p4d = &early_pgts[next_early_pgt++]->pmd; i = (physaddr >> PGDIR_SHIFT) % PTRS_PER_PGD; pgd[i + 0] = (pgdval_t)p4d + pgtable_flags; @@ -179,7 +180,7 @@ unsigned long __head __startup_64(unsigned long p2v_offset, pmd_entry = __PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL; /* Filter out unsupported __PAGE_KERNEL_* bits: */ - pmd_entry &= RIP_REL_REF(__supported_pte_mask); + pmd_entry &= __supported_pte_mask; pmd_entry += sme_get_me_mask(); pmd_entry += physaddr; From patchwork Thu Apr 10 13:41:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 880623 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E19A28C5C9 for ; Thu, 10 Apr 2025 13:41:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292514; cv=none; b=ICzGs+KySi5cbRdfO2NiHGGOhUxVPXzhSxFBQkdKpnvhPIZbxf/FdM6VNcjdgS+jvOsaoc+te9KvfMug6h8ynYA9Mz1jJWE0qrqJZLqulP/efHoygpJWYeRnE3oQ0d88S24lIWfaROLVIFkKXT6o81HtdFD5W+1gXCiq8ZYdRns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292514; c=relaxed/simple; bh=rGZMX0G8lRLt0dY4UTfyPAeFNjieLVMC1P9MjIC1z4k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QKTPkfS+VRZkvwzyvWW/lMqK8E/JRLq8F4/dDOMRCclEqAmzxwfxUJG2s5rKji2MpOor0kq9OBF/m/1APTwk1Uv4OV7Or539co+2zaEOkGUwXG5cLMMYi5yTjmrEw//5bTJ4qSBnH0eDl068EJrhKjnoSyZuOBBD1S5P99BXn08= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bHEGPwDo; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bHEGPwDo" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-391315098b2so353669f8f.2 for ; Thu, 10 Apr 2025 06:41:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292510; x=1744897310; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ut3DF7cdGnYGcsB+1y8GBrdozOKwjcN3MDn2MzZHkLM=; b=bHEGPwDoo3kYs7xiBXTKO01dFpfFevD/s6ceH2tldTcQV/ICyoHnnMvQPMWGNTM4Vt /u0Hc8femlmQO12vFdMNEtNu7S9giSQn8EQL0t8eetKAtSDlhb4NVu27/gpYlspwXxUH GDusrHSDY5ICE1YbaNwN9feVATQLlrkxCwVjdKkukEXC2iQ7jVwO/2/6r7e6z637pX8b jldWi2ybsKGYZl47UNHOZOlpAQqOgyvUf7lIsoyVnKJik/isBGg3wu6cOuyjN15HFJIJ 50OLMjYfIQAd8TGEpfQQd3/LISi4vvp2+tG1svOlqUhxMEkwK7pA0whsD4QA4Ycgmm5t cSYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292510; x=1744897310; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ut3DF7cdGnYGcsB+1y8GBrdozOKwjcN3MDn2MzZHkLM=; b=WyX/qqlSUwTScdG+6RnbiSHyaKs62IGa0s6exwOT3M1giyAaxvMtoqhlvxQieXbOe2 TcuR4o4n4m/oPPxcx+Qym9L28dc31nDrBz0PGhdhsdouYEnI909iMDoCrKYzsROajm6L awW4yPwZ2JIeOjjKy6r25QToVRmQTnhJfuER0J7/LPalfX9VPHUUng/2Ax/m4RdWlUJ6 GrDxpmDpYr/KratdTFBia7FC49gSghWcp0+hf1at/jFm7uZTjckCLk1OArQIh4qT21lt jhQlM1C3mdTqDla5aJ1+ZP/5iaPtJNM6FEHsYDRzy5Fmccj3ws/AGcQsUgvkbdwI36hV AVug== X-Gm-Message-State: AOJu0YzGvLpf6nQqDn3liypZh94pyPM62P4rGJxwQ8GYZ9TyW0pcaSFS ED5mgaEJeO5RWb9+dxsN6A/55bt9rsLeemJ5AoovyTYn4HmlQvoB3o2Fw7h9iz704oo0qnn/KRw m0GY36r+MIUHHcHOTRPbTn3YNyLFVHeOfLdTE+FKHSsptisNBQcPgk1m/CK/0rBI1Xaw5jmjp2u mKa2TVwcpLgzqBS9iUwYrDVLAiww== X-Google-Smtp-Source: AGHT+IE4m6fPKqx/35XmbrFlcOeYZO9KrixDd7VjHDR0y2s4p9wZTI76FsNpYTfEVOGWOh1JL4I+B4WW X-Received: from wmbgx3.prod.google.com ([2002:a05:600c:8583:b0:43d:48c5:64a2]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:248a:b0:39d:724f:a8cd with SMTP id ffacd0b85a97d-39d8f4993d5mr2529023f8f.35.1744292510373; Thu, 10 Apr 2025 06:41:50 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:23 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3139; i=ardb@kernel.org; h=from:subject; bh=1OMnBnM3B5uy1EYv4XyX2QqLee71wb5xFGUqQq+IsUU=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qTbnjFluV2qrldV1vde/X/1KYO3nsp1yTdMyjm3QP vPWOYeho5SFQYyDQVZMkUVg9t93O09PlKp1niULM4eVCWQIAxenAEzkywJGhuvBnsdjV3w5vvnH OivuuDnN8bn3XLa+VD8vqSyx96O2vDYjw8SF97e8VRB3Xb4tx/qV6ub1+gv8m8TuH7RhjJv6xeU oAysA X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-18-ardb+git@google.com> Subject: [PATCH v4 05/11] x86/boot: Move early SME init code into startup/ From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Move the SME initialization code, which runs from the 1:1 mapping of memory as it operates on the kernel virtual mapping, into the new sub-directory arch/x86/boot/startup/ where all startup code will reside that needs to tolerate executing from the 1:1 mapping. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/startup/Makefile | 1 + arch/x86/{mm/mem_encrypt_identity.c => boot/startup/sme.c} | 2 -- arch/x86/mm/Makefile | 6 ------ 3 files changed, 1 insertion(+), 8 deletions(-) diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile index 10319aee666b..ccdfc42a4d59 100644 --- a/arch/x86/boot/startup/Makefile +++ b/arch/x86/boot/startup/Makefile @@ -16,6 +16,7 @@ UBSAN_SANITIZE := n KCOV_INSTRUMENT := n obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o +obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o lib-$(CONFIG_X86_64) += la57toggle.o lib-$(CONFIG_EFI_MIXED) += efi-mixed.o diff --git a/arch/x86/mm/mem_encrypt_identity.c b/arch/x86/boot/startup/sme.c similarity index 99% rename from arch/x86/mm/mem_encrypt_identity.c rename to arch/x86/boot/startup/sme.c index e7fb3779b35f..23d10cda5b58 100644 --- a/arch/x86/mm/mem_encrypt_identity.c +++ b/arch/x86/boot/startup/sme.c @@ -45,8 +45,6 @@ #include #include -#include "mm_internal.h" - #define PGD_FLAGS _KERNPG_TABLE_NOENC #define P4D_FLAGS _KERNPG_TABLE_NOENC #define PUD_FLAGS _KERNPG_TABLE_NOENC diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 32035d5be5a0..3faa60f13a61 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -3,12 +3,10 @@ KCOV_INSTRUMENT_tlb.o := n KCOV_INSTRUMENT_mem_encrypt.o := n KCOV_INSTRUMENT_mem_encrypt_amd.o := n -KCOV_INSTRUMENT_mem_encrypt_identity.o := n KCOV_INSTRUMENT_pgprot.o := n KASAN_SANITIZE_mem_encrypt.o := n KASAN_SANITIZE_mem_encrypt_amd.o := n -KASAN_SANITIZE_mem_encrypt_identity.o := n KASAN_SANITIZE_pgprot.o := n # Disable KCSAN entirely, because otherwise we get warnings that some functions @@ -16,12 +14,10 @@ KASAN_SANITIZE_pgprot.o := n KCSAN_SANITIZE := n # Avoid recursion by not calling KMSAN hooks for CEA code. KMSAN_SANITIZE_cpu_entry_area.o := n -KMSAN_SANITIZE_mem_encrypt_identity.o := n ifdef CONFIG_FUNCTION_TRACER CFLAGS_REMOVE_mem_encrypt.o = -pg CFLAGS_REMOVE_mem_encrypt_amd.o = -pg -CFLAGS_REMOVE_mem_encrypt_identity.o = -pg CFLAGS_REMOVE_pgprot.o = -pg endif @@ -32,7 +28,6 @@ obj-y += pat/ # Make sure __phys_addr has no stackprotector CFLAGS_physaddr.o := -fno-stack-protector -CFLAGS_mem_encrypt_identity.o := -fno-stack-protector CFLAGS_fault.o := -I $(src)/../include/asm/trace @@ -63,5 +58,4 @@ obj-$(CONFIG_MITIGATION_PAGE_TABLE_ISOLATION) += pti.o obj-$(CONFIG_X86_MEM_ENCRYPT) += mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_amd.o -obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_identity.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += mem_encrypt_boot.o From patchwork Thu Apr 10 13:41:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 879829 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F131E28CF56 for ; Thu, 10 Apr 2025 13:41:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292515; cv=none; b=u4fkC6hw0qAw+87VGIo2slPLmWTCISaPVzyDxqm9VN2AE46QJdpLuNRyWOEI1UfwGd+FJ7abOO0yEhURd0bPZontayEPujIQlGvhSXcqerO15HZnrp2y+E6/xQmISWiRt/yUFV+2Z9+k9A/aDs+ex+YYKrugFYcuW+ho8pJv2fU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292515; c=relaxed/simple; bh=nMTMqzq37ZDdZSy/ASJb7GM4lqZTMOHxEOCfZ5XR4c8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AAlkcqAyK46nazJYRrYpKNHpYptkehnjq2vvpmqfEOcFs075FbDUFwOgxKj24aheNhgKvQ0eIwctVoJ06jBgLj7t4XXhPPkFoLIrlnE9vfExy8onodrYiMWYUk6UxXboDVjie39QjA7Yn8kLj82qceJY7b35aD7YyV4mJ/wcC6g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lsNO/Plv; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lsNO/Plv" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-39d917b1455so285262f8f.3 for ; Thu, 10 Apr 2025 06:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292512; x=1744897312; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Bu1M3/PvdkhGWlHIRkQzAzcvhk8sTE/2aa4WDt4cxm0=; b=lsNO/Plv7GrYSyZsgRRIoR0xzAlUQGfCsPUP3cYTd+Mv7PS5GPTSDTYQqJDz3XoZ4K IRMUDdRO0JlpwiYnonf7c3jmqfABNexQv41K6GqWcxVw9kiI2IMz3uhOCMh1SlQqGtjM n+gpy4nAWXLkxJuY6c6a8LUB/S75wacUQSMZF3bkNXJhemggj9nzYnPeuDefJpiPT38V mhYndEv5uBQKy+s2UVBilZ8bUv/ZIp77zaNgg1retkENSM2IhRKGF3QMeVFHRhHv+TeA PcMOlo5k6Vrkxum/+PCDC0Eza6O2q4DEt7j41qtW8EL07m8ZZw8vHj3P6PpiYaKdpBO2 5xWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292512; x=1744897312; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Bu1M3/PvdkhGWlHIRkQzAzcvhk8sTE/2aa4WDt4cxm0=; b=ABgZ+7jZqhCwy1ZhfYhvYdShPahWroz8DhHByMumgslcSMLy8N4qhhKXjOsDnY+qQ/ hD3i9JlOWWqrVkh3Ag78bkIcyNcuJ6SvTsr1gKIEnWnkfIQzfsPY6WWBAqgvJsYdekck TQFT606o2L4DINXettTHr2j6vFkSwJVq11B79q3YLsHBD/OH9GLQ/X6lYRQWHK6ZYiKi isXiGa5e0DAR9kqHkQ/Zzu4szRSOU4Fd2113U+HgDn5YliUSrFJK5gYbEVm4KhIatZ7Y I0KelxV8PqteAdNCEJmT3WN50RpTrUpvBY8DfWwe4WIxFks4gtwRESpWOmFev54qvu8a p5Dg== X-Gm-Message-State: AOJu0YxpVpZ0aXlvDJgsgmDhytI//IzfaowRySnsAindC6xw9jPo7C4E L/ksTTxAwhsou/HetIQDlA9QZ2Y/FozKpCt71Kd1VC5DjN5JJ2k7SyT7pT+OsZlnz/SA9NFBuO+ a+kxGL7NSuCSjj506K9fttwrmKz4LcasJMWTSnV1CgbN2J8rgT+oW6+U+zWgZB4E15X1wUFJGdW 7HHbVBRk6JW6sJWq62ha+7IxJVSw== X-Google-Smtp-Source: AGHT+IHjGJ8my3L+/VrQ6z6E/5WTYGRGconWFEeiWV78b8cMZMlPoKQBR0w9iIS6PTwJ1hNbOmGkrZ05 X-Received: from wmbet13.prod.google.com ([2002:a05:600c:818d:b0:43c:f5b8:aad0]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:4404:b0:39c:13fd:ea9c with SMTP id ffacd0b85a97d-39d8f4e4741mr1694110f8f.47.1744292512486; Thu, 10 Apr 2025 06:41:52 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:24 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=2567; i=ardb@kernel.org; h=from:subject; bh=HHqxACxmdTzUBkBumD23I1ULhjlW30p2Zg7pYMcwt7Y=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qQ4N1s3Fv0T1v264wxLJtfCEcbDHtdKL+/Z+T4uxz HE/eX5mRykLgxgHg6yYIovA7L/vdp6eKFXrPEsWZg4rE8gQBi5OAZjImWiGv1KL07/qn5N4NVdG nkHr9JTMLcfOxu+Kc9m0wHNDmYbfhT0M/+s28xXacfhuDJmutG/pkwcnT9ou+qW/8OAEs9v31u3 5UcQPAA== X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-19-ardb+git@google.com> Subject: [PATCH v4 06/11] x86/boot: Drop RIP_REL_REF() uses from SME startup code From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel RIP_REL_REF() has no effect on code residing in arch/x86/boot/startup, as it is built with -fPIC. So remove any occurrences from the SME startup code. Note the SME is the only caller of cc_set_mask() that requires this, so drop it from there as well. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/startup/sme.c | 11 +++++------ arch/x86/include/asm/coco.h | 2 +- arch/x86/include/asm/mem_encrypt.h | 2 +- 3 files changed, 7 insertions(+), 8 deletions(-) diff --git a/arch/x86/boot/startup/sme.c b/arch/x86/boot/startup/sme.c index 23d10cda5b58..5738b31c8e60 100644 --- a/arch/x86/boot/startup/sme.c +++ b/arch/x86/boot/startup/sme.c @@ -297,8 +297,7 @@ void __head sme_encrypt_kernel(struct boot_params *bp) * instrumentation or checking boot_cpu_data in the cc_platform_has() * function. */ - if (!sme_get_me_mask() || - RIP_REL_REF(sev_status) & MSR_AMD64_SEV_ENABLED) + if (!sme_get_me_mask() || sev_status & MSR_AMD64_SEV_ENABLED) return; /* @@ -524,7 +523,7 @@ void __head sme_enable(struct boot_params *bp) me_mask = 1UL << (ebx & 0x3f); /* Check the SEV MSR whether SEV or SME is enabled */ - RIP_REL_REF(sev_status) = msr = __rdmsr(MSR_AMD64_SEV); + sev_status = msr = __rdmsr(MSR_AMD64_SEV); feature_mask = (msr & MSR_AMD64_SEV_ENABLED) ? AMD_SEV_BIT : AMD_SME_BIT; /* @@ -560,8 +559,8 @@ void __head sme_enable(struct boot_params *bp) return; } - RIP_REL_REF(sme_me_mask) = me_mask; - RIP_REL_REF(physical_mask) &= ~me_mask; - RIP_REL_REF(cc_vendor) = CC_VENDOR_AMD; + sme_me_mask = me_mask; + physical_mask &= ~me_mask; + cc_vendor = CC_VENDOR_AMD; cc_set_mask(me_mask); } diff --git a/arch/x86/include/asm/coco.h b/arch/x86/include/asm/coco.h index e7225452963f..e1dbf8df1b69 100644 --- a/arch/x86/include/asm/coco.h +++ b/arch/x86/include/asm/coco.h @@ -22,7 +22,7 @@ static inline u64 cc_get_mask(void) static inline void cc_set_mask(u64 mask) { - RIP_REL_REF(cc_mask) = mask; + cc_mask = mask; } u64 cc_mkenc(u64 val); diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h index 1530ee301dfe..ea6494628cb0 100644 --- a/arch/x86/include/asm/mem_encrypt.h +++ b/arch/x86/include/asm/mem_encrypt.h @@ -61,7 +61,7 @@ void __init sev_es_init_vc_handling(void); static inline u64 sme_get_me_mask(void) { - return RIP_REL_REF(sme_me_mask); + return sme_me_mask; } #define __bss_decrypted __section(".bss..decrypted") From patchwork Thu Apr 10 13:41:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 880621 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 828D128D837 for ; Thu, 10 Apr 2025 13:41:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292523; cv=none; b=PcQe5PknLG8OZY35BIqZY6ouHUEOg704GSOLt9TEywOIUf0p+K2jpEAQznj85VTuvLeUc3a1mZ8gA/tG+unJCG1gkRBJXws2nPD9TbZLTLX/2umEqf+j5lyN75lChuf+O5G+Yl9ggKTFddbjVuP69oIeup/lFQvTdlrgDbYtWT0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292523; c=relaxed/simple; bh=vM67J1PXJCfXgYbTaW64nM7P/pQrfOPX7SIwEPfUYLI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YzUOrhttj87QH26UopuzN6HnoOHlAiodOUNV8jQa0Er8rqOPerfeLXAtieQOOeJI9V8q6Psldrq/DnMRjRZCXtLUmcGVOgaYtOy2M1cldnoLqPHie5mFGVbkf2msjrooF1WxHdsrI09Kv5n7515ztWaGU+9OXKRjxu/gpuBoUVA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=BaKXM7rd; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="BaKXM7rd" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-3912e4e2033so382102f8f.0 for ; Thu, 10 Apr 2025 06:41:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292517; x=1744897317; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=AKMscW38p0w7jQFUu2hb4y6VWO76QJqFM0eYk9egbz8=; b=BaKXM7rdx+QAc1trHWREW4Q1jYLawL50IzhpLeYuiclAF4uYfkzTPpj8uinlqy5mKh TbZUtjv5E57JtvquNJXvZCSQo1yxzVRI0kwgU7v3UuIwHcrXf2GzaGfU3OO/TPaZYOQu OC/LSSBEOYpg1NJGRXSZWquGefugdRvjar+XKUkjUKp+BQjE4FqwviCdTlNCfmuA5moL V4aBg3W0auOncQiGClUJODBWeWv4wQFt+TdV/ApfUe4puOBfGlNLiy8lSNDrHFf6sB8i vIZfv9E6FTgOmn/U95AqqNzzgHlAhaRH5Um5H1gbZ58y1bm7h4pM0GknAVzfRCCuGhgY fP2Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292517; x=1744897317; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=AKMscW38p0w7jQFUu2hb4y6VWO76QJqFM0eYk9egbz8=; b=wcjbcMK3Vw0e0ft9LwRAEc70ByuC13fqrTt3vXxrBoIJSQx4xKF0kdrQQkpDo2Ydp4 IciMW3FF4avDoUxWdTRdzO26Jl6LIXcpVSH4IZRUrF9sZDPTLXGUe0rz75dYVh9R2WwC OdM4FLh1cNRsv4sKevVa9Kdfg9gVdTuFMmfwGDpLGNSMskbChbD3b7N9pWtXjlAWuFIS GUYkoSHxEZ7TpOckQkrftjpmWjqaKl+R/tE6/uWGoG8E+3ZtkTh1pItnjizf/qGuUM2s d6DIY1A40hIjCt/t1Vb86035bYDs9TeU6iMBENgawLxag8w4WqHXQ0R6JUW++bmbdb6J rMVQ== X-Gm-Message-State: AOJu0YyIltfcL7dJatiSznte80BhAKB1mxASx5sAQZKYg++gQt8VNdGF j/smDsoRd+CloT0NGnbhAjezGzssEikCEoaFvHqGVtDV8Wu+g6adiEGjkM344Zp1E4GyeteJB0c TZMsnBAr71YwIOl2f6V+UKA9p1KXXMWFn566Ts7jHZXD2UF5Af2naaanRnFnLfBcM3aNo7CrdjK n+U24z0CO2GCrDbl3VtGHbqzEuOg== X-Google-Smtp-Source: AGHT+IGKHuKoboCe9eD9syYspbkeXthO/0/Yyfk0Q0DQwJYmjhTldVIkCdqSEfQ5rm/bGdh9T0z4Xfr/ X-Received: from wmby10.prod.google.com ([2002:a05:600c:c04a:b0:43d:522a:45e8]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:5f48:0:b0:391:2fe3:24ec with SMTP id ffacd0b85a97d-39d8f46aef3mr2561770f8f.14.1744292516870; Thu, 10 Apr 2025 06:41:56 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:26 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=97236; i=ardb@kernel.org; h=from:subject; bh=5lc2uuzNj5mTst6eV+V0hrmJyTzNtK+1DmQuWpSWoDI=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qW5Ry8LbwZqskeFN6smJ/DHJf/fcz36QYbkgchPHJ uY5u1U7SlkYxDgYZMUUWQRm/3238/REqVrnWbIwc1iZQIYwcHEKwEQWtDAyLPwo8NPRcPUmpjDW 1+Ib/OYrTPfw0Y66oeK2ds2Vs3biexgZVuvrzPc+fe7bteQ7W8rLxB7F/mKpj92olDnt6rZE2bV LmAE= X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-21-ardb+git@google.com> Subject: [PATCH v4 08/11] x86/sev: Split off startup code from core code From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Disentangle the SEV core code and the SEV code that is called during early boot. The latter piece will be moved into startup/ in a subsequent patch. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/sev.c | 2 + arch/x86/coco/sev/Makefile | 12 +- arch/x86/coco/sev/core.c | 1574 ++++---------------- arch/x86/coco/sev/shared.c | 281 ---- arch/x86/coco/sev/startup.c | 1395 +++++++++++++++++ 5 files changed, 1658 insertions(+), 1606 deletions(-) diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 478eca4f7180..714e30c66eae 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -141,6 +141,8 @@ u64 svsm_get_caa_pa(void) int svsm_perform_call_protocol(struct svsm_call *call); +u8 snp_vmpl; + /* Include code for early handlers */ #include "../../coco/sev/shared.c" diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile index dcb06dc8b5ae..7d7d2aee62f0 100644 --- a/arch/x86/coco/sev/Makefile +++ b/arch/x86/coco/sev/Makefile @@ -1,18 +1,18 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y += core.o +obj-y += core.o startup.o # jump tables are emitted using absolute references in non-PIC code # so they cannot be used in the early SEV startup code -CFLAGS_core.o += -fno-jump-tables +CFLAGS_startup.o += -fno-jump-tables ifdef CONFIG_FUNCTION_TRACER -CFLAGS_REMOVE_core.o = -pg +CFLAGS_REMOVE_startup.o = -pg endif -KASAN_SANITIZE_core.o := n -KMSAN_SANITIZE_core.o := n -KCOV_INSTRUMENT_core.o := n +KASAN_SANITIZE_startup.o := n +KMSAN_SANITIZE_startup.o := n +KCOV_INSTRUMENT_startup.o := n # With some compiler versions the generated code results in boot hangs, caused # by several compilation units. To be safe, disable all instrumentation. diff --git a/arch/x86/coco/sev/core.c b/arch/x86/coco/sev/core.c index aeb7731862c0..26e3cf28c4c1 100644 --- a/arch/x86/coco/sev/core.c +++ b/arch/x86/coco/sev/core.c @@ -80,18 +80,6 @@ static const char * const sev_status_feat_names[] = { [MSR_AMD64_SNP_SMT_PROT_BIT] = "SMTProt", }; -/* For early boot hypervisor communication in SEV-ES enabled guests */ -struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE); - -/* - * Needs to be in the .data section because we need it NULL before bss is - * cleared - */ -struct ghcb *boot_ghcb __section(".data"); - -/* Bitmap of SEV features supported by the hypervisor */ -u64 sev_hv_features __ro_after_init; - /* Secrets page physical address from the CC blob */ static u64 secrets_pa __ro_after_init; @@ -104,14 +92,16 @@ static u64 snp_tsc_scale __ro_after_init; static u64 snp_tsc_offset __ro_after_init; static u64 snp_tsc_freq_khz __ro_after_init; - -/* For early boot SVSM communication */ -struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE); - DEFINE_PER_CPU(struct sev_es_runtime_data*, runtime_data); DEFINE_PER_CPU(struct sev_es_save_area *, sev_vmsa); -DEFINE_PER_CPU(struct svsm_ca *, svsm_caa); -DEFINE_PER_CPU(u64, svsm_caa_pa); + +/* + * SVSM related information: + * When running under an SVSM, the VMPL that Linux is executing at must be + * non-zero. The VMPL is therefore used to indicate the presence of an SVSM. + */ +u8 snp_vmpl __ro_after_init; +EXPORT_SYMBOL_GPL(snp_vmpl); static __always_inline bool on_vc_stack(struct pt_regs *regs) { @@ -128,6 +118,7 @@ static __always_inline bool on_vc_stack(struct pt_regs *regs) return ((sp >= __this_cpu_ist_bottom_va(VC)) && (sp < __this_cpu_ist_top_va(VC))); } + /* * This function handles the case when an NMI is raised in the #VC * exception handler entry code, before the #VC handler has switched off @@ -184,397 +175,203 @@ void noinstr __sev_es_ist_exit(void) this_cpu_write(cpu_tss_rw.x86_tss.ist[IST_INDEX_VC], *(unsigned long *)ist); } -/* - * Nothing shall interrupt this code path while holding the per-CPU - * GHCB. The backup GHCB is only for NMIs interrupting this path. - * - * Callers must disable local interrupts around it. - */ -noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state) +static u64 __init get_snp_jump_table_addr(void) { - struct sev_es_runtime_data *data; - struct ghcb *ghcb; - - WARN_ON(!irqs_disabled()); - - data = this_cpu_read(runtime_data); - ghcb = &data->ghcb_page; - - if (unlikely(data->ghcb_active)) { - /* GHCB is already in use - save its contents */ - - if (unlikely(data->backup_ghcb_active)) { - /* - * Backup-GHCB is also already in use. There is no way - * to continue here so just kill the machine. To make - * panic() work, mark GHCBs inactive so that messages - * can be printed out. - */ - data->ghcb_active = false; - data->backup_ghcb_active = false; - - instrumentation_begin(); - panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use"); - instrumentation_end(); - } - - /* Mark backup_ghcb active before writing to it */ - data->backup_ghcb_active = true; - - state->ghcb = &data->backup_ghcb; + struct snp_secrets_page *secrets; + void __iomem *mem; + u64 addr; - /* Backup GHCB content */ - *state->ghcb = *ghcb; - } else { - state->ghcb = NULL; - data->ghcb_active = true; + mem = ioremap_encrypted(secrets_pa, PAGE_SIZE); + if (!mem) { + pr_err("Unable to locate AP jump table address: failed to map the SNP secrets page.\n"); + return 0; } - return ghcb; -} + secrets = (__force struct snp_secrets_page *)mem; -static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt, - unsigned char *buffer) -{ - return copy_from_kernel_nofault(buffer, (unsigned char *)ctxt->regs->ip, MAX_INSN_SIZE); + addr = secrets->os_area.ap_jump_table_pa; + iounmap(mem); + + return addr; } -static enum es_result __vc_decode_user_insn(struct es_em_ctxt *ctxt) +void noinstr __sev_es_nmi_complete(void) { - char buffer[MAX_INSN_SIZE]; - int insn_bytes; - - insn_bytes = insn_fetch_from_user_inatomic(ctxt->regs, buffer); - if (insn_bytes == 0) { - /* Nothing could be copied */ - ctxt->fi.vector = X86_TRAP_PF; - ctxt->fi.error_code = X86_PF_INSTR | X86_PF_USER; - ctxt->fi.cr2 = ctxt->regs->ip; - return ES_EXCEPTION; - } else if (insn_bytes == -EINVAL) { - /* Effective RIP could not be calculated */ - ctxt->fi.vector = X86_TRAP_GP; - ctxt->fi.error_code = 0; - ctxt->fi.cr2 = 0; - return ES_EXCEPTION; - } - - if (!insn_decode_from_regs(&ctxt->insn, ctxt->regs, buffer, insn_bytes)) - return ES_DECODE_FAILED; + struct ghcb_state state; + struct ghcb *ghcb; - if (ctxt->insn.immediate.got) - return ES_OK; - else - return ES_DECODE_FAILED; -} + ghcb = __sev_get_ghcb(&state); -static enum es_result __vc_decode_kern_insn(struct es_em_ctxt *ctxt) -{ - char buffer[MAX_INSN_SIZE]; - int res, ret; + vc_ghcb_invalidate(ghcb); + ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_NMI_COMPLETE); + ghcb_set_sw_exit_info_1(ghcb, 0); + ghcb_set_sw_exit_info_2(ghcb, 0); - res = vc_fetch_insn_kernel(ctxt, buffer); - if (res) { - ctxt->fi.vector = X86_TRAP_PF; - ctxt->fi.error_code = X86_PF_INSTR; - ctxt->fi.cr2 = ctxt->regs->ip; - return ES_EXCEPTION; - } + sev_es_wr_ghcb_msr(__pa_nodebug(ghcb)); + VMGEXIT(); - ret = insn_decode(&ctxt->insn, buffer, MAX_INSN_SIZE, INSN_MODE_64); - if (ret < 0) - return ES_DECODE_FAILED; - else - return ES_OK; + __sev_put_ghcb(&state); } -static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt) +static u64 __init get_jump_table_addr(void) { - if (user_mode(ctxt->regs)) - return __vc_decode_user_insn(ctxt); - else - return __vc_decode_kern_insn(ctxt); -} + struct ghcb_state state; + unsigned long flags; + struct ghcb *ghcb; + u64 ret = 0; -static enum es_result vc_write_mem(struct es_em_ctxt *ctxt, - char *dst, char *buf, size_t size) -{ - unsigned long error_code = X86_PF_PROT | X86_PF_WRITE; + if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) + return get_snp_jump_table_addr(); - /* - * This function uses __put_user() independent of whether kernel or user - * memory is accessed. This works fine because __put_user() does no - * sanity checks of the pointer being accessed. All that it does is - * to report when the access failed. - * - * Also, this function runs in atomic context, so __put_user() is not - * allowed to sleep. The page-fault handler detects that it is running - * in atomic context and will not try to take mmap_sem and handle the - * fault, so additional pagefault_enable()/disable() calls are not - * needed. - * - * The access can't be done via copy_to_user() here because - * vc_write_mem() must not use string instructions to access unsafe - * memory. The reason is that MOVS is emulated by the #VC handler by - * splitting the move up into a read and a write and taking a nested #VC - * exception on whatever of them is the MMIO access. Using string - * instructions here would cause infinite nesting. - */ - switch (size) { - case 1: { - u8 d1; - u8 __user *target = (u8 __user *)dst; - - memcpy(&d1, buf, 1); - if (__put_user(d1, target)) - goto fault; - break; - } - case 2: { - u16 d2; - u16 __user *target = (u16 __user *)dst; + local_irq_save(flags); - memcpy(&d2, buf, 2); - if (__put_user(d2, target)) - goto fault; - break; - } - case 4: { - u32 d4; - u32 __user *target = (u32 __user *)dst; + ghcb = __sev_get_ghcb(&state); - memcpy(&d4, buf, 4); - if (__put_user(d4, target)) - goto fault; - break; - } - case 8: { - u64 d8; - u64 __user *target = (u64 __user *)dst; + vc_ghcb_invalidate(ghcb); + ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_JUMP_TABLE); + ghcb_set_sw_exit_info_1(ghcb, SVM_VMGEXIT_GET_AP_JUMP_TABLE); + ghcb_set_sw_exit_info_2(ghcb, 0); - memcpy(&d8, buf, 8); - if (__put_user(d8, target)) - goto fault; - break; - } - default: - WARN_ONCE(1, "%s: Invalid size: %zu\n", __func__, size); - return ES_UNSUPPORTED; - } + sev_es_wr_ghcb_msr(__pa(ghcb)); + VMGEXIT(); - return ES_OK; + if (ghcb_sw_exit_info_1_is_valid(ghcb) && + ghcb_sw_exit_info_2_is_valid(ghcb)) + ret = ghcb->save.sw_exit_info_2; -fault: - if (user_mode(ctxt->regs)) - error_code |= X86_PF_USER; + __sev_put_ghcb(&state); - ctxt->fi.vector = X86_TRAP_PF; - ctxt->fi.error_code = error_code; - ctxt->fi.cr2 = (unsigned long)dst; + local_irq_restore(flags); - return ES_EXCEPTION; + return ret; } -static enum es_result vc_read_mem(struct es_em_ctxt *ctxt, - char *src, char *buf, size_t size) +static inline void __pval_terminate(u64 pfn, bool action, unsigned int page_size, + int ret, u64 svsm_ret) { - unsigned long error_code = X86_PF_PROT; - - /* - * This function uses __get_user() independent of whether kernel or user - * memory is accessed. This works fine because __get_user() does no - * sanity checks of the pointer being accessed. All that it does is - * to report when the access failed. - * - * Also, this function runs in atomic context, so __get_user() is not - * allowed to sleep. The page-fault handler detects that it is running - * in atomic context and will not try to take mmap_sem and handle the - * fault, so additional pagefault_enable()/disable() calls are not - * needed. - * - * The access can't be done via copy_from_user() here because - * vc_read_mem() must not use string instructions to access unsafe - * memory. The reason is that MOVS is emulated by the #VC handler by - * splitting the move up into a read and a write and taking a nested #VC - * exception on whatever of them is the MMIO access. Using string - * instructions here would cause infinite nesting. - */ - switch (size) { - case 1: { - u8 d1; - u8 __user *s = (u8 __user *)src; - - if (__get_user(d1, s)) - goto fault; - memcpy(buf, &d1, 1); - break; - } - case 2: { - u16 d2; - u16 __user *s = (u16 __user *)src; - - if (__get_user(d2, s)) - goto fault; - memcpy(buf, &d2, 2); - break; - } - case 4: { - u32 d4; - u32 __user *s = (u32 __user *)src; - - if (__get_user(d4, s)) - goto fault; - memcpy(buf, &d4, 4); - break; - } - case 8: { - u64 d8; - u64 __user *s = (u64 __user *)src; - if (__get_user(d8, s)) - goto fault; - memcpy(buf, &d8, 8); - break; - } - default: - WARN_ONCE(1, "%s: Invalid size: %zu\n", __func__, size); - return ES_UNSUPPORTED; - } + WARN(1, "PVALIDATE failure: pfn: 0x%llx, action: %u, size: %u, ret: %d, svsm_ret: 0x%llx\n", + pfn, action, page_size, ret, svsm_ret); - return ES_OK; + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); +} -fault: - if (user_mode(ctxt->regs)) - error_code |= X86_PF_USER; +static void svsm_pval_terminate(struct svsm_pvalidate_call *pc, int ret, u64 svsm_ret) +{ + unsigned int page_size; + bool action; + u64 pfn; - ctxt->fi.vector = X86_TRAP_PF; - ctxt->fi.error_code = error_code; - ctxt->fi.cr2 = (unsigned long)src; + pfn = pc->entry[pc->cur_index].pfn; + action = pc->entry[pc->cur_index].action; + page_size = pc->entry[pc->cur_index].page_size; - return ES_EXCEPTION; + __pval_terminate(pfn, action, page_size, ret, svsm_ret); } -static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt *ctxt, - unsigned long vaddr, phys_addr_t *paddr) +static void pval_pages(struct snp_psc_desc *desc) { - unsigned long va = (unsigned long)vaddr; - unsigned int level; - phys_addr_t pa; - pgd_t *pgd; - pte_t *pte; - - pgd = __va(read_cr3_pa()); - pgd = &pgd[pgd_index(va)]; - pte = lookup_address_in_pgd(pgd, va, &level); - if (!pte) { - ctxt->fi.vector = X86_TRAP_PF; - ctxt->fi.cr2 = vaddr; - ctxt->fi.error_code = 0; - - if (user_mode(ctxt->regs)) - ctxt->fi.error_code |= X86_PF_USER; + struct psc_entry *e; + unsigned long vaddr; + unsigned int size; + unsigned int i; + bool validate; + u64 pfn; + int rc; - return ES_EXCEPTION; - } + for (i = 0; i <= desc->hdr.end_entry; i++) { + e = &desc->entries[i]; - if (WARN_ON_ONCE(pte_val(*pte) & _PAGE_ENC)) - /* Emulated MMIO to/from encrypted memory not supported */ - return ES_UNSUPPORTED; + pfn = e->gfn; + vaddr = (unsigned long)pfn_to_kaddr(pfn); + size = e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; + validate = e->operation == SNP_PAGE_STATE_PRIVATE; - pa = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT; - pa |= va & ~page_level_mask(level); + rc = pvalidate(vaddr, size, validate); + if (!rc) + continue; - *paddr = pa; + if (rc == PVALIDATE_FAIL_SIZEMISMATCH && size == RMP_PG_SIZE_2M) { + unsigned long vaddr_end = vaddr + PMD_SIZE; - return ES_OK; + for (; vaddr < vaddr_end; vaddr += PAGE_SIZE, pfn++) { + rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate); + if (rc) + __pval_terminate(pfn, validate, RMP_PG_SIZE_4K, rc, 0); + } + } else { + __pval_terminate(pfn, validate, size, rc, 0); + } + } } -static enum es_result vc_ioio_check(struct es_em_ctxt *ctxt, u16 port, size_t size) +static u64 svsm_build_ca_from_pfn_range(u64 pfn, u64 pfn_end, bool action, + struct svsm_pvalidate_call *pc) { - BUG_ON(size > 4); + struct svsm_pvalidate_entry *pe; - if (user_mode(ctxt->regs)) { - struct thread_struct *t = ¤t->thread; - struct io_bitmap *iobm = t->io_bitmap; - size_t idx; + /* Nothing in the CA yet */ + pc->num_entries = 0; + pc->cur_index = 0; - if (!iobm) - goto fault; + pe = &pc->entry[0]; - for (idx = port; idx < port + size; ++idx) { - if (test_bit(idx, iobm->bitmap)) - goto fault; - } - } + while (pfn < pfn_end) { + pe->page_size = RMP_PG_SIZE_4K; + pe->action = action; + pe->ignore_cf = 0; + pe->pfn = pfn; - return ES_OK; + pe++; + pfn++; -fault: - ctxt->fi.vector = X86_TRAP_GP; - ctxt->fi.error_code = 0; + pc->num_entries++; + if (pc->num_entries == SVSM_PVALIDATE_MAX_COUNT) + break; + } - return ES_EXCEPTION; + return pfn; } -static __always_inline void vc_forward_exception(struct es_em_ctxt *ctxt) +static int svsm_build_ca_from_psc_desc(struct snp_psc_desc *desc, unsigned int desc_entry, + struct svsm_pvalidate_call *pc) { - long error_code = ctxt->fi.error_code; - int trapnr = ctxt->fi.vector; - - ctxt->regs->orig_ax = ctxt->fi.error_code; - - switch (trapnr) { - case X86_TRAP_GP: - exc_general_protection(ctxt->regs, error_code); - break; - case X86_TRAP_UD: - exc_invalid_op(ctxt->regs); - break; - case X86_TRAP_PF: - write_cr2(ctxt->fi.cr2); - exc_page_fault(ctxt->regs, error_code); - break; - case X86_TRAP_AC: - exc_alignment_check(ctxt->regs, error_code); - break; - default: - pr_emerg("Unsupported exception in #VC instruction emulation - can't continue\n"); - BUG(); - } -} + struct svsm_pvalidate_entry *pe; + struct psc_entry *e; -/* Include code shared with pre-decompression boot stage */ -#include "shared.c" + /* Nothing in the CA yet */ + pc->num_entries = 0; + pc->cur_index = 0; -noinstr void __sev_put_ghcb(struct ghcb_state *state) -{ - struct sev_es_runtime_data *data; - struct ghcb *ghcb; + pe = &pc->entry[0]; + e = &desc->entries[desc_entry]; - WARN_ON(!irqs_disabled()); + while (desc_entry <= desc->hdr.end_entry) { + pe->page_size = e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; + pe->action = e->operation == SNP_PAGE_STATE_PRIVATE; + pe->ignore_cf = 0; + pe->pfn = e->gfn; - data = this_cpu_read(runtime_data); - ghcb = &data->ghcb_page; + pe++; + e++; - if (state->ghcb) { - /* Restore GHCB from Backup */ - *ghcb = *state->ghcb; - data->backup_ghcb_active = false; - state->ghcb = NULL; - } else { - /* - * Invalidate the GHCB so a VMGEXIT instruction issued - * from userspace won't appear to be valid. - */ - vc_ghcb_invalidate(ghcb); - data->ghcb_active = false; + desc_entry++; + pc->num_entries++; + if (pc->num_entries == SVSM_PVALIDATE_MAX_COUNT) + break; } + + return desc_entry; } -int svsm_perform_call_protocol(struct svsm_call *call) +static void svsm_pval_pages(struct snp_psc_desc *desc) { - struct ghcb_state state; + struct svsm_pvalidate_entry pv_4k[VMGEXIT_PSC_MAX_ENTRY]; + unsigned int i, pv_4k_count = 0; + struct svsm_pvalidate_call *pc; + struct svsm_call call = {}; unsigned long flags; - struct ghcb *ghcb; + bool action; + u64 pc_pa; int ret; /* @@ -584,184 +381,149 @@ int svsm_perform_call_protocol(struct svsm_call *call) flags = native_local_irq_save(); /* - * Use rip-relative references when called early in the boot. If - * ghcbs_initialized is set, then it is late in the boot and no need - * to worry about rip-relative references in called functions. + * The SVSM calling area (CA) can support processing 510 entries at a + * time. Loop through the Page State Change descriptor until the CA is + * full or the last entry in the descriptor is reached, at which time + * the SVSM is invoked. This repeats until all entries in the descriptor + * are processed. */ - if (RIP_REL_REF(sev_cfg).ghcbs_initialized) - ghcb = __sev_get_ghcb(&state); - else if (RIP_REL_REF(boot_ghcb)) - ghcb = RIP_REL_REF(boot_ghcb); - else - ghcb = NULL; + call.caa = svsm_get_caa(); - do { - ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call) - : svsm_perform_msr_protocol(call); - } while (ret == -EAGAIN); + pc = (struct svsm_pvalidate_call *)call.caa->svsm_buffer; + pc_pa = svsm_get_caa_pa() + offsetof(struct svsm_ca, svsm_buffer); - if (RIP_REL_REF(sev_cfg).ghcbs_initialized) - __sev_put_ghcb(&state); + /* Protocol 0, Call ID 1 */ + call.rax = SVSM_CORE_CALL(SVSM_CORE_PVALIDATE); + call.rcx = pc_pa; - native_local_irq_restore(flags); + for (i = 0; i <= desc->hdr.end_entry;) { + i = svsm_build_ca_from_psc_desc(desc, i, pc); - return ret; -} + do { + ret = svsm_perform_call_protocol(&call); + if (!ret) + continue; -void noinstr __sev_es_nmi_complete(void) -{ - struct ghcb_state state; - struct ghcb *ghcb; + /* + * Check if the entry failed because of an RMP mismatch (a + * PVALIDATE at 2M was requested, but the page is mapped in + * the RMP as 4K). + */ - ghcb = __sev_get_ghcb(&state); + if (call.rax_out == SVSM_PVALIDATE_FAIL_SIZEMISMATCH && + pc->entry[pc->cur_index].page_size == RMP_PG_SIZE_2M) { + /* Save this entry for post-processing at 4K */ + pv_4k[pv_4k_count++] = pc->entry[pc->cur_index]; + + /* Skip to the next one unless at the end of the list */ + pc->cur_index++; + if (pc->cur_index < pc->num_entries) + ret = -EAGAIN; + else + ret = 0; + } + } while (ret == -EAGAIN); - vc_ghcb_invalidate(ghcb); - ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_NMI_COMPLETE); - ghcb_set_sw_exit_info_1(ghcb, 0); - ghcb_set_sw_exit_info_2(ghcb, 0); + if (ret) + svsm_pval_terminate(pc, ret, call.rax_out); + } - sev_es_wr_ghcb_msr(__pa_nodebug(ghcb)); - VMGEXIT(); + /* Process any entries that failed to be validated at 2M and validate them at 4K */ + for (i = 0; i < pv_4k_count; i++) { + u64 pfn, pfn_end; - __sev_put_ghcb(&state); -} + action = pv_4k[i].action; + pfn = pv_4k[i].pfn; + pfn_end = pfn + 512; -static u64 __init get_snp_jump_table_addr(void) -{ - struct snp_secrets_page *secrets; - void __iomem *mem; - u64 addr; + while (pfn < pfn_end) { + pfn = svsm_build_ca_from_pfn_range(pfn, pfn_end, action, pc); - mem = ioremap_encrypted(secrets_pa, PAGE_SIZE); - if (!mem) { - pr_err("Unable to locate AP jump table address: failed to map the SNP secrets page.\n"); - return 0; + ret = svsm_perform_call_protocol(&call); + if (ret) + svsm_pval_terminate(pc, ret, call.rax_out); + } } - secrets = (__force struct snp_secrets_page *)mem; - - addr = secrets->os_area.ap_jump_table_pa; - iounmap(mem); - - return addr; + native_local_irq_restore(flags); } -static u64 __init get_jump_table_addr(void) +static void pvalidate_pages(struct snp_psc_desc *desc) { - struct ghcb_state state; - unsigned long flags; - struct ghcb *ghcb; - u64 ret = 0; - - if (cc_platform_has(CC_ATTR_GUEST_SEV_SNP)) - return get_snp_jump_table_addr(); - - local_irq_save(flags); + if (snp_vmpl) + svsm_pval_pages(desc); + else + pval_pages(desc); +} - ghcb = __sev_get_ghcb(&state); +static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc) +{ + int cur_entry, end_entry, ret = 0; + struct snp_psc_desc *data; + struct es_em_ctxt ctxt; vc_ghcb_invalidate(ghcb); - ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_AP_JUMP_TABLE); - ghcb_set_sw_exit_info_1(ghcb, SVM_VMGEXIT_GET_AP_JUMP_TABLE); - ghcb_set_sw_exit_info_2(ghcb, 0); - - sev_es_wr_ghcb_msr(__pa(ghcb)); - VMGEXIT(); - - if (ghcb_sw_exit_info_1_is_valid(ghcb) && - ghcb_sw_exit_info_2_is_valid(ghcb)) - ret = ghcb->save.sw_exit_info_2; - - __sev_put_ghcb(&state); - local_irq_restore(flags); - - return ret; -} + /* Copy the input desc into GHCB shared buffer */ + data = (struct snp_psc_desc *)ghcb->shared_buffer; + memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc))); -void __head -early_set_pages_state(unsigned long vaddr, unsigned long paddr, - unsigned long npages, enum psc_op op) -{ - unsigned long paddr_end; - u64 val; - - vaddr = vaddr & PAGE_MASK; + /* + * As per the GHCB specification, the hypervisor can resume the guest + * before processing all the entries. Check whether all the entries + * are processed. If not, then keep retrying. Note, the hypervisor + * will update the data memory directly to indicate the status, so + * reference the data->hdr everywhere. + * + * The strategy here is to wait for the hypervisor to change the page + * state in the RMP table before guest accesses the memory pages. If the + * page state change was not successful, then later memory access will + * result in a crash. + */ + cur_entry = data->hdr.cur_entry; + end_entry = data->hdr.end_entry; - paddr = paddr & PAGE_MASK; - paddr_end = paddr + (npages << PAGE_SHIFT); + while (data->hdr.cur_entry <= data->hdr.end_entry) { + ghcb_set_sw_scratch(ghcb, (u64)__pa(data)); - while (paddr < paddr_end) { - /* Page validation must be rescinded before changing to shared */ - if (op == SNP_PAGE_STATE_SHARED) - pvalidate_4k_page(vaddr, paddr, false); + /* This will advance the shared buffer data points to. */ + ret = sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0); /* - * Use the MSR protocol because this function can be called before - * the GHCB is established. + * Page State Change VMGEXIT can pass error code through + * exit_info_2. */ - sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op)); - VMGEXIT(); - - val = sev_es_rd_ghcb_msr(); - - if (GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) - goto e_term; - - if (GHCB_MSR_PSC_RESP_VAL(val)) - goto e_term; + if (WARN(ret || ghcb->save.sw_exit_info_2, + "SNP: PSC failed ret=%d exit_info_2=%llx\n", + ret, ghcb->save.sw_exit_info_2)) { + ret = 1; + goto out; + } - /* Page validation must be performed after changing to private */ - if (op == SNP_PAGE_STATE_PRIVATE) - pvalidate_4k_page(vaddr, paddr, true); + /* Verify that reserved bit is not set */ + if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) { + ret = 1; + goto out; + } - vaddr += PAGE_SIZE; - paddr += PAGE_SIZE; + /* + * Sanity check that entry processing is not going backwards. + * This will happen only if hypervisor is tricking us. + */ + if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry, +"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n", + end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) { + ret = 1; + goto out; + } } - return; - -e_term: - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); +out: + return ret; } -void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, - unsigned long npages) -{ - /* - * This can be invoked in early boot while running identity mapped, so - * use an open coded check for SNP instead of using cc_platform_has(). - * This eliminates worries about jump tables or checking boot_cpu_data - * in the cc_platform_has() function. - */ - if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) - return; - - /* - * Ask the hypervisor to mark the memory pages as private in the RMP - * table. - */ - early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE); -} - -void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, - unsigned long npages) -{ - /* - * This can be invoked in early boot while running identity mapped, so - * use an open coded check for SNP instead of using cc_platform_has(). - * This eliminates worries about jump tables or checking boot_cpu_data - * in the cc_platform_has() function. - */ - if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) - return; - - /* Ask hypervisor to mark the memory pages shared in the RMP table. */ - early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED); -} - -static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr, - unsigned long vaddr_end, int op) +static unsigned long __set_pages_state(struct snp_psc_desc *data, unsigned long vaddr, + unsigned long vaddr_end, int op) { struct ghcb_state state; bool use_large_entry; @@ -1335,90 +1097,6 @@ int __init sev_es_efi_map_ghcbs(pgd_t *pgd) return 0; } -/* Writes to the SVSM CAA MSR are ignored */ -static enum es_result __vc_handle_msr_caa(struct pt_regs *regs, bool write) -{ - if (write) - return ES_OK; - - regs->ax = lower_32_bits(this_cpu_read(svsm_caa_pa)); - regs->dx = upper_32_bits(this_cpu_read(svsm_caa_pa)); - - return ES_OK; -} - -/* - * TSC related accesses should not exit to the hypervisor when a guest is - * executing with Secure TSC enabled, so special handling is required for - * accesses of MSR_IA32_TSC and MSR_AMD64_GUEST_TSC_FREQ. - */ -static enum es_result __vc_handle_secure_tsc_msrs(struct pt_regs *regs, bool write) -{ - u64 tsc; - - /* - * GUEST_TSC_FREQ should not be intercepted when Secure TSC is enabled. - * Terminate the SNP guest when the interception is enabled. - */ - if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ) - return ES_VMM_ERROR; - - /* - * Writes: Writing to MSR_IA32_TSC can cause subsequent reads of the TSC - * to return undefined values, so ignore all writes. - * - * Reads: Reads of MSR_IA32_TSC should return the current TSC value, use - * the value returned by rdtsc_ordered(). - */ - if (write) { - WARN_ONCE(1, "TSC MSR writes are verboten!\n"); - return ES_OK; - } - - tsc = rdtsc_ordered(); - regs->ax = lower_32_bits(tsc); - regs->dx = upper_32_bits(tsc); - - return ES_OK; -} - -static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) -{ - struct pt_regs *regs = ctxt->regs; - enum es_result ret; - bool write; - - /* Is it a WRMSR? */ - write = ctxt->insn.opcode.bytes[1] == 0x30; - - switch (regs->cx) { - case MSR_SVSM_CAA: - return __vc_handle_msr_caa(regs, write); - case MSR_IA32_TSC: - case MSR_AMD64_GUEST_TSC_FREQ: - if (sev_status & MSR_AMD64_SNP_SECURE_TSC) - return __vc_handle_secure_tsc_msrs(regs, write); - break; - default: - break; - } - - ghcb_set_rcx(ghcb, regs->cx); - if (write) { - ghcb_set_rax(ghcb, regs->ax); - ghcb_set_rdx(ghcb, regs->dx); - } - - ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MSR, write, 0); - - if ((ret == ES_OK) && !write) { - regs->ax = ghcb->save.rax; - regs->dx = ghcb->save.rdx; - } - - return ret; -} - static void snp_register_per_cpu_ghcb(void) { struct sev_es_runtime_data *data; @@ -1631,748 +1309,6 @@ void __init sev_es_init_vc_handling(void) initial_vc_handler = (unsigned long)kernel_exc_vmm_communication; } -static void __init vc_early_forward_exception(struct es_em_ctxt *ctxt) -{ - int trapnr = ctxt->fi.vector; - - if (trapnr == X86_TRAP_PF) - native_write_cr2(ctxt->fi.cr2); - - ctxt->regs->orig_ax = ctxt->fi.error_code; - do_early_exception(ctxt->regs, trapnr); -} - -static long *vc_insn_get_rm(struct es_em_ctxt *ctxt) -{ - long *reg_array; - int offset; - - reg_array = (long *)ctxt->regs; - offset = insn_get_modrm_rm_off(&ctxt->insn, ctxt->regs); - - if (offset < 0) - return NULL; - - offset /= sizeof(long); - - return reg_array + offset; -} -static enum es_result vc_do_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt, - unsigned int bytes, bool read) -{ - u64 exit_code, exit_info_1, exit_info_2; - unsigned long ghcb_pa = __pa(ghcb); - enum es_result res; - phys_addr_t paddr; - void __user *ref; - - ref = insn_get_addr_ref(&ctxt->insn, ctxt->regs); - if (ref == (void __user *)-1L) - return ES_UNSUPPORTED; - - exit_code = read ? SVM_VMGEXIT_MMIO_READ : SVM_VMGEXIT_MMIO_WRITE; - - res = vc_slow_virt_to_phys(ghcb, ctxt, (unsigned long)ref, &paddr); - if (res != ES_OK) { - if (res == ES_EXCEPTION && !read) - ctxt->fi.error_code |= X86_PF_WRITE; - - return res; - } - - exit_info_1 = paddr; - /* Can never be greater than 8 */ - exit_info_2 = bytes; - - ghcb_set_sw_scratch(ghcb, ghcb_pa + offsetof(struct ghcb, shared_buffer)); - - return sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, exit_info_1, exit_info_2); -} - -/* - * The MOVS instruction has two memory operands, which raises the - * problem that it is not known whether the access to the source or the - * destination caused the #VC exception (and hence whether an MMIO read - * or write operation needs to be emulated). - * - * Instead of playing games with walking page-tables and trying to guess - * whether the source or destination is an MMIO range, split the move - * into two operations, a read and a write with only one memory operand. - * This will cause a nested #VC exception on the MMIO address which can - * then be handled. - * - * This implementation has the benefit that it also supports MOVS where - * source _and_ destination are MMIO regions. - * - * It will slow MOVS on MMIO down a lot, but in SEV-ES guests it is a - * rare operation. If it turns out to be a performance problem the split - * operations can be moved to memcpy_fromio() and memcpy_toio(). - */ -static enum es_result vc_handle_mmio_movs(struct es_em_ctxt *ctxt, - unsigned int bytes) -{ - unsigned long ds_base, es_base; - unsigned char *src, *dst; - unsigned char buffer[8]; - enum es_result ret; - bool rep; - int off; - - ds_base = insn_get_seg_base(ctxt->regs, INAT_SEG_REG_DS); - es_base = insn_get_seg_base(ctxt->regs, INAT_SEG_REG_ES); - - if (ds_base == -1L || es_base == -1L) { - ctxt->fi.vector = X86_TRAP_GP; - ctxt->fi.error_code = 0; - return ES_EXCEPTION; - } - - src = ds_base + (unsigned char *)ctxt->regs->si; - dst = es_base + (unsigned char *)ctxt->regs->di; - - ret = vc_read_mem(ctxt, src, buffer, bytes); - if (ret != ES_OK) - return ret; - - ret = vc_write_mem(ctxt, dst, buffer, bytes); - if (ret != ES_OK) - return ret; - - if (ctxt->regs->flags & X86_EFLAGS_DF) - off = -bytes; - else - off = bytes; - - ctxt->regs->si += off; - ctxt->regs->di += off; - - rep = insn_has_rep_prefix(&ctxt->insn); - if (rep) - ctxt->regs->cx -= 1; - - if (!rep || ctxt->regs->cx == 0) - return ES_OK; - else - return ES_RETRY; -} - -static enum es_result vc_handle_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt) -{ - struct insn *insn = &ctxt->insn; - enum insn_mmio_type mmio; - unsigned int bytes = 0; - enum es_result ret; - u8 sign_byte; - long *reg_data; - - mmio = insn_decode_mmio(insn, &bytes); - if (mmio == INSN_MMIO_DECODE_FAILED) - return ES_DECODE_FAILED; - - if (mmio != INSN_MMIO_WRITE_IMM && mmio != INSN_MMIO_MOVS) { - reg_data = insn_get_modrm_reg_ptr(insn, ctxt->regs); - if (!reg_data) - return ES_DECODE_FAILED; - } - - if (user_mode(ctxt->regs)) - return ES_UNSUPPORTED; - - switch (mmio) { - case INSN_MMIO_WRITE: - memcpy(ghcb->shared_buffer, reg_data, bytes); - ret = vc_do_mmio(ghcb, ctxt, bytes, false); - break; - case INSN_MMIO_WRITE_IMM: - memcpy(ghcb->shared_buffer, insn->immediate1.bytes, bytes); - ret = vc_do_mmio(ghcb, ctxt, bytes, false); - break; - case INSN_MMIO_READ: - ret = vc_do_mmio(ghcb, ctxt, bytes, true); - if (ret) - break; - - /* Zero-extend for 32-bit operation */ - if (bytes == 4) - *reg_data = 0; - - memcpy(reg_data, ghcb->shared_buffer, bytes); - break; - case INSN_MMIO_READ_ZERO_EXTEND: - ret = vc_do_mmio(ghcb, ctxt, bytes, true); - if (ret) - break; - - /* Zero extend based on operand size */ - memset(reg_data, 0, insn->opnd_bytes); - memcpy(reg_data, ghcb->shared_buffer, bytes); - break; - case INSN_MMIO_READ_SIGN_EXTEND: - ret = vc_do_mmio(ghcb, ctxt, bytes, true); - if (ret) - break; - - if (bytes == 1) { - u8 *val = (u8 *)ghcb->shared_buffer; - - sign_byte = (*val & 0x80) ? 0xff : 0x00; - } else { - u16 *val = (u16 *)ghcb->shared_buffer; - - sign_byte = (*val & 0x8000) ? 0xff : 0x00; - } - - /* Sign extend based on operand size */ - memset(reg_data, sign_byte, insn->opnd_bytes); - memcpy(reg_data, ghcb->shared_buffer, bytes); - break; - case INSN_MMIO_MOVS: - ret = vc_handle_mmio_movs(ctxt, bytes); - break; - default: - ret = ES_UNSUPPORTED; - break; - } - - return ret; -} - -static enum es_result vc_handle_dr7_write(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - struct sev_es_runtime_data *data = this_cpu_read(runtime_data); - long val, *reg = vc_insn_get_rm(ctxt); - enum es_result ret; - - if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) - return ES_VMM_ERROR; - - if (!reg) - return ES_DECODE_FAILED; - - val = *reg; - - /* Upper 32 bits must be written as zeroes */ - if (val >> 32) { - ctxt->fi.vector = X86_TRAP_GP; - ctxt->fi.error_code = 0; - return ES_EXCEPTION; - } - - /* Clear out other reserved bits and set bit 10 */ - val = (val & 0xffff23ffL) | BIT(10); - - /* Early non-zero writes to DR7 are not supported */ - if (!data && (val & ~DR7_RESET_VALUE)) - return ES_UNSUPPORTED; - - /* Using a value of 0 for ExitInfo1 means RAX holds the value */ - ghcb_set_rax(ghcb, val); - ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WRITE_DR7, 0, 0); - if (ret != ES_OK) - return ret; - - if (data) - data->dr7 = val; - - return ES_OK; -} - -static enum es_result vc_handle_dr7_read(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - struct sev_es_runtime_data *data = this_cpu_read(runtime_data); - long *reg = vc_insn_get_rm(ctxt); - - if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) - return ES_VMM_ERROR; - - if (!reg) - return ES_DECODE_FAILED; - - if (data) - *reg = data->dr7; - else - *reg = DR7_RESET_VALUE; - - return ES_OK; -} - -static enum es_result vc_handle_wbinvd(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - return sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WBINVD, 0, 0); -} - -static enum es_result vc_handle_rdpmc(struct ghcb *ghcb, struct es_em_ctxt *ctxt) -{ - enum es_result ret; - - ghcb_set_rcx(ghcb, ctxt->regs->cx); - - ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_RDPMC, 0, 0); - if (ret != ES_OK) - return ret; - - if (!(ghcb_rax_is_valid(ghcb) && ghcb_rdx_is_valid(ghcb))) - return ES_VMM_ERROR; - - ctxt->regs->ax = ghcb->save.rax; - ctxt->regs->dx = ghcb->save.rdx; - - return ES_OK; -} - -static enum es_result vc_handle_monitor(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - /* - * Treat it as a NOP and do not leak a physical address to the - * hypervisor. - */ - return ES_OK; -} - -static enum es_result vc_handle_mwait(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - /* Treat the same as MONITOR/MONITORX */ - return ES_OK; -} - -static enum es_result vc_handle_vmmcall(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - enum es_result ret; - - ghcb_set_rax(ghcb, ctxt->regs->ax); - ghcb_set_cpl(ghcb, user_mode(ctxt->regs) ? 3 : 0); - - if (x86_platform.hyper.sev_es_hcall_prepare) - x86_platform.hyper.sev_es_hcall_prepare(ghcb, ctxt->regs); - - ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_VMMCALL, 0, 0); - if (ret != ES_OK) - return ret; - - if (!ghcb_rax_is_valid(ghcb)) - return ES_VMM_ERROR; - - ctxt->regs->ax = ghcb->save.rax; - - /* - * Call sev_es_hcall_finish() after regs->ax is already set. - * This allows the hypervisor handler to overwrite it again if - * necessary. - */ - if (x86_platform.hyper.sev_es_hcall_finish && - !x86_platform.hyper.sev_es_hcall_finish(ghcb, ctxt->regs)) - return ES_VMM_ERROR; - - return ES_OK; -} - -static enum es_result vc_handle_trap_ac(struct ghcb *ghcb, - struct es_em_ctxt *ctxt) -{ - /* - * Calling ecx_alignment_check() directly does not work, because it - * enables IRQs and the GHCB is active. Forward the exception and call - * it later from vc_forward_exception(). - */ - ctxt->fi.vector = X86_TRAP_AC; - ctxt->fi.error_code = 0; - return ES_EXCEPTION; -} - -static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt, - struct ghcb *ghcb, - unsigned long exit_code) -{ - enum es_result result = vc_check_opcode_bytes(ctxt, exit_code); - - if (result != ES_OK) - return result; - - switch (exit_code) { - case SVM_EXIT_READ_DR7: - result = vc_handle_dr7_read(ghcb, ctxt); - break; - case SVM_EXIT_WRITE_DR7: - result = vc_handle_dr7_write(ghcb, ctxt); - break; - case SVM_EXIT_EXCP_BASE + X86_TRAP_AC: - result = vc_handle_trap_ac(ghcb, ctxt); - break; - case SVM_EXIT_RDTSC: - case SVM_EXIT_RDTSCP: - result = vc_handle_rdtsc(ghcb, ctxt, exit_code); - break; - case SVM_EXIT_RDPMC: - result = vc_handle_rdpmc(ghcb, ctxt); - break; - case SVM_EXIT_INVD: - pr_err_ratelimited("#VC exception for INVD??? Seriously???\n"); - result = ES_UNSUPPORTED; - break; - case SVM_EXIT_CPUID: - result = vc_handle_cpuid(ghcb, ctxt); - break; - case SVM_EXIT_IOIO: - result = vc_handle_ioio(ghcb, ctxt); - break; - case SVM_EXIT_MSR: - result = vc_handle_msr(ghcb, ctxt); - break; - case SVM_EXIT_VMMCALL: - result = vc_handle_vmmcall(ghcb, ctxt); - break; - case SVM_EXIT_WBINVD: - result = vc_handle_wbinvd(ghcb, ctxt); - break; - case SVM_EXIT_MONITOR: - result = vc_handle_monitor(ghcb, ctxt); - break; - case SVM_EXIT_MWAIT: - result = vc_handle_mwait(ghcb, ctxt); - break; - case SVM_EXIT_NPF: - result = vc_handle_mmio(ghcb, ctxt); - break; - default: - /* - * Unexpected #VC exception - */ - result = ES_UNSUPPORTED; - } - - return result; -} - -static __always_inline bool is_vc2_stack(unsigned long sp) -{ - return (sp >= __this_cpu_ist_bottom_va(VC2) && sp < __this_cpu_ist_top_va(VC2)); -} - -static __always_inline bool vc_from_invalid_context(struct pt_regs *regs) -{ - unsigned long sp, prev_sp; - - sp = (unsigned long)regs; - prev_sp = regs->sp; - - /* - * If the code was already executing on the VC2 stack when the #VC - * happened, let it proceed to the normal handling routine. This way the - * code executing on the VC2 stack can cause #VC exceptions to get handled. - */ - return is_vc2_stack(sp) && !is_vc2_stack(prev_sp); -} - -static bool vc_raw_handle_exception(struct pt_regs *regs, unsigned long error_code) -{ - struct ghcb_state state; - struct es_em_ctxt ctxt; - enum es_result result; - struct ghcb *ghcb; - bool ret = true; - - ghcb = __sev_get_ghcb(&state); - - vc_ghcb_invalidate(ghcb); - result = vc_init_em_ctxt(&ctxt, regs, error_code); - - if (result == ES_OK) - result = vc_handle_exitcode(&ctxt, ghcb, error_code); - - __sev_put_ghcb(&state); - - /* Done - now check the result */ - switch (result) { - case ES_OK: - vc_finish_insn(&ctxt); - break; - case ES_UNSUPPORTED: - pr_err_ratelimited("Unsupported exit-code 0x%02lx in #VC exception (IP: 0x%lx)\n", - error_code, regs->ip); - ret = false; - break; - case ES_VMM_ERROR: - pr_err_ratelimited("Failure in communication with VMM (exit-code 0x%02lx IP: 0x%lx)\n", - error_code, regs->ip); - ret = false; - break; - case ES_DECODE_FAILED: - pr_err_ratelimited("Failed to decode instruction (exit-code 0x%02lx IP: 0x%lx)\n", - error_code, regs->ip); - ret = false; - break; - case ES_EXCEPTION: - vc_forward_exception(&ctxt); - break; - case ES_RETRY: - /* Nothing to do */ - break; - default: - pr_emerg("Unknown result in %s():%d\n", __func__, result); - /* - * Emulating the instruction which caused the #VC exception - * failed - can't continue so print debug information - */ - BUG(); - } - - return ret; -} - -static __always_inline bool vc_is_db(unsigned long error_code) -{ - return error_code == SVM_EXIT_EXCP_BASE + X86_TRAP_DB; -} - -/* - * Runtime #VC exception handler when raised from kernel mode. Runs in NMI mode - * and will panic when an error happens. - */ -DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication) -{ - irqentry_state_t irq_state; - - /* - * With the current implementation it is always possible to switch to a - * safe stack because #VC exceptions only happen at known places, like - * intercepted instructions or accesses to MMIO areas/IO ports. They can - * also happen with code instrumentation when the hypervisor intercepts - * #DB, but the critical paths are forbidden to be instrumented, so #DB - * exceptions currently also only happen in safe places. - * - * But keep this here in case the noinstr annotations are violated due - * to bug elsewhere. - */ - if (unlikely(vc_from_invalid_context(regs))) { - instrumentation_begin(); - panic("Can't handle #VC exception from unsupported context\n"); - instrumentation_end(); - } - - /* - * Handle #DB before calling into !noinstr code to avoid recursive #DB. - */ - if (vc_is_db(error_code)) { - exc_debug(regs); - return; - } - - irq_state = irqentry_nmi_enter(regs); - - instrumentation_begin(); - - if (!vc_raw_handle_exception(regs, error_code)) { - /* Show some debug info */ - show_regs(regs); - - /* Ask hypervisor to sev_es_terminate */ - sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); - - /* If that fails and we get here - just panic */ - panic("Returned from Terminate-Request to Hypervisor\n"); - } - - instrumentation_end(); - irqentry_nmi_exit(regs, irq_state); -} - -/* - * Runtime #VC exception handler when raised from user mode. Runs in IRQ mode - * and will kill the current task with SIGBUS when an error happens. - */ -DEFINE_IDTENTRY_VC_USER(exc_vmm_communication) -{ - /* - * Handle #DB before calling into !noinstr code to avoid recursive #DB. - */ - if (vc_is_db(error_code)) { - noist_exc_debug(regs); - return; - } - - irqentry_enter_from_user_mode(regs); - instrumentation_begin(); - - if (!vc_raw_handle_exception(regs, error_code)) { - /* - * Do not kill the machine if user-space triggered the - * exception. Send SIGBUS instead and let user-space deal with - * it. - */ - force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0); - } - - instrumentation_end(); - irqentry_exit_to_user_mode(regs); -} - -bool __init handle_vc_boot_ghcb(struct pt_regs *regs) -{ - unsigned long exit_code = regs->orig_ax; - struct es_em_ctxt ctxt; - enum es_result result; - - vc_ghcb_invalidate(boot_ghcb); - - result = vc_init_em_ctxt(&ctxt, regs, exit_code); - if (result == ES_OK) - result = vc_handle_exitcode(&ctxt, boot_ghcb, exit_code); - - /* Done - now check the result */ - switch (result) { - case ES_OK: - vc_finish_insn(&ctxt); - break; - case ES_UNSUPPORTED: - early_printk("PANIC: Unsupported exit-code 0x%02lx in early #VC exception (IP: 0x%lx)\n", - exit_code, regs->ip); - goto fail; - case ES_VMM_ERROR: - early_printk("PANIC: Failure in communication with VMM (exit-code 0x%02lx IP: 0x%lx)\n", - exit_code, regs->ip); - goto fail; - case ES_DECODE_FAILED: - early_printk("PANIC: Failed to decode instruction (exit-code 0x%02lx IP: 0x%lx)\n", - exit_code, regs->ip); - goto fail; - case ES_EXCEPTION: - vc_early_forward_exception(&ctxt); - break; - case ES_RETRY: - /* Nothing to do */ - break; - default: - BUG(); - } - - return true; - -fail: - show_regs(regs); - - sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); -} - -/* - * Initial set up of SNP relies on information provided by the - * Confidential Computing blob, which can be passed to the kernel - * in the following ways, depending on how it is booted: - * - * - when booted via the boot/decompress kernel: - * - via boot_params - * - * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH): - * - via a setup_data entry, as defined by the Linux Boot Protocol - * - * Scan for the blob in that order. - */ -static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp) -{ - struct cc_blob_sev_info *cc_info; - - /* Boot kernel would have passed the CC blob via boot_params. */ - if (bp->cc_blob_address) { - cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address; - goto found_cc_info; - } - - /* - * If kernel was booted directly, without the use of the - * boot/decompression kernel, the CC blob may have been passed via - * setup_data instead. - */ - cc_info = find_cc_blob_setup_data(bp); - if (!cc_info) - return NULL; - -found_cc_info: - if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC) - snp_abort(); - - return cc_info; -} - -static __head void svsm_setup(struct cc_blob_sev_info *cc_info) -{ - struct svsm_call call = {}; - int ret; - u64 pa; - - /* - * Record the SVSM Calling Area address (CAA) if the guest is not - * running at VMPL0. The CA will be used to communicate with the - * SVSM to perform the SVSM services. - */ - if (!svsm_setup_ca(cc_info)) - return; - - /* - * It is very early in the boot and the kernel is running identity - * mapped but without having adjusted the pagetables to where the - * kernel was loaded (physbase), so the get the CA address using - * RIP-relative addressing. - */ - pa = (u64)rip_rel_ptr(&boot_svsm_ca_page); - - /* - * Switch over to the boot SVSM CA while the current CA is still - * addressable. There is no GHCB at this point so use the MSR protocol. - * - * SVSM_CORE_REMAP_CA call: - * RAX = 0 (Protocol=0, CallID=0) - * RCX = New CA GPA - */ - call.caa = svsm_get_caa(); - call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA); - call.rcx = pa; - ret = svsm_perform_call_protocol(&call); - if (ret) - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL); - - RIP_REL_REF(boot_svsm_caa) = (struct svsm_ca *)pa; - RIP_REL_REF(boot_svsm_caa_pa) = pa; -} - -bool __head snp_init(struct boot_params *bp) -{ - struct cc_blob_sev_info *cc_info; - - if (!bp) - return false; - - cc_info = find_cc_blob(bp); - if (!cc_info) - return false; - - if (cc_info->secrets_phys && cc_info->secrets_len == PAGE_SIZE) - secrets_pa = cc_info->secrets_phys; - else - return false; - - setup_cpuid_table(cc_info); - - svsm_setup(cc_info); - - /* - * The CC blob will be used later to access the secrets page. Cache - * it here like the boot kernel does. - */ - bp->cc_blob_address = (u32)(unsigned long)cc_info; - - return true; -} - -void __head __noreturn snp_abort(void) -{ - sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); -} - /* * SEV-SNP guests should only execute dmi_setup() if EFI_CONFIG_TABLES are * enabled, as the alternative (fallback) logic for DMI probing in the legacy diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/coco/sev/shared.c index a7c94020e384..815542295f16 100644 --- a/arch/x86/coco/sev/shared.c +++ b/arch/x86/coco/sev/shared.c @@ -27,17 +27,12 @@ /* * SVSM related information: - * When running under an SVSM, the VMPL that Linux is executing at must be - * non-zero. The VMPL is therefore used to indicate the presence of an SVSM. - * * During boot, the page tables are set up as identity mapped and later * changed to use kernel virtual addresses. Maintain separate virtual and * physical addresses for the CAA to allow SVSM functions to be used during * early boot, both with identity mapped virtual addresses and proper kernel * virtual addresses. */ -u8 snp_vmpl __ro_after_init; -EXPORT_SYMBOL_GPL(snp_vmpl); struct svsm_ca *boot_svsm_caa __ro_after_init; u64 boot_svsm_caa_pa __ro_after_init; @@ -1192,28 +1187,6 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info) } } -static inline void __pval_terminate(u64 pfn, bool action, unsigned int page_size, - int ret, u64 svsm_ret) -{ - WARN(1, "PVALIDATE failure: pfn: 0x%llx, action: %u, size: %u, ret: %d, svsm_ret: 0x%llx\n", - pfn, action, page_size, ret, svsm_ret); - - sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PVALIDATE); -} - -static void svsm_pval_terminate(struct svsm_pvalidate_call *pc, int ret, u64 svsm_ret) -{ - unsigned int page_size; - bool action; - u64 pfn; - - pfn = pc->entry[pc->cur_index].pfn; - action = pc->entry[pc->cur_index].action; - page_size = pc->entry[pc->cur_index].page_size; - - __pval_terminate(pfn, action, page_size, ret, svsm_ret); -} - static void __head svsm_pval_4k_page(unsigned long paddr, bool validate) { struct svsm_pvalidate_call *pc; @@ -1269,260 +1242,6 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr, } } -static void pval_pages(struct snp_psc_desc *desc) -{ - struct psc_entry *e; - unsigned long vaddr; - unsigned int size; - unsigned int i; - bool validate; - u64 pfn; - int rc; - - for (i = 0; i <= desc->hdr.end_entry; i++) { - e = &desc->entries[i]; - - pfn = e->gfn; - vaddr = (unsigned long)pfn_to_kaddr(pfn); - size = e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; - validate = e->operation == SNP_PAGE_STATE_PRIVATE; - - rc = pvalidate(vaddr, size, validate); - if (!rc) - continue; - - if (rc == PVALIDATE_FAIL_SIZEMISMATCH && size == RMP_PG_SIZE_2M) { - unsigned long vaddr_end = vaddr + PMD_SIZE; - - for (; vaddr < vaddr_end; vaddr += PAGE_SIZE, pfn++) { - rc = pvalidate(vaddr, RMP_PG_SIZE_4K, validate); - if (rc) - __pval_terminate(pfn, validate, RMP_PG_SIZE_4K, rc, 0); - } - } else { - __pval_terminate(pfn, validate, size, rc, 0); - } - } -} - -static u64 svsm_build_ca_from_pfn_range(u64 pfn, u64 pfn_end, bool action, - struct svsm_pvalidate_call *pc) -{ - struct svsm_pvalidate_entry *pe; - - /* Nothing in the CA yet */ - pc->num_entries = 0; - pc->cur_index = 0; - - pe = &pc->entry[0]; - - while (pfn < pfn_end) { - pe->page_size = RMP_PG_SIZE_4K; - pe->action = action; - pe->ignore_cf = 0; - pe->pfn = pfn; - - pe++; - pfn++; - - pc->num_entries++; - if (pc->num_entries == SVSM_PVALIDATE_MAX_COUNT) - break; - } - - return pfn; -} - -static int svsm_build_ca_from_psc_desc(struct snp_psc_desc *desc, unsigned int desc_entry, - struct svsm_pvalidate_call *pc) -{ - struct svsm_pvalidate_entry *pe; - struct psc_entry *e; - - /* Nothing in the CA yet */ - pc->num_entries = 0; - pc->cur_index = 0; - - pe = &pc->entry[0]; - e = &desc->entries[desc_entry]; - - while (desc_entry <= desc->hdr.end_entry) { - pe->page_size = e->pagesize ? RMP_PG_SIZE_2M : RMP_PG_SIZE_4K; - pe->action = e->operation == SNP_PAGE_STATE_PRIVATE; - pe->ignore_cf = 0; - pe->pfn = e->gfn; - - pe++; - e++; - - desc_entry++; - pc->num_entries++; - if (pc->num_entries == SVSM_PVALIDATE_MAX_COUNT) - break; - } - - return desc_entry; -} - -static void svsm_pval_pages(struct snp_psc_desc *desc) -{ - struct svsm_pvalidate_entry pv_4k[VMGEXIT_PSC_MAX_ENTRY]; - unsigned int i, pv_4k_count = 0; - struct svsm_pvalidate_call *pc; - struct svsm_call call = {}; - unsigned long flags; - bool action; - u64 pc_pa; - int ret; - - /* - * This can be called very early in the boot, use native functions in - * order to avoid paravirt issues. - */ - flags = native_local_irq_save(); - - /* - * The SVSM calling area (CA) can support processing 510 entries at a - * time. Loop through the Page State Change descriptor until the CA is - * full or the last entry in the descriptor is reached, at which time - * the SVSM is invoked. This repeats until all entries in the descriptor - * are processed. - */ - call.caa = svsm_get_caa(); - - pc = (struct svsm_pvalidate_call *)call.caa->svsm_buffer; - pc_pa = svsm_get_caa_pa() + offsetof(struct svsm_ca, svsm_buffer); - - /* Protocol 0, Call ID 1 */ - call.rax = SVSM_CORE_CALL(SVSM_CORE_PVALIDATE); - call.rcx = pc_pa; - - for (i = 0; i <= desc->hdr.end_entry;) { - i = svsm_build_ca_from_psc_desc(desc, i, pc); - - do { - ret = svsm_perform_call_protocol(&call); - if (!ret) - continue; - - /* - * Check if the entry failed because of an RMP mismatch (a - * PVALIDATE at 2M was requested, but the page is mapped in - * the RMP as 4K). - */ - - if (call.rax_out == SVSM_PVALIDATE_FAIL_SIZEMISMATCH && - pc->entry[pc->cur_index].page_size == RMP_PG_SIZE_2M) { - /* Save this entry for post-processing at 4K */ - pv_4k[pv_4k_count++] = pc->entry[pc->cur_index]; - - /* Skip to the next one unless at the end of the list */ - pc->cur_index++; - if (pc->cur_index < pc->num_entries) - ret = -EAGAIN; - else - ret = 0; - } - } while (ret == -EAGAIN); - - if (ret) - svsm_pval_terminate(pc, ret, call.rax_out); - } - - /* Process any entries that failed to be validated at 2M and validate them at 4K */ - for (i = 0; i < pv_4k_count; i++) { - u64 pfn, pfn_end; - - action = pv_4k[i].action; - pfn = pv_4k[i].pfn; - pfn_end = pfn + 512; - - while (pfn < pfn_end) { - pfn = svsm_build_ca_from_pfn_range(pfn, pfn_end, action, pc); - - ret = svsm_perform_call_protocol(&call); - if (ret) - svsm_pval_terminate(pc, ret, call.rax_out); - } - } - - native_local_irq_restore(flags); -} - -static void pvalidate_pages(struct snp_psc_desc *desc) -{ - if (snp_vmpl) - svsm_pval_pages(desc); - else - pval_pages(desc); -} - -static int vmgexit_psc(struct ghcb *ghcb, struct snp_psc_desc *desc) -{ - int cur_entry, end_entry, ret = 0; - struct snp_psc_desc *data; - struct es_em_ctxt ctxt; - - vc_ghcb_invalidate(ghcb); - - /* Copy the input desc into GHCB shared buffer */ - data = (struct snp_psc_desc *)ghcb->shared_buffer; - memcpy(ghcb->shared_buffer, desc, min_t(int, GHCB_SHARED_BUF_SIZE, sizeof(*desc))); - - /* - * As per the GHCB specification, the hypervisor can resume the guest - * before processing all the entries. Check whether all the entries - * are processed. If not, then keep retrying. Note, the hypervisor - * will update the data memory directly to indicate the status, so - * reference the data->hdr everywhere. - * - * The strategy here is to wait for the hypervisor to change the page - * state in the RMP table before guest accesses the memory pages. If the - * page state change was not successful, then later memory access will - * result in a crash. - */ - cur_entry = data->hdr.cur_entry; - end_entry = data->hdr.end_entry; - - while (data->hdr.cur_entry <= data->hdr.end_entry) { - ghcb_set_sw_scratch(ghcb, (u64)__pa(data)); - - /* This will advance the shared buffer data points to. */ - ret = sev_es_ghcb_hv_call(ghcb, &ctxt, SVM_VMGEXIT_PSC, 0, 0); - - /* - * Page State Change VMGEXIT can pass error code through - * exit_info_2. - */ - if (WARN(ret || ghcb->save.sw_exit_info_2, - "SNP: PSC failed ret=%d exit_info_2=%llx\n", - ret, ghcb->save.sw_exit_info_2)) { - ret = 1; - goto out; - } - - /* Verify that reserved bit is not set */ - if (WARN(data->hdr.reserved, "Reserved bit is set in the PSC header\n")) { - ret = 1; - goto out; - } - - /* - * Sanity check that entry processing is not going backwards. - * This will happen only if hypervisor is tricking us. - */ - if (WARN(data->hdr.end_entry > end_entry || cur_entry > data->hdr.cur_entry, -"SNP: PSC processing going backward, end_entry %d (got %d) cur_entry %d (got %d)\n", - end_entry, data->hdr.end_entry, cur_entry, data->hdr.cur_entry)) { - ret = 1; - goto out; - } - } - -out: - return ret; -} - static enum es_result vc_check_opcode_bytes(struct es_em_ctxt *ctxt, unsigned long exit_code) { diff --git a/arch/x86/coco/sev/startup.c b/arch/x86/coco/sev/startup.c new file mode 100644 index 000000000000..9f5dc70cfb44 --- /dev/null +++ b/arch/x86/coco/sev/startup.c @@ -0,0 +1,1395 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * AMD Memory Encryption Support + * + * Copyright (C) 2019 SUSE + * + * Author: Joerg Roedel + */ + +#define pr_fmt(fmt) "SEV: " fmt + +#include /* For show_regs() */ +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* For early boot hypervisor communication in SEV-ES enabled guests */ +struct ghcb boot_ghcb_page __bss_decrypted __aligned(PAGE_SIZE); + +/* + * Needs to be in the .data section because we need it NULL before bss is + * cleared + */ +struct ghcb *boot_ghcb __section(".data"); + +/* Bitmap of SEV features supported by the hypervisor */ +u64 sev_hv_features __ro_after_init; + +/* Secrets page physical address from the CC blob */ +static u64 secrets_pa __ro_after_init; + +/* For early boot SVSM communication */ +struct svsm_ca boot_svsm_ca_page __aligned(PAGE_SIZE); + +DEFINE_PER_CPU(struct svsm_ca *, svsm_caa); +DEFINE_PER_CPU(u64, svsm_caa_pa); + +/* + * Nothing shall interrupt this code path while holding the per-CPU + * GHCB. The backup GHCB is only for NMIs interrupting this path. + * + * Callers must disable local interrupts around it. + */ +noinstr struct ghcb *__sev_get_ghcb(struct ghcb_state *state) +{ + struct sev_es_runtime_data *data; + struct ghcb *ghcb; + + WARN_ON(!irqs_disabled()); + + data = this_cpu_read(runtime_data); + ghcb = &data->ghcb_page; + + if (unlikely(data->ghcb_active)) { + /* GHCB is already in use - save its contents */ + + if (unlikely(data->backup_ghcb_active)) { + /* + * Backup-GHCB is also already in use. There is no way + * to continue here so just kill the machine. To make + * panic() work, mark GHCBs inactive so that messages + * can be printed out. + */ + data->ghcb_active = false; + data->backup_ghcb_active = false; + + instrumentation_begin(); + panic("Unable to handle #VC exception! GHCB and Backup GHCB are already in use"); + instrumentation_end(); + } + + /* Mark backup_ghcb active before writing to it */ + data->backup_ghcb_active = true; + + state->ghcb = &data->backup_ghcb; + + /* Backup GHCB content */ + *state->ghcb = *ghcb; + } else { + state->ghcb = NULL; + data->ghcb_active = true; + } + + return ghcb; +} + +static int vc_fetch_insn_kernel(struct es_em_ctxt *ctxt, + unsigned char *buffer) +{ + return copy_from_kernel_nofault(buffer, (unsigned char *)ctxt->regs->ip, MAX_INSN_SIZE); +} + +static enum es_result __vc_decode_user_insn(struct es_em_ctxt *ctxt) +{ + char buffer[MAX_INSN_SIZE]; + int insn_bytes; + + insn_bytes = insn_fetch_from_user_inatomic(ctxt->regs, buffer); + if (insn_bytes == 0) { + /* Nothing could be copied */ + ctxt->fi.vector = X86_TRAP_PF; + ctxt->fi.error_code = X86_PF_INSTR | X86_PF_USER; + ctxt->fi.cr2 = ctxt->regs->ip; + return ES_EXCEPTION; + } else if (insn_bytes == -EINVAL) { + /* Effective RIP could not be calculated */ + ctxt->fi.vector = X86_TRAP_GP; + ctxt->fi.error_code = 0; + ctxt->fi.cr2 = 0; + return ES_EXCEPTION; + } + + if (!insn_decode_from_regs(&ctxt->insn, ctxt->regs, buffer, insn_bytes)) + return ES_DECODE_FAILED; + + if (ctxt->insn.immediate.got) + return ES_OK; + else + return ES_DECODE_FAILED; +} + +static enum es_result __vc_decode_kern_insn(struct es_em_ctxt *ctxt) +{ + char buffer[MAX_INSN_SIZE]; + int res, ret; + + res = vc_fetch_insn_kernel(ctxt, buffer); + if (res) { + ctxt->fi.vector = X86_TRAP_PF; + ctxt->fi.error_code = X86_PF_INSTR; + ctxt->fi.cr2 = ctxt->regs->ip; + return ES_EXCEPTION; + } + + ret = insn_decode(&ctxt->insn, buffer, MAX_INSN_SIZE, INSN_MODE_64); + if (ret < 0) + return ES_DECODE_FAILED; + else + return ES_OK; +} + +static enum es_result vc_decode_insn(struct es_em_ctxt *ctxt) +{ + if (user_mode(ctxt->regs)) + return __vc_decode_user_insn(ctxt); + else + return __vc_decode_kern_insn(ctxt); +} + +static enum es_result vc_write_mem(struct es_em_ctxt *ctxt, + char *dst, char *buf, size_t size) +{ + unsigned long error_code = X86_PF_PROT | X86_PF_WRITE; + + /* + * This function uses __put_user() independent of whether kernel or user + * memory is accessed. This works fine because __put_user() does no + * sanity checks of the pointer being accessed. All that it does is + * to report when the access failed. + * + * Also, this function runs in atomic context, so __put_user() is not + * allowed to sleep. The page-fault handler detects that it is running + * in atomic context and will not try to take mmap_sem and handle the + * fault, so additional pagefault_enable()/disable() calls are not + * needed. + * + * The access can't be done via copy_to_user() here because + * vc_write_mem() must not use string instructions to access unsafe + * memory. The reason is that MOVS is emulated by the #VC handler by + * splitting the move up into a read and a write and taking a nested #VC + * exception on whatever of them is the MMIO access. Using string + * instructions here would cause infinite nesting. + */ + switch (size) { + case 1: { + u8 d1; + u8 __user *target = (u8 __user *)dst; + + memcpy(&d1, buf, 1); + if (__put_user(d1, target)) + goto fault; + break; + } + case 2: { + u16 d2; + u16 __user *target = (u16 __user *)dst; + + memcpy(&d2, buf, 2); + if (__put_user(d2, target)) + goto fault; + break; + } + case 4: { + u32 d4; + u32 __user *target = (u32 __user *)dst; + + memcpy(&d4, buf, 4); + if (__put_user(d4, target)) + goto fault; + break; + } + case 8: { + u64 d8; + u64 __user *target = (u64 __user *)dst; + + memcpy(&d8, buf, 8); + if (__put_user(d8, target)) + goto fault; + break; + } + default: + WARN_ONCE(1, "%s: Invalid size: %zu\n", __func__, size); + return ES_UNSUPPORTED; + } + + return ES_OK; + +fault: + if (user_mode(ctxt->regs)) + error_code |= X86_PF_USER; + + ctxt->fi.vector = X86_TRAP_PF; + ctxt->fi.error_code = error_code; + ctxt->fi.cr2 = (unsigned long)dst; + + return ES_EXCEPTION; +} + +static enum es_result vc_read_mem(struct es_em_ctxt *ctxt, + char *src, char *buf, size_t size) +{ + unsigned long error_code = X86_PF_PROT; + + /* + * This function uses __get_user() independent of whether kernel or user + * memory is accessed. This works fine because __get_user() does no + * sanity checks of the pointer being accessed. All that it does is + * to report when the access failed. + * + * Also, this function runs in atomic context, so __get_user() is not + * allowed to sleep. The page-fault handler detects that it is running + * in atomic context and will not try to take mmap_sem and handle the + * fault, so additional pagefault_enable()/disable() calls are not + * needed. + * + * The access can't be done via copy_from_user() here because + * vc_read_mem() must not use string instructions to access unsafe + * memory. The reason is that MOVS is emulated by the #VC handler by + * splitting the move up into a read and a write and taking a nested #VC + * exception on whatever of them is the MMIO access. Using string + * instructions here would cause infinite nesting. + */ + switch (size) { + case 1: { + u8 d1; + u8 __user *s = (u8 __user *)src; + + if (__get_user(d1, s)) + goto fault; + memcpy(buf, &d1, 1); + break; + } + case 2: { + u16 d2; + u16 __user *s = (u16 __user *)src; + + if (__get_user(d2, s)) + goto fault; + memcpy(buf, &d2, 2); + break; + } + case 4: { + u32 d4; + u32 __user *s = (u32 __user *)src; + + if (__get_user(d4, s)) + goto fault; + memcpy(buf, &d4, 4); + break; + } + case 8: { + u64 d8; + u64 __user *s = (u64 __user *)src; + if (__get_user(d8, s)) + goto fault; + memcpy(buf, &d8, 8); + break; + } + default: + WARN_ONCE(1, "%s: Invalid size: %zu\n", __func__, size); + return ES_UNSUPPORTED; + } + + return ES_OK; + +fault: + if (user_mode(ctxt->regs)) + error_code |= X86_PF_USER; + + ctxt->fi.vector = X86_TRAP_PF; + ctxt->fi.error_code = error_code; + ctxt->fi.cr2 = (unsigned long)src; + + return ES_EXCEPTION; +} + +static enum es_result vc_slow_virt_to_phys(struct ghcb *ghcb, struct es_em_ctxt *ctxt, + unsigned long vaddr, phys_addr_t *paddr) +{ + unsigned long va = (unsigned long)vaddr; + unsigned int level; + phys_addr_t pa; + pgd_t *pgd; + pte_t *pte; + + pgd = __va(read_cr3_pa()); + pgd = &pgd[pgd_index(va)]; + pte = lookup_address_in_pgd(pgd, va, &level); + if (!pte) { + ctxt->fi.vector = X86_TRAP_PF; + ctxt->fi.cr2 = vaddr; + ctxt->fi.error_code = 0; + + if (user_mode(ctxt->regs)) + ctxt->fi.error_code |= X86_PF_USER; + + return ES_EXCEPTION; + } + + if (WARN_ON_ONCE(pte_val(*pte) & _PAGE_ENC)) + /* Emulated MMIO to/from encrypted memory not supported */ + return ES_UNSUPPORTED; + + pa = (phys_addr_t)pte_pfn(*pte) << PAGE_SHIFT; + pa |= va & ~page_level_mask(level); + + *paddr = pa; + + return ES_OK; +} + +static enum es_result vc_ioio_check(struct es_em_ctxt *ctxt, u16 port, size_t size) +{ + BUG_ON(size > 4); + + if (user_mode(ctxt->regs)) { + struct thread_struct *t = ¤t->thread; + struct io_bitmap *iobm = t->io_bitmap; + size_t idx; + + if (!iobm) + goto fault; + + for (idx = port; idx < port + size; ++idx) { + if (test_bit(idx, iobm->bitmap)) + goto fault; + } + } + + return ES_OK; + +fault: + ctxt->fi.vector = X86_TRAP_GP; + ctxt->fi.error_code = 0; + + return ES_EXCEPTION; +} + +static __always_inline void vc_forward_exception(struct es_em_ctxt *ctxt) +{ + long error_code = ctxt->fi.error_code; + int trapnr = ctxt->fi.vector; + + ctxt->regs->orig_ax = ctxt->fi.error_code; + + switch (trapnr) { + case X86_TRAP_GP: + exc_general_protection(ctxt->regs, error_code); + break; + case X86_TRAP_UD: + exc_invalid_op(ctxt->regs); + break; + case X86_TRAP_PF: + write_cr2(ctxt->fi.cr2); + exc_page_fault(ctxt->regs, error_code); + break; + case X86_TRAP_AC: + exc_alignment_check(ctxt->regs, error_code); + break; + default: + pr_emerg("Unsupported exception in #VC instruction emulation - can't continue\n"); + BUG(); + } +} + +/* Include code shared with pre-decompression boot stage */ +#include "shared.c" + +noinstr void __sev_put_ghcb(struct ghcb_state *state) +{ + struct sev_es_runtime_data *data; + struct ghcb *ghcb; + + WARN_ON(!irqs_disabled()); + + data = this_cpu_read(runtime_data); + ghcb = &data->ghcb_page; + + if (state->ghcb) { + /* Restore GHCB from Backup */ + *ghcb = *state->ghcb; + data->backup_ghcb_active = false; + state->ghcb = NULL; + } else { + /* + * Invalidate the GHCB so a VMGEXIT instruction issued + * from userspace won't appear to be valid. + */ + vc_ghcb_invalidate(ghcb); + data->ghcb_active = false; + } +} + +int svsm_perform_call_protocol(struct svsm_call *call) +{ + struct ghcb_state state; + unsigned long flags; + struct ghcb *ghcb; + int ret; + + /* + * This can be called very early in the boot, use native functions in + * order to avoid paravirt issues. + */ + flags = native_local_irq_save(); + + /* + * Use rip-relative references when called early in the boot. If + * ghcbs_initialized is set, then it is late in the boot and no need + * to worry about rip-relative references in called functions. + */ + if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + ghcb = __sev_get_ghcb(&state); + else if (RIP_REL_REF(boot_ghcb)) + ghcb = RIP_REL_REF(boot_ghcb); + else + ghcb = NULL; + + do { + ret = ghcb ? svsm_perform_ghcb_protocol(ghcb, call) + : svsm_perform_msr_protocol(call); + } while (ret == -EAGAIN); + + if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + __sev_put_ghcb(&state); + + native_local_irq_restore(flags); + + return ret; +} + +void __head +early_set_pages_state(unsigned long vaddr, unsigned long paddr, + unsigned long npages, enum psc_op op) +{ + unsigned long paddr_end; + u64 val; + + vaddr = vaddr & PAGE_MASK; + + paddr = paddr & PAGE_MASK; + paddr_end = paddr + (npages << PAGE_SHIFT); + + while (paddr < paddr_end) { + /* Page validation must be rescinded before changing to shared */ + if (op == SNP_PAGE_STATE_SHARED) + pvalidate_4k_page(vaddr, paddr, false); + + /* + * Use the MSR protocol because this function can be called before + * the GHCB is established. + */ + sev_es_wr_ghcb_msr(GHCB_MSR_PSC_REQ_GFN(paddr >> PAGE_SHIFT, op)); + VMGEXIT(); + + val = sev_es_rd_ghcb_msr(); + + if (GHCB_RESP_CODE(val) != GHCB_MSR_PSC_RESP) + goto e_term; + + if (GHCB_MSR_PSC_RESP_VAL(val)) + goto e_term; + + /* Page validation must be performed after changing to private */ + if (op == SNP_PAGE_STATE_PRIVATE) + pvalidate_4k_page(vaddr, paddr, true); + + vaddr += PAGE_SIZE; + paddr += PAGE_SIZE; + } + + return; + +e_term: + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_PSC); +} + +void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long paddr, + unsigned long npages) +{ + /* + * This can be invoked in early boot while running identity mapped, so + * use an open coded check for SNP instead of using cc_platform_has(). + * This eliminates worries about jump tables or checking boot_cpu_data + * in the cc_platform_has() function. + */ + if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) + return; + + /* + * Ask the hypervisor to mark the memory pages as private in the RMP + * table. + */ + early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_PRIVATE); +} + +void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr, + unsigned long npages) +{ + /* + * This can be invoked in early boot while running identity mapped, so + * use an open coded check for SNP instead of using cc_platform_has(). + * This eliminates worries about jump tables or checking boot_cpu_data + * in the cc_platform_has() function. + */ + if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) + return; + + /* Ask hypervisor to mark the memory pages shared in the RMP table. */ + early_set_pages_state(vaddr, paddr, npages, SNP_PAGE_STATE_SHARED); +} + +/* Writes to the SVSM CAA MSR are ignored */ +static enum es_result __vc_handle_msr_caa(struct pt_regs *regs, bool write) +{ + if (write) + return ES_OK; + + regs->ax = lower_32_bits(this_cpu_read(svsm_caa_pa)); + regs->dx = upper_32_bits(this_cpu_read(svsm_caa_pa)); + + return ES_OK; +} + +/* + * TSC related accesses should not exit to the hypervisor when a guest is + * executing with Secure TSC enabled, so special handling is required for + * accesses of MSR_IA32_TSC and MSR_AMD64_GUEST_TSC_FREQ. + */ +static enum es_result __vc_handle_secure_tsc_msrs(struct pt_regs *regs, bool write) +{ + u64 tsc; + + /* + * GUEST_TSC_FREQ should not be intercepted when Secure TSC is enabled. + * Terminate the SNP guest when the interception is enabled. + */ + if (regs->cx == MSR_AMD64_GUEST_TSC_FREQ) + return ES_VMM_ERROR; + + /* + * Writes: Writing to MSR_IA32_TSC can cause subsequent reads of the TSC + * to return undefined values, so ignore all writes. + * + * Reads: Reads of MSR_IA32_TSC should return the current TSC value, use + * the value returned by rdtsc_ordered(). + */ + if (write) { + WARN_ONCE(1, "TSC MSR writes are verboten!\n"); + return ES_OK; + } + + tsc = rdtsc_ordered(); + regs->ax = lower_32_bits(tsc); + regs->dx = upper_32_bits(tsc); + + return ES_OK; +} + +static enum es_result vc_handle_msr(struct ghcb *ghcb, struct es_em_ctxt *ctxt) +{ + struct pt_regs *regs = ctxt->regs; + enum es_result ret; + bool write; + + /* Is it a WRMSR? */ + write = ctxt->insn.opcode.bytes[1] == 0x30; + + switch (regs->cx) { + case MSR_SVSM_CAA: + return __vc_handle_msr_caa(regs, write); + case MSR_IA32_TSC: + case MSR_AMD64_GUEST_TSC_FREQ: + if (sev_status & MSR_AMD64_SNP_SECURE_TSC) + return __vc_handle_secure_tsc_msrs(regs, write); + break; + default: + break; + } + + ghcb_set_rcx(ghcb, regs->cx); + if (write) { + ghcb_set_rax(ghcb, regs->ax); + ghcb_set_rdx(ghcb, regs->dx); + } + + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_MSR, write, 0); + + if ((ret == ES_OK) && !write) { + regs->ax = ghcb->save.rax; + regs->dx = ghcb->save.rdx; + } + + return ret; +} + +static void __init vc_early_forward_exception(struct es_em_ctxt *ctxt) +{ + int trapnr = ctxt->fi.vector; + + if (trapnr == X86_TRAP_PF) + native_write_cr2(ctxt->fi.cr2); + + ctxt->regs->orig_ax = ctxt->fi.error_code; + do_early_exception(ctxt->regs, trapnr); +} + +static long *vc_insn_get_rm(struct es_em_ctxt *ctxt) +{ + long *reg_array; + int offset; + + reg_array = (long *)ctxt->regs; + offset = insn_get_modrm_rm_off(&ctxt->insn, ctxt->regs); + + if (offset < 0) + return NULL; + + offset /= sizeof(long); + + return reg_array + offset; +} +static enum es_result vc_do_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt, + unsigned int bytes, bool read) +{ + u64 exit_code, exit_info_1, exit_info_2; + unsigned long ghcb_pa = __pa(ghcb); + enum es_result res; + phys_addr_t paddr; + void __user *ref; + + ref = insn_get_addr_ref(&ctxt->insn, ctxt->regs); + if (ref == (void __user *)-1L) + return ES_UNSUPPORTED; + + exit_code = read ? SVM_VMGEXIT_MMIO_READ : SVM_VMGEXIT_MMIO_WRITE; + + res = vc_slow_virt_to_phys(ghcb, ctxt, (unsigned long)ref, &paddr); + if (res != ES_OK) { + if (res == ES_EXCEPTION && !read) + ctxt->fi.error_code |= X86_PF_WRITE; + + return res; + } + + exit_info_1 = paddr; + /* Can never be greater than 8 */ + exit_info_2 = bytes; + + ghcb_set_sw_scratch(ghcb, ghcb_pa + offsetof(struct ghcb, shared_buffer)); + + return sev_es_ghcb_hv_call(ghcb, ctxt, exit_code, exit_info_1, exit_info_2); +} + +/* + * The MOVS instruction has two memory operands, which raises the + * problem that it is not known whether the access to the source or the + * destination caused the #VC exception (and hence whether an MMIO read + * or write operation needs to be emulated). + * + * Instead of playing games with walking page-tables and trying to guess + * whether the source or destination is an MMIO range, split the move + * into two operations, a read and a write with only one memory operand. + * This will cause a nested #VC exception on the MMIO address which can + * then be handled. + * + * This implementation has the benefit that it also supports MOVS where + * source _and_ destination are MMIO regions. + * + * It will slow MOVS on MMIO down a lot, but in SEV-ES guests it is a + * rare operation. If it turns out to be a performance problem the split + * operations can be moved to memcpy_fromio() and memcpy_toio(). + */ +static enum es_result vc_handle_mmio_movs(struct es_em_ctxt *ctxt, + unsigned int bytes) +{ + unsigned long ds_base, es_base; + unsigned char *src, *dst; + unsigned char buffer[8]; + enum es_result ret; + bool rep; + int off; + + ds_base = insn_get_seg_base(ctxt->regs, INAT_SEG_REG_DS); + es_base = insn_get_seg_base(ctxt->regs, INAT_SEG_REG_ES); + + if (ds_base == -1L || es_base == -1L) { + ctxt->fi.vector = X86_TRAP_GP; + ctxt->fi.error_code = 0; + return ES_EXCEPTION; + } + + src = ds_base + (unsigned char *)ctxt->regs->si; + dst = es_base + (unsigned char *)ctxt->regs->di; + + ret = vc_read_mem(ctxt, src, buffer, bytes); + if (ret != ES_OK) + return ret; + + ret = vc_write_mem(ctxt, dst, buffer, bytes); + if (ret != ES_OK) + return ret; + + if (ctxt->regs->flags & X86_EFLAGS_DF) + off = -bytes; + else + off = bytes; + + ctxt->regs->si += off; + ctxt->regs->di += off; + + rep = insn_has_rep_prefix(&ctxt->insn); + if (rep) + ctxt->regs->cx -= 1; + + if (!rep || ctxt->regs->cx == 0) + return ES_OK; + else + return ES_RETRY; +} + +static enum es_result vc_handle_mmio(struct ghcb *ghcb, struct es_em_ctxt *ctxt) +{ + struct insn *insn = &ctxt->insn; + enum insn_mmio_type mmio; + unsigned int bytes = 0; + enum es_result ret; + u8 sign_byte; + long *reg_data; + + mmio = insn_decode_mmio(insn, &bytes); + if (mmio == INSN_MMIO_DECODE_FAILED) + return ES_DECODE_FAILED; + + if (mmio != INSN_MMIO_WRITE_IMM && mmio != INSN_MMIO_MOVS) { + reg_data = insn_get_modrm_reg_ptr(insn, ctxt->regs); + if (!reg_data) + return ES_DECODE_FAILED; + } + + if (user_mode(ctxt->regs)) + return ES_UNSUPPORTED; + + switch (mmio) { + case INSN_MMIO_WRITE: + memcpy(ghcb->shared_buffer, reg_data, bytes); + ret = vc_do_mmio(ghcb, ctxt, bytes, false); + break; + case INSN_MMIO_WRITE_IMM: + memcpy(ghcb->shared_buffer, insn->immediate1.bytes, bytes); + ret = vc_do_mmio(ghcb, ctxt, bytes, false); + break; + case INSN_MMIO_READ: + ret = vc_do_mmio(ghcb, ctxt, bytes, true); + if (ret) + break; + + /* Zero-extend for 32-bit operation */ + if (bytes == 4) + *reg_data = 0; + + memcpy(reg_data, ghcb->shared_buffer, bytes); + break; + case INSN_MMIO_READ_ZERO_EXTEND: + ret = vc_do_mmio(ghcb, ctxt, bytes, true); + if (ret) + break; + + /* Zero extend based on operand size */ + memset(reg_data, 0, insn->opnd_bytes); + memcpy(reg_data, ghcb->shared_buffer, bytes); + break; + case INSN_MMIO_READ_SIGN_EXTEND: + ret = vc_do_mmio(ghcb, ctxt, bytes, true); + if (ret) + break; + + if (bytes == 1) { + u8 *val = (u8 *)ghcb->shared_buffer; + + sign_byte = (*val & 0x80) ? 0xff : 0x00; + } else { + u16 *val = (u16 *)ghcb->shared_buffer; + + sign_byte = (*val & 0x8000) ? 0xff : 0x00; + } + + /* Sign extend based on operand size */ + memset(reg_data, sign_byte, insn->opnd_bytes); + memcpy(reg_data, ghcb->shared_buffer, bytes); + break; + case INSN_MMIO_MOVS: + ret = vc_handle_mmio_movs(ctxt, bytes); + break; + default: + ret = ES_UNSUPPORTED; + break; + } + + return ret; +} + +static enum es_result vc_handle_dr7_write(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + struct sev_es_runtime_data *data = this_cpu_read(runtime_data); + long val, *reg = vc_insn_get_rm(ctxt); + enum es_result ret; + + if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) + return ES_VMM_ERROR; + + if (!reg) + return ES_DECODE_FAILED; + + val = *reg; + + /* Upper 32 bits must be written as zeroes */ + if (val >> 32) { + ctxt->fi.vector = X86_TRAP_GP; + ctxt->fi.error_code = 0; + return ES_EXCEPTION; + } + + /* Clear out other reserved bits and set bit 10 */ + val = (val & 0xffff23ffL) | BIT(10); + + /* Early non-zero writes to DR7 are not supported */ + if (!data && (val & ~DR7_RESET_VALUE)) + return ES_UNSUPPORTED; + + /* Using a value of 0 for ExitInfo1 means RAX holds the value */ + ghcb_set_rax(ghcb, val); + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WRITE_DR7, 0, 0); + if (ret != ES_OK) + return ret; + + if (data) + data->dr7 = val; + + return ES_OK; +} + +static enum es_result vc_handle_dr7_read(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + struct sev_es_runtime_data *data = this_cpu_read(runtime_data); + long *reg = vc_insn_get_rm(ctxt); + + if (sev_status & MSR_AMD64_SNP_DEBUG_SWAP) + return ES_VMM_ERROR; + + if (!reg) + return ES_DECODE_FAILED; + + if (data) + *reg = data->dr7; + else + *reg = DR7_RESET_VALUE; + + return ES_OK; +} + +static enum es_result vc_handle_wbinvd(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + return sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_WBINVD, 0, 0); +} + +static enum es_result vc_handle_rdpmc(struct ghcb *ghcb, struct es_em_ctxt *ctxt) +{ + enum es_result ret; + + ghcb_set_rcx(ghcb, ctxt->regs->cx); + + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_RDPMC, 0, 0); + if (ret != ES_OK) + return ret; + + if (!(ghcb_rax_is_valid(ghcb) && ghcb_rdx_is_valid(ghcb))) + return ES_VMM_ERROR; + + ctxt->regs->ax = ghcb->save.rax; + ctxt->regs->dx = ghcb->save.rdx; + + return ES_OK; +} + +static enum es_result vc_handle_monitor(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + /* + * Treat it as a NOP and do not leak a physical address to the + * hypervisor. + */ + return ES_OK; +} + +static enum es_result vc_handle_mwait(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + /* Treat the same as MONITOR/MONITORX */ + return ES_OK; +} + +static enum es_result vc_handle_vmmcall(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + enum es_result ret; + + ghcb_set_rax(ghcb, ctxt->regs->ax); + ghcb_set_cpl(ghcb, user_mode(ctxt->regs) ? 3 : 0); + + if (x86_platform.hyper.sev_es_hcall_prepare) + x86_platform.hyper.sev_es_hcall_prepare(ghcb, ctxt->regs); + + ret = sev_es_ghcb_hv_call(ghcb, ctxt, SVM_EXIT_VMMCALL, 0, 0); + if (ret != ES_OK) + return ret; + + if (!ghcb_rax_is_valid(ghcb)) + return ES_VMM_ERROR; + + ctxt->regs->ax = ghcb->save.rax; + + /* + * Call sev_es_hcall_finish() after regs->ax is already set. + * This allows the hypervisor handler to overwrite it again if + * necessary. + */ + if (x86_platform.hyper.sev_es_hcall_finish && + !x86_platform.hyper.sev_es_hcall_finish(ghcb, ctxt->regs)) + return ES_VMM_ERROR; + + return ES_OK; +} + +static enum es_result vc_handle_trap_ac(struct ghcb *ghcb, + struct es_em_ctxt *ctxt) +{ + /* + * Calling ecx_alignment_check() directly does not work, because it + * enables IRQs and the GHCB is active. Forward the exception and call + * it later from vc_forward_exception(). + */ + ctxt->fi.vector = X86_TRAP_AC; + ctxt->fi.error_code = 0; + return ES_EXCEPTION; +} + +static enum es_result vc_handle_exitcode(struct es_em_ctxt *ctxt, + struct ghcb *ghcb, + unsigned long exit_code) +{ + enum es_result result = vc_check_opcode_bytes(ctxt, exit_code); + + if (result != ES_OK) + return result; + + switch (exit_code) { + case SVM_EXIT_READ_DR7: + result = vc_handle_dr7_read(ghcb, ctxt); + break; + case SVM_EXIT_WRITE_DR7: + result = vc_handle_dr7_write(ghcb, ctxt); + break; + case SVM_EXIT_EXCP_BASE + X86_TRAP_AC: + result = vc_handle_trap_ac(ghcb, ctxt); + break; + case SVM_EXIT_RDTSC: + case SVM_EXIT_RDTSCP: + result = vc_handle_rdtsc(ghcb, ctxt, exit_code); + break; + case SVM_EXIT_RDPMC: + result = vc_handle_rdpmc(ghcb, ctxt); + break; + case SVM_EXIT_INVD: + pr_err_ratelimited("#VC exception for INVD??? Seriously???\n"); + result = ES_UNSUPPORTED; + break; + case SVM_EXIT_CPUID: + result = vc_handle_cpuid(ghcb, ctxt); + break; + case SVM_EXIT_IOIO: + result = vc_handle_ioio(ghcb, ctxt); + break; + case SVM_EXIT_MSR: + result = vc_handle_msr(ghcb, ctxt); + break; + case SVM_EXIT_VMMCALL: + result = vc_handle_vmmcall(ghcb, ctxt); + break; + case SVM_EXIT_WBINVD: + result = vc_handle_wbinvd(ghcb, ctxt); + break; + case SVM_EXIT_MONITOR: + result = vc_handle_monitor(ghcb, ctxt); + break; + case SVM_EXIT_MWAIT: + result = vc_handle_mwait(ghcb, ctxt); + break; + case SVM_EXIT_NPF: + result = vc_handle_mmio(ghcb, ctxt); + break; + default: + /* + * Unexpected #VC exception + */ + result = ES_UNSUPPORTED; + } + + return result; +} + +static __always_inline bool is_vc2_stack(unsigned long sp) +{ + return (sp >= __this_cpu_ist_bottom_va(VC2) && sp < __this_cpu_ist_top_va(VC2)); +} + +static __always_inline bool vc_from_invalid_context(struct pt_regs *regs) +{ + unsigned long sp, prev_sp; + + sp = (unsigned long)regs; + prev_sp = regs->sp; + + /* + * If the code was already executing on the VC2 stack when the #VC + * happened, let it proceed to the normal handling routine. This way the + * code executing on the VC2 stack can cause #VC exceptions to get handled. + */ + return is_vc2_stack(sp) && !is_vc2_stack(prev_sp); +} + +static bool vc_raw_handle_exception(struct pt_regs *regs, unsigned long error_code) +{ + struct ghcb_state state; + struct es_em_ctxt ctxt; + enum es_result result; + struct ghcb *ghcb; + bool ret = true; + + ghcb = __sev_get_ghcb(&state); + + vc_ghcb_invalidate(ghcb); + result = vc_init_em_ctxt(&ctxt, regs, error_code); + + if (result == ES_OK) + result = vc_handle_exitcode(&ctxt, ghcb, error_code); + + __sev_put_ghcb(&state); + + /* Done - now check the result */ + switch (result) { + case ES_OK: + vc_finish_insn(&ctxt); + break; + case ES_UNSUPPORTED: + pr_err_ratelimited("Unsupported exit-code 0x%02lx in #VC exception (IP: 0x%lx)\n", + error_code, regs->ip); + ret = false; + break; + case ES_VMM_ERROR: + pr_err_ratelimited("Failure in communication with VMM (exit-code 0x%02lx IP: 0x%lx)\n", + error_code, regs->ip); + ret = false; + break; + case ES_DECODE_FAILED: + pr_err_ratelimited("Failed to decode instruction (exit-code 0x%02lx IP: 0x%lx)\n", + error_code, regs->ip); + ret = false; + break; + case ES_EXCEPTION: + vc_forward_exception(&ctxt); + break; + case ES_RETRY: + /* Nothing to do */ + break; + default: + pr_emerg("Unknown result in %s():%d\n", __func__, result); + /* + * Emulating the instruction which caused the #VC exception + * failed - can't continue so print debug information + */ + BUG(); + } + + return ret; +} + +static __always_inline bool vc_is_db(unsigned long error_code) +{ + return error_code == SVM_EXIT_EXCP_BASE + X86_TRAP_DB; +} + +/* + * Runtime #VC exception handler when raised from kernel mode. Runs in NMI mode + * and will panic when an error happens. + */ +DEFINE_IDTENTRY_VC_KERNEL(exc_vmm_communication) +{ + irqentry_state_t irq_state; + + /* + * With the current implementation it is always possible to switch to a + * safe stack because #VC exceptions only happen at known places, like + * intercepted instructions or accesses to MMIO areas/IO ports. They can + * also happen with code instrumentation when the hypervisor intercepts + * #DB, but the critical paths are forbidden to be instrumented, so #DB + * exceptions currently also only happen in safe places. + * + * But keep this here in case the noinstr annotations are violated due + * to bug elsewhere. + */ + if (unlikely(vc_from_invalid_context(regs))) { + instrumentation_begin(); + panic("Can't handle #VC exception from unsupported context\n"); + instrumentation_end(); + } + + /* + * Handle #DB before calling into !noinstr code to avoid recursive #DB. + */ + if (vc_is_db(error_code)) { + exc_debug(regs); + return; + } + + irq_state = irqentry_nmi_enter(regs); + + instrumentation_begin(); + + if (!vc_raw_handle_exception(regs, error_code)) { + /* Show some debug info */ + show_regs(regs); + + /* Ask hypervisor to sev_es_terminate */ + sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); + + /* If that fails and we get here - just panic */ + panic("Returned from Terminate-Request to Hypervisor\n"); + } + + instrumentation_end(); + irqentry_nmi_exit(regs, irq_state); +} + +/* + * Runtime #VC exception handler when raised from user mode. Runs in IRQ mode + * and will kill the current task with SIGBUS when an error happens. + */ +DEFINE_IDTENTRY_VC_USER(exc_vmm_communication) +{ + /* + * Handle #DB before calling into !noinstr code to avoid recursive #DB. + */ + if (vc_is_db(error_code)) { + noist_exc_debug(regs); + return; + } + + irqentry_enter_from_user_mode(regs); + instrumentation_begin(); + + if (!vc_raw_handle_exception(regs, error_code)) { + /* + * Do not kill the machine if user-space triggered the + * exception. Send SIGBUS instead and let user-space deal with + * it. + */ + force_sig_fault(SIGBUS, BUS_OBJERR, (void __user *)0); + } + + instrumentation_end(); + irqentry_exit_to_user_mode(regs); +} + +bool __init handle_vc_boot_ghcb(struct pt_regs *regs) +{ + unsigned long exit_code = regs->orig_ax; + struct es_em_ctxt ctxt; + enum es_result result; + + vc_ghcb_invalidate(boot_ghcb); + + result = vc_init_em_ctxt(&ctxt, regs, exit_code); + if (result == ES_OK) + result = vc_handle_exitcode(&ctxt, boot_ghcb, exit_code); + + /* Done - now check the result */ + switch (result) { + case ES_OK: + vc_finish_insn(&ctxt); + break; + case ES_UNSUPPORTED: + early_printk("PANIC: Unsupported exit-code 0x%02lx in early #VC exception (IP: 0x%lx)\n", + exit_code, regs->ip); + goto fail; + case ES_VMM_ERROR: + early_printk("PANIC: Failure in communication with VMM (exit-code 0x%02lx IP: 0x%lx)\n", + exit_code, regs->ip); + goto fail; + case ES_DECODE_FAILED: + early_printk("PANIC: Failed to decode instruction (exit-code 0x%02lx IP: 0x%lx)\n", + exit_code, regs->ip); + goto fail; + case ES_EXCEPTION: + vc_early_forward_exception(&ctxt); + break; + case ES_RETRY: + /* Nothing to do */ + break; + default: + BUG(); + } + + return true; + +fail: + show_regs(regs); + + sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SEV_ES_GEN_REQ); +} + +/* + * Initial set up of SNP relies on information provided by the + * Confidential Computing blob, which can be passed to the kernel + * in the following ways, depending on how it is booted: + * + * - when booted via the boot/decompress kernel: + * - via boot_params + * + * - when booted directly by firmware/bootloader (e.g. CONFIG_PVH): + * - via a setup_data entry, as defined by the Linux Boot Protocol + * + * Scan for the blob in that order. + */ +static __head struct cc_blob_sev_info *find_cc_blob(struct boot_params *bp) +{ + struct cc_blob_sev_info *cc_info; + + /* Boot kernel would have passed the CC blob via boot_params. */ + if (bp->cc_blob_address) { + cc_info = (struct cc_blob_sev_info *)(unsigned long)bp->cc_blob_address; + goto found_cc_info; + } + + /* + * If kernel was booted directly, without the use of the + * boot/decompression kernel, the CC blob may have been passed via + * setup_data instead. + */ + cc_info = find_cc_blob_setup_data(bp); + if (!cc_info) + return NULL; + +found_cc_info: + if (cc_info->magic != CC_BLOB_SEV_HDR_MAGIC) + snp_abort(); + + return cc_info; +} + +static __head void svsm_setup(struct cc_blob_sev_info *cc_info) +{ + struct svsm_call call = {}; + int ret; + u64 pa; + + /* + * Record the SVSM Calling Area address (CAA) if the guest is not + * running at VMPL0. The CA will be used to communicate with the + * SVSM to perform the SVSM services. + */ + if (!svsm_setup_ca(cc_info)) + return; + + /* + * It is very early in the boot and the kernel is running identity + * mapped but without having adjusted the pagetables to where the + * kernel was loaded (physbase), so the get the CA address using + * RIP-relative addressing. + */ + pa = (u64)rip_rel_ptr(&boot_svsm_ca_page); + + /* + * Switch over to the boot SVSM CA while the current CA is still + * addressable. There is no GHCB at this point so use the MSR protocol. + * + * SVSM_CORE_REMAP_CA call: + * RAX = 0 (Protocol=0, CallID=0) + * RCX = New CA GPA + */ + call.caa = svsm_get_caa(); + call.rax = SVSM_CORE_CALL(SVSM_CORE_REMAP_CA); + call.rcx = pa; + ret = svsm_perform_call_protocol(&call); + if (ret) + sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL); + + RIP_REL_REF(boot_svsm_caa) = (struct svsm_ca *)pa; + RIP_REL_REF(boot_svsm_caa_pa) = pa; +} + +bool __head snp_init(struct boot_params *bp) +{ + struct cc_blob_sev_info *cc_info; + + if (!bp) + return false; + + cc_info = find_cc_blob(bp); + if (!cc_info) + return false; + + if (cc_info->secrets_phys && cc_info->secrets_len == PAGE_SIZE) + secrets_pa = cc_info->secrets_phys; + else + return false; + + setup_cpuid_table(cc_info); + + svsm_setup(cc_info); + + /* + * The CC blob will be used later to access the secrets page. Cache + * it here like the boot kernel does. + */ + bp->cc_blob_address = (u32)(unsigned long)cc_info; + + return true; +} + +void __head __noreturn snp_abort(void) +{ + sev_es_terminate(SEV_TERM_SET_GEN, GHCB_SNP_UNSUPPORTED); +} From patchwork Thu Apr 10 13:41:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 879828 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 614DB28A41F for ; Thu, 10 Apr 2025 13:42:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292522; cv=none; b=gT2zTSXtcrzVN3zxR96Ip+u6D0b8ORwyHiy3+N1t3VGaofOGvjdTrr84cE4eFr/IxNXPNvjcv57jfXYUEcdcb7RQOBb2/HPy2apysrOxE9BYhSBfN4XyIZW1DzM96pGos0R1LRwq3Ko1PBlkCylS9QTSRT07iGT9X72DzfMRkJk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292522; c=relaxed/simple; bh=3RDY+InRbRGkZwpv/31rtaQafCmsuYt5wpq5FyYuVIo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fBaTdrUjK4JCvYSV/B3xcDWyhz+eWIyfdHMlB/MHtKiY5pJByeRRXfgYq9R79o0uiN4IHu6wwr3BI6iSRSI3CeiTwL6j1nNhJPruk0hRPl4ZJQxVOeX/X0xnRv3G+k9pm0f1S1Kj0Rx/QeRBLdyHG30k3Mino+zfakGWjkaJJMU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=n1FqhGeL; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="n1FqhGeL" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43efa869b19so6361915e9.2 for ; Thu, 10 Apr 2025 06:42:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292519; x=1744897319; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8Yq2Udp4AIBfEdCHQPPm/mcMlKJxLC8HFZ/abGzZSPM=; b=n1FqhGeL1wkJ2opPaXf2P1r5lq1dURqO1pnbBelYGrzG60wGIx4ZAZzSGuuNXbP137 cvstqzB1YmDjNtkZhjFwQzhZ1LmSHsVZXJ21VYgiMaaqANL3Cy2eIz2MQYV+O/eLhujE 78HBgmiRhQSd9AZqWcE471JMVTbEtwYnU0pG78P1RxJzfAKT7WR07xcQqvHD5eK0Y/m/ urcezOIXAtpvHEel52BxZsGcdmQLlDzzEtw1sLQb/CEKLc+IkAAgfJn5HVhIfcakA3Fq b5/D9O60/0PQrAOuecmORZYqWiHlBk3oXnpy4GnmCP5C3acZTyXuGhyIvFtUdpKwPxMi SH7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292519; x=1744897319; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8Yq2Udp4AIBfEdCHQPPm/mcMlKJxLC8HFZ/abGzZSPM=; b=HX33tvs6D6NHUI05G9cEkYtPzLXJAtRINZYBjAkL1HXWsNfvDFIEhOQr36dl570tvI HFpIBPjb58t0ZDQMdTrYB779utOWjboy5dLjg85lX+3MuqwWT15gj8vqM3v1a48yQjZ1 tlfES22VtQoLL1R3eHhA3bsgnrDoznbNFrwJWoreM2Tts3G+ZqiPN3RQq1RLCQKOo4EZ NsCmpVORS5Uu2dy28HwamZfMICoZbo2Uy0UhcqwNwCI9W1si6pf2sxsbW6I2At68u1c8 yNnbOlLUC1FrdIYQL0PYT12O1nK6RnbnVN4vVZfalxtz3OVa4hMjpmxC8lkgLg8x8PPn 95dw== X-Gm-Message-State: AOJu0YxbXoYsTM6KKM4gVnbM4id4C43bi8RbPTi2p5eLlgenk9/vuyQ2 dNCpscGYUwFbqN6CdUJZKpWssnb23co9lOjt9Kz/FydtvWDeCz7WvR3m9Nu+4BjEpR//scUfe9h GrMdRr7Tsuhm1crH6mTYBNEGQeiuFBzM/ktIJ0ZEyiu8LvD4H2hTWSP4mJugvBXiApH7yHAB+Kv NrbTHOSVr0HxMti03gn3N2cOr6qw== X-Google-Smtp-Source: AGHT+IH5IZ4nLFaGoD4MuACElwJFU9aXPnUiWGgPaq1XTHhC2cKM31QzFF70nprkf6ZRYgrR6xmVXgVr X-Received: from wmcn12.prod.google.com ([2002:a05:600c:c0cc:b0:43c:f60a:4c59]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1e13:b0:43c:e481:3353 with SMTP id 5b1f17b1804b1-43f2d7f1b68mr32794315e9.17.1744292518802; Thu, 10 Apr 2025 06:41:58 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:27 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=3347; i=ardb@kernel.org; h=from:subject; bh=ZxH01yKh1YWsY6C3Y3GXCe7hCPc6xzGAIM8w8oIWNO0=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qV67Dyt0dPKafwRHms/cEjtZ1GTdBltth8sTJuusY C2IVl3RUcrCIMbBICumyCIw+++7nacnStU6z5KFmcPKBDKEgYtTACairMrwh8+zpnrOPpvjyfue 6Rkq7Zgf8aB5fcfPiSs+7Jt7J933ySFGhpmTdJn93AREKqN0D0uzM1a+XVT9ZtLnY/2l2S9Xr/g 9lx0A X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-22-ardb+git@google.com> Subject: [PATCH v4 09/11] x86/boot: Move SEV startup code into startup/ From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Move the SEV startup code into arch/x86/boot/startup/, where it will reside along with other code that executes extremely early, and therefore needs to be built in a special manner. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/compressed/sev.c | 2 +- arch/x86/boot/startup/Makefile | 2 +- arch/x86/{coco/sev/shared.c => boot/startup/sev-shared.c} | 0 arch/x86/{coco/sev/startup.c => boot/startup/sev-startup.c} | 2 +- arch/x86/coco/sev/Makefile | 21 +------------------- 5 files changed, 4 insertions(+), 23 deletions(-) diff --git a/arch/x86/boot/compressed/sev.c b/arch/x86/boot/compressed/sev.c index 714e30c66eae..478c65149cf0 100644 --- a/arch/x86/boot/compressed/sev.c +++ b/arch/x86/boot/compressed/sev.c @@ -144,7 +144,7 @@ int svsm_perform_call_protocol(struct svsm_call *call); u8 snp_vmpl; /* Include code for early handlers */ -#include "../../coco/sev/shared.c" +#include "../../boot/startup/sev-shared.c" int svsm_perform_call_protocol(struct svsm_call *call) { diff --git a/arch/x86/boot/startup/Makefile b/arch/x86/boot/startup/Makefile index ccdfc42a4d59..b56facb9091a 100644 --- a/arch/x86/boot/startup/Makefile +++ b/arch/x86/boot/startup/Makefile @@ -16,7 +16,7 @@ UBSAN_SANITIZE := n KCOV_INSTRUMENT := n obj-$(CONFIG_X86_64) += gdt_idt.o map_kernel.o -obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o +obj-$(CONFIG_AMD_MEM_ENCRYPT) += sme.o sev-startup.o lib-$(CONFIG_X86_64) += la57toggle.o lib-$(CONFIG_EFI_MIXED) += efi-mixed.o diff --git a/arch/x86/coco/sev/shared.c b/arch/x86/boot/startup/sev-shared.c similarity index 100% rename from arch/x86/coco/sev/shared.c rename to arch/x86/boot/startup/sev-shared.c diff --git a/arch/x86/coco/sev/startup.c b/arch/x86/boot/startup/sev-startup.c similarity index 99% rename from arch/x86/coco/sev/startup.c rename to arch/x86/boot/startup/sev-startup.c index 9f5dc70cfb44..10b636009d1c 100644 --- a/arch/x86/coco/sev/startup.c +++ b/arch/x86/boot/startup/sev-startup.c @@ -422,7 +422,7 @@ static __always_inline void vc_forward_exception(struct es_em_ctxt *ctxt) } /* Include code shared with pre-decompression boot stage */ -#include "shared.c" +#include "sev-shared.c" noinstr void __sev_put_ghcb(struct ghcb_state *state) { diff --git a/arch/x86/coco/sev/Makefile b/arch/x86/coco/sev/Makefile index 7d7d2aee62f0..b89ba3fba343 100644 --- a/arch/x86/coco/sev/Makefile +++ b/arch/x86/coco/sev/Makefile @@ -1,22 +1,3 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y += core.o startup.o - -# jump tables are emitted using absolute references in non-PIC code -# so they cannot be used in the early SEV startup code -CFLAGS_startup.o += -fno-jump-tables - -ifdef CONFIG_FUNCTION_TRACER -CFLAGS_REMOVE_startup.o = -pg -endif - -KASAN_SANITIZE_startup.o := n -KMSAN_SANITIZE_startup.o := n -KCOV_INSTRUMENT_startup.o := n - -# With some compiler versions the generated code results in boot hangs, caused -# by several compilation units. To be safe, disable all instrumentation. -KCSAN_SANITIZE := n - -# Clang 14 and older may fail to respect __no_sanitize_undefined when inlining -UBSAN_SANITIZE := n +obj-y += core.o From patchwork Thu Apr 10 13:41:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 879827 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B938D28C5AB for ; Thu, 10 Apr 2025 13:42:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292524; cv=none; b=qu3e7Iq+LkeTLT+YCo56U7HRVmPV/oApJWwOVv/VCqB3AuUeTNdpgrYGnGBAA7miNPbozJIOvxsV9xaPjgoHsmhlz41EaFfRNGgNec5GpeHj+8gejolmcFiUA/yG9FUrwPMW3908/QfbwiZ4Yr9dztI4G6u20hHqafI+rHIDnYQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292524; c=relaxed/simple; bh=qRnbrYushYruhm4qh3eglZ9GhyvymI5PLgTPXFxDYaE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gPcSqAcWIIoLRcNajuuqydJS0vH4ti3l5fjh3In9Lw5/7R55IwzSb++TpXKqp7P1WD/Ea2E7HP1HDiIwNNexSIFH0+6LIfX9FM8ODWruNV+zFOUmlw7Vd1sob2kWWdTiiZKjYkNgxkwsNMfzLtTXSaYqf1V6oHzDKIOqBiHJbvw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=WsGxr9Q5; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="WsGxr9Q5" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-39135d31ca4so484356f8f.1 for ; Thu, 10 Apr 2025 06:42:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292521; x=1744897321; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=sm9qg46trCRfuH5OPEG2I+8thbipFQuVhxq71kAuSK8=; b=WsGxr9Q55hM6iaYh7aDCSJvmUStTZ264BuUOfKhOx9NoFU4VSviriTLSLe8ef7iAgb WB/vvhn3kp08sKNgA0OKLWjrGydOs2588CnZtlKzFM7uPZMMK7e+eMMLoQeSDsgak34X 2AUg6b5n6qRZtBXtGxM56lhUM7mJ1B6jn+03zP0VRBzBHG2KgHJdLcvv7CXwbgu5MDMb Jf55tV0+4PzmucEgPiLsNzU3mJegHjpPg+T5ssk0vCNXnP6d+rH5A/607eZBALamRUSp u0kGnm8vyeqCt15FPEtbH1ZgjsOHTY0glGpGK/oqHwlkr5TJTQB8jM+MyVbmpani4xQ1 IgOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292521; x=1744897321; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sm9qg46trCRfuH5OPEG2I+8thbipFQuVhxq71kAuSK8=; b=A1aSo0sKudpRbyynvnZOuR8m2JUVfC6k6qXMebT+zWPW6Rnn0Ppe9f1CUhCWZjaT6U LnkxiH0SM7c0ywzOoEPzI3unxcPruygJQcPTSuxVbHSMF89XEfzWB1BFsg+tkYSrEwmi k463OZmdi6V7Xu1mdap6z3WYUTlzOwQoafH44jAGCatgrxFdlG4y059LYUtcchQ/zu34 lyJ/2Ld/o3g9Lhtq/+yQlYfqefclWeOO/H1EYw+dNTVGctiU1xcHPYJ2CM/2CE3lEGyA igjfpnpY02umpYkSPQGlMRNOrrY/v6Onh8auYqAYqDF5OuBBHwGoXXYlAZ7388YIvfoI EkaA== X-Gm-Message-State: AOJu0YwCS21p/MLso7k31+Q0O4nowx5XtFDvJBS9DYODAyEsBlVBwlWx LF4lNHY9OcELvm/yH4ZmWHKDoStxbueVfm6FOtk7hlFb6hxGIwi/ppT2LiQX9NXl2mIEA7Vu5MS pmzfkZYZ1msRXoxJ50gbiYPZ+uNu00Mw7hiFjWbBy+Xr8wllq1IOD/rs8Sy0mdNsyeQYWCGlUoR KDTGUWc6iAHhIel0jsDpoEO8r7fQ== X-Google-Smtp-Source: AGHT+IF4ZCUofvG98+/W0HdYPKmbxRqu7FA683gwVCBxJo6VzdLSZzPQd41IKBaeCpJh++s/0G8p8qN1 X-Received: from wmbh20.prod.google.com ([2002:a05:600c:a114:b0:43d:8f:dd29]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:64c9:0:b0:39c:2673:4f10 with SMTP id ffacd0b85a97d-39d8f4c9222mr2471771f8f.23.1744292521043; Thu, 10 Apr 2025 06:42:01 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:28 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=6881; i=ardb@kernel.org; h=from:subject; bh=me4CAa7ZM5wAa0+f+0nLWScVpocU5E9ofun8VFTWI5A=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qb5DnQs2LjJv07t168XhDdvcFNpKzecavrl0vfXvr FCPbZPedpSwMIhxMMiKKbIIzP77bufpiVK1zrNkYeawMoEMYeDiFICJRHMx/Nhz7VvlN+GYx8bs FRP0mv0nRik3dNwTmHKi2+LSqg86lxgZ9rAKeOjbbVrSl/Lv26tPOXqvr+Z0L+WJuzjDKHtd/ms lfgA= X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-23-ardb+git@google.com> Subject: [PATCH v4 10/11] x86/boot: Drop RIP_REL_REF() uses from early SEV code From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Now that the early SEV code is built with -fPIC, RIP_REL_REF() has no effect and can be dropped. Signed-off-by: Ard Biesheuvel --- arch/x86/boot/startup/sev-shared.c | 26 +++++++++----------- arch/x86/boot/startup/sev-startup.c | 16 ++++++------ arch/x86/include/asm/sev-internal.h | 18 +++----------- 3 files changed, 23 insertions(+), 37 deletions(-) diff --git a/arch/x86/boot/startup/sev-shared.c b/arch/x86/boot/startup/sev-shared.c index 815542295f16..173f3d1f777a 100644 --- a/arch/x86/boot/startup/sev-shared.c +++ b/arch/x86/boot/startup/sev-shared.c @@ -299,7 +299,7 @@ static int svsm_perform_ghcb_protocol(struct ghcb *ghcb, struct svsm_call *call) * Fill in protocol and format specifiers. This can be called very early * in the boot, so use rip-relative references as needed. */ - ghcb->protocol_version = RIP_REL_REF(ghcb_version); + ghcb->protocol_version = ghcb_version; ghcb->ghcb_usage = GHCB_DEFAULT_USAGE; ghcb_set_sw_exit_code(ghcb, SVM_VMGEXIT_SNP_RUN_VMPL); @@ -656,9 +656,9 @@ snp_cpuid(struct ghcb *ghcb, struct es_em_ctxt *ctxt, struct cpuid_leaf *leaf) leaf->eax = leaf->ebx = leaf->ecx = leaf->edx = 0; /* Skip post-processing for out-of-range zero leafs. */ - if (!(leaf->fn <= RIP_REL_REF(cpuid_std_range_max) || - (leaf->fn >= 0x40000000 && leaf->fn <= RIP_REL_REF(cpuid_hyp_range_max)) || - (leaf->fn >= 0x80000000 && leaf->fn <= RIP_REL_REF(cpuid_ext_range_max)))) + if (!(leaf->fn <= cpuid_std_range_max || + (leaf->fn >= 0x40000000 && leaf->fn <= cpuid_hyp_range_max) || + (leaf->fn >= 0x80000000 && leaf->fn <= cpuid_ext_range_max))) return 0; } @@ -1179,11 +1179,11 @@ static void __head setup_cpuid_table(const struct cc_blob_sev_info *cc_info) const struct snp_cpuid_fn *fn = &cpuid_table->fn[i]; if (fn->eax_in == 0x0) - RIP_REL_REF(cpuid_std_range_max) = fn->eax; + cpuid_std_range_max = fn->eax; else if (fn->eax_in == 0x40000000) - RIP_REL_REF(cpuid_hyp_range_max) = fn->eax; + cpuid_hyp_range_max = fn->eax; else if (fn->eax_in == 0x80000000) - RIP_REL_REF(cpuid_ext_range_max) = fn->eax; + cpuid_ext_range_max = fn->eax; } } @@ -1229,11 +1229,7 @@ static void __head pvalidate_4k_page(unsigned long vaddr, unsigned long paddr, { int ret; - /* - * This can be called very early during boot, so use rIP-relative - * references as needed. - */ - if (RIP_REL_REF(snp_vmpl)) { + if (snp_vmpl) { svsm_pval_4k_page(paddr, validate); } else { ret = pvalidate(vaddr, RMP_PG_SIZE_4K, validate); @@ -1377,7 +1373,7 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info) if (!secrets_page->svsm_guest_vmpl) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_VMPL0); - RIP_REL_REF(snp_vmpl) = secrets_page->svsm_guest_vmpl; + snp_vmpl = secrets_page->svsm_guest_vmpl; caa = secrets_page->svsm_caa; @@ -1392,8 +1388,8 @@ static bool __head svsm_setup_ca(const struct cc_blob_sev_info *cc_info) * The CA is identity mapped when this routine is called, both by the * decompressor code and the early kernel code. */ - RIP_REL_REF(boot_svsm_caa) = (struct svsm_ca *)caa; - RIP_REL_REF(boot_svsm_caa_pa) = caa; + boot_svsm_caa = (struct svsm_ca *)caa; + boot_svsm_caa_pa = caa; /* Advertise the SVSM presence via CPUID. */ cpuid_table = (struct snp_cpuid_table *)snp_cpuid_get_table(); diff --git a/arch/x86/boot/startup/sev-startup.c b/arch/x86/boot/startup/sev-startup.c index 10b636009d1c..e376a340b629 100644 --- a/arch/x86/boot/startup/sev-startup.c +++ b/arch/x86/boot/startup/sev-startup.c @@ -467,10 +467,10 @@ int svsm_perform_call_protocol(struct svsm_call *call) * ghcbs_initialized is set, then it is late in the boot and no need * to worry about rip-relative references in called functions. */ - if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + if (sev_cfg.ghcbs_initialized) ghcb = __sev_get_ghcb(&state); - else if (RIP_REL_REF(boot_ghcb)) - ghcb = RIP_REL_REF(boot_ghcb); + else if (boot_ghcb) + ghcb = boot_ghcb; else ghcb = NULL; @@ -479,7 +479,7 @@ int svsm_perform_call_protocol(struct svsm_call *call) : svsm_perform_msr_protocol(call); } while (ret == -EAGAIN); - if (RIP_REL_REF(sev_cfg).ghcbs_initialized) + if (sev_cfg.ghcbs_initialized) __sev_put_ghcb(&state); native_local_irq_restore(flags); @@ -542,7 +542,7 @@ void __head early_snp_set_memory_private(unsigned long vaddr, unsigned long padd * This eliminates worries about jump tables or checking boot_cpu_data * in the cc_platform_has() function. */ - if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) + if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED)) return; /* @@ -561,7 +561,7 @@ void __head early_snp_set_memory_shared(unsigned long vaddr, unsigned long paddr * This eliminates worries about jump tables or checking boot_cpu_data * in the cc_platform_has() function. */ - if (!(RIP_REL_REF(sev_status) & MSR_AMD64_SEV_SNP_ENABLED)) + if (!(sev_status & MSR_AMD64_SEV_SNP_ENABLED)) return; /* Ask hypervisor to mark the memory pages shared in the RMP table. */ @@ -1356,8 +1356,8 @@ static __head void svsm_setup(struct cc_blob_sev_info *cc_info) if (ret) sev_es_terminate(SEV_TERM_SET_LINUX, GHCB_TERM_SVSM_CA_REMAP_FAIL); - RIP_REL_REF(boot_svsm_caa) = (struct svsm_ca *)pa; - RIP_REL_REF(boot_svsm_caa_pa) = pa; + boot_svsm_caa = (struct svsm_ca *)pa; + boot_svsm_caa_pa = pa; } bool __head snp_init(struct boot_params *bp) diff --git a/arch/x86/include/asm/sev-internal.h b/arch/x86/include/asm/sev-internal.h index 73cb774c3639..e54847a69107 100644 --- a/arch/x86/include/asm/sev-internal.h +++ b/arch/x86/include/asm/sev-internal.h @@ -68,28 +68,18 @@ extern u64 boot_svsm_caa_pa; static __always_inline struct svsm_ca *svsm_get_caa(void) { - /* - * Use rIP-relative references when called early in the boot. If - * ->use_cas is set, then it is late in the boot and no need - * to worry about rIP-relative references. - */ - if (RIP_REL_REF(sev_cfg).use_cas) + if (sev_cfg.use_cas) return this_cpu_read(svsm_caa); else - return RIP_REL_REF(boot_svsm_caa); + return boot_svsm_caa; } static __always_inline u64 svsm_get_caa_pa(void) { - /* - * Use rIP-relative references when called early in the boot. If - * ->use_cas is set, then it is late in the boot and no need - * to worry about rIP-relative references. - */ - if (RIP_REL_REF(sev_cfg).use_cas) + if (sev_cfg.use_cas) return this_cpu_read(svsm_caa_pa); else - return RIP_REL_REF(boot_svsm_caa_pa); + return boot_svsm_caa_pa; } int svsm_perform_call_protocol(struct svsm_call *call); From patchwork Thu Apr 10 13:41:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 880620 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A590528D85E for ; Thu, 10 Apr 2025 13:42:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292526; cv=none; b=dMM/U9qq2jZrm8mv+6cZ+ZKKJwjAcGUoA692tfDVElKjz/PqoFNkyMPTx1Qa+2MUgb1cypPbARyI2fsBNCncmpmutk+Qg8v1iIsRkhKwFtHPBW6/bPKzZYey+RcYyLyeXevtugl+WeHWO2Yid4wx7mEhie7Wp+uY5bGFr7n3LF4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744292526; c=relaxed/simple; bh=c7VVRQXrhzgfIlMjr+TpjXTvNCRF1v445ixvQETbT+A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=fG1hN8SLy6YAPN6PJx12qwbUa5p5Kxm04oBZ7enwdxC18MnymKUoPSGleUPlqyB0Kd/aPQJRB3x8UbzShW9yGNbYl2w4ABY2ow8v3Djzwn1fLh3S6m0fLCjSMqdP3Rs3ok5E1pE8XhGAvInuq/tPbsWr618Ht9vVgg/ChYfKYrc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=oRH7a5BX; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--ardb.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oRH7a5BX" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43ce8f82e66so5643145e9.3 for ; Thu, 10 Apr 2025 06:42:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744292523; x=1744897323; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=uOh+IRGczBamQjnnCYnCR7O3Qkx4ta7ti5u2qxCyexk=; b=oRH7a5BXIVMl2WzWDgYkBvEIvoahMb1DbRXVYqg27ybI000V8qrjA7HBZTO2vQGPfi +6rFgUaGJI14p1kjig/Zln3/HUD3eWDkI4ItRgrIIQlq1iXHYz8mrE2aqXs48fdZD4xt aIziUcIlq27+PpdHced+nxi+p+f7XtpEZV64JXmtJl3jaaoDIoe/xezibebqD/x0iudR 0Wj8/ysrAZFaxKKaF/PfrmQ0/CzPgHDOEdi5x2cFWRm4b3jhxNDS4rKbshTCLIHFhKR3 jGM9sbNRor6R1GdA2LlNYMvrERuBO/yAy8jkofxjYsWr6RZhMeCCufkyfS5/0zARyDwt 32eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744292523; x=1744897323; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=uOh+IRGczBamQjnnCYnCR7O3Qkx4ta7ti5u2qxCyexk=; b=GsIOIz8DgQEdXOtdq16AZeSteJTqLWWNC3HSQ7xTB1GfStK6noyR7DQy4SxRtl4OCw VQ9ZEUSePavXZza8iALYTdXqAVh37kOBUvYU7n7Vugm8oyxES9z5aAZmIwmOMhMQKhFH jm+EAolUfBCO2xMcEAhegYoPvO6V38alopSBoWAd55NXCMUPQT0C38yFNwV9Ds/erAr+ A8hSUip5Z2huTY5D6yQets3mliipxwaj2ZVpshHfbSi3Di2rL41Mq//O/iTszGNhofoe 5winsPiy5JqayKaBdavUYrVvo65kxBzBwTI5w7/4k/iXIEQrkzKXzs+4bLvni2/NvEAc zSDg== X-Gm-Message-State: AOJu0Yyw+QbPaGInRVCOnBNNiHRSTkqdsN0/8Lx3uEWl1jjL1YVptRxd 4G3IJZYu66zQZ41g3mAsxVl/6T7PJldUfSFraTgPdu28iM8o43AVvRKfGPWWfOZDGbxL7PdVdx8 2F2gMe/Sxm037N4ogGf+JuVaPB5wmXKe5Kwz+E3ODtf+BjxDyI2jmyZI6fFLLHOlJiaaPGW+Pde hGtxHoO0B/vPrjj3hffpDNxp8S9Q== X-Google-Smtp-Source: AGHT+IHlh93oL9mKoWPZ8xOYA3DwDprouXIOHdb9NVnXRG5y9eYSfFLcWKwT5Rx6iE1dQlUAN4pF5K0M X-Received: from wmbeq5.prod.google.com ([2002:a05:600c:8485:b0:43c:eba5:f9b3]) (user=ardb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:19d0:b0:43e:afca:808f with SMTP id 5b1f17b1804b1-43f2ffa2ee7mr26677755e9.31.1744292523066; Thu, 10 Apr 2025 06:42:03 -0700 (PDT) Date: Thu, 10 Apr 2025 15:41:29 +0200 In-Reply-To: <20250410134117.3713574-13-ardb+git@google.com> Precedence: bulk X-Mailing-List: linux-efi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250410134117.3713574-13-ardb+git@google.com> X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=717; i=ardb@kernel.org; h=from:subject; bh=X3JfDQ3/BuCITBPYEHh2vS/KH9MRV7HSxm9OJYaOaHg=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIf37qQnnWN6GX+s0/hHHK/zwBN/KK5prDScuMDvg/Dz00 afpPftdO0pZGMQ4GGTFFFkEZv99t/P0RKla51myMHNYmUCGMHBxCsBEtrQy/JUKWSp+5LqmLasU f9ZDuwAHRm4V4RT7x/6zRR/N/a/1XZ6R4eC8BdplW9cqCGc8D1q2+W7RbgOd5z8C3EUF1fb3fKm 7ywwA X-Mailer: git-send-email 2.49.0.504.g3bcea36a83-goog Message-ID: <20250410134117.3713574-24-ardb+git@google.com> Subject: [PATCH v4 11/11] x86/asm: Retire RIP_REL_REF() From: Ard Biesheuvel To: linux-efi@vger.kernel.org Cc: x86@kernel.org, mingo@kernel.org, linux-kernel@vger.kernel.org, Ard Biesheuvel , Tom Lendacky , Dionna Amalie Glaze , Kevin Loughlin From: Ard Biesheuvel Now that all users have been moved into startup/ where PIC codegen is used, RIP_REL_REF() is no longer needed. Remove it. Signed-off-by: Ard Biesheuvel --- arch/x86/include/asm/asm.h | 5 ----- 1 file changed, 5 deletions(-) diff --git a/arch/x86/include/asm/asm.h b/arch/x86/include/asm/asm.h index a9f07799e337..eef0771512de 100644 --- a/arch/x86/include/asm/asm.h +++ b/arch/x86/include/asm/asm.h @@ -120,11 +120,6 @@ static __always_inline __pure void *rip_rel_ptr(void *p) return p; } -#ifndef __pic__ -#define RIP_REL_REF(var) (*(typeof(&(var)))rip_rel_ptr(&(var))) -#else -#define RIP_REL_REF(var) (var) -#endif #endif /*