From patchwork Fri Dec 13 16:47:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850845 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEAE41EF093 for ; Fri, 13 Dec 2024 16:48:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108498; cv=none; b=IbQpK365F3yViDna2LLIrsUMpoEKTLFQB/yY4+/o6Fvuta4efV6NqynDc7xg40lYxL+vmmhQS+1jA4XJ+FN+/ImqZ+zoHqZdt5GBjthI/REiinienFFsesAdl/KYIAEiZ9j5SucGw+jHd2sJW1Sntukg3GycbIGZnk55syiXgpc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108498; c=relaxed/simple; bh=vt1tv3thcSuusIu94qlbNdtWSULIMt9Iwb3oz+YsFfY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Hk/0TZk+UdWwPK2g/eFJTIGr8VHRGTgEo4MCAtfKHdQ7LM5YJYoz7FGJXnkCoS1gOW2iv2n4f4/R7v8TQh3qDro903avg6AjkY8/WyKLAnM2bL3/AH35indP6dnHC5hYk4St1Blabc+jZ3KPEAYdN9zRDR3xBkGODEL06s3zH4M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pwaodsk6; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pwaodsk6" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43582d49dacso17330635e9.2 for ; Fri, 13 Dec 2024 08:48:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108495; x=1734713295; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zAuvJ3/1BpungxKC2V7q88+gZ9+3WIZOJ/4nQKuXfFs=; b=pwaodsk67bNZI+cim6lSRCHLusgSaSq+fd2lSDlVxe4yFyvkx64MaEXLCfuAslQSUp axCrFHmfv4y/tyg7mlSIKooItR/M6yiMIcHzGsFjKboNlWXx0aC9c+paA+eHsYJgylw4 /FMG0fumOjPatdhDZKAvtpQhisREwB9rSrxqMwzWvPajONjGpsiiN+O9Dl2jcQfcok5T yRpesab7rRoY9gq326tvyPPuYva7wnmqp2wexlVWPgkPrPUSiEyIGkKyy3uhkIZG9i5i Bwo3EtD4QoWTOufAG/JfpQULey+Ns+oEy3sXEYPqbvDGy4b3x0YBNd/UlsTX368/NQx2 nCKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108495; x=1734713295; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zAuvJ3/1BpungxKC2V7q88+gZ9+3WIZOJ/4nQKuXfFs=; b=o5SoXCAm70FedDRoDQ/pP8CWS7yDjLBOyhiN702cw0Xor58tQRDk28YW0zevZ1zbFZ ZE+eSB4zj5vu9l0FVLKLn+GiOLqDkRt2uqpMdkQnoyyz7Mi7inLBLCtjLe6kT/PIMsSx lQQIVHacZCksnSl+r7AIR+lg04fsWRZR0KlAAJzdnWPCx+wgLT2MFiT64UMUfjZglYDi nLQwqPyxEETaxzwj1YuG7Y+VhOOCMTEt5WRwNzkyRoCKIETR+UPfVxDpUFeNvWrd2zEu PE2Q6XaBxWIegUwdQ62ac7ub5aqZJkGS7aWIN+T+2SU1rhaGXKR8IrA9/mYNWk+a4+6f R3cg== X-Forwarded-Encrypted: i=1; AJvYcCVmND2H3240THqo83hNEvrptYtpYUsoE6bDR/bDQgUcjXdAm8PSf7oLXQ17N4gJMvIMB3BIn1weP3ULDqUn@vger.kernel.org X-Gm-Message-State: AOJu0Yz1Cwe0llL3aFUYIYs+Y4lolAO7KVVPbw0SiZR/OgWGE3mcpgn5 k+OziQD8ua2DeGwVNjVQFcUolVoG9mc0gBsjEv+0FmJw243RGj2+3LkOtsaAclpHyKYc1owUaw= = X-Google-Smtp-Source: AGHT+IHs8ZHuR71xrnh0Wig8GOEjO8jW22DFD8cUeagZ+0AiDaeSDaoT2cCNj9eWJgQpPRurrM0qMZgcmQ== X-Received: from wmbd13.prod.google.com ([2002:a05:600c:58cd:b0:434:9dec:7cc5]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:348f:b0:434:a802:43d with SMTP id 5b1f17b1804b1-4362aa9f682mr28943725e9.27.1734108495459; Fri, 13 Dec 2024 08:48:15 -0800 (PST) Date: Fri, 13 Dec 2024 16:47:57 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-2-tabba@google.com> Subject: [RFC PATCH v4 01/14] mm: Consolidate freeing of typed folios on final folio_put() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Some folio types, such as hugetlb, handle freeing their own folios. Moreover, guest_memfd will require being notified once a folio's reference count reaches 0 to facilitate shared to private folio conversion, without the folio actually being freed at that point. As a first step towards that, this patch consolidates freeing folios that have a type. The first user is hugetlb folios. Later in this patch series, guest_memfd will become the second user of this. Suggested-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/page-flags.h | 15 +++++++++++++++ mm/swap.c | 24 +++++++++++++++++++----- 2 files changed, 34 insertions(+), 5 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index cf46ac720802..aca57802d7c7 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -970,6 +970,21 @@ static inline bool page_has_type(const struct page *page) return page_mapcount_is_type(data_race(page->page_type)); } +static inline int page_get_type(const struct page *page) +{ + return page->page_type >> 24; +} + +static inline bool folio_has_type(const struct folio *folio) +{ + return page_has_type(&folio->page); +} + +static inline int folio_get_type(const struct folio *folio) +{ + return page_get_type(&folio->page); +} + #define FOLIO_TYPE_OPS(lname, fname) \ static __always_inline bool folio_test_##fname(const struct folio *folio) \ { \ diff --git a/mm/swap.c b/mm/swap.c index 10decd9dffa1..6f01b56bce13 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -94,6 +94,20 @@ static void page_cache_release(struct folio *folio) unlock_page_lruvec_irqrestore(lruvec, flags); } +static void free_typed_folio(struct folio *folio) +{ + switch (folio_get_type(folio)) { + case PGTY_hugetlb: + free_huge_folio(folio); + return; + case PGTY_offline: + /* Nothing to do, it's offline. */ + return; + default: + WARN_ON_ONCE(1); + } +} + void __folio_put(struct folio *folio) { if (unlikely(folio_is_zone_device(folio))) { @@ -101,8 +115,8 @@ void __folio_put(struct folio *folio) return; } - if (folio_test_hugetlb(folio)) { - free_huge_folio(folio); + if (unlikely(folio_has_type(folio))) { + free_typed_folio(folio); return; } @@ -934,13 +948,13 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) if (!folio_ref_sub_and_test(folio, nr_refs)) continue; - /* hugetlb has its own memcg */ - if (folio_test_hugetlb(folio)) { + if (unlikely(folio_has_type(folio))) { + /* typed folios have their own memcg, if any */ if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - free_huge_folio(folio); + free_typed_folio(folio); continue; } folio_unqueue_deferred_split(folio); From patchwork Fri Dec 13 16:47:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850364 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE3531F03D4 for ; Fri, 13 Dec 2024 16:48:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108500; cv=none; b=Na1mjz8wF+/aUF9mRksgVTNazAxiCab8OqpyEaiBuFDZSaEAL3ZVSpRU1NYOZunBxbZF2MPdly5oiWi+G4bCF/ejylms9GRvJv7BpIxM4lhM4hlPwEBAwK/R95QsIrN+8iADj4uz9s9psruNHIj8Gqt+wSsIT38H78yA3kvVnh8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108500; c=relaxed/simple; bh=6hWs3HynUV9F0pYtSrb9dYcQg0dIuQhd3cTg2777BS4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=YqoTSQ3eNVPNTgXAN4gDY+PgNAP3wHXQ6W/Ae9TSUofrN/t3seMj3gKEViQWeorM8GlZsZIRc9c6HqEDaViMDkfrDCDKBblt/woODPKRKreH988rqXgh7iNw/fHdusCecPtlgahA9GjsiQ6oSxz1643p1mXbniEaAY7cvpeeWrg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=XmSjY43c; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="XmSjY43c" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4359206e1e4so18656985e9.2 for ; Fri, 13 Dec 2024 08:48:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108497; x=1734713297; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8sbfS8W1wicHM5vsGa5IMowjJ2CAstrX6T9HihvAEa4=; b=XmSjY43cYB00mvfWxmHEGcDaLjyD8JopszuZSiZ9Cwbg855UdVWIqMuUq9uMKTg707 suvCB7a7sB+l9AnpJsSFtJ53BNkdObwSn1+dSuT5FSVMS93OkgLZ1wHeYnqopQW3Itmt nV3zNgySi7cIIquy+K2hu8YNal6ub6GaPq4Dmhuts6rhxW7sR1x3Xyo200giZQKa5EQs 0sodABa2V9snAF7PyKWIJHqe3t5D1DVnJZ7AM/V8rMAjTG05MsJMfrDQly8I8hjONYKj ydYsPk78dvV7oX8+DFz+ZM/eHDQmEUm+l9scVRNFUNG8U6Cp1EumeXu5W59moFMbvhJh GlMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108497; x=1734713297; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8sbfS8W1wicHM5vsGa5IMowjJ2CAstrX6T9HihvAEa4=; b=qqjgfEmlX+MrtGB7cYgTd23ei8NaZzzTxuYiXnncQM6BD/yfJpQxz6xjaYm9eIfT+G tepGL+YNnS+73HAMi5W0bgI/FYHce4kJHkfMaVNzekWk/Lq69u9UE018AbDbyjcB16C0 a8M91Z6rkRw8LZO+oVe3F2xTmkCs9CLQx9Gd2FpbjKEdLYA2A5xCdlHrc5B9QAQXbmZW I7i6oUXcMliTU8k1LjcBZHWYvPfzQP5mpWgwPxDKw+DHn5DD2Tx/rbNoYsQqLPK4YLWh jB6v2ojvZJN5narwyStRgnyVCjZHEuxkwjgpf7oldfa1+QhzLSE2vkxjS7rBHMlW00Ku Uxwg== X-Forwarded-Encrypted: i=1; AJvYcCUBXgKR88UuwJGxJ8NpexwrsWwRkJpjs1/md8SVQsDuDel1Z735l/B+H6tFBN3Is9guF1xfygyj/jVjqOAj@vger.kernel.org X-Gm-Message-State: AOJu0Yysfzv6OeOBbIzg1IPr50dXga3JvGJ9w//ePiRO71bNGnwrXPMu kjpSjpHl5qn5WO/WeVNcK6rRWWK3oh9QbhQecVbDqG8Q83Okr5Lt5IlfthT3foT3+GPWVezAuQ= = X-Google-Smtp-Source: AGHT+IEPhjbYVrPG4OvpW3X6Nrv45wDkaxf1Je2UWVRf/3W/68PxH4JbNqktpfTiAT0Xl8J+JFcRBc200g== X-Received: from wmnb22.prod.google.com ([2002:a05:600c:6d6:b0:434:a4bc:534f]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1da4:b0:434:f753:6012 with SMTP id 5b1f17b1804b1-4362aa509abmr35207575e9.17.1734108497542; Fri, 13 Dec 2024 08:48:17 -0800 (PST) Date: Fri, 13 Dec 2024 16:47:58 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-3-tabba@google.com> Subject: [RFC PATCH v4 02/14] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Ackerley Tng Signed-off-by: Fuad Tabba --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 119 ++++++++++++++++++++++++++++++------- 2 files changed, 100 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 47a9f68f7b24..198554b1f0b5 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 +#include +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -307,6 +312,38 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, @@ -316,6 +353,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -397,11 +436,67 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + + if (kvm_gmem_fops.owner && !try_module_get(kvm_gmem_fops.owner)) + return ERR_PTR(-ENOENT); + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + iput(inode); + return file; + } + + file->f_mapping = inode->i_mapping; + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + + return file; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -415,32 +510,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Fri Dec 13 16:47:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850844 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E95011F12EA for ; Fri, 13 Dec 2024 16:48:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108503; cv=none; b=HGFD8MtaKAX/oVEPkx+Q2RVXdeFk5aemvJSR4v3oMGMM7lVilqBSCkNBCipkbzENRULd0fSnvY4cvZEGKpPnnB+InfJxv2VmstgQMKTIhT8c+K8FKrI+cIhjcFXMa0Csy9cz1ddrA1RBYkBYmWQIfu77oyFq1vfg9LJb/GIFX5w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108503; c=relaxed/simple; bh=iCi7APSFa/qYdDys/K0F8mdRcZOCsK0RuWlqnSzupl8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=NHCr0FAv/71dXNDlAdS5yVu80DxH6W8+m0NzZJw67s5rOL9hbrKoqtH3HJyiCkhIaicORaXVkc3y/6MVCQxfVb0da4la1xnwmQugMXBHf11CsUiyP3wdcSITcj0bTEv/GFo0nhmFv3A/FLUmrmVccJ3Cc4ZnrdFtjQouS7S9tb8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=lQCrWHiK; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="lQCrWHiK" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361ac607b6so16011865e9.0 for ; Fri, 13 Dec 2024 08:48:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108500; x=1734713300; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Nn98e+WujoBnk4EYt4M9w4x2TZk0zk8clmlBtUUAHUE=; b=lQCrWHiK5qLMHEIjGmkmS+3Vxk46/P1ZKpKwBM/IAu+JEzrffVWV7sAZwkF7gUTp3S nmGO5iMq8EmLg+88632ZN+6hWo1zGwSQUhB+4Qd5lroq6qfZfJpxUIXZl2HAkNN+/Uz+ QchVY3uiwTSAA5lwKqHr3XJgwfIjm6fT/FAIOOQhca0XvN/jf7cMgZLPgrXbIN+qyVFu bcO0+u89sShy4YuM9zEJvpWgVhrY1WLeY4DiGGgdJk9BvxxAvlN1YX2b4xG2PIMc7iPi XjX9g7Ndxxw5nF+jJSn5lDDg7t6xwcQlnHpqx6C9u+1FzBFuXPIDCYm0tWBJf8REnb0n 6tHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108500; x=1734713300; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Nn98e+WujoBnk4EYt4M9w4x2TZk0zk8clmlBtUUAHUE=; b=hfbIjFLjwzkmAIbRB0IsZsd/1rBEwvBZcG1q5OoqscQj80IzA99zXEMT0jjaKlBLbM AW+S4w/KuzpZ25gf827+475HokXMjpH1/x1dDQCepfTM73fi+qsVoMY24syHWv9n688j eysb6N/Aw9m5CYDnvXAnCw89i1YTLnKbF4Q3WUqrxVm6/xJZJZIz0oLQuBJatF0XB/BW h+A9+HoaLnP7++9bYKBvfFzNdidWQcO98LnnZCRhFyZ1idfKdPrpdLM1hhYHPnjTRYLw LMXa1SPT1ey5Y8puDFl6nL1qYXyk9hYdHuoT/Ja4ZpMNbTNN4M1SUfZgDF4L0KsVoCdE +5vQ== X-Forwarded-Encrypted: i=1; AJvYcCUdrvdcVHTfa4D/yzxjV2N2Wil0RfdV3qlmit/FtZPxid7sPViHsGEVbn5qdhzns0p+D0GW7O1AH+qjSd2I@vger.kernel.org X-Gm-Message-State: AOJu0YyXxZjnrF2/1f31AfwbBvb+ShAppHCi8nLCgBCw3Xe5SaFGz/Pr C5pXfud7kOgD+24Q0nzutkKeR82K6Gmlo9BjEgCNXKSmibqN/n3hLYOKKC3u0LStOrJQh9piGA= = X-Google-Smtp-Source: AGHT+IGbUUdOrO+LdNxgwAJN5RVrV2s69jyBzXe2ihtQ1qCa2+mzNDnn/emZBrvQ9OXRTxBEpqDlu0D2/g== X-Received: from wmma26.prod.google.com ([2002:a05:600c:225a:b0:434:f299:5633]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1e8a:b0:434:f5d1:f10f with SMTP id 5b1f17b1804b1-4362aa52fa7mr31352765e9.17.1734108499591; Fri, 13 Dec 2024 08:48:19 -0800 (PST) Date: Fri, 13 Dec 2024 16:47:59 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-4-tabba@google.com> Subject: [RFC PATCH v4 03/14] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Create a new variant of kvm_gmem_get_pfn(), which retains the folio lock if it returns successfully. This is needed in subsequent patches in order to protect against races when checking whether a folio can be mapped by the host. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 +++++++++++ virt/kvm/guest_memfd.c | 27 ++++++++++++++++++++------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 401439bb21e3..cda3ed4c3c27 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2500,6 +2500,9 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, struct page **page, int *max_order); +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order); #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -2509,6 +2512,14 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_get_pfn_locked(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, + struct page **page, int *max_order) +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 198554b1f0b5..6453658d2650 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -672,9 +672,9 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, return folio; } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, struct page **page, - int *max_order) +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) { pgoff_t index = kvm_gmem_get_index(slot, gfn); struct file *file = kvm_gmem_get_file(slot); @@ -694,17 +694,30 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, if (!is_prepared) r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio); - folio_unlock(folio); - - if (!r) + if (!r) { *page = folio_file_page(folio, index); - else + } else { + folio_unlock(folio); folio_put(folio); + } out: fput(file); return r; } +EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn_locked); + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) +{ + int r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, pfn, page, max_order); + + if (!r) + unlock_page(*page); + + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM From patchwork Fri Dec 13 16:48:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850363 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3A0461E47C9 for ; Fri, 13 Dec 2024 16:48:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108505; cv=none; b=asYR4IWlKQQdmF2bh7I2g6D6M/Q7RplMlZScGWKV0hIBkft4RcWruHOQKZNjCmjYvB78N5AIPo/L5MDmd2WTiFr8V/jnNEuvCxHiFyBZ0H0aDxg+bZ3cJBDaKcVCshr2Zq+Oba15JcBFArpz0JzBUx9UnNsERD2QT8f5ZyEJW5g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108505; c=relaxed/simple; bh=zcjkMhxoY1cdUKFs54GzqECSAze9A8v0uUzrNhnwYfo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=MNOn9+EF9dQ4kLcazR1jWK3XLWu4i1TUdGL4jGu046GHO0QLAScujBkV60GE6+fWXrBFjD2CwMYmr35anPj4QdsZSzwUaHXm1idX9ayFCRavr9IcXZmtu3wGJEWBW+VWx/pwUUYt73SfayfHNS1VJMLZuWCkWUF155/i6EFfyN0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=o6oEiOVd; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="o6oEiOVd" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-3860bc1d4f1so1298467f8f.2 for ; Fri, 13 Dec 2024 08:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108501; x=1734713301; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=aq6uxvVbG5xxzgybZF/l/V4c8HPIK8uPOyna1cvYXWA=; b=o6oEiOVdDIxVCohrx0nZINc24jdT4HyfALhdEG5DdtWn9vUiiZDTquxygnfp0umc1F dnkt77AuGJdopTbg/oRxhoqn3VrvuB++ro5JgjHGFLo152tQZZqVwYTVeTPP3G9mOKJC ereY8BmSdNQfRU/GKE7uMZZKDlkLwQEqIQV1eXOj+AgUYBDtIRRGFPDOlGafLNzwJZbc tCuA4LpDh1vx8KmqbDkzLmudTMF1O5ZN8IZVo2RziUGMeGO+HtCjzAcN5LXy9XLJdfCy HvCNeHPDM4cuWV4SwNczXV+6y1VnwMbwgJly4yT1DMBjfj+mZPSViIXgviEPQGweGTDb xBvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108501; x=1734713301; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=aq6uxvVbG5xxzgybZF/l/V4c8HPIK8uPOyna1cvYXWA=; b=QDR0bjnPLfspk3W4CIZVNkdKK+P7LxVzMReMxvfcH8VeS2RTaGShh+GasixUGSlJeF K3qWWagxaG64ABRqatF8k8LYqVe5gzjM0yB2YYtkzdjsnpFycEk5bmlOmjNjjx9Lothq mxA+C2M9D7d+BuoGj+DDLyfxY3E8dMKwfCXMw6RTOczgfv2In+1OdHd3naf3AwdvD9d6 rz0MK36X9vI+pk2PAswsYgjEifPwd9wjlmHdfLVyJrPHcys5jR2/zreVlTAbjujaAfHT CyDpRH/n4vmnanQTEpoGy0gDOpUPnFUNwCw9Mk7W4dZ3pelncJQptkb5aEARujcsgn6W RoJQ== X-Forwarded-Encrypted: i=1; AJvYcCV+KvX4JvLg+UGtAuBY0+6Ej99DhSYkI0+P/1amjo3nAQhuKHNQ/SVEg+4ywq9NtKGIRH0LQWO8OysNat5b@vger.kernel.org X-Gm-Message-State: AOJu0YznSCaFAQ0yH/+ab8M7UYv2yzENkbNQLcXfcxvS23W2AduD9HJ5 MRpm9YmCVgKQhxyl+x20WgcN+xx/mTgyggYZ+r/j//vjvA9uEoPAAjWMe7FYL+tAS+4SgWqoIA= = X-Google-Smtp-Source: AGHT+IFSvG/fqSl7ZNJ3qm5zCF5qD4QOJhC1Avgrxg1Y/6IRPNfNklxASuKBBLtf2tg/8VDvraqzyfwYkg== X-Received: from wmbjl5.prod.google.com ([2002:a05:600c:6a85:b0:434:f119:f1a]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:4b12:b0:385:fc70:7eb with SMTP id ffacd0b85a97d-38880ac252bmr2943992f8f.12.1734108501569; Fri, 13 Dec 2024 08:48:21 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:00 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-5-tabba@google.com> Subject: [RFC PATCH v4 04/14] KVM: guest_memfd: Track mappability within a struct kvm_gmem_private From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Track whether guest_memfd memory can be mapped within the inode, since it is property of the guest_memfd's memory contents. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 56 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 5 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 6453658d2650..0a7b6cf8bd8f 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -18,6 +18,17 @@ struct kvm_gmem { struct list_head entry; }; +struct kvm_gmem_inode_private { +#ifdef CONFIG_KVM_GMEM_MAPPABLE + struct xarray mappable_offsets; +#endif +}; + +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) +{ + return inode->i_mapping->i_private_data; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -312,8 +323,28 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +static void kvm_gmem_evict_inode(struct inode *inode) +{ + struct kvm_gmem_inode_private *private = kvm_gmem_private(inode); + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + /* + * .evict_inode can be called before private data is set up if there are + * issues during inode creation. + */ + if (private) + xa_destroy(&private->mappable_offsets); +#endif + + truncate_inode_pages_final(inode->i_mapping); + + kfree(private); + clear_inode(inode); +} + static const struct super_operations kvm_gmem_super_operations = { - .statfs = simple_statfs, + .statfs = simple_statfs, + .evict_inode = kvm_gmem_evict_inode, }; static int kvm_gmem_init_fs_context(struct fs_context *fc) @@ -440,6 +471,7 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct kvm_gmem_inode_private *private; struct inode *inode; int err; @@ -448,10 +480,19 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; err = security_inode_init_security_anon(inode, &qname, NULL); - if (err) { - iput(inode); - return ERR_PTR(err); - } + if (err) + goto out; + + err = -ENOMEM; + private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto out; + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + xa_init(&private->mappable_offsets); +#endif + + inode->i_mapping->i_private_data = private; inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; @@ -464,6 +505,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); return inode; + +out: + iput(inode); + + return ERR_PTR(err); } static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, From patchwork Fri Dec 13 16:48:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850843 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 299341F1306 for ; Fri, 13 Dec 2024 16:48:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108507; cv=none; b=XaifOGVDlhznyCmkJcrzeZd+XC8Vux1hm7iEj4i9zn+huciXUt3cHOlu/Po+toDVikT7IZcfM7zhQ0VigWa85jZzrJy5vd9FqnfJE9fvjVnPr7rQ9TI2MNhu/ace0Qkh4HJa0gxoHZkHpsGZS02CVcLehx881C8m8EpYHv8YUMc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108507; c=relaxed/simple; bh=Px48V1cRK0+MsBVm/1/E4YcmT4j+gTiiv3U+islhvuU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=L8GzMbmGfi4Laix5jZwx4WWuEJEFzGJfpmTPM9sBdjiARPwPWAxp0Ev8zvDLg5hLmOeDrq5yL/ujjbupY1vCFG7ZhTaBE9Pk0B+BHHw7U+Bs5zpdJKsGvkUF5yqnkClAa6arlt3ddyRJIgsEdQ4VhhDDxXkaXFJuAF7BOh1p//w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=24YKx0oH; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="24YKx0oH" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4361d4e8359so15924105e9.3 for ; Fri, 13 Dec 2024 08:48:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108503; x=1734713303; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vJuueBr2gSdijh3IQFMZdg4//VRgMvs45ltnyf1edv8=; b=24YKx0oHXJ0XCJ/Dtz/Ya9Gc8i9ikt0q/lyYkHE0X9h1LASltOlsSS5gOULsk4uOB7 kL+jXkKk5Y1AYB9HTpCjuVaj00BtQaz2dQvszE12jRE+2pJAVmbj+f4YaGhHtQO+/7ip xXDWMB1AP5aAaN4MzupcWExybWqajNu+XzVhVyIf+hiKmAx8uniOpYhkVKWmVztwRqct Y6Ctm1OmW2zm1eQ9DMUum3eCIhWQWF8VEhaZZ+GLjXz7q1gEnp7TlG0+Jk/jTGEfG8f6 oIAQm6AZwohP1LZNprN3jlUmqBz6xTL+SwVz3AjpmfULWcIvvksG3yH9vlNP0D9RG6jU 4Buw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108503; x=1734713303; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vJuueBr2gSdijh3IQFMZdg4//VRgMvs45ltnyf1edv8=; b=mJ52k2TV3UmhJ+p4Eo9DIDG04aaaSyFIjLedGp8u4Z7hcJJWVd5GZS8msqGImO94PY q/v3SzXqMcWekISDgTQjVYTL/iX/aAsr+mJ97beUKBzL90qIzLghHhSDG+Q6T0lrdCmH s9xPrGYoL73TW00B1NgFClkqO5GVR9gL2c4tSJY9pZv3Z489FE3hn4n77b2vPfoXXQqf kzOFrKpGVD+hsj9NKrGL47WM1Upjwqd4Pul04qFPV7amxLycpz6xoJd0uyzUEJ4iACe0 8jZ6IMxGn4bICCyHyvzqtFTC4XeQ5zIZOlO6kMsjwJScVXi79x5YVMgr6uW1rVNIJjiN rjrg== X-Forwarded-Encrypted: i=1; AJvYcCUwL8YZugznpKg/YUT1nc8Q2pi2TD26Bfeb91x9d6UetH7Q+JQF2bCKiVLQ8/DuUCL85oLx2d42Gz7Q+MQh@vger.kernel.org X-Gm-Message-State: AOJu0YwzUS2N4AdQV/x4oLAAQORdfhVfQVMFUZkn8quNN030TnKOk2ef oYZbRnNERFDy5z/jNqR0geLuzy0XHKeg+L8vH5+rhVYtNF/DErlfV+o9AVrJc3sU3Rde/Kwkjg= = X-Google-Smtp-Source: AGHT+IFkQF+3tcycznq/pNNLMGlVHQ2lVFQl1W+VwFJakHWzfT3BbDdg+Hikl/qFw92f/o0UH09Yhh+yDw== X-Received: from wmhp18.prod.google.com ([2002:a05:600c:4192:b0:434:9de6:413a]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1e19:b0:435:1a2:2633 with SMTP id 5b1f17b1804b1-4362aa52c65mr29622895e9.15.1734108503647; Fri, 13 Dec 2024 08:48:23 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:01 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-6-tabba@google.com> Subject: [RFC PATCH v4 05/14] KVM: guest_memfd: Folio mappability states and functions that manage their transition From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com To allow restricted mapping of guest_memfd folios by the host, guest_memfd needs to track whether they can be mapped and by who, since the mapping will only be allowed under conditions where it safe to access these folios. These conditions depend on the folios being explicitly shared with the host, or not yet exposed to the guest (e.g., at initialization). This patch introduces states that determine whether the host and the guest can fault in the folios as well as the functions that manage transitioning between those states. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 53 ++++++++++++++ virt/kvm/guest_memfd.c | 153 +++++++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 92 +++++++++++++++++++++++ 3 files changed, 298 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index cda3ed4c3c27..84aa7908a5dd 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2564,4 +2564,57 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end); +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn); +bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, gfn_t gfn); +#else +static inline bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end) +{ + WARN_ON_ONCE(1); + return false; +} +static inline int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + WARN_ON_ONCE(1); + return false; +} +static inline bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + WARN_ON_ONCE(1); + return false; +} +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + #endif diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 0a7b6cf8bd8f..d1c192927cf7 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -375,6 +375,159 @@ static void kvm_gmem_init_mount(void) kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +/* + * An enum of the valid states that describe who can map a folio. + * Bit 0: if set guest cannot map the page + * Bit 1: if set host cannot map the page + */ +enum folio_mappability { + KVM_GMEM_ALL_MAPPABLE = 0b00, /* Mappable by host and guest. */ + KVM_GMEM_GUEST_MAPPABLE = 0b10, /* Mappable only by guest. */ + KVM_GMEM_NONE_MAPPABLE = 0b11, /* Not mappable, transient state. */ +}; + +/* + * Marks the range [start, end) as mappable by both the host and the guest. + * Usually called when guest shares memory with the host. + */ +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + void *xval = xa_mk_value(KVM_GMEM_ALL_MAPPABLE); + pgoff_t i; + int r = 0; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + r = xa_err(xa_store(mappable_offsets, i, xval, GFP_KERNEL)); + if (r) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +/* + * Marks the range [start, end) as not mappable by the host. If the host doesn't + * have any references to a particular folio, then that folio is marked as + * mappable by the guest. + * + * However, if the host still has references to the folio, then the folio is + * marked and not mappable by anyone. Marking it is not mappable allows it to + * drain all references from the host, and to ensure that the hypervisor does + * not transition the folio to private, since the host still might access it. + * + * Usually called when guest unshares memory with the host. + */ +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); + void *xval_none = xa_mk_value(KVM_GMEM_NONE_MAPPABLE); + pgoff_t i; + int r = 0; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + struct folio *folio; + int refcount = 0; + + folio = filemap_lock_folio(inode->i_mapping, i); + if (!IS_ERR(folio)) { + refcount = folio_ref_count(folio); + } else { + r = PTR_ERR(folio); + if (WARN_ON_ONCE(r != -ENOENT)) + break; + + folio = NULL; + } + + /* +1 references are expected because of filemap_lock_folio(). */ + if (folio && refcount > folio_nr_pages(folio) + 1) { + /* + * Outstanding references, the folio cannot be faulted + * in by anyone until they're dropped. + */ + r = xa_err(xa_store(mappable_offsets, i, xval_none, GFP_KERNEL)); + } else { + /* + * No outstanding references. Transition the folio to + * guest mappable immediately. + */ + r = xa_err(xa_store(mappable_offsets, i, xval_guest, GFP_KERNEL)); + } + + if (folio) { + folio_unlock(folio); + folio_put(folio); + } + + if (WARN_ON_ONCE(r)) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +static bool gmem_is_mappable(struct inode *inode, pgoff_t pgoff) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + unsigned long r; + + r = xa_to_value(xa_load(mappable_offsets, pgoff)); + + return (r == KVM_GMEM_ALL_MAPPABLE); +} + +static bool gmem_is_guest_mappable(struct inode *inode, pgoff_t pgoff) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + unsigned long r; + + r = xa_to_value(xa_load(mappable_offsets, pgoff)); + + return (r == KVM_GMEM_ALL_MAPPABLE || r == KVM_GMEM_GUEST_MAPPABLE); +} + +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_set_mappable(inode, start_off, end_off); +} + +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_clear_mappable(inode, start_off, end_off); +} + +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn) +{ + struct inode *inode = file_inode(slot->gmem.file); + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + + return gmem_is_mappable(inode, pgoff); +} + +bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, gfn_t gfn) +{ + struct inode *inode = file_inode(slot->gmem.file); + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + + return gmem_is_guest_mappable(inode, pgoff); +} +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index de2c11dae231..fffff01cebe7 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3094,6 +3094,98 @@ static int next_segment(unsigned long len, int offset) return len; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + bool r = true; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end, i; + + if (!kvm_slot_can_be_private(memslot)) + continue; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(gfn_start >= gfn_end)) + continue; + + for (i = gfn_start; i < gfn_end; i++) { + r = kvm_slot_gmem_is_mappable(memslot, i); + if (r) + goto out; + } + } +out: + mutex_unlock(&kvm->slots_lock); + + return r; +} + +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + if (!kvm_slot_can_be_private(memslot)) + continue; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + r = kvm_slot_gmem_set_mappable(memslot, gfn_start, gfn_end); + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} + +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + if (!kvm_slot_can_be_private(memslot)) + continue; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + r = kvm_slot_gmem_clear_mappable(memslot, gfn_start, gfn_end); + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} + +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) From patchwork Fri Dec 13 16:48:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850362 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6BC3F1F193A for ; Fri, 13 Dec 2024 16:48:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108509; cv=none; b=YqTNwUPvNSb8FaqsNKz2irJqc/IvWDqiILySHzQArislA3u6QBPR0VpkT7TPd8UvkihYrDU1kFEmhHdNq+mlgdHYzSWKHLg88YKAO5RYSA6KcGquDlnDoRgvOz/C6MeYJgMGuwU2PwbCly2asD/QdDTPpgCo6LnZWtXb+mbMoFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108509; c=relaxed/simple; bh=MRziK0nkXcHn0W5MKOVYd7k+EfFkyEYOhsd8aAEHo54=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ImPS2V10n9bACh/C1mzWjyJkO3fHflYYx2grsoPwqQDEhd0LPnfA1pU7zqOfJIXe/+Q5duwv86mlR1KJDmo2hMGgxiJSUhg4toUxr4O0tyPsh3WS7iDirABMAqwmQlsgOvJV6n6zuLMJfbqJfLrHxBtpr0XLCP3lq4hDCqthn4E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=uguuZLzO; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="uguuZLzO" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-3860bc1d4f1so1298531f8f.2 for ; Fri, 13 Dec 2024 08:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108506; x=1734713306; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gdnqwTM+ekDbCobF/sslf1fAX7qCXn0A+3Ru0J9WkCw=; b=uguuZLzOvjxhPcOp/RGtEbyC0Jt7PocV66FFZB4CADUvP/O7EXQCAMQf1Uhiw6lhnT EuoVEhPYqU0e2kQ84BCJ0hHMStIE3FxluK1ryYnCL1jTMVgZTFjiF0FeYLPwL3xvPKeg Xwhlsprrtsl1ASg7rtxSPBNbPYweFFVGFri6L+DmYf9WGRWxIBMvJHXxvOv4gHdIr7ke E/cTu7CMKAg5zBElF7qOT0+R39LxRCunCeZYnxCxp3kvN73VxFZGlYtiqsjn6aGi6WrO Fdf5zgPzIo4+8ZDleEi0wRUZHRULEVC9eRoJsO0dijt0WD9MA7EtZgFP/26GzbUSXMsT JREA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108506; x=1734713306; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gdnqwTM+ekDbCobF/sslf1fAX7qCXn0A+3Ru0J9WkCw=; b=vIJu0auA5dg9jePCGVdauO4RuChdSoJ28zg4HHudipjHagYKE1GYkTSrWcSB+Qom1a kT1eGR+qSSMLrk3U8vQmBAkplShZtrNDVjxZVOuOhhFbpfoK+q8afEJJFR7b/QN8GO1t 6Jmja9diInyItaYIzh+zF6TL7L+tEYjsADx0l+zdtsKjE0IUtggntPRy9t/vf6gTckBJ 3vQi0pSUBDhIQcw1KCPNMBshHd2ZBnK+m7phucqo0zbUuWdLNGzCxUyjVC2Vmp7PSJyA tcRb3C+Nsm/OSq5XUUvZKqxNMgVx4Wpm4rcNTVLKG+W7BT6M5ZiUpQ986mry404aw4om uYWg== X-Forwarded-Encrypted: i=1; AJvYcCWP9Kjz83TwP0xVEBUF5SdgDHgJQbJXG4enWgJbL+Mx8bKRT4gFTPbSGpyzs5QYesTIVcx5d0EraYfaIflF@vger.kernel.org X-Gm-Message-State: AOJu0Yw0/wTIJSkZxb9KhLuCTtP/Zjvq0BAnzZ6dVKoEz1Sgr1x6tENX CdTQ3s1S9tdg1/9E8ajvcfAf9fuyBzlYCX7LL8jlO02PJocT/uMeLrbfEBAirLUNImI2ojc0Vw= = X-Google-Smtp-Source: AGHT+IHsiL1TU4JB1jwgTJjdMcyuypfsiIlUlbhs/D1z5iTg4X32f0hrac57+zGyUbo1D5HK1o3YHzzCeg== X-Received: from wrpa7.prod.google.com ([2002:adf:eec7:0:b0:382:31e8:c1f8]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1f82:b0:385:effc:a279 with SMTP id ffacd0b85a97d-3888e0c184dmr1858759f8f.58.1734108505690; Fri, 13 Dec 2024 08:48:25 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:02 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-7-tabba@google.com> Subject: [RFC PATCH v4 06/14] KVM: guest_memfd: Handle final folio_put() of guestmem pages From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Before transitioning a guest_memfd folio to unshared, thereby disallowing access by the host and allowing the hypervisor to transition its view of the guest page as private, we need to be sure that the host doesn't have any references to the folio. This patch introduces a new type for guest_memfd folios, and uses that to register a callback that informs the guest_memfd subsystem when the last reference is dropped, therefore knowing that the host doesn't have any remaining references. Signed-off-by: Fuad Tabba --- The function kvm_slot_gmem_register_callback() isn't used in this series. It will be used later in code that performs unsharing of memory. I have tested it with pKVM, based on downstream code [*]. It's included in this RFC since it demonstrates the plan to handle unsharing of private folios. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.13-v4-pkvm --- include/linux/kvm_host.h | 11 +++ include/linux/page-flags.h | 7 ++ mm/debug.c | 1 + mm/swap.c | 4 + virt/kvm/guest_memfd.c | 145 +++++++++++++++++++++++++++++++++++++ 5 files changed, 168 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 84aa7908a5dd..7ada5f78ded4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2574,6 +2574,8 @@ int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end); bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn); bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, gfn_t gfn); +int kvm_slot_gmem_register_callback(struct kvm_memory_slot *slot, gfn_t gfn); +void kvm_gmem_handle_folio_put(struct folio *folio); #else static inline bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end) { @@ -2615,6 +2617,15 @@ static inline bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, WARN_ON_ONCE(1); return false; } +int kvm_slot_gmem_register_callback(struct kvm_memory_slot *slot, gfn_t gfn) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline void kvm_gmem_handle_folio_put(struct folio *folio) +{ + WARN_ON_ONCE(1); +} #endif /* CONFIG_KVM_GMEM_MAPPABLE */ #endif diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index aca57802d7c7..b0e8e43de77c 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -950,6 +950,7 @@ enum pagetype { PGTY_slab = 0xf5, PGTY_zsmalloc = 0xf6, PGTY_unaccepted = 0xf7, + PGTY_guestmem = 0xf8, PGTY_mapcount_underflow = 0xff }; @@ -1099,6 +1100,12 @@ FOLIO_TYPE_OPS(hugetlb, hugetlb) FOLIO_TEST_FLAG_FALSE(hugetlb) #endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE +FOLIO_TYPE_OPS(guestmem, guestmem) +#else +FOLIO_TEST_FLAG_FALSE(guestmem) +#endif + PAGE_TYPE_OPS(Zsmalloc, zsmalloc, zsmalloc) /* diff --git a/mm/debug.c b/mm/debug.c index 95b6ab809c0e..db93be385ed9 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -56,6 +56,7 @@ static const char *page_type_names[] = { DEF_PAGETYPE_NAME(table), DEF_PAGETYPE_NAME(buddy), DEF_PAGETYPE_NAME(unaccepted), + DEF_PAGETYPE_NAME(guestmem), }; static const char *page_type_name(unsigned int page_type) diff --git a/mm/swap.c b/mm/swap.c index 6f01b56bce13..15220eaabc86 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -37,6 +37,7 @@ #include #include #include +#include #include "internal.h" @@ -103,6 +104,9 @@ static void free_typed_folio(struct folio *folio) case PGTY_offline: /* Nothing to do, it's offline. */ return; + case PGTY_guestmem: + kvm_gmem_handle_folio_put(folio); + return; default: WARN_ON_ONCE(1); } diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index d1c192927cf7..5ecaa5dfcd00 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -387,6 +387,28 @@ enum folio_mappability { KVM_GMEM_NONE_MAPPABLE = 0b11, /* Not mappable, transient state. */ }; +/* + * Unregisters the __folio_put() callback from the folio. + * + * Restores a folio's refcount after all pending references have been released, + * and removes the folio type, thereby removing the callback. Now the folio can + * be freed normaly once all actual references have been dropped. + * + * Must be called with the filemap (inode->i_mapping) invalidate_lock held. + * Must also have exclusive access to the folio: folio must be either locked, or + * gmem holds the only reference. + */ +static void __kvm_gmem_restore_pending_folio(struct folio *folio) +{ + if (WARN_ON_ONCE(folio_mapped(folio) || !folio_test_guestmem(folio))) + return; + + WARN_ON_ONCE(!folio_test_locked(folio) || folio_ref_count(folio) > 1); + + __folio_clear_guestmem(folio); + folio_ref_add(folio, folio_nr_pages(folio)); +} + /* * Marks the range [start, end) as mappable by both the host and the guest. * Usually called when guest shares memory with the host. @@ -400,7 +422,31 @@ static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) filemap_invalidate_lock(inode->i_mapping); for (i = start; i < end; i++) { + struct folio *folio = NULL; + + /* + * If the folio is NONE_MAPPABLE, it indicates that it is + * transitioning to private (GUEST_MAPPABLE). Transition it to + * shared (ALL_MAPPABLE) immediately, and remove the callback. + */ + if (xa_to_value(xa_load(mappable_offsets, i)) == KVM_GMEM_NONE_MAPPABLE) { + folio = filemap_lock_folio(inode->i_mapping, i); + if (WARN_ON_ONCE(IS_ERR(folio))) { + r = PTR_ERR(folio); + break; + } + + if (folio_test_guestmem(folio)) + __kvm_gmem_restore_pending_folio(folio); + } + r = xa_err(xa_store(mappable_offsets, i, xval, GFP_KERNEL)); + + if (folio) { + folio_unlock(folio); + folio_put(folio); + } + if (r) break; } @@ -473,6 +519,105 @@ static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) return r; } +/* + * Registers a callback to __folio_put(), so that gmem knows that the host does + * not have any references to the folio. It does that by setting the folio type + * to guestmem. + * + * Returns 0 if the host doesn't have any references, or -EAGAIN if the host + * has references, and the callback has been registered. + * + * Must be called with the following locks held: + * - filemap (inode->i_mapping) invalidate_lock + * - folio lock + */ +static int __gmem_register_callback(struct folio *folio, struct inode *inode, pgoff_t idx) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); + int refcount; + + rwsem_assert_held_write_nolockdep(&inode->i_mapping->invalidate_lock); + WARN_ON_ONCE(!folio_test_locked(folio)); + + if (folio_mapped(folio) || folio_test_guestmem(folio)) + return -EAGAIN; + + /* Register a callback first. */ + __folio_set_guestmem(folio); + + /* + * Check for references after setting the type to guestmem, to guard + * against potential races with the refcount being decremented later. + * + * At least one reference is expected because the folio is locked. + */ + + refcount = folio_ref_sub_return(folio, folio_nr_pages(folio)); + if (refcount == 1) { + int r; + + /* refcount isn't elevated, it's now faultable by the guest. */ + r = WARN_ON_ONCE(xa_err(xa_store(mappable_offsets, idx, xval_guest, GFP_KERNEL))); + if (!r) + __kvm_gmem_restore_pending_folio(folio); + + return r; + } + + return -EAGAIN; +} + +int kvm_slot_gmem_register_callback(struct kvm_memory_slot *slot, gfn_t gfn) +{ + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + struct inode *inode = file_inode(slot->gmem.file); + struct folio *folio; + int r; + + filemap_invalidate_lock(inode->i_mapping); + + folio = filemap_lock_folio(inode->i_mapping, pgoff); + if (WARN_ON_ONCE(IS_ERR(folio))) { + r = PTR_ERR(folio); + goto out; + } + + r = __gmem_register_callback(folio, inode, pgoff); + + folio_unlock(folio); + folio_put(folio); +out: + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +/* + * Callback function for __folio_put(), i.e., called when all references by the + * host to the folio have been dropped. This allows gmem to transition the state + * of the folio to mappable by the guest, and allows the hypervisor to continue + * transitioning its state to private, since the host cannot attempt to access + * it anymore. + */ +void kvm_gmem_handle_folio_put(struct folio *folio) +{ + struct xarray *mappable_offsets; + struct inode *inode; + pgoff_t index; + void *xval; + + inode = folio->mapping->host; + index = folio->index; + mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + xval = xa_mk_value(KVM_GMEM_GUEST_MAPPABLE); + + filemap_invalidate_lock(inode->i_mapping); + __kvm_gmem_restore_pending_folio(folio); + WARN_ON_ONCE(xa_err(xa_store(mappable_offsets, index, xval, GFP_KERNEL))); + filemap_invalidate_unlock(inode->i_mapping); +} + static bool gmem_is_mappable(struct inode *inode, pgoff_t pgoff) { struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; From patchwork Fri Dec 13 16:48:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850842 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9EBED1F238D for ; Fri, 13 Dec 2024 16:48:29 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108511; cv=none; b=N8k4TlUaihOEZxW9Yxzf7dR7SLuvOfp0kNmJSn13X7mBN9qMu62cU4WGfRZcOzRLwKnWWY/2lZzsrT896h2XCscei7YoCJiPtS1l/owZHI67F6UYYam4rUvhiDZKwbx2EtBDwvGOKqdRrRxN/3mpTzX9pgbsgk+zA519u/3hvEo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108511; c=relaxed/simple; bh=HJ9evCiKrTr6pVEFTt9Uc9jd6JCIdDtaZlf4NLwhmDs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gQvpsakd+BHmMQgA2kpaAYAtNd8nAHhcYj/W2kQQh/y6p8aOE+yztLpjwUK2lp8hLYYD0AWmVRqCvjQvPB0IRjo0HvI4II9cL/BaM+WwqAk+cWrE3kTP+j9iGysPVaf1jorT7ckHkhFkHjZ48VUlFob9kWKtWt9NEG5LwwGbIZE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KCLhHEp5; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KCLhHEp5" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43621907030so18692885e9.1 for ; Fri, 13 Dec 2024 08:48:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108508; x=1734713308; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vYAjt6KjlCWR3/9CjS9sdQrEPORDtZUsmbS3MjiJTDQ=; b=KCLhHEp5BCv5f/bO46MLYhE/ig8TCdFEl7gvN3Y7Ig34LuG3lhY/9q+S8wmY2N1miG hKPo2OWDsvnNGASu/8ihXN4Eb18LLW/e6Zs9c62tx4tYo/pUJqFZF3KGlaRV0jfDepqc UbbKplH9T2LYkmNY3Lo+8/HAm99s8ofkkZp9mfwtx3nhvZfW6ipUDAnBGoZx2SVp23Lg vZlmX4QjYjzOZ8Z20WfWYv4UnN1YMf7MtHLGueYHhehbUwe2r0LDdMkJQ+whG2uwaZJ1 NFLK2Z5vDXMwDHBGwVQpeb92iyFyZBkGUxcZFp/X7hQA5GOZs++LXGItcDpDZBr9+ToE zyIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108508; x=1734713308; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vYAjt6KjlCWR3/9CjS9sdQrEPORDtZUsmbS3MjiJTDQ=; b=FDu8iVVn5S+7JIRVHHxq/FpUClMMgomZ0mPeW8huv2pVEIZjHGwvKamS0d+A3pOV3B UMkIDBG35loq33RKYkMBclLPLS4uIm7bV+1obd+ta3sBebYtA1nrNwbd79DBMSjSREdh IQd5UvelQDgRIwrmzbLbss+OAO/eYGAwtnHNvqcsoCwIxt2WC7GYOgQ2V/X38yF1OfuY j1yKiDrDHU5NCS0XE6+cQWtrl7KSBVooqXWaOeExqXmJFxwIeXmUZDL1rchypHtQyAy/ l/D0SQ1iiW2dbGf1AdLd7YmmoTWC8vK3Th5nk8mKp1jUJkalwB90mELmSbYY/A85HtzY OqXg== X-Forwarded-Encrypted: i=1; AJvYcCXKx7O5/5WL1jIBpP1/YbNMg44/yJfwh1wcIO361qGJoPqy3f82didEDtK181LKJ8RWyyzWcz8LNO48hvMH@vger.kernel.org X-Gm-Message-State: AOJu0Yzb1tdS/EDF6H0FZwtRbeunyyF57qyxGJGhxJW93fRDi+C7zwtZ fHz8qCZ+5mTy5sX3EDiVnXZanBWmToImFkeFIYo5dIgg4BxG8RGaP+Vvd8BtMZXPewvIySt0Bw= = X-Google-Smtp-Source: AGHT+IEuIo0g8Wj72K05W3M76Lwgy9hl7xYqHm8Z8UjN79dj1bkbMiQK/ObAF1ojkOHGGxPeIlJxjhjp+Q== X-Received: from wmdv21.prod.google.com ([2002:a05:600c:12d5:b0:434:f1d0:7dc9]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:524c:b0:436:2238:97f6 with SMTP id 5b1f17b1804b1-4362aa1af5emr29426785e9.1.1734108507984; Fri, 13 Dec 2024 08:48:27 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:03 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-8-tabba@google.com> Subject: [RFC PATCH v4 07/14] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add support for mmap() and fault() for guest_memfd in the host. The ability to fault in a guest page is contingent on that page being shared with the host. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. The mapping is restricted to only memory explicitly shared with the host. KVM checks that the host doesn't have any mappings for private memory via the folio's refcount. To avoid races between paths that check mappability and paths that check whether the host has any mappings (via the refcount), the folio lock is held in while either check is being performed. This new feature is gated with a new configuration option, CONFIG_KVM_GMEM_MAPPABLE. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Elliot Berman Signed-off-by: Elliot Berman Signed-off-by: Fuad Tabba --- The functions kvm_gmem_is_mapped(), kvm_gmem_set_mappable(), and int kvm_gmem_clear_mappable() are not used in this patch series. They are intended to be used in future patches [*], which check and toggle mapability when the guest shares/unshares pages with the host. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.13-v4-pkvm --- virt/kvm/Kconfig | 4 ++ virt/kvm/guest_memfd.c | 87 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 91 insertions(+) diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 54e959e7d68f..59400fd8f539 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_MAPPABLE + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 5ecaa5dfcd00..3d3645924db9 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -671,9 +671,88 @@ bool kvm_slot_gmem_is_guest_mappable(struct kvm_memory_slot *slot, gfn_t gfn) return gmem_is_guest_mappable(inode, pgoff); } + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + filemap_invalidate_lock_shared(inode->i_mapping); + + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (IS_ERR(folio)) { + ret = VM_FAULT_SIGBUS; + goto out_filemap; + } + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out_folio; + } + + if (!gmem_is_mappable(inode, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (WARN_ON_ONCE(folio_test_guestmem(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (!folio_test_uptodate(folio)) { + unsigned long nr_pages = folio_nr_pages(folio); + unsigned long i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + + folio_mark_uptodate(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); + +out_folio: + if (ret != VM_FAULT_LOCKED) { + folio_unlock(folio); + folio_put(folio); + } + +out_filemap: + filemap_invalidate_unlock_shared(inode->i_mapping); + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +#define kvm_gmem_mmap NULL #endif /* CONFIG_KVM_GMEM_MAPPABLE */ static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, @@ -860,6 +939,14 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); From patchwork Fri Dec 13 16:48:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850361 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC5B51F2C46 for ; Fri, 13 Dec 2024 16:48:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108513; cv=none; b=ZnpnW0OTUoJiPy2z6Nbl98vNofwYqo5+HVFEA2P66me3k0/fqL57fDtcwRKDGqZEXSsZSt0G6J1o8iRf0jAp8v70MVASBQMX1tV6k/cFdUo9XdI0onaszVPlDk5EdBagYvF1W/mLIKe0m9ji5DEKsCucjia2Qvmg+s8qBR3Owdk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108513; c=relaxed/simple; bh=nMkhlDZRvzFOXfebvNzr3xwtg7I4O+5NgpfnfSuZnDc=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ck59qlloZl1ZhPkRBmgwWb8o9tcstlS57MRnpd17EuxQ/zSIQ7XmesA10RYwud5n70eFMlyHOKHawhtPdmofyn/ux2KdYjHygsR8ortbHRtBTOOyftzNttHFoIZ8L1bqWPp3EoEzxgfa3cfeCK6WF4uwoMg63CWBqXd/TzftMNY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=brPBSZ8j; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="brPBSZ8j" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-3862a49fbdaso319829f8f.1 for ; Fri, 13 Dec 2024 08:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108510; x=1734713310; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=W5zxHybLV3NWne95v7ysBYYUl2v+8+sVLD5dfkruHJg=; b=brPBSZ8juYxt2zyf6eCXcjqxhFcnO85lVq0WzYTOHtCd9Pa752fd3AsZBqZIT4B9n+ H0KYrUhzna2lQOEA1qqe+6ZioZnYmFqN8WqbVQERMzoVFz8DGvwKY6qnsqW+iP8EuIhZ NFtGA2gUUtV8mAMBs9vq2A1xtURuyiCvz2YHpFxhtcZiG3sIvACFuaBgDBgl4ebw2+LM jot3IZrxVo1trTcfQI7Z3t0HVcTZNzRWZxOuDlM8a9+pcXCYdbUPPGGjC6TB+3L6of8N WFa+p7K2Fo5KDQJ3U2VlKXOj7/rAPakY7aeF6swdO5ua0CFjjrey3Ly20fddzyF86XmX bq3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108510; x=1734713310; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=W5zxHybLV3NWne95v7ysBYYUl2v+8+sVLD5dfkruHJg=; b=Q3Kap/RU869umlUZKlSaiy4twBGooEWKHK1VXkdSLtWUatBmCALYSXKAfwJKdUMzbf Lj6xph/c7zzzcL+AwY+D/mtvq4iJouAbCScc5cPEGTlrcSfgfLurRxGt7tCdrxcXUmmD PJv7D7Cbe0LYj9pGf8r4zzp0Zo4ERbPIEIBMVjoHtC7L+Pfg/pGlF0dWyZjfXQQLe6lr DIlF2amLK1NQ1oKaCPWDPb9RnbpwLtYxO3PuD/7oYw99uNMJsKcLfMhO6DbQplJjzUy8 XjnyZEzdPvtRKTI27Lo4bCCNIL9bu4S3RBwcvLL6qw9LegkSHhrIsJCylEx5qAs+JOKl zZuQ== X-Forwarded-Encrypted: i=1; AJvYcCXGlWnlvCmF6YYs6/EgExRHipxNMvUOthaHblV/2CsOLTiMq0jUW8WCR/T1heCPYxqtSW7C49Mlz/MzzUb/@vger.kernel.org X-Gm-Message-State: AOJu0YyIvqlnGb+CtDRULgenAYc3EUeK3A4hJWWKuehv1irAvJkUZ4y4 ikumsXtFjTVceZPneAZlSZIIMxigDKyZYJJDNzWYbr4+HQ/OSqRX34X2NyqZQaDRKmtGA5jqiQ= = X-Google-Smtp-Source: AGHT+IE5qr7blgR8UT1xyGN9dbKV8Uu0Nxy8xzMzxQfp0eU0uQYPzB6TjeBHpyTiJ8UPA6pvno7rqFYahw== X-Received: from wmmu14.prod.google.com ([2002:a05:600c:ce:b0:434:f0a3:7876]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:4b12:b0:385:edb7:69ba with SMTP id ffacd0b85a97d-38880ac5f25mr3423932f8f.1.1734108510104; Fri, 13 Dec 2024 08:48:30 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:04 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-9-tabba@google.com> Subject: [RFC PATCH v4 08/14] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Make kvm_(read|/write)_guest_page() capable of accessing guest memory for slots that don't have a userspace address, but only if the memory is mappable, which also indicates that it is accessible by the host. Signed-off-by: Fuad Tabba --- virt/kvm/kvm_main.c | 133 +++++++++++++++++++++++++++++++++++++------- 1 file changed, 114 insertions(+), 19 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index fffff01cebe7..53692feb6213 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3184,23 +3184,110 @@ int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) return r; } +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, &page, NULL); + if (r) + return r; + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(data, page_address(page) + offset, len); +unlock: + unlock_page(page); + if (r) + put_page(page); + else + kvm_release_page_clean(page); + + return r; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, &page, NULL); + if (r) + return r; + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(page_address(page) + offset, data, len); +unlock: + unlock_page(page); + if (r) + put_page(page); + else + kvm_release_page_dirty(page); + + return r; +} +#else +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} #endif /* CONFIG_KVM_GMEM_MAPPABLE */ /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ -static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, - void *data, int offset, int len) + +static int __kvm_read_guest_page(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, int len) { - int r; unsigned long addr; if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + return __kvm_read_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + } + addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) return -EFAULT; - r = __copy_from_user(data, (void __user *)addr + offset, len); - if (r) + if (__copy_from_user(data, (void __user *)addr + offset, len)) return -EFAULT; return 0; } @@ -3210,7 +3297,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, { struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_read_guest_page); @@ -3219,7 +3306,7 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(vcpu->kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page); @@ -3296,22 +3383,30 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); /* Copy @len bytes from @data into guest memory at '(@gfn * PAGE_SIZE) + @offset' */ static int __kvm_write_guest_page(struct kvm *kvm, - struct kvm_memory_slot *memslot, gfn_t gfn, - const void *data, int offset, int len) + struct kvm_memory_slot *slot, gfn_t gfn, + const void *data, int offset, int len) { - int r; - unsigned long addr; - if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; - addr = gfn_to_hva_memslot(memslot, gfn); - if (kvm_is_error_hva(addr)) - return -EFAULT; - r = __copy_to_user((void __user *)addr + offset, data, len); - if (r) - return -EFAULT; - mark_page_dirty_in_slot(kvm, memslot, gfn); + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + int r = __kvm_write_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + + if (r) + return r; + } else { + unsigned long addr = gfn_to_hva_memslot(slot, gfn); + + if (kvm_is_error_hva(addr)) + return -EFAULT; + if (__copy_to_user((void __user *)addr + offset, data, len)) + return -EFAULT; + } + + mark_page_dirty_in_slot(kvm, slot, gfn); return 0; } From patchwork Fri Dec 13 16:48:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850841 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 28F3F1F37C9 for ; Fri, 13 Dec 2024 16:48:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108515; cv=none; b=A9M9nKUHaH3lyr6twkhQ7Ut98Ig5MQmvM2aBkgHPX2npXXSaXQu3OpY4gCyVUDwCgu2e/FUErVetlcCTw2qe2Ljh9KYYUZdPrsyS1StbP1+gaF1EN7EVZEdvGrikUcAON+de3kf1WGps2/OVBawc7Vpoqtskx3nMy48Of5Z5SO0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108515; c=relaxed/simple; bh=M7+MFZiu3/ijUuLXmaqnRaTq0ZymjLSw2DyjVM//qRY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=SIeDAEYW2hUtLkqb1P/tPN3opXCH+Fm3xZs18rR9bSOSgV3Jd5nfDkoRWTNrvCqPWL3FEhTjY9GX9xn0qXRACPsYKBWtlk7pRQFKLVtkmhwTHSo6+cZq+Snq0Va1AFcrSutiupGKDhP692karDP6F5HdX3u23o0lus6I8zPQIiY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=t4XViSSs; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="t4XViSSs" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4359206e1e4so18659425e9.2 for ; Fri, 13 Dec 2024 08:48:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108512; x=1734713312; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ksEVsaSQioDaJqKOOzrRTGd2BPTgCKjKlBLY+w56wZo=; b=t4XViSSs+C2EkTS1htUgQwfH3VoPbpCYP73PF2tFIoR7eAlk5hYRmeHLEZt6Phmj6B omFOrsaLOzN/wRDhvpi9DX+7vIftqUdFk2i+C8CjCYHDMOXaY53cbhEQaH5yFSqgVlws xGHMbGOJgiGofKGCSCAoOeGcrZ4OEz9S4I7Tv2EC4+QWYWfPn8UlIPiJl8qOTWu3vp+A 0LixCqGUVAxf3cQ0n1kFjF8HUOmmcSluB/TR4iNEx+2rZuu+cOf2AxQ7rfLkfr09pSwb bbu4RVGWv3C0AYNi88bi7/sAzKzURHHAUz4XZV8wUlW57227wa3j+iEBFy7oDwZniEJb K5MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108512; x=1734713312; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ksEVsaSQioDaJqKOOzrRTGd2BPTgCKjKlBLY+w56wZo=; b=YbOrWAjLLxQEvVdbFoN0dmgzugyLcXR/rN4blDImqiyiGh3xteOUH8Br3GF+CjIyqJ cXpAiizlbd/vhdt2tJEInZKp5hMbayBjlfG6A6DwBfvJ8ZfIS4IUD8vVk0H48Tti/s9z nsoTG/YBc0VYKO323pghlzWYytptva40XSpyLazRa1L7bmjkWi6DhYjyaOG/uBrvAs2D 7yXimF13jWfodiTF1tqMYMQSQ3LvXpMGyZbBP7eg4M2IfZLz/GivziM3pgeJ+S60BIqU yalGXO/LKQl1OpmPZi7hM3oqyT1x5wwjU++sZT3kzJ0tq7U18f5O30wxQK+qnE5BRuTW 5fUw== X-Forwarded-Encrypted: i=1; AJvYcCWlji+jRBAEOwMcGwEHMuTDRmrPidQWVCsmcBAtOhUAfxOVsS5L1xDMEUYXO1YAWBqlA71oKfdycV7mZxPX@vger.kernel.org X-Gm-Message-State: AOJu0Ywsn+SwDM//BnXwmiBMq4TOSDWhknd6w5Epp5noMudOkHAyH60W PfDlR/iKNcR8qcPjQ6Y9K9dAZnabcn2tjwRaUX1RopdH5o3eW9x47vI3AXuagOX2rzU0DD/sLA= = X-Google-Smtp-Source: AGHT+IEt4pXQrhABH8gera5DOjDskLThzt7L9I5lJj6Vxwq11yPrteBm6Llct58hlLMQdsD4E964MjHJuQ== X-Received: from wmbg15.prod.google.com ([2002:a05:600c:a40f:b0:436:1534:b059]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:6549:b0:434:f0df:9fd with SMTP id 5b1f17b1804b1-4362aa1b061mr38281225e9.2.1734108512506; Fri, 13 Dec 2024 08:48:32 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:05 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-10-tabba@google.com> Subject: [RFC PATCH v4 09/14] KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add the KVM capability KVM_CAP_GUEST_MEMFD_MAPPABLE, which is true if mapping guest memory is supported by the host. Signed-off-by: Fuad Tabba --- include/uapi/linux/kvm.h | 1 + virt/kvm/kvm_main.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 502ea63b5d2e..021f8ef9979b 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -933,6 +933,7 @@ struct kvm_enable_cap { #define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 +#define KVM_CAP_GUEST_MEMFD_MAPPABLE 239 struct kvm_irq_routing_irqchip { __u32 irqchip; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 53692feb6213..0d1c2e95e771 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4979,6 +4979,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_KVM_PRIVATE_MEM case KVM_CAP_GUEST_MEMFD: return !kvm || kvm_arch_has_private_mem(kvm); +#endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE + case KVM_CAP_GUEST_MEMFD_MAPPABLE: + return !kvm || kvm_arch_has_private_mem(kvm); #endif default: break; From patchwork Fri Dec 13 16:48:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850360 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E0401F3D33 for ; Fri, 13 Dec 2024 16:48:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108519; cv=none; b=XVL5HsFYvR/joE06uttnSyc3SiHGI8ALCOwVhDr47W7yVbNuPZiPVx1d7WWqq3GzWfLekGeov1Jh11PLa4UxdwEjG0RLF3tCh0ReYo05MjSt6SLGQ2w2RAdPaarnswwwHAQRIwEPI2vY21KSbKadYd4kyTsa1cavIhXjUPOCZzI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108519; c=relaxed/simple; bh=nTUBEGVYkg9r5vYRMcM+vZaIEl5r6n5mctx7ZVr18aY=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=qyPksYNq3lhlfS6uhDmyGc1gdN+oe5NLKdvrQQpEXchS9blhgxf4b+nysYvpb0aHKWeA6OmLmZcoQ15fEbIkel6UW7GF1e+NSTBFaYta39PeYcVs7UuBg1SVd8iLFVcsKsHhOqAwBs1E1ceWEYhOCBu1doKnCn4YHYA5qETt1LA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=07b0Ny0F; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="07b0Ny0F" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361efc9dc6so11561505e9.3 for ; Fri, 13 Dec 2024 08:48:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108515; x=1734713315; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WsEDyhBns9NxCkCIdpMm4WAodyMfjoY3V2TpfAI1mN0=; b=07b0Ny0FBWrMRLUo1E3nN+OZc0bp3yqnm5vz3ssSUF83tn9fCwrfy+nMTJuHh6Gd1P VWrNEiNsy0bAfcdfo5Gvnkar6pBD5zjEr9b7AZ81XYf/qDpmOVtv1veKmVO3JYQpQdy4 bhBOqR7rNMWPRqKWQ6oKgjJWhkXyqVlF7MfdMQQ0zQkqFeo4czJRC9gcsOd4lDp2Klwa JrglEiuF8Q5P1apAA7IwtuP/+32z4/aZ2kLHStlIV9d3uU1OhV/+yJ9nyretsphtln1o SpaRTs8D8kYdCgF/sTYXtHxKfwNuMwtGHgGWieF5T2UZf5WuNigoDuwnBgbP9rk/9Egj JFGQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108515; x=1734713315; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WsEDyhBns9NxCkCIdpMm4WAodyMfjoY3V2TpfAI1mN0=; b=tsw3GQE7qAjCL7oTP0MnmJfy0FtAFeZbdnAF+9iYCvFm4Y8wi3tx7z6Kt/M+qcWwWX s+KCxi+IXCeNTsqOFF8RXzNL1DgV/iog3XPXCKNv4vXYYa8q5/25yjRVXAviTtkkN1+C MGc7qMO2H9Gz6B/uR7TnST1kC66dyxO8CWm3u1NA/1/W69V0T3FRyjkJ8rqPBIxtdYhw /byip+PmUfiW9zy8BWEFGBn6PS7EW2IESRJeNvtPIS3ogNUpAaVklMo36b8XswYeCoC3 dgWBIXWzqX3A6z7H96NLYaM+wO2QKs5hXhDqvA1KCYwiYTTHK/UXUPN2z9I13eHuuLJ1 fRMg== X-Forwarded-Encrypted: i=1; AJvYcCWow4avfYkoMmhNakc5Y2IqzfrMKYEN4bwy5GAi+S4AjJnr4sjKyQInCZRZoDGEDOk5pLWEeb3lgWtJMu57@vger.kernel.org X-Gm-Message-State: AOJu0YxNi4dkNM+rUGBBpxapw+yEJWG69S5JqcdSwXgzs78+lCpGhYdv WS1JFahK1iSy+Q9pO+tnUXCEcEQxOOsuNBtjRo7izYHlQv+6JCWhKnZCtGt/IGMRVj9eYhks+A= = X-Google-Smtp-Source: AGHT+IGBPIX0XH2E/QAG4VVkoTgxq7y6qEK9iHPxA7arOJ8hwxyGtcqtTyclPQC42evEdmBRc/aMqmUXOw== X-Received: from wmox24.prod.google.com ([2002:a05:600c:1798:b0:434:f7cf:3389]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1da4:b0:434:faa9:5266 with SMTP id 5b1f17b1804b1-4362aa43222mr32447565e9.13.1734108514902; Fri, 13 Dec 2024 08:48:34 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:06 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-11-tabba@google.com> Subject: [RFC PATCH v4 10/14] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Not all use cases require guest_memfd() to be mappable by the host when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_MAPPABLE, which when set on KVM_CREATE_GUEST_MEMFD initializes the memory as mappable by the host. Otherwise, memory is private until shared by the guest with the host. Signed-off-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 4 ++++ include/uapi/linux/kvm.h | 1 + tools/testing/selftests/kvm/guest_memfd_test.c | 7 +++++-- virt/kvm/guest_memfd.c | 6 +++++- 4 files changed, 15 insertions(+), 3 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 454c2aaa155e..60b65d9b8077 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6380,6 +6380,10 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If the capability KVM_CAP_GUEST_MEMFD_MAPPABLE is supported, then the flags +field supports GUEST_MEMFD_FLAG_INIT_MAPPABLE, which initializes the memory +as mappable by the host. + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 021f8ef9979b..b34aed04ffa5 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1566,6 +1566,7 @@ struct kvm_memory_attributes { #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define GUEST_MEMFD_FLAG_INIT_MAPPABLE (1UL << 0) struct kvm_create_guest_memfd { __u64 size; diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ce687f8d248f..04b4111b7190 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -123,7 +123,7 @@ static void test_invalid_punch_hole(int fd, size_t page_size, size_t total_size) static void test_create_guest_memfd_invalid(struct kvm_vm *vm) { size_t page_size = getpagesize(); - uint64_t flag; + uint64_t flag = BIT(0); size_t size; int fd; @@ -134,7 +134,10 @@ static void test_create_guest_memfd_invalid(struct kvm_vm *vm) size); } - for (flag = BIT(0); flag; flag <<= 1) { + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + flag = GUEST_MEMFD_FLAG_INIT_MAPPABLE << 1; + + for (; flag; flag <<= 1) { fd = __vm_create_guest_memfd(vm, page_size, flag); TEST_ASSERT(fd == -1 && errno == EINVAL, "guest_memfd() with flag '0x%lx' should fail with EINVAL", diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 3d3645924db9..f33a577295b3 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -939,7 +939,8 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } - if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + (flags & GUEST_MEMFD_FLAG_INIT_MAPPABLE)) { err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); if (err) { fput(file); @@ -968,6 +969,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) + valid_flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + if (flags & ~valid_flags) return -EINVAL; From patchwork Fri Dec 13 16:48:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850840 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 812571F3D54 for ; Fri, 13 Dec 2024 16:48:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108521; cv=none; b=TecGqVT5O6Q10B0rjn1PgTmOoaznknGXGIXZ6haeXEBwgutyegkn5JxciXZUIAlvsQeiDXoJeF+1y0nqn1GfKEuwgkS8KO2YgN/3Y4lTMUn1aokpefHUZ8iSS/3I/jPFtcvJQbbphF4J1VCemJjGhQn5SGxSnT+qvS+zlWDuy9o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108521; c=relaxed/simple; bh=1UcBMsBaNJ9H03ScD6XckdVfKjpH/qoRGNKoYNELwMQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Zrs8uhQNpNpcXbvchGAqRwP5fGbNXtNoAt1M0ARcssJzCmsqlOIBonq/94h4Y7hC0T3GjOXvrUodhJ2ER007uhv5usrgeY8Di6SKSREnfth46HENlh2bMHE6psMDwm4A1gfM7kfsvNEWGptn63akEV6oqcaKgi8uXh9pfmntI8w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ukrJBYfb; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ukrJBYfb" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-436228ebc5eso11689805e9.3 for ; Fri, 13 Dec 2024 08:48:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108517; x=1734713317; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DhFiA/KCvNxVvN2dxXqgdhmMKMTNkZQ9H4IjGYFoLJU=; b=ukrJBYfbFTRjkJoGTVzOq2Thu9roiPplSWxpeRhGClZRRrCj+NTmaxT0kXNSCA7jEa qyg55VRv3JRItwxdPkL2bHOK4oOJ8ZgXj/nZqyl6PKtXKIH8YNa2VzkdzL0+UC0c5rvg qtRxo+qed3p5NQ5/kwFWJXRnPNrUYjIO1psO+HR4HGhaCNGaGfbJ3s6P5147h3i7AX0H 7FFth12ypvakDUXagJAbF5OO949kv4LIAMp7vvC1hHIV/dyaS4+JxG3+vdCkNYVzpnjx UKiQ/aLsNeGwVECPUF+pKslBmOwkm05jTKvSh4R+swSwD6y4S4PVl6CaZdCRZKwJXfki Vzlg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108517; x=1734713317; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DhFiA/KCvNxVvN2dxXqgdhmMKMTNkZQ9H4IjGYFoLJU=; b=B6xHvkkiUYAyBbpv2SBBoBjihiQqGWA6qQphaJ7YUj3eEbkZx/PisXJ6uDJ1n7aykm nb8Q80og8xpvSb9L+WuBUubLPHgJ2L6yds6aQeJLqSXd7A274JXJiZUt5/jqRG1xcgWI RDquCeBNigxQGz5dE+8Q6mrMw2/dmH5qBGnVgL7x972aUeyoIH3jy8a91IY6AtJxfttB X/x99UfBReWN/sqC74n5D6x+qhMInweXk1tH8pxBYHwkn2KJQWz1Fs2O15wL8xYjlA9M yHOahoW0eJKSq0qMH8R71eNaLIF1dcN5Hk4xtTuNoJqHbhMqxbJQefOnrr89zjPS2MFt lvtA== X-Forwarded-Encrypted: i=1; AJvYcCWBYowqFA9nnslFBnlkF2BYktLtaDh18teE9MXHJXubxtDpsV2abUlYZMhDtT8Yjsn5X+OMNl330QCIPV3W@vger.kernel.org X-Gm-Message-State: AOJu0YxKzdgDnJxliORVfa3IZfugJNSHu66V3KqAQJpOcUyIHE1mzFoc vZrbY5FgCcK74OnIX0ql392SyWJ7ZA8uLC+A36O52CbcbIFI1RmgoaVJT4XztZfJPriu8sdT7A= = X-Google-Smtp-Source: AGHT+IFC3wQu843hFQBCJ/CWlcnzA5CxWqK8aB90iNtnojUxP19hTT+b2JGWT0UhJ1/fNK7qaPLpg8+Hqw== X-Received: from wmmu14.prod.google.com ([2002:a05:600c:ce:b0:434:f0a3:7876]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a7b:cb89:0:b0:436:1af4:5e07 with SMTP id 5b1f17b1804b1-4362aa1561emr25562455e9.1.1734108516946; Fri, 13 Dec 2024 08:48:36 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:07 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-12-tabba@google.com> Subject: [RFC PATCH v4 11/14] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Expand the guest_memfd selftests to include testing mapping guest memory if the capability is supported, and that still checks that memory is not mappable if the capability isn't supported. Also, build the guest_memfd selftest for aarch64. Signed-off-by: Fuad Tabba --- tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 57 +++++++++++++++++-- 2 files changed, 53 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 41593d2e7de9..c998eb3c3b77 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -174,6 +174,7 @@ TEST_GEN_PROGS_aarch64 += coalesced_io_test TEST_GEN_PROGS_aarch64 += demand_paging_test TEST_GEN_PROGS_aarch64 += dirty_log_test TEST_GEN_PROGS_aarch64 += dirty_log_perf_test +TEST_GEN_PROGS_aarch64 += guest_memfd_test TEST_GEN_PROGS_aarch64 += guest_print_test TEST_GEN_PROGS_aarch64 += get-reg-list TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index 04b4111b7190..12b5777c2eb5 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -34,12 +34,55 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_allowed(int fd, size_t total_size) { + size_t page_size = getpagesize(); + char *mem; + int ret; + int i; + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass."); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + page_size); + TEST_ASSERT(!ret, "fallocate the first page should succeed"); + + for (i = 0; i < page_size; i++) + TEST_ASSERT_EQ(mem[i], 0x00); + for (; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = munmap(mem, total_size); + TEST_ASSERT(!ret, "munmap should succeed"); +} + +static void test_mmap_denied(int fd, size_t total_size) +{ + size_t page_size = getpagesize(); char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); TEST_ASSERT_EQ(mem, MAP_FAILED); + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT_EQ(mem, MAP_FAILED); +} + +static void test_mmap(int fd, size_t total_size) +{ + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + test_mmap_allowed(fd, total_size); + else + test_mmap_denied(fd, total_size); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -175,13 +218,17 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) int main(int argc, char *argv[]) { - size_t page_size; + uint64_t flags = 0; + struct kvm_vm *vm; size_t total_size; + size_t page_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + page_size = getpagesize(); total_size = page_size * 4; @@ -190,10 +237,10 @@ int main(int argc, char *argv[]) test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); - fd = vm_create_guest_memfd(vm, total_size, 0); + fd = vm_create_guest_memfd(vm, total_size, flags); test_file_read_write(fd); - test_mmap(fd, page_size); + test_mmap(fd, total_size); test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); From patchwork Fri Dec 13 16:48:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850359 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 98FEA1F3D4A for ; Fri, 13 Dec 2024 16:48:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108522; cv=none; b=QzolkNDQRa6mzcrTPD5iVQ3HoOm3DeYpH+mZ3/CJCHjsS3mGVRPbtOvU3VYC9CXsq7OfzIClS/rtrwInTz+FffzbWZzd1bUczpg/tvtfo6/OF92m99m+7+oYagK9QrOMGaszfGsjmxc8PbFbgXOzbuEWM5i3H5e0t/OLHxKG5+w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108522; c=relaxed/simple; bh=esAEXl56ZCqm0IdQI+nLRQfWrCMPvhXiPaUVVWHNr00=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=XX2hPbUK15gL+ZUwgO40rsHyzhjAM1OGYIg2DofEW/mg0HQBQ5hATXHQcgcIzDeKlh3odlPnGNpgWKOhEbuajh0rZALDfOOvZffP+Mx7Z9+Vh1tKLKaZqmFd+yRLmqood/22DB9keuX4UGuxF8e/aAgyLLgRAWAeHJVrv158Iq0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=oxQX6XTL; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oxQX6XTL" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361eb83f46so17728885e9.3 for ; Fri, 13 Dec 2024 08:48:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108519; x=1734713319; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=g2VJIq0JFQ/ZqzbxK3We3Qf3gNhX7/2qVuXpD8D4O2A=; b=oxQX6XTLWaumP6mCSMj3e5weBWC0t1DivJub/2HUvt95swYiTYkYDV9dQmeS6oXWYt aUEXJgHc/20a7dxHjVi3j0A0eFVCE2pVojVEf2FBFln/PJBFWnm5rDmZkBoYN6SVWaVl 9+usi40x2TxWVKRVOaodi6bl0FgRLIIEB3eeuKE1f45Ii2y23sfWCziMwIHe5jqYWc2A 2FP3wQYfx8Pit89AFRvP+whZyfgSSQ6qGy6iXpZXHDKJLa3HcfXie41SfQcZhf0o9LYt SvNUKzaYLJaPCgEei5rNvjAUYKDUSZBFIO78v1psc/dZvniiqg4Be996Wh3fKESjgiId NRoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108519; x=1734713319; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=g2VJIq0JFQ/ZqzbxK3We3Qf3gNhX7/2qVuXpD8D4O2A=; b=UyPP/UfZOT1W7BkhxQQ6A52N1w2rVGiMzw2wrr13UiD5gTo0NHXEENEHXimmFe9Ddd 6XEH1UfLmBj7lHc7mIEDovjF7tPnMjs6MQce2pZVHqnoRxl9qepMzLj0yMQnykr8diHj gubw//0/WG5RPtft6fpgbvlawNyVLEgqWDSBmzCPOrew613j1zcREyPExAb2gFD54aeN Xc11wScWnZOckaeTU3Frc3aO/2P92A+hAmSSIeVsynXS89vjAWcKqa1HAwtDDHTtuntQ pMaWEh67KecgYXVmSzVrZzpdP+uu982O8UCrZOK5ugJQuIlEHKcPEiGb1zXUQeDXULFQ PmuA== X-Forwarded-Encrypted: i=1; AJvYcCXKm/36Zt+/cdEbww42pZNRWBkCPHox49EcTceSkcYzNVPlQdZe8Is9ch7SgIKuopVz8IO54Y4LVfb/t/bO@vger.kernel.org X-Gm-Message-State: AOJu0Yyy94T8CvefTS1QiofAPcU0GBTeh3J5Q/AgT9dZ5ZwXUN3SX+xV KY1K36AFSukibfxIFL3SRJ7MEGmmcMf40/emEHyNOzQUuWfDSXi/+tX/F3L90+qM4lFAbInl5w= = X-Google-Smtp-Source: AGHT+IEHYbi0erAT5PMhFOWMD98RX6BEQ5VdWuD/tN2fh0Tz16i8Z/iaZQVtS7R87H/i8b+meXfFpU9LHw== X-Received: from wmsn8.prod.google.com ([2002:a05:600c:3b88:b0:434:f018:dd30]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:34c2:b0:434:fddf:5c0a with SMTP id 5b1f17b1804b1-4362aa156a5mr33549795e9.3.1734108519019; Fri, 13 Dec 2024 08:48:39 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:08 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-13-tabba@google.com> Subject: [RFC PATCH v4 12/14] KVM: arm64: Skip VMA checks for slots without userspace address From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Memory slots backed by guest memory might be created with no intention of being mapped by the host. These are recognized by not having a userspace address in the memory slot. VMA checks are neither possible nor necessary for this kind of slot, so skip them. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index c9d46ad57e52..342a9bd3848f 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -988,6 +988,10 @@ static void stage2_unmap_memslot(struct kvm *kvm, phys_addr_t size = PAGE_SIZE * memslot->npages; hva_t reg_end = hva + size; + /* Host will not map this private memory without a userspace address. */ + if (kvm_slot_can_be_private(memslot) && !hva) + return; + /* * A memory region could potentially cover multiple VMAs, and any holes * between them, so iterate over all of them to find out if we should @@ -2133,6 +2137,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, hva = new->userspace_addr; reg_end = hva + (new->npages << PAGE_SHIFT); + /* Host will not map this private memory without a userspace address. */ + if ((kvm_slot_can_be_private(new)) && !hva) + return 0; + mmap_read_lock(current->mm); /* * A memory region could potentially cover multiple VMAs, and any holes From patchwork Fri Dec 13 16:48:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850839 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F13151F428F for ; Fri, 13 Dec 2024 16:48:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108524; cv=none; b=okhG0RpUs4GhdJd9qQ4w2RNQKMOyopyU1s/+PhpVUvgCKC5sWaSsY5hReVc7p7i9uydbhbSXJ+x2KkxYIraEO5BYZLTsDsXFPUeQNgHxMOCJR11BtAm5N8rPiJ5Jfo+qqD0Q5NBnhqJqaXZNwxsy3hQUDp9+ZJQ9pJLfM33FEUQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108524; c=relaxed/simple; bh=kFzhKNwEokKspfpctGToWgrQRJHoGI5g6v9oDy8dheQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=V3iNapF+TV+FkqtbZUhPypNGDuG6I9wWo8RaiTKxq9OXxKtqyMG9VRtlXdBmnzeGseReHiwckyLLkNU6RElEwXaiJc5lC/CH1inc7ILyGgiu5mDf3fAuFOleyzOwrVP14capAoGu6V78flkBvTBXLDWGE4t7xNrNWVutVtp667s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=3iDQvn7q; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="3iDQvn7q" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4361fc2b2d6so4598675e9.3 for ; Fri, 13 Dec 2024 08:48:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108521; x=1734713321; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=3eqvqwg5DMqF/8QH18XL4R6P7ndrk18+/AZZCHc/Ifs=; b=3iDQvn7qAZOw3Jdz4SIAk+Gbi7hu2n66NoxSPRMtQ8Wsxx18lWwVg9YdLaR/aFTCeU k3qIDgW5pkFEeqSnOGeo3pm0I6EXVr66kENbRBjyPH4k+1y4JYORf/b67hcXbb+baTxe ygOgsb1dw3lui50wBawnpB0oU95v53T8En3+njJh9SDcEpTiTSHZ3hsAY+OTnI2wh0Dc NuyU+2AgvB2V4ek8kn3oRt6Aow6WnYSkvcjXB6fERwxYvZ8o0nATf0iXca8/dpMxLv32 G861rauljnKhGV4EWANB9I6yyXkDqSGrYdh8Fw+s2hVocVbAgFiVJFV+Dnrbbqg8HVjA 0Vcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108521; x=1734713321; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3eqvqwg5DMqF/8QH18XL4R6P7ndrk18+/AZZCHc/Ifs=; b=CV+AADNtSFV6xP/mIjMHLFb/vjBqb8lXJgF8B4R6KzDCWxNbhk7mWCKghCN0DGcWqB HWHbXZKwclan8kHhIpjw68uKXxjOYlR6zL6sOdvsuTYXzyezcbCqe/EQfKX3qbR7rp6z xEjp351lc7/jWdS43Ll2D4pxGFQpn6sU3hoF/zCZllksL77L+5ppgJo6ixn/UL9ubmxJ zPXC8LAjeVZgE5fh1plU0o8DuJUlCyNoHzjyCF91ZrIaNqPRUNj3VfL9RASa30uHgrvQ 7/YrCznGBYi5ME+mhE/vKGDOYas9LqOtgYhl5dLDvxDL/W3kuGk6dHb9oWtvWv0OtXvk UUnQ== X-Forwarded-Encrypted: i=1; AJvYcCWYEJ2hRRphVfLiDzoxI2COgkz79fYqmrcZDOk53nIQT0HjaIqOv1StpOtFOug4kBNYCZBS9FtXgsSKLmMA@vger.kernel.org X-Gm-Message-State: AOJu0YzdIxAmZbtBwXZt0xlmSc0bktC/87kqlEmY2s3BZm3KJJ4j9kOO lfg5h8cDn7f8ZeB35V/CgFoBIVmk2GMXsak8hAcvjJgHT+n1qkSKVzV8Uk0KitW8V+6XjnnB3A= = X-Google-Smtp-Source: AGHT+IGK0ZDMNsH9IoiRdhl2fIEsYjGp9BzosmVVqKEIzH/3TIloY+9G+GKQNVEw6jbsarqoTxBCMQDUSA== X-Received: from wmlf18.prod.google.com ([2002:a7b:c8d2:0:b0:434:9da4:2fa5]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:3496:b0:434:a07d:b709 with SMTP id 5b1f17b1804b1-4362aab4896mr28738005e9.29.1734108521406; Fri, 13 Dec 2024 08:48:41 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:09 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-14-tabba@google.com> Subject: [RFC PATCH v4 13/14] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add arm64 support for resolving guest page faults on guest_memfd() backed memslots. This support is not contingent on pKVM, or other confidential computing support, and works in both VHE and nVHE modes. Without confidential computing, this support is useful for testing and debugging. In the future, it might also be useful should a user want to use guest_memfd() for all code, whether it's for a protected guest or not. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 111 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 109 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 342a9bd3848f..1c4b3871967c 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1434,6 +1434,107 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static int guest_memfd_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, + struct kvm_memory_slot *memslot, bool fault_is_perm) +{ + struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; + bool exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); + bool logging_active = memslot_is_logging(memslot); + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; + bool write_fault = kvm_is_write_fault(vcpu); + struct mm_struct *mm = current->mm; + gfn_t gfn = gpa_to_gfn(fault_ipa); + struct kvm *kvm = vcpu->kvm; + struct page *page; + kvm_pfn_t pfn; + int ret; + + /* For now, guest_memfd() only supports PAGE_SIZE granules. */ + if (WARN_ON_ONCE(fault_is_perm && + kvm_vcpu_trap_get_perm_fault_granule(vcpu) != PAGE_SIZE)) { + return -EFAULT; + } + + VM_BUG_ON(write_fault && exec_fault); + + if (fault_is_perm && !write_fault && !exec_fault) { + kvm_err("Unexpected L2 read permission error\n"); + return -EFAULT; + } + + /* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (!fault_is_perm || (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu)); + if (ret) + return ret; + } + + /* + * Holds the folio lock until mapped in the guest and its refcount is + * stable, to avoid races with paths that check if the folio is mapped + * by the host. + */ + ret = kvm_gmem_get_pfn_locked(kvm, memslot, gfn, &pfn, &page, NULL); + if (ret) + return ret; + + if (!kvm_slot_gmem_is_guest_mappable(memslot, gfn)) { + ret = -EAGAIN; + goto unlock_page; + } + + /* + * Once it's faulted in, a guest_memfd() page will stay in memory. + * Therefore, count it as locked. + */ + if (!fault_is_perm) { + ret = account_locked_vm(mm, 1, true); + if (ret) + goto unlock_page; + } + + read_lock(&kvm->mmu_lock); + if (write_fault) + prot |= KVM_PGTABLE_PROT_W; + + if (exec_fault) + prot |= KVM_PGTABLE_PROT_X; + + if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) + prot |= KVM_PGTABLE_PROT_X; + + /* + * Under the premise of getting a FSC_PERM fault, we just need to relax + * permissions. + */ + if (fault_is_perm) + ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); + else + ret = kvm_pgtable_stage2_map(pgt, fault_ipa, PAGE_SIZE, + __pfn_to_phys(pfn), prot, + memcache, + KVM_PGTABLE_WALK_HANDLE_FAULT | + KVM_PGTABLE_WALK_SHARED); + + kvm_release_faultin_page(kvm, page, !!ret, write_fault); + read_unlock(&kvm->mmu_lock); + + if (ret && !fault_is_perm) + account_locked_vm(mm, 1, false); +unlock_page: + unlock_page(page); + put_page(page); + + return ret != -EAGAIN ? ret : 0; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1900,8 +2001,14 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } - ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, - esr_fsc_is_permission_fault(esr)); + if (kvm_slot_can_be_private(memslot)) { + ret = guest_memfd_abort(vcpu, fault_ipa, memslot, + esr_fsc_is_permission_fault(esr)); + } else { + ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, + esr_fsc_is_permission_fault(esr)); + } + if (ret == 0) ret = 1; out: From patchwork Fri Dec 13 16:48:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 850358 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4E7311F470F for ; Fri, 13 Dec 2024 16:48:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108527; cv=none; b=bG9mZz0KY/hFej/OwkzpzW0nZZG14fNJ2jLFP7l0M+qYEUYHNlBfIp1O7lyP4ZBTpL7PqEnPlPeA6qZl1ynFI/cmMuzcqeieh3ht2ITHJ9UMI3eRqv7MjGAP4CVdMvGFDouSBAtMvHwcKNxh4BUmzr3dmLPLRRCUAJTSi45TdaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734108527; c=relaxed/simple; bh=4pVyegVekE3qePTCKJJ20hbPBFuYD82p7l3jYrlPl0M=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=PYrkkNCJu+X3dFv3C0HD0njaXFUVZEREpRRKHTq/IS+xpSuXiIIc/oeOF3XPpesi58+TmO/J/lgcfz5nq1cZ3Sp5juq+F1TMNtyhK+p2UdQD8Isr2nx0HzQ1Rqqt5l7OX5D14OwJkcxO2gY9gygKnBsx2eN91LAK2ZetvS3wVAg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TWU9fAo8; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TWU9fAo8" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-38629a685fdso296438f8f.2 for ; Fri, 13 Dec 2024 08:48:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1734108524; x=1734713324; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=QFDTUdjhMWOhr77i5TfIxa/V7OmOhu1GE3Wb7is6IxQ=; b=TWU9fAo8MMQu9CfyO/ktma+w3Jb0qoN7kV7BEzdZKwu7sQkUZ1Bpf1EKm9VI9N+GQ6 zFAk/Jyujo/qcwZgjHpRLzhfpMya1qNfWFxSWnlT8X1UsIGGdRPzbrKK138iGVzB3hXB utIoJ6QZIRmlILMGBZc01dyqB8i7dwhnx+E3kwjY1vTK+zpFnK0pl3fbxTdBzFtyIXgr BSjXcKz8dzskJXqlD6zIfH+eDVQ+Iec2SOVVdX007i0xG3FiW+55wSb9dxc1dI93t8fz GwbJJD4/Uzvok0xd5q/d8Bz3V99jCZLJ3eJdjiPNmf4PYcFF/LGZj7IpsYA8i4utnxwN 21kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734108524; x=1734713324; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=QFDTUdjhMWOhr77i5TfIxa/V7OmOhu1GE3Wb7is6IxQ=; b=CHASL44V1CfdMWm2m2M4bsmHZQ5rPrzx2FsTHijnTj12Ua6JG9FitZZGV1vuWqEZXZ xU3+hJuUHXJWjvsr2MjjPBWwf9PTjHipdRSsVusZ1qYsfShQ/KFgoP4sOZpfAt1phwj1 EG5yf5ijN/BpGZPdnhGIS4LuMBNhggCj39r9Jjsh4sTtDCy/QCB9PXvaVCHRxAWxk6C7 7XyarZ26I8g7tPF76jtAcavDiKNSl8NVe1ZoDyvOhpxL1sMdA6mGNvYieNuvZLKZNYQ3 gZ5Qus1372DzBM2zc2O38KUNS8OgpksXouPIfrNAxOYM88EzSsy0LxxZn6+cf+JiF62t 8OsQ== X-Forwarded-Encrypted: i=1; AJvYcCUbWcPTPcZYJxeL1zNdcUhDSGdC/C0JVGJhXlotppKpmfBnnV69mkgGf9xHbnfjLGTMm3hEUEySH1AKpx74@vger.kernel.org X-Gm-Message-State: AOJu0Ywls6Dz3YbgL+fwytGWrKiyBt8gE7trxPQohQAo4lQncA1FO3y6 YYP0nJSNWFS624cqKwAOsGb6QxVJOyhI+5Ka8ZB8XanCOzu0ost2NiITMUF6CtNP1jL6b5hSVQ= = X-Google-Smtp-Source: AGHT+IEdZjijVCjCEZr3PkrGBsAlFaK1miBXntQYNoOTRqQcvW0yt5qcatWxqC8z3zg7zC5MHMXcUa8eHQ== X-Received: from wmbd13.prod.google.com ([2002:a05:600c:58cd:b0:434:9dec:7cc5]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:717:b0:385:ebea:969d with SMTP id ffacd0b85a97d-38880ad91f1mr2174346f8f.22.1734108523614; Fri, 13 Dec 2024 08:48:43 -0800 (PST) Date: Fri, 13 Dec 2024 16:48:10 +0000 In-Reply-To: <20241213164811.2006197-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241213164811.2006197-1-tabba@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241213164811.2006197-15-tabba@google.com> Subject: [RFC PATCH v4 14/14] KVM: arm64: Enable guest_memfd private memory when pKVM is enabled From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Implement kvm_arch_has_private_mem() in arm64 when pKVM is enabled, and make it dependent on the configuration option. Also, now that the infrastructure is in place for arm64 to support guest private memory, enable it in the arm64 kernel configuration. Signed-off-by: Fuad Tabba --- arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/Kconfig | 1 + 2 files changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index e18e9244d17a..8dfae9183651 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1529,4 +1529,7 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); #define kvm_has_s1poe(k) \ (kvm_has_feat((k), ID_AA64MMFR3_EL1, S1POE, IMP)) +#define kvm_arch_has_private_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) && is_protected_kvm_enabled()) + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index ead632ad01b4..fe3451f244b5 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -38,6 +38,7 @@ menuconfig KVM select HAVE_KVM_VCPU_RUN_PID_CHANGE select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS + select KVM_GMEM_MAPPABLE help Support hosting virtualized guest machines.