From patchwork Mon Mar 3 17:10:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 869874 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E6301EE7D9 for ; Mon, 3 Mar 2025 17:10:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021821; cv=none; b=dUIkr7om1jap82KhLw/XaeAKnmZbEk3/NKmMAfl6lmLNUfm4h+UxWQf4HKYSctK+146Gzo2BTY8aLzi3sYDV4jRPiZUZiJ12d3TU6NSk/L1YqNOsWmzAPwECPEskFPrI8sgNr7B6LvhC0EYUpFaS/HIWndCqlLOQHZDtp3xH6eI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021821; c=relaxed/simple; bh=OGMLw7hSKMeyt/pbr/nhF59bzM4sycNSU55EOTupIo0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Pnfv5IOWSCj+C8xBVJxM0rlaTAs30WwbHFw4O0Cn0un9TZsLp7dk1JWT20ULD0EC+9Irxhuc2OaV/k2cDEOYH/ZosOOJ7vQXJyVoTumwlFMdmvuJs15v36PAGKx0rH+ndo87grTIutcsj6lOHa+nbmtRWzj80NaHMyvKAjWFqtI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=wBdp8pcT; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="wBdp8pcT" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-390f7db84faso1812110f8f.2 for ; Mon, 03 Mar 2025 09:10:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741021818; x=1741626618; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=w5p1u51hDu4RIdVFXU2+a7I5FQFTXTH5HXvwkoJ+/VY=; b=wBdp8pcTQsbkSw0R3yXqTYPyUneuv4MOUCAMjIwb8WhKnKWNjpqDO8beaZ5dNokBrg 7nJ2f+oTpzR+d5SfyS5CuMfuUY7HTfU0/SAyRnfIEANpimjJ6Jdna+z2uOu1Aq7XhLR9 rvWFTdwfA4Rre/AxIKJRoIs2LW+xfYMLtZwIyerR9qK4C+QYHqS+9hBmJmhCPUPmpYEj uzWcc5TgK8Ds4um8RhPEuszJt4XDVbS0gva5/+vS6gwZuLrVqu0DPErx29us1O2TI9rP bD9BvaO9LHCJgdUfilsBZ3URRpf+aEmKZADnSsJ00EiRK/sNLIZqpenY+8RN0I1ktrqT 5B1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741021818; x=1741626618; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=w5p1u51hDu4RIdVFXU2+a7I5FQFTXTH5HXvwkoJ+/VY=; b=YekvQPhhX3ebvMTIRxs5OPh8aeGf9pNySODVu01jnIlHnkbRmeazJ6mcvPi030jO/j GMT/A4piMv3eACQuCq2ibn1swSAuLqXHhrZstcSjGXUMOlfRta5B22pzKGBnw39KC+fZ TpWx1o98yGXOvWr6cSF1BK70w9BiJPN5BF1q9moYUXAaIzEPnlDhM7zi0dlBTAvr8sXt g0Pl0TTvRYtw1R1KuR+bZl6Op+esEpn7/fR2Qee5rWWDg6Ul4uDRwsT0MumN9x5A1ZJD ufHqMTCN0r7txRGPeXVpQafdNor1ZmREoE2ljv31u/TZkuOrfJObGRdFFskcUK69ViH2 DBNQ== X-Forwarded-Encrypted: i=1; AJvYcCWA4Dwi+MPJ4vUmUfWXXGojGO4w2G+WXlhy1C48WBNfq02CU9DfFf0RuYE7eT39wc2+OhLNuDJRz4uQi4Qe@vger.kernel.org X-Gm-Message-State: AOJu0Yw69RyRcpWkjmmKRB+i0svXE7OsVj1m99Ea5KiyE1E1J9rOj4tn t9kR9qbnc4r8FSjn6YIW3wu6iFXQZS/UObqGsXFHdnFuE4Qu+a1ae/5sB0lWi1RXSyCHGuGF2Q= = X-Google-Smtp-Source: AGHT+IHrIQRTSbIxQusEzSVfIPDJzKXN0UjZRRQHhWJRv/ncbjwRbNznaVOzH4GMAzYPY56lQvuMR1QBGw== X-Received: from wre13.prod.google.com ([2002:a05:6000:4b0d:b0:390:e0e3:e8cd]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:186d:b0:391:13ef:1b35 with SMTP id ffacd0b85a97d-39113ef1da9mr1259620f8f.29.1741021817721; Mon, 03 Mar 2025 09:10:17 -0800 (PST) Date: Mon, 3 Mar 2025 17:10:05 +0000 In-Reply-To: <20250303171013.3548775-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250303171013.3548775-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250303171013.3548775-2-tabba@google.com> Subject: [PATCH v5 1/9] mm: Consolidate freeing of typed folios on final folio_put() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Some folio types, such as hugetlb, handle freeing their own folios. Moreover, guest_memfd will require being notified once a folio's reference count reaches 0 to facilitate shared to private folio conversion, without the folio actually being freed at that point. As a first step towards that, this patch consolidates freeing folios that have a type. The first user is hugetlb folios. Later in this patch series, guest_memfd will become the second user of this. Suggested-by: David Hildenbrand Acked-by: Vlastimil Babka Acked-by: David Hildenbrand Signed-off-by: Fuad Tabba --- include/linux/page-flags.h | 15 +++++++++++++++ mm/swap.c | 23 ++++++++++++++++++----- 2 files changed, 33 insertions(+), 5 deletions(-) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index 36d283552f80..6dc2494bd002 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -953,6 +953,21 @@ static inline bool page_has_type(const struct page *page) return page_mapcount_is_type(data_race(page->page_type)); } +static inline int page_get_type(const struct page *page) +{ + return page->page_type >> 24; +} + +static inline bool folio_has_type(const struct folio *folio) +{ + return page_has_type(&folio->page); +} + +static inline int folio_get_type(const struct folio *folio) +{ + return page_get_type(&folio->page); +} + #define FOLIO_TYPE_OPS(lname, fname) \ static __always_inline bool folio_test_##fname(const struct folio *folio) \ { \ diff --git a/mm/swap.c b/mm/swap.c index fc8281ef4241..47bc1bb919cc 100644 --- a/mm/swap.c +++ b/mm/swap.c @@ -94,6 +94,19 @@ static void page_cache_release(struct folio *folio) unlock_page_lruvec_irqrestore(lruvec, flags); } +static void free_typed_folio(struct folio *folio) +{ + switch (folio_get_type(folio)) { +#ifdef CONFIG_HUGETLBFS + case PGTY_hugetlb: + free_huge_folio(folio); + return; +#endif + default: + WARN_ON_ONCE(1); + } +} + void __folio_put(struct folio *folio) { if (unlikely(folio_is_zone_device(folio))) { @@ -101,8 +114,8 @@ void __folio_put(struct folio *folio) return; } - if (folio_test_hugetlb(folio)) { - free_huge_folio(folio); + if (unlikely(folio_has_type(folio))) { + free_typed_folio(folio); return; } @@ -966,13 +979,13 @@ void folios_put_refs(struct folio_batch *folios, unsigned int *refs) if (!folio_ref_sub_and_test(folio, nr_refs)) continue; - /* hugetlb has its own memcg */ - if (folio_test_hugetlb(folio)) { + if (unlikely(folio_has_type(folio))) { + /* typed folios have their own memcg, if any */ if (lruvec) { unlock_page_lruvec_irqrestore(lruvec, flags); lruvec = NULL; } - free_huge_folio(folio); + free_typed_folio(folio); continue; } folio_unqueue_deferred_split(folio); From patchwork Mon Mar 3 17:10:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 869873 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9416624CEE3 for ; Mon, 3 Mar 2025 17:10:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021825; cv=none; b=P3aD5SmqEKjx3jhwxzTg0EZoIlrU8OvcBN4AdU1VuRRmJjHLmop+cUy60xt6WNPyHYwC8dHirwqFOQxl1JXVHcNIFostU6fBAiQ9yQDIB1B/zgjs+dz5wps3ruj/OqTe4lQwPsBZFgi3dMJglust4NNQIqH5nR3dbcZrzt8pqKY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021825; c=relaxed/simple; bh=5q3op/vr/0aA73x90j/SW+hw8Vc/Uz7wBgQMLMPuxSk=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QQJDjfIJ+s0qJsPOLjxkHrPB5gOoQkIuBKOvNDj/F07igpekqSTsyy8yJXCdEeAl4iWnBu1VIWq9QMF6DhHIrq6d/F6I6QnYCMc8y3pfGwW5FJ/W+o/tRKuZXrkV4BdAAZlovSVzytKPXIDbIUY2uLoSfZ1J27OWqhNlxAbmkoE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=RdYlOTPR; arc=none smtp.client-ip=209.85.221.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="RdYlOTPR" Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-390ddebcbd1so2920187f8f.2 for ; Mon, 03 Mar 2025 09:10:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741021822; x=1741626622; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=iEl5LoBVGArZ0Hko7f7NcM1d/VPAiCytVB+FzStXmAM=; b=RdYlOTPRfrgeZME3BJZFJJWwJlHtD8li/HPNM/WYlSrnEWzDx7UIMKNvyK6UQK9zeY 2DmOYj7W1FYJXG+woijRyyHpPEfknWpU2Y2mZ/b3dtVuQY/CdRVh66+IitjJlwifVvc2 ZSOjphiu0YvK/XatPiyl7jqEg8zjMDX3N7MrUX3J9B2y/ine7CwsUfzczd8QmE52hf5B K4880r1V6VeGo9AMQTwWGjKpCFJMRv/YZKT5HxF0PuyKW47Iu/7wM4rC+VPFwpXg+yV9 YoRFs4jU/t0xJrT0FXJXEMCIrmhD5CoRs5Rjg8tGXo5tC0dLaGtKV+ptv7eDH6a0hteL OCxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741021822; x=1741626622; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=iEl5LoBVGArZ0Hko7f7NcM1d/VPAiCytVB+FzStXmAM=; b=AtZCdreQKmXjATvuDZfRFFRDcdS999zfHrL02LzVS7qFRmOqkOHMlBjw+2c9fpyA5Y KK5p+WjaYn7mA54CpbCVO0cq0YTpVNj/GR9KH4VhoIWQT9PsiVVMte2AmdJjQ4d7FfeF FzJZjEvkdSKBTqKqB/Qb4cTPbPThtFEYHbC2hHMmo0YvLFqRgwT9hS9MuHSxcUo+trs0 /I2d4PMgTfMQdLI3fAR8rGL+GSjy9+ZbVOpXxDk09sxrrWRIxl5kStwtrkd33h3kQHsr XU8m1UTh5EtSbqM4J3ddxbJ3N3cLMZP0YY5ZpxHRE5jyFmrS9pyfkNuwQ7H2Np5YMwju 2nIg== X-Forwarded-Encrypted: i=1; AJvYcCV5AtkqPQxu5WUKjbXq8genNXx0oAfDDs2i+B+juz++tbdiiqYPSpkj6zQfUJB2KZNOL4VL0OaapWwYuSwO@vger.kernel.org X-Gm-Message-State: AOJu0YxOnNON8bOTSf142+9NbcFzxpMP41x23WvrLLSi7+EdUcvDovoE Gu0iHesRs2vivb7if9jvHpq2TP1CaMQzu4wFy5bpwxl7qxrm+cR3qVoCuVWYW0J1DRUnVhQ8hw= = X-Google-Smtp-Source: AGHT+IHgoYpjT7qfoJb2A88YneKFoylvHGn17OMTHx97hw2eQNXCP3h043pd8jbo3m8XgXFe11agjUZICw== X-Received: from wmgg5.prod.google.com ([2002:a05:600d:5:b0:43b:c99a:4af8]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:1ac9:b0:391:13ef:1b1b with SMTP id ffacd0b85a97d-39113ef1e57mr1340632f8f.30.1741021821960; Mon, 03 Mar 2025 09:10:21 -0800 (PST) Date: Mon, 3 Mar 2025 17:10:07 +0000 In-Reply-To: <20250303171013.3548775-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250303171013.3548775-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250303171013.3548775-4-tabba@google.com> Subject: [PATCH v5 3/9] KVM: guest_memfd: Allow host to map guest_memfd() pages From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Add support for mmap() and fault() for guest_memfd backed memory in the host for VMs that support in-place conversion between shared and private. To that end, this patch adds the ability to check whether the VM type supports in-place conversion, and only allows mapping its memory if that's the case. Also add the KVM capability KVM_CAP_GMEM_SHARED_MEM, which indicates that the VM supports shared memory in guest_memfd, or that the host can create VMs that support shared memory. Supporting shared memory implies that memory can be mapped when shared with the host. This is controlled by the KVM_GMEM_SHARED_MEM configuration option. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 ++++ include/uapi/linux/kvm.h | 1 + virt/kvm/guest_memfd.c | 105 +++++++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 4 ++ 4 files changed, 121 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 7788e3625f6d..2d025b8ee20e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm) } #endif +/* + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for + * private memory is enabled and it supports in-place shared/private conversion. + */ +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm) +{ + return false; +} +#endif + #ifndef kvm_arch_has_readonly_mem static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm) { diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 45e6d8fca9b9..117937a895da 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -929,6 +929,7 @@ struct kvm_enable_cap { #define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 +#define KVM_CAP_GMEM_SHARED_MEM 239 struct kvm_irq_routing_irqchip { __u32 irqchip; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index b2aa6bf24d3a..4291956b51ae 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -312,7 +312,112 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) +{ + struct kvm_gmem *gmem = file->private_data; + + /* For now, VMs that support shared memory share all their memory. */ + return kvm_arch_gmem_supports_shared_mem(gmem->kvm); +} + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + filemap_invalidate_lock_shared(inode->i_mapping); + + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (IS_ERR(folio)) { + switch (PTR_ERR(folio)) { + case -EAGAIN: + ret = VM_FAULT_RETRY; + break; + case -ENOMEM: + ret = VM_FAULT_OOM; + break; + default: + ret = VM_FAULT_SIGBUS; + break; + } + goto out_filemap; + } + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out_folio; + } + + /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */ + if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + /* + * Only private folios are marked as "guestmem" so far, and we never + * expect private folios at this point. + */ + if (WARN_ON_ONCE(folio_test_guestmem(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + /* No support for huge pages. */ + if (WARN_ON_ONCE(folio_test_large(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (!folio_test_uptodate(folio)) { + clear_highpage(folio_page(folio, 0)); + kvm_gmem_mark_prepared(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); + +out_folio: + if (ret != VM_FAULT_LOCKED) { + folio_unlock(folio); + folio_put(folio); + } + +out_filemap: + filemap_invalidate_unlock_shared(inode->i_mapping); + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct kvm_gmem *gmem = file->private_data; + + if (!kvm_arch_gmem_supports_shared_mem(gmem->kvm)) + return -ENODEV; + + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +#define kvm_gmem_mmap NULL +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ + static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index ba0327e2d0d3..38f0f402ea46 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -4830,6 +4830,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_KVM_PRIVATE_MEM case KVM_CAP_GUEST_MEMFD: return !kvm || kvm_arch_has_private_mem(kvm); +#endif +#ifdef CONFIG_KVM_GMEM_SHARED_MEM + case KVM_CAP_GMEM_SHARED_MEM: + return !kvm || kvm_arch_gmem_supports_shared_mem(kvm); #endif default: break; From patchwork Mon Mar 3 17:10:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 869872 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C63124E015 for ; Mon, 3 Mar 2025 17:10:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021829; cv=none; b=JAjRGuXcClVOezBfQzGfoX7FH0ZCplA/q7GE6Y9qfCdEECS0YogKfof6B1gvcuHFcDBQKP9wlcnUXVgIkisa/57+u+8lFaoQYFZIH42Yh72nskneYYvC/6yY4hM/uwd2BnWdaJn4Vi5mjfY9ZzivASogoJurTlVwNDoMZLLQ/+g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021829; c=relaxed/simple; bh=DUkn5GbyH5CsPI0C4PwuQxIiwtGgSakNH9hZfIYfgMo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=H54wRawSXrfR14wCN2xaZGdqu4HtzcR/7dRBDAZOA/sOLtqrIFu38H3OA6KnIWwrRvOGJZvsh6XIsk159L5Q6AVXUxkZN4I/BuKmrSceIc1yeB/NsTTsC8gKb4aHk17x0EpUET5dDc8gjGc+A7i5dPlTj5HBFGsn+tKErBxOEds= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=nX3DUxyR; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="nX3DUxyR" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43bcb061704so4835e9.0 for ; Mon, 03 Mar 2025 09:10:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741021826; x=1741626626; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kBo9cxuckkLxj1P4sgBwDTqSuyzpVc+06bRUFpsmRRk=; b=nX3DUxyRqDrCgDK4MjhAent2+2t0M3XT5ynD2zKc7891zaeMUSLNWmm1sxIGPCZcZI mZdRbC18DZ88V0JxDN3ZpIlbXkQBatZklhDcIprgxoTwFkO0xUx8UJo8EaseJQfVSUhH NT8TiFSZI91ULRBXgLZoFKl32LqxM/DG+DE0CUlF349xp1erKquxA3cFrnPHHG8umSiL XpCBL2ygMnoS1oRN/aq25IE9MHZSdqRurv5Hh/rYzDcAELaQv3RlJhTOhyrCpqZRBGAx 4RLvY0ScfQUzKcKmOoH80wz8/uvW9i2z+2Gk3VX580fpcr5MKhE9Ey0mbK1pYljjnqok Xr8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741021826; x=1741626626; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kBo9cxuckkLxj1P4sgBwDTqSuyzpVc+06bRUFpsmRRk=; b=QyAcJrC9VT0qcpbrY5i8Ub4vgf6sQBwcCmWScsi3mw7N6wyeb/0Oy2X5fBk9uxysvv s3dciSzetGwF5t9+VIh2emHCxEBqn2yDX2FOc+ltD2qGVI7T/jKVhLn/e/mhp+OBMwjo kO8cOJiwYTZNnB8V6ijPdcjjyrOSA/dCUTzW27EMANqq7g29+e9zXfHVCtvEAycUJHsN t58PoP5wPdsj9SZOWqj44gNHyBzw1vLrWmvu/vbhsUFQnunFzGwm5+gvopAd4NM8FWdA SrW7iSyQkw4UbZ8Jdq8m4uiu56gpht2VsWb3FfKJaoghiD/wk5htK2hIQFCHY59Nhqg+ DzJA== X-Forwarded-Encrypted: i=1; AJvYcCXVEwCP7jYGZ5EE8YfXNNv/I+5Dr33aM4M4QCnndU5DvGjlbf9gYrZdoo5dLFQ6ewtYM7ASo8lS4AkKVzLL@vger.kernel.org X-Gm-Message-State: AOJu0Yw/Ul9Dv1AEO5yjkOzNDYHmpI2N/hDHAUcxtOWr8Q2PQGlb02aS 6OH7SkZItW00a0jtk5sag7cVRiUfzJaYWyDT/RAV3cq3a/neuiCZuLQe+uHIMTxhhQD0sFSfRQ= = X-Google-Smtp-Source: AGHT+IEzzWpqEnLXSXMECYOpjexIXQkN5OcLraeoWVVOWbC+GmYVuhkSVhxjsVgVJQAszWAWjCT5GaMLGQ== X-Received: from wmqa13.prod.google.com ([2002:a05:600c:348d:b0:439:64f9:d801]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1f89:b0:439:96b2:e8f with SMTP id 5b1f17b1804b1-43bb4550794mr52549605e9.28.1741021826102; Mon, 03 Mar 2025 09:10:26 -0800 (PST) Date: Mon, 3 Mar 2025 17:10:09 +0000 In-Reply-To: <20250303171013.3548775-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250303171013.3548775-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250303171013.3548775-6-tabba@google.com> Subject: [PATCH v5 5/9] KVM: x86: Mark KVM_X86_SW_PROTECTED_VM as supporting guest_memfd shared memory From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com The KVM_X86_SW_PROTECTED_VM type is meant for experimentation and does not have any underlying support for protected guests. This makes it a good candidate for testing mapping shared memory. Therefore, when the kconfig option is enabled, mark KVM_X86_SW_PROTECTED_VM as supporting shared memory. This means that this memory is considered by guest_memfd to be shared with the host, with the possibility of in-place conversion between shared and private. This allows the host to map and fault in guest_memfd memory belonging to this VM type. Signed-off-by: Fuad Tabba --- arch/x86/include/asm/kvm_host.h | 5 +++++ arch/x86/kvm/Kconfig | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h index 0b7af5902ff7..c6e4925bdc8a 100644 --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -2245,8 +2245,13 @@ void kvm_configure_mmu(bool enable_tdp, int tdp_forced_root_level, #ifdef CONFIG_KVM_PRIVATE_MEM #define kvm_arch_has_private_mem(kvm) ((kvm)->arch.has_private_mem) + +#define kvm_arch_gmem_supports_shared_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) && \ + ((kvm)->arch.vm_type == KVM_X86_SW_PROTECTED_VM)) #else #define kvm_arch_has_private_mem(kvm) false +#define kvm_arch_gmem_supports_shared_mem(kvm) false #endif #define kvm_arch_has_readonly_mem(kvm) (!(kvm)->arch.has_protected_state) diff --git a/arch/x86/kvm/Kconfig b/arch/x86/kvm/Kconfig index ea2c4f21c1ca..22d1bcdaad58 100644 --- a/arch/x86/kvm/Kconfig +++ b/arch/x86/kvm/Kconfig @@ -45,7 +45,8 @@ config KVM_X86 select HAVE_KVM_PM_NOTIFIER if PM select KVM_GENERIC_HARDWARE_ENABLING select KVM_GENERIC_PRE_FAULT_MEMORY - select KVM_GENERIC_PRIVATE_MEM if KVM_SW_PROTECTED_VM + select KVM_PRIVATE_MEM if KVM_SW_PROTECTED_VM + select KVM_GMEM_SHARED_MEM if KVM_SW_PROTECTED_VM select KVM_WERROR if WERROR config KVM From patchwork Mon Mar 3 17:10:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 869871 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ED75724FBE5 for ; Mon, 3 Mar 2025 17:10:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021834; cv=none; b=rd7Nb1QWhHtmlZAppG9rc2NyYmIjjHZ+l9/pEL3OXDfPr4vyw1JWAWM7rz773wB0l6uDFZPxupAQM1CBzPYYtyuwK3C0MGyqaUDfxReLGnoT72dhEzmyyYj8g04HeqIOzCjTOZO1j9tO8ZwYf2cCLmZG7mx9IDHQBiP3dqN+Z7s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021834; c=relaxed/simple; bh=RsQwq313gjrcolwwWOK6tAbBz5LwEBOlBhOr6d860Ig=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=eMKXshN1NC9+TyZ6U448Y6580HWzkBJ1RHyCXEQ7wvjQqU/S8Yts5R7EUrDp1aqs4jq4PkYk1dw0Hn8HatoXotGxZTuR9bXmR/L+ZfnaPIkJZsQuUjLbfBF3efecl0WyFpOHpwVYyY3o8inSfzZ3EkQ1oJvAsd7UeJGche4gh6Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Mdb6B8rB; arc=none smtp.client-ip=209.85.128.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Mdb6B8rB" Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-4393e89e910so26328045e9.0 for ; Mon, 03 Mar 2025 09:10:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741021830; x=1741626630; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gt/riWGcUGRHcZwRoC2V1jTyAk/j0KNMZMVzsat7B0s=; b=Mdb6B8rBvzysyCoUk6IBIaAhKifD+FVXv70ffyU8cHxSgPuT58LfXdJHejQLaYakDe QC5kJffNMK8vt2MEyhIpRpsAcS4VfXos0mMz61WgHRixzIOmtG0pOQp7vM4tFcVA2Gu9 As+NldCpqyEY96Rv1WE95g2mJ35Qvl+b9/tonveko7+r500s722Cj2v1v2TbN/SVPTOj 2r2ri9lL4HFj4DFZxti2fp/gnlPpzyGtPvWGOOGROhIBPjt1kWRicjhtAKwZukvlPr0U idyRSVqemelYOHZSKJBAxyMGIseIClCOZHRDVnU4hKYntRB055C5Dr39TXpqoSYshBrX sOzg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741021830; x=1741626630; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gt/riWGcUGRHcZwRoC2V1jTyAk/j0KNMZMVzsat7B0s=; b=bx9tOCjIzt6P1+pxAkLqJECjFsy0af8wzT2KD1ClthAlICjYZzuZoruVBJCyGLpLof hAykFZDh5qfqzcDzTggzwBIMH6NuqHLp6m/OORdtkvy5I2T+KsPmXuDLAT2LnzI9Ea8K S1LfWPJa9ufzvHqLvFbFmkcnNGul8ECp5HK9dYHqXueyuTxg76XpD8x2/rM4NghHBXcc paO/27MMOrZmbDe5yRRjBV/20PAEv5K5Vv85wlDx0FVe8S/D/GgmLQtiAwOJnT/Q1wIc pxobswFDabEfDBlsx3x1hGKVr+G63HVxRfXeSPXssIGnJ+2uSw/3wO2IdiWtpCZnZt4t s4GA== X-Forwarded-Encrypted: i=1; AJvYcCVH+x5OF6EVJ7u1C2oWRNMojV+VubKBTcyCAZKVqylQ5blelAgTuxbisl4cv7DRcgdMYFXhre475U6dpn8s@vger.kernel.org X-Gm-Message-State: AOJu0YzfleKaB/fxj9S5t9bczrc1bSzJTiuwlkJKmX2SuvUkslAynb2w 3xhNYkVrPifCkkGaJliIpmTR6TY+vH4bYq6XL23sOE6t3WS2SEDIvxAExEGInOj//w5uECjbTA= = X-Google-Smtp-Source: AGHT+IHjUvSjd+E6xvwns/9W7bCh3g1PvhkarhqtlRtOoo2/O/7/4oOGG+YV5YIkmlaYkfIS1yNvw3pbbA== X-Received: from wmbfk7.prod.google.com ([2002:a05:600c:cc7:b0:43b:bfff:e091]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:4023:b0:390:fb2b:62bc with SMTP id ffacd0b85a97d-390fb2b634dmr8086880f8f.23.1741021830280; Mon, 03 Mar 2025 09:10:30 -0800 (PST) Date: Mon, 3 Mar 2025 17:10:11 +0000 In-Reply-To: <20250303171013.3548775-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250303171013.3548775-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250303171013.3548775-8-tabba@google.com> Subject: [PATCH v5 7/9] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Add arm64 support for handling guest page faults on guest_memfd backed memslots. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 65 +++++++++++++++++++++++++++------------- include/linux/kvm_host.h | 5 ++++ virt/kvm/kvm_main.c | 5 ---- 3 files changed, 50 insertions(+), 25 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 887ffa1f5b14..adb0681fc1c6 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1454,6 +1454,30 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static kvm_pfn_t faultin_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, bool write_fault, bool *writable, + struct page **page, bool is_private) +{ + kvm_pfn_t pfn; + int ret; + + if (!is_private) + return __kvm_faultin_pfn(slot, gfn, write_fault ? FOLL_WRITE : 0, writable, page); + + *writable = false; + + ret = kvm_gmem_get_pfn(kvm, slot, gfn, &pfn, page, NULL); + if (!ret) { + *writable = !memslot_is_readonly(slot); + return pfn; + } + + if (ret == -EHWPOISON) + return KVM_PFN_ERR_HWPOISON; + + return KVM_PFN_ERR_NOSLOT_MASK; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1461,19 +1485,20 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, { int ret = 0; bool write_fault, writable; - bool exec_fault, mte_allowed; + bool exec_fault, mte_allowed = false; bool device = false, vfio_allow_any_uc = false; unsigned long mmu_seq; phys_addr_t ipa = fault_ipa; struct kvm *kvm = vcpu->kvm; - struct vm_area_struct *vma; + struct vm_area_struct *vma = NULL; short vma_shift; void *memcache; - gfn_t gfn; + gfn_t gfn = ipa >> PAGE_SHIFT; kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); - bool force_pte = logging_active || is_protected_kvm_enabled(); - long vma_pagesize, fault_granule; + bool is_gmem = kvm_mem_is_private(kvm, gfn); + bool force_pte = logging_active || is_gmem || is_protected_kvm_enabled(); + long vma_pagesize, fault_granule = PAGE_SIZE; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt; struct page *page; @@ -1510,16 +1535,22 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, return ret; } + mmap_read_lock(current->mm); + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ - mmap_read_lock(current->mm); - vma = vma_lookup(current->mm, hva); - if (unlikely(!vma)) { - kvm_err("Failed to find VMA for hva 0x%lx\n", hva); - mmap_read_unlock(current->mm); - return -EFAULT; + if (!is_gmem) { + vma = vma_lookup(current->mm, hva); + if (unlikely(!vma)) { + kvm_err("Failed to find VMA for hva 0x%lx\n", hva); + mmap_read_unlock(current->mm); + return -EFAULT; + } + + vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; + mte_allowed = kvm_vma_mte_allowed(vma); } if (force_pte) @@ -1590,18 +1621,13 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, ipa &= ~(vma_pagesize - 1); } - gfn = ipa >> PAGE_SHIFT; - mte_allowed = kvm_vma_mte_allowed(vma); - - vfio_allow_any_uc = vma->vm_flags & VM_ALLOW_ANY_UNCACHED; - /* Don't use the VMA after the unlock -- it may have vanished */ vma = NULL; /* * Read mmu_invalidate_seq so that KVM can detect if the results of - * vma_lookup() or __kvm_faultin_pfn() become stale prior to - * acquiring kvm->mmu_lock. + * vma_lookup() or faultin_pfn() become stale prior to acquiring + * kvm->mmu_lock. * * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs * with the smp_wmb() in kvm_mmu_invalidate_end(). @@ -1609,8 +1635,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, mmu_seq = vcpu->kvm->mmu_invalidate_seq; mmap_read_unlock(current->mm); - pfn = __kvm_faultin_pfn(memslot, gfn, write_fault ? FOLL_WRITE : 0, - &writable, &page); + pfn = faultin_pfn(kvm, memslot, gfn, write_fault, &writable, &page, is_gmem); if (pfn == KVM_PFN_ERR_HWPOISON) { kvm_send_hwpoison_signal(hva, vma_shift); return 0; diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 296f1d284d55..88efbbf04db1 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1882,6 +1882,11 @@ static inline int memslot_id(struct kvm *kvm, gfn_t gfn) return gfn_to_memslot(kvm, gfn)->id; } +static inline bool memslot_is_readonly(const struct kvm_memory_slot *slot) +{ + return slot->flags & KVM_MEM_READONLY; +} + static inline gfn_t hva_to_gfn_memslot(unsigned long hva, struct kvm_memory_slot *slot) { diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 38f0f402ea46..3e40acb9f5c0 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2624,11 +2624,6 @@ unsigned long kvm_host_page_size(struct kvm_vcpu *vcpu, gfn_t gfn) return size; } -static bool memslot_is_readonly(const struct kvm_memory_slot *slot) -{ - return slot->flags & KVM_MEM_READONLY; -} - static unsigned long __gfn_to_hva_many(const struct kvm_memory_slot *slot, gfn_t gfn, gfn_t *nr_pages, bool write) { From patchwork Mon Mar 3 17:10:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 869870 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B0A072500C6 for ; Mon, 3 Mar 2025 17:10:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021838; cv=none; b=JlgDo2LcUi0myPgusbOZQ9/qPP01bNF2Kq4qCjGklGhvRA2nFTMP/XqWEMvwm/YZD9mLt5R6U230+qfJE+/w55GQjfkv1TH66Yli/UfZOj8kagY0kLB9UoDdn9JPJw8HNSedMm+3y5U7nxHn+5vRCfwlGr8lRYj0SGmz9o7KyMY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1741021838; c=relaxed/simple; bh=hHukxtvv+DhlvOpnT6JJgLuCaUBAiA8jyPZaBpcbwYI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=RT2DJoeVhUSFhAMBPYjTJlnOpOPuJbJrsfB1dinWuqH/MQwUbOScvmLBxfbSxLD5Gole1tsW+yn//edLp5UCf4DW6c0qCvV364tEwH7DfiMDWbguewrlArn74i4iEroO/sSwi9p95sVp2GLeiTSPlKqVYkQkNyqQlBiWD9w4/Hc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=C213ZEtX; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="C213ZEtX" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-4399c32efb4so24411255e9.1 for ; Mon, 03 Mar 2025 09:10:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1741021835; x=1741626635; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=shxzjltmhVCSEs3XLEEf6Hj2khLkpxaqKL228sddyqU=; b=C213ZEtXEyvOT7om2kBYwZvTyG2f2rRjVduaq9UV7JfVr8UTn1RuuPQmfBdmdo8F6D 6a4UD48fVYtI4+AFwK/jjTjWuqGeAuJUJCcmsNujeVl1dD0LwrdXQL/yokT99Sd145l5 C7HRuhWPvAUp7o1Yuv7lF6QQsPJNJkYjPn4LQejBW7bd6U5REVJCrGsIoM6cSeFGNGcZ 0VL/o8TVGXOvrnNe0jdbVElZyBdJw6rYVmKqs10O2tfnRbw4zJ//RWWgHIL1djEudJOh KZzBn0uda4dTrAIWj95uldgnzz2G28kWsSd0LbIahwpkjzrkm7hRVoqU7yUlZwMDh/ZI zQyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741021835; x=1741626635; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=shxzjltmhVCSEs3XLEEf6Hj2khLkpxaqKL228sddyqU=; b=OKeh/IMA6UqUvdo01BA7uZH4cuzqKze4MHGjxnkdVocWCe93w82tMqp1wlq8SdY2ca ZaAHy5QhG6yr3jkt+tuPE55DEodmErMUy5nH41BAiHplLmpVWbFTJXabP1x+r8Q/SxzO xxURzH+y475xTtaA0OSP000Maf2z82yi98YKTTr4H7/0mMc+yQIrpuZ+Xnfgf4sS8JFa n3VxBZrnHfLqW2OV2GGwSO7ha7a5R+Qn4OtjKuJAQOohXHgrcFHE+9bwy5emsJp0lmqs dGKKtHeVYV9vTKzX1NF8Yxtb5WBALIa1knTJ1cC2qTrwX3lP6xz1xhYJ10y8Ef4uCOSj ehug== X-Forwarded-Encrypted: i=1; AJvYcCWcNI2oCBrv0R+JKhq4gc1XEXPoOGTEY+/Paz+JF7TUVINukCz0lK7hzlr7vZ2pOmSCOVZ2XFDP2O8Yweyq@vger.kernel.org X-Gm-Message-State: AOJu0YwCSGdFEIX5Fea4SKwaOboM+ATwSLNp5NKWuAU9aStBya8SVuNr //xy7OkRgLRW6/VwBffk1ymu02HBBmmneRCzMoxYsZ2gwDbs/EKqICE4P7hHTnWAaiMbyNvNGw= = X-Google-Smtp-Source: AGHT+IFRshHAVxkzqZnB+ggYu/XSK/78X/FWHRdv+7cCRWE1mXA23VnZ9DwQ8SGznRbx9niCQ37rSqzPuA== X-Received: from wmsd10.prod.google.com ([2002:a05:600c:3aca:b0:439:7e67:ca7b]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:2c7:b0:390:f0ff:2c10 with SMTP id ffacd0b85a97d-390f0ff30camr10550328f8f.19.1741021834976; Mon, 03 Mar 2025 09:10:34 -0800 (PST) Date: Mon, 3 Mar 2025 17:10:13 +0000 In-Reply-To: <20250303171013.3548775-1-tabba@google.com> Precedence: bulk X-Mailing-List: linux-arm-msm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250303171013.3548775-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250303171013.3548775-10-tabba@google.com> Subject: [PATCH v5 9/9] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, tabba@google.com Expand the guest_memfd selftests to include testing mapping guest memory for VM types that support it. Also, build the guest_memfd selftest for aarch64. Signed-off-by: Fuad Tabba --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 75 +++++++++++++++++-- 2 files changed, 70 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selftests/kvm/Makefile.kvm index 4277b983cace..c9a3f30e28dd 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -160,6 +160,7 @@ TEST_GEN_PROGS_arm64 += coalesced_io_test TEST_GEN_PROGS_arm64 += demand_paging_test TEST_GEN_PROGS_arm64 += dirty_log_test TEST_GEN_PROGS_arm64 += dirty_log_perf_test +TEST_GEN_PROGS_arm64 += guest_memfd_test TEST_GEN_PROGS_arm64 += guest_print_test TEST_GEN_PROGS_arm64 += get-reg-list TEST_GEN_PROGS_arm64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ce687f8d248f..38c501e49e0e 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -34,12 +34,48 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_allowed(int fd, size_t total_size) { + size_t page_size = getpagesize(); + const char val = 0xaa; + char *mem; + int ret; + int i; + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass."); + + memset(mem, val, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + page_size); + TEST_ASSERT(!ret, "fallocate the first page should succeed"); + + for (i = 0; i < page_size; i++) + TEST_ASSERT_EQ(mem[i], 0x00); + for (; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + memset(mem, val, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], val); + + ret = munmap(mem, total_size); + TEST_ASSERT(!ret, "munmap should succeed"); +} + +static void test_mmap_denied(int fd, size_t total_size) +{ + size_t page_size = getpagesize(); char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); TEST_ASSERT_EQ(mem, MAP_FAILED); + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT_EQ(mem, MAP_FAILED); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -170,19 +206,27 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) close(fd1); } -int main(int argc, char *argv[]) +unsigned long get_shared_type(void) { - size_t page_size; +#ifdef __x86_64__ + return KVM_X86_SW_PROTECTED_VM; +#endif + return 0; +} + +void test_vm_type(unsigned long type, bool is_shared) +{ + struct kvm_vm *vm; size_t total_size; + size_t page_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); page_size = getpagesize(); total_size = page_size * 4; - vm = vm_create_barebones(); + vm = vm_create_barebones_type(type); test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); @@ -190,10 +234,29 @@ int main(int argc, char *argv[]) fd = vm_create_guest_memfd(vm, total_size, 0); test_file_read_write(fd); - test_mmap(fd, page_size); + + if (is_shared) + test_mmap_allowed(fd, total_size); + else + test_mmap_denied(fd, total_size); + test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); close(fd); + kvm_vm_release(vm); +} + +int main(int argc, char *argv[]) +{ +#ifndef __aarch64__ + /* For now, arm64 only supports shared guest memory. */ + test_vm_type(VM_TYPE_DEFAULT, false); +#endif + + if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM)) + test_vm_type(get_shared_type(), true); + + return 0; }