From patchwork Tue Feb 11 15:50:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 864719 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2076.outbound.protection.outlook.com [40.107.100.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CB613253F25; Tue, 11 Feb 2025 15:50:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.76 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289051; cv=fail; b=nhWZtP1fwMPnPAuYPrz41c5SPkdJ7st1F5IND/FJmZhs9dz8/k66i86tHplXoj/CEILAEyDAmpC+Z5rK6pzjp6ccY5HZFfjCxB1tmQ2vdWFZH4tvMmq8BB2ybX3r8jP7xRRO227tS9Y562IbkRMhQzmmZcbOoywuvfw37eqp5KQ= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289051; c=relaxed/simple; bh=49f/dPp87RiFapq5p4qldVbwVIK+M6t0mbOEbglDROg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=o8BDOYbvgVnvqkpIhzcun5rXfVn3x8x1BWU64p87+I14H4I+W6nPJ4S2YrG59j4OBfqQ90n6DexGBlkG3bg5SmLuV4cgI0F9DBDUB5MswuM7GYwVsb1Nm+jRMAAvbeGhhikYTicEXRlu4MRZTXNRx/yS8SUINkJFYVmJneFSgGM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=VwNK7g/1; arc=fail smtp.client-ip=40.107.100.76 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="VwNK7g/1" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=ed1Sk+z6wy8K04mWIZr6PyvBQlP+QzamhUb48ZoCbf2QM1QI8CgfkkNIe13wb6ZqMHwTxvII5k9lL4Di5oQN6RqrXtSQnDyAhCc0runN1lpZGZj7byVC4rdGYvqMf9tlGdSoE7S99FKt7Tgn17dfUf6FZCrEux9VT8kGV/FnjGPyr1rkL/LQcSIvxJpDkp7jw3rhevLSaL2WBy2r9pnz94IWo40CdlVJ5qBEGa4oBFUO+a/sZjAiJKzDpqczZFKE6XGI30edg4gCe5q16YxM1GVMciQfmRertq3rCnVJkD8H48lOv4dDHyyIaxcd7M2ODNyBjYfx5Z/vXZzwG7DzUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=mwthoYlchLM8xN56fca1f+HkSHa3K7jmOxmklqBCBis=; b=v6ZSuveLIqJX16TaNRz5eC+MNJHGrWIvsrG7BQA8jXMbpBVowcL/0A7YF4B78FdP5iQKij3N/QXfGHi3kewJGY3JDWfWwbrnf0hdEoblP1w2ul+UKbM4ELUtlyVg+a1R6tH1CJJ8OVMBDzYBJbyfWNJhy7HptbH0+lyxBi8+l2x1omABLyus+r6bk/brn/SYJOLEk5pp2me12HP9p3DyXaCAQ0kJZKsy/R2WE1dODx4KJP9myj3XOGyJbbPYr4E9SaCHQVXjTeZpVONpRHiDz6NkGKZKxWGgFsbIB82q4YUTjDE3nJJ5OUs9sQncq9vu+ZpycVLycJdrKmH/qjHFPA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=mwthoYlchLM8xN56fca1f+HkSHa3K7jmOxmklqBCBis=; b=VwNK7g/1pMCOeLZ2XLiu00RypGJH6btihxDNmAlf57PAN/gBBXruQ+nP/EAzKwT0YhrxeryXunASFPRqtUGN+DgGY9YfZjiVB9l37uh2BOxo54wE7S+YEVhfbSxEsiHw6l0QvAiJ9T2tt316TFPGPL5vQVLKOwIZcXJvo5G77k2dhVac8ZUx9JvvADEvNg2n64CLX6ri8hl08TErUWwVqT7qho/9k+B0xYF0nF9XM5jXv45h8L5antIndVvn7jaqvcueqeazHwZnMkHNk38Xz3AoxrpWm/y0zifhoYAK6HdZPmvdB/nhDGtprnksCn47llWn3Lhio/k60Ws6tO5Y1A== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA3PR12MB8764.namprd12.prod.outlook.com (2603:10b6:806:317::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8422.19; Tue, 11 Feb 2025 15:50:45 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8422.015; Tue, 11 Feb 2025 15:50:45 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v7 2/8] mm/huge_memory: add two new (not yet used) functions for folio_split() Date: Tue, 11 Feb 2025 10:50:27 -0500 Message-ID: <20250211155034.268962-3-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250211155034.268962-1-ziy@nvidia.com> References: <20250211155034.268962-1-ziy@nvidia.com> X-ClientProxiedBy: BL0PR1501CA0006.namprd15.prod.outlook.com (2603:10b6:207:17::19) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA3PR12MB8764:EE_ X-MS-Office365-Filtering-Correlation-Id: b455369e-f6a7-47ab-efe3-08dd4ab3d87f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; ARA:13230040|376014|7416014|1800799024|366016|10070799003; X-Microsoft-Antispam-Message-Info: 03TZXauAyTC1Q4P5nJGoJI5SmgsiKxSvoTVakhDz/WKyVKOTn05vqHShbzaa1nbL0a4QaqPmUqyLLI+DACL8iLXf6bZF3qtxTbemPv6YuBplj16U+gia+Axn0bKOuDTSAuCGaK4NE/D/SL2F4i/9cPJjma3uALs0+8Abi3tqw78aqmwXj1C2nk0RoBYgoC94boQIsMZhYt8J+FG4Zt5CbnvIDzX+isNeAKx7eg0OAXLc2KNyOwffw7icJotGE2B7p58GVl2eOE4eKjs96+4kOgx4msCo4Cjts54PRj26AOMtGMJ9W1NwqWyg2PqyQf7qf6um0qqeW9lgDfvh6vS6OWOGsJSQSKPQqzHOBAD8NZZ0niLfL4s+eh70dKENXwjjYl2rkV1zk0Y4nCRd5XbBYnLX8szJL/9YmbrddWMD0CcOzVnLSA8s6ABiYWjSgWz1lzozlR2msZ9A9lsYIy4wUFL3xmYWxPkxFtLiWTByptZO7AqAcjZJqGtwkcJw0Oq8CIyx30A+kaiIL+ETLtD8QDjev8LvsqSHuAzVOnJWHoxOZKVf0sniSRJpt5G1ZGbD0mvu8rbc6OnvP3CcIbA++QqBgqJ/rB0wp7s5SaZ3jekSDUM5GeYW9BfmTnm/LmLWitZ6UwjVq8T3Q+8yA4x/+4GAgdOIlmetuDRRvp98zw5fR6nCZOw53M1m/J5yQZvi0JzkZi21BepIFM9khprpzDR4EODSj/Z+9d7R8o4DKx1NTopoQ0CHLJoBF7YjuWTtYYUKFk02GzqULYHAAvkGva4oRGQdu2PT+QHwdzySf1zD/p/8U9EeqGWhP+2cULn6aUz3nHtLxPUafD3xrrhSu6TNKpJ1xsAjufturmhFOwjt2EHt1s4C9SgiDV+qA3pc1VYCSI7R81D1yzCqliomGnX9SylDJq8AXFVrOjLGxLxXY8SV9zgih71KVzfrmLnweQwZumARQLRhx56ZqAntKf9l90S7o4X5ppOCI42j70E1PYm/0g809ZeosAYAE9sZYcyCzWOa3eHGRf9ed9oIxDP9b+8yEk5Bjco2DuGoI114dsq8ovKOm4Ru5+7VT+QOoT9ondYnzsM9H+ejyUZx3bolmBDU5LF5zcvZWruTOMLS0wCndubqgkXpX6DGKzPW7eyNEhWk3KkvQ87/Tczru44pMUXZsoWMK0ERldWtlvu8OpZGf+xNjGipShKe1eoD3FUu3YSIA1UQzOu/xZVE2/CmBzGcNj4yRWGNu1FPErAaayoeYPNftubjZtyJOcr8bj1UBoaXkIj4nSEKi0kkF61HtQ91r+bj7impvh/hJ9BiGDZH1yiil6FPcaxo9H9s7fmn4U7uYzC+ksHASyyRI7cx/T/LwnflMHWjfjPfl4XzvPHZnqBruBwJIYEzG7pn X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(7416014)(1800799024)(366016)(10070799003); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: kO8YePDlpJNHCV/AnOuK5upwfdQ8qV4knyI3VPY1V2sMdXtsyLvZ14iwgpZND5OSi9/bzN/bvrPmhL8gEl95uZbf/3exyblTku8ltzJVC6Du+SSPIsYvpxoY16hojS96NCOOfTNO5Wu9iuTFwMVB+d8HMZKTKBw2kPEFva5pBhZbBy/udM+WESQqvfEUHEA829ucVF5Ja6GKXCW5GfgP/cUoJ3gg/J5wn2gXNRJ9y6i/iEAn6TvqRcQmSkXnT2gEnKgWjng6VG0LoiSzw8jfKMMBNaxJGzn/yRJBzK54+PaQVHlXuvnvu1vUOKi7/Iz3OWy6D1wk3lcZUTbgHCIMpfcExFnVRLuHXmYIiJeBEcdd9i1Q6uc93gV5Rb10CgmbLbJ69XCE7RsXdoJRjibX37s93dhmR0IVN1ukqG6Os8kX/5i9VmrqHmPNZPGGm9/JIQo15GdTlkVbim6jNvogBgg1qQX+4NdMDn9dL1N5dGxKPHis3/poDSs3Nf46m/wQSm/MWf5uJXrydV5ZjRTlevuuR6RwTAR9m/ElQCWhaIkFfV8T1rkYjw8aumQWbbsGWpbL+shdeS4jKQ3/q6wTe5sDYMpIf8ukLEe3TxibCn3Zhn0UHxzx3Z/Uus7HqiWc13nWiyCIzb+lBCf2wlSZNRFViNdoOaGv3tL6skTBF8R+GlJnUhwMC+j79xz0PV9UADIoOiUeLsujSrSVFlxlPK0NB6IAiv+fVWktswy2iSCIEJu0wQuHH0F7IKXquKGdUU8sLNR1544aFZA2Qv3dukCmxwMRFB3M3rMHN2ZG6mQ1wkr0j8an70v5F8HxEfFME8xwwS8g9Mw219Eq4b3+bjbzkkO+/hzuE95E0Ijhf4w78CgMsT2/g/se4tQmIiVvQO6C645N/J1sN2l97MaKh2lJL4lyMy65tYdemOygCDZYCQo8Yw2zJlCXW+yUPlLDCQO1re9ZtSBefnRmQ2oD7JG3VWZUDh2Iz+BMvScwwGRSJFeAXbLCpshXSLk4qqDZ7m4Vmd3tq433kVZ5F65ff5hT2O/B5hGl/Gk45BXxTDheTCh7oDy6m8j8YyCj2vBqBrgZfilJ1L5AEzupZ55c9EtttedS+uPuKhNhhCO/y23U7/wy6CMRzWquBR2VsjOjbK8pZbxw2JptXwFhfoJZpBXj4N6oKC8CsBML8KMqm6aqmwdSZhgUPfej0JGnWeuzByEDL6IGeA/rX6fkGKXfxDZRRgWLyDg2jA13H8iuX3m+RdtWCxJFB0QHU+rF1JtzNghMThNbF1rV6J2JuwZnOT9mC9myOpuR6xmBWVuk55HAT7PWJLWCZQiJmuUB1wcA9E+AeUnMtYdu0wMkWTj10ORg6TbunT6kDzFsE4LVe2exOc6e6XMqrFamfRnAocFLLZTrFZk6Vf3C55ZM01G6rq1g9kPQKvg3qCbRrsj7ObLL+xGmb+U37fjUCX91zPPzy2APTAqnZGPlr8igMylz3RBmju34YN2bX/P0+k6PVHU1bc7AJSHjpCW7euT4razmOfFjXmNJo6Xk/+wZMYcsImuKBDzSgEOwQuHNJEzpP7CYQrUsewLuNr16XpqchfWjSkPivPRuR6qbvLmO/BLYYsYgvsSKwz2YW+Scu1OwANpgPL2xzhkgo5CuRBLbVCAY X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: b455369e-f6a7-47ab-efe3-08dd4ab3d87f X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2025 15:50:44.9470 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Q4D0hpur7x9PDTgnBPyhjHNIbZfF9+NFKopqy11xalkdzgC2uwLH67AECPKIWhIL X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB8764 This is a preparation patch, both added functions are not used yet. The added __split_unmapped_folio() is able to split a folio with its mapping removed in two manners: 1) uniform split (the existing way), and 2) buddy allocator like split. The added __split_folio_to_order() can split a folio into any lower order. For uniform split, __split_unmapped_folio() calls it once to split the given folio to the new order. For buddy allocator split, __split_unmapped_folio() calls it (folio_order - new_order) times and each time splits the folio containing the given page to one lower order. Signed-off-by: Zi Yan --- mm/huge_memory.c | 349 ++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 348 insertions(+), 1 deletion(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index a0277f4154c2..12d3f515c408 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3262,7 +3262,6 @@ static void remap_page(struct folio *folio, unsigned long nr, int flags) static void lru_add_page_tail(struct folio *folio, struct page *tail, struct lruvec *lruvec, struct list_head *list) { - VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); VM_BUG_ON_FOLIO(PageLRU(tail), folio); lockdep_assert_held(&lruvec->lru_lock); @@ -3506,6 +3505,354 @@ bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins) caller_pins; } +/* + * It splits @folio into @new_order folios and copies the @folio metadata to + * all the resulting folios. + */ +static int __split_folio_to_order(struct folio *folio, int new_order) +{ + int curr_order = folio_order(folio); + long nr_pages = folio_nr_pages(folio); + long new_nr_pages = 1 << new_order; + long index; + + if (curr_order <= new_order) + return -EINVAL; + + /* + * Skip the first new_nr_pages, since the new folio from them have all + * the flags from the original folio. + */ + for (index = new_nr_pages; index < nr_pages; index += new_nr_pages) { + struct page *head = &folio->page; + struct page *new_head = head + index; + + /* + * Careful: new_folio is not a "real" folio before we cleared PageTail. + * Don't pass it around before clear_compound_head(). + */ + struct folio *new_folio = (struct folio *)new_head; + + VM_BUG_ON_PAGE(atomic_read(&new_head->_mapcount) != -1, new_head); + + /* + * Clone page flags before unfreezing refcount. + * + * After successful get_page_unless_zero() might follow flags change, + * for example lock_page() which set PG_waiters. + * + * Note that for mapped sub-pages of an anonymous THP, + * PG_anon_exclusive has been cleared in unmap_folio() and is stored in + * the migration entry instead from where remap_page() will restore it. + * We can still have PG_anon_exclusive set on effectively unmapped and + * unreferenced sub-pages of an anonymous THP: we can simply drop + * PG_anon_exclusive (-> PG_mappedtodisk) for these here. + */ + new_head->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; + new_head->flags |= (head->flags & + ((1L << PG_referenced) | + (1L << PG_swapbacked) | + (1L << PG_swapcache) | + (1L << PG_mlocked) | + (1L << PG_uptodate) | + (1L << PG_active) | + (1L << PG_workingset) | + (1L << PG_locked) | + (1L << PG_unevictable) | +#ifdef CONFIG_ARCH_USES_PG_ARCH_2 + (1L << PG_arch_2) | +#endif +#ifdef CONFIG_ARCH_USES_PG_ARCH_3 + (1L << PG_arch_3) | +#endif + (1L << PG_dirty) | + LRU_GEN_MASK | LRU_REFS_MASK)); + + /* ->mapping in first and second tail page is replaced by other uses */ + VM_BUG_ON_PAGE(new_nr_pages > 2 && new_head->mapping != TAIL_MAPPING, + new_head); + new_head->mapping = head->mapping; + new_head->index = head->index + index; + + /* + * page->private should not be set in tail pages. Fix up and warn once + * if private is unexpectedly set. + */ + if (unlikely(new_head->private)) { + VM_WARN_ON_ONCE_PAGE(true, new_head); + new_head->private = 0; + } + + if (folio_test_swapcache(folio)) + new_folio->swap.val = folio->swap.val + index; + + /* Page flags must be visible before we make the page non-compound. */ + smp_wmb(); + + /* + * Clear PageTail before unfreezing page refcount. + * + * After successful get_page_unless_zero() might follow put_page() + * which needs correct compound_head(). + */ + clear_compound_head(new_head); + if (new_order) { + prep_compound_page(new_head, new_order); + folio_set_large_rmappable(new_folio); + + folio_set_order(folio, new_order); + } + + if (folio_test_young(folio)) + folio_set_young(new_folio); + if (folio_test_idle(folio)) + folio_set_idle(new_folio); + + folio_xchg_last_cpupid(new_folio, folio_last_cpupid(folio)); + } + + if (!new_order) + ClearPageCompound(&folio->page); + + return 0; +} + +/* + * It splits an unmapped @folio to lower order smaller folios in two ways. + * @folio: the to-be-split folio + * @new_order: the smallest order of the after split folios (since buddy + * allocator like split generates folios with orders from @folio's + * order - 1 to new_order). + * @page: in buddy allocator like split, the folio containing @page will be + * split until its order becomes @new_order. + * @list: the after split folios will be added to @list if it is not NULL, + * otherwise to LRU lists. + * @end: the end of the file @folio maps to. -1 if @folio is anonymous memory. + * @xas: xa_state pointing to folio->mapping->i_pages and locked by caller + * @mapping: @folio->mapping + * @uniform_split: if the split is uniform or not (buddy allocator like split) + * + * + * 1. uniform split: the given @folio into multiple @new_order small folios, + * where all small folios have the same order. This is done when + * uniform_split is true. + * 2. buddy allocator like (non-uniform) split: the given @folio is split into + * half and one of the half (containing the given page) is split into half + * until the given @page's order becomes @new_order. This is done when + * uniform_split is false. + * + * The high level flow for these two methods are: + * 1. uniform split: a single __split_folio_to_order() is called to split the + * @folio into @new_order, then we traverse all the resulting folios one by + * one in PFN ascending order and perform stats, unfreeze, adding to list, + * and file mapping index operations. + * 2. non-uniform split: in general, folio_order - @new_order calls to + * __split_folio_to_order() are made in a for loop to split the @folio + * to one lower order at a time. The resulting small folios are processed + * like what is done during the traversal in 1, except the one containing + * @page, which is split in next for loop. + * + * After splitting, the caller's folio reference will be transferred to the + * folio containing @page. The other folios may be freed if they are not mapped. + * + * In terms of locking, after splitting, + * 1. uniform split leaves @page (or the folio contains it) locked; + * 2. buddy allocator like (non-uniform) split leaves @folio locked. + * + * + * For !uniform_split, when -ENOMEM is returned, the original folio might be + * split. The caller needs to check the input folio. + */ +static int __split_unmapped_folio(struct folio *folio, int new_order, + struct page *page, struct list_head *list, pgoff_t end, + struct xa_state *xas, struct address_space *mapping, + bool uniform_split) +{ + struct lruvec *lruvec; + struct address_space *swap_cache = NULL; + struct folio *origin_folio = folio; + struct folio *next_folio = folio_next(folio); + struct folio *new_folio; + struct folio *next; + int order = folio_order(folio); + int split_order; + int start_order = uniform_split ? new_order : order - 1; + int nr_dropped = 0; + int ret = 0; + bool stop_split = false; + + if (folio_test_anon(folio) && folio_test_swapcache(folio)) { + /* a swapcache folio can only be uniformly split to order-0 */ + if (!uniform_split || new_order != 0) + return -EINVAL; + + swap_cache = swap_address_space(folio->swap); + xa_lock(&swap_cache->i_pages); + } + + if (folio_test_anon(folio)) + mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); + + /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ + lruvec = folio_lruvec_lock(folio); + + folio_clear_has_hwpoisoned(folio); + + /* + * split to new_order one order at a time. For uniform split, + * folio is split to new_order directly. + */ + for (split_order = start_order; + split_order >= new_order && !stop_split; + split_order--) { + int old_order = folio_order(folio); + struct folio *release; + struct folio *end_folio = folio_next(folio); + int status; + + /* order-1 anonymous folio is not supported */ + if (folio_test_anon(folio) && split_order == 1) + continue; + if (uniform_split && split_order != new_order) + continue; + + if (mapping) { + /* + * uniform split has xas_split_alloc() called before + * irq is disabled to allocate enough memory, whereas + * non-uniform split can handle ENOMEM. + */ + if (uniform_split) + xas_split(xas, folio, old_order); + else { + xas_set_order(xas, folio->index, split_order); + xas_try_split(xas, folio, old_order, + GFP_NOWAIT); + if (xas_error(xas)) { + ret = xas_error(xas); + stop_split = true; + goto after_split; + } + } + } + + /* complete memcg works before add pages to LRU */ + split_page_memcg(&folio->page, old_order, split_order); + split_page_owner(&folio->page, old_order, split_order); + pgalloc_tag_split(folio, old_order, split_order); + + status = __split_folio_to_order(folio, split_order); + + if (status < 0) { + stop_split = true; + ret = -EINVAL; + } + +after_split: + /* + * Iterate through after-split folios and perform related + * operations. But in buddy allocator like split, the folio + * containing the specified page is skipped until its order + * is new_order, since the folio will be worked on in next + * iteration. + */ + for (release = folio, next = folio_next(folio); + release != end_folio; + release = next, next = folio_next(next)) { + /* + * for buddy allocator like split, the folio containing + * page will be split next and should not be released, + * until the folio's order is new_order or stop_split + * is set to true by the above xas_split() failure. + */ + if (release == page_folio(page)) { + folio = release; + if (split_order != new_order && !stop_split) + continue; + } + if (folio_test_anon(release)) { + mod_mthp_stat(folio_order(release), + MTHP_STAT_NR_ANON, 1); + } + + /* + * Unfreeze refcount first. Additional reference from + * page cache. + */ + folio_ref_unfreeze(release, + 1 + ((!folio_test_anon(origin_folio) || + folio_test_swapcache(origin_folio)) ? + folio_nr_pages(release) : 0)); + + if (release != origin_folio) + lru_add_page_tail(origin_folio, &release->page, + lruvec, list); + + /* Some pages can be beyond EOF: drop them from page cache */ + if (release->index >= end) { + if (shmem_mapping(origin_folio->mapping)) + nr_dropped += folio_nr_pages(release); + else if (folio_test_clear_dirty(release)) + folio_account_cleaned(release, + inode_to_wb(origin_folio->mapping->host)); + __filemap_remove_folio(release, NULL); + folio_put(release); + } else if (!folio_test_anon(release)) { + __xa_store(&origin_folio->mapping->i_pages, + release->index, &release->page, 0); + } else if (swap_cache) { + __xa_store(&swap_cache->i_pages, + swap_cache_index(release->swap), + &release->page, 0); + } + } + } + + unlock_page_lruvec(lruvec); + + if (folio_test_anon(origin_folio)) { + if (folio_test_swapcache(origin_folio)) + xa_unlock(&swap_cache->i_pages); + } else + xa_unlock(&mapping->i_pages); + + /* Caller disabled irqs, so they are still disabled here */ + local_irq_enable(); + + if (nr_dropped) + shmem_uncharge(mapping->host, nr_dropped); + + remap_page(origin_folio, 1 << order, + folio_test_anon(origin_folio) ? + RMP_USE_SHARED_ZEROPAGE : 0); + + /* + * At this point, folio should contain the specified page. + * For uniform split, it is left for caller to unlock. + * For buddy allocator like split, the first after-split folio is left + * for caller to unlock. + */ + for (new_folio = origin_folio, next = folio_next(origin_folio); + new_folio != next_folio; + new_folio = next, next = folio_next(next)) { + if (uniform_split && new_folio == folio) + continue; + if (!uniform_split && new_folio == origin_folio) + continue; + + folio_unlock(new_folio); + /* + * Subpages may be freed if there wasn't any mapping + * like if add_to_swap() is running on a lru page that + * had its mapping zapped. And freeing these pages + * requires taking the lru_lock so we do the put_page + * of the tail pages after the split is complete. + */ + free_page_and_swap_cache(&new_folio->page); + } + return ret; +} + /* * This function splits a large folio into smaller folios of order @new_order. * @page can point to any page of the large folio to split. The split operation From patchwork Tue Feb 11 15:50:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 864718 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2076.outbound.protection.outlook.com [40.107.100.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45686254AFF; Tue, 11 Feb 2025 15:50:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.76 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289056; cv=fail; b=ouvM8SMyGCLnEFIGntp/BSWOIOVu7uOf4mYJRjfK4lItvXAclm4nkebf8TeNnWDEek1CMZvVvhLMRuKzXM/KIJmE8aKUuV9H7YKvC0loA2JQ501WupICVpk4shptEqECnov4WdbOUartHClkjLkP0TF5t9uSoWhxROhTrRfOvy8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289056; c=relaxed/simple; bh=+RQccpuYlY19/lLmWyWHzI+CysA/a577KD+IW8ynSUc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=CkhsL6d5BtYEr7kWdHqCbSmwMcLGdztDMplOWsu75+p76b2ez4L8Nm3/uuCgxgcQiqUZnI+Jk8sY7k78tXe8Osa1YH6vIGBfqoAjckWFSQI1ZpxLxcyTWRnbrxOjAel/5VpbHBb0z2r0IBTlgVbYKsGtIJOu0G6bijq8Je5x2VQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=Dfk37PcV; arc=fail smtp.client-ip=40.107.100.76 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="Dfk37PcV" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=KGJJpxSeg85pnldFExlk9Frs9gpKIZ/9GBGQnySghqVmEfMBkcO1Kf527x5KumPtu7Pwu5zkfInCzirinJ5lCp+Cgd2Z1uM0R7ywggVNdhtF/hQgngA5PSLXsRrkn9ha09OsRfAbPweicoK3jueUwFpxJRsGl7FzztCZbO1g91zEUrXZ0hX6vZyigbU9CT/lBHKSzT35JD3EhjK7CSb5NxbTrgygpe5TEqqkDfCCD7WQyotfrsNliweAXxLRvHEbrqGPcNbPWoJ18ZVU16w3hUxskVKX95oTHgp4nv/EkVQk8L+X8pfiB16tfufUUm+JDWwbElbZBKfwvOGgcuoEvQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dAYnIf8/Gs+3uoCIL3RT60CY7Ue8nFoACFYyR4tk1gM=; b=F8YpVmbLN912DJ1u3oFUWJQR5x78uGutggk4yjSo570iypw7jvTg3WDiT+KZV4U1Gh/ICC35tY/uYkr5OrHvt9mvrtd4sywNsu8KVyT55issMMRV3LjYEjUNiGkF2mOq5m/hww779w0UWpNdakf165pnf+4VY4WwrShNYqWt+aqTTQUIHhBxGTlxHCStSfiXiICi899C8cVISIp2wtOGfgPRIvxhu1PbaF5JM0ssW+XwjDH+c0loIiP1ikRuEDcxUm1Gq3ervqUV0IezcbPPQaWz9ZXDxcsHEMns+g04Ps35tlcqfCuwojFRnVtWxqH4MKfGU+Wu4lU1f+nao+YBjQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dAYnIf8/Gs+3uoCIL3RT60CY7Ue8nFoACFYyR4tk1gM=; b=Dfk37PcVCt/chtjco3pf1eWFW+rVz2WMJ1d349SA8AJdFBADhx065aJVYv7yrCBlPSl8L21y8VgKRsIu60iYBmU6ZSNKxkJLgrgNWp+Bhdw7nyBwU7fV4miZgz5y4aiJQaoeFc8ehdLNAs9E1hlZ6C7EAks9ZQscDNmK8mWxZgqB+fQN+x04xaO7AdTABcnUedwP+DFR2hAGUPr56PWOwcT/4lUlfaHDqibLyoOPyaAxNykVIHOt61Vb3rTrCDLVZyTpAKc7moWxcfmWiuZxcwNa9Q91qO+KF9CenLwt7P64a41w5jO6ZBZk135dLshD49jSBCbvREyDUCJuZU5bfw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA3PR12MB8764.namprd12.prod.outlook.com (2603:10b6:806:317::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8422.19; Tue, 11 Feb 2025 15:50:48 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8422.015; Tue, 11 Feb 2025 15:50:48 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v7 4/8] mm/huge_memory: add buddy allocator like (non-uniform) folio_split() Date: Tue, 11 Feb 2025 10:50:29 -0500 Message-ID: <20250211155034.268962-5-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250211155034.268962-1-ziy@nvidia.com> References: <20250211155034.268962-1-ziy@nvidia.com> X-ClientProxiedBy: BN9PR03CA0236.namprd03.prod.outlook.com (2603:10b6:408:f8::31) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA3PR12MB8764:EE_ X-MS-Office365-Filtering-Correlation-Id: 2b4c2bd4-0e1f-47ca-6dea-08dd4ab3da60 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: 6cikV0ESg1thxwwBM2PuG/LJabMXQK4uaE+xKC7Rj1Hf3lC1bvNaIWacu1mg96OfBdj+bpvmt7IXSpGfcTPQqynipMvpyvoBE55MWdyhNGGYFLrJFVI2pYzDSTMkGNtDIN8G8vRDtTA1uCmIUrgcPo/tV/YqDWdcS7ff/lJSfzYOodhDN3tl96BXNul1pMGHzs+8SUYK+zY9jgf2CS6CK509v3JKlKSCHWgoeYpkNEF0wdLDlvbMRGWfibc8GzlnX+kiKyT50pXvQi67iuCAp4SJMBrxhIeIDC1MYhc13zRJQPRGSFwVZQLpFXg3aCgL36vl+QSaTcLVbBG6aYw06cqTIpdpdZ1mvXBchdEU8VTjqvqLpsud9dmxNJqzz3jyzS7QCS2AXyk7JK9nxfRABY9xYZZL3SCLvEmF2qGqC9KYTNZSsagd+qE6YRo9tkRjtU+OfjHCEta2QU6CnqGqz0DOSBGcdKCgJEEX10gTh5xr4wONN2oet0w4zFyo1km8iFxx+/aKZeSCJnufU/MpC3F04K/8tl/vTTB+UUwFr4e+xN4f2h4hcM4GhnpCgY8cOTEhL5BREYYno2hjYEr2/EEPhexZonyNodCMu22F6uviB1ZKNEWENMbfU6hRDIqhO6Iyh4JjN6xrHxO16t9ma6B0I4Q1lXBJCSjUnLSw1pxBJ7uk62Q/JjSaYt1K8CITAijm8GVMdFsbtgcznIhyMJV+wSNCoGDlUTjk8+oXPNG3v1Hatifd0yWdasGc4Nxrgqgv2+i89qTNcEzAHzv34vxURkzrmgND/f8i1I5/YqjuBxayiffLD1DZJDoivwDvAYLVArRH2olRMWYyHTb4GrUxhN/LKR9berMoTkntsNwc2Vgjn76PRGfKqiJ6Vt3X+fsQNNHYgzzWNJDKkeH4KHR8G3STNSqjDcYqGPwIb9nNMvLzLwfxm/9TzQRbnKnI9DJsLLnNYlPYcAeReWOaIBadoZCWt+OhQNsYEs5W6WspSldfCZI0bugeK2GLLk5gk7tN1VSWNwRmfDLCHy6MhyJyLTZ9pseDlcfMDfXdTPazs1xkQw2OVl+MaVAbfEKvIV4rtJWLbn86UAVMHsvwg1nWNi4wOjmKV1fXcAqAG7OGYklGCRc2mepfgD8CLmEHXJPHSaS8OXJkjKSJOrEJttccI9tKF6DQpJwrmQQcHfKItLQ4x64zoPljJFnBw/JwZOv7pA02W5BbaqnECpvVEfu6UIb4OhWh95HiO3Uj9sDrKC91PAtkzIZWioeGrrcM3SoH9e+rL4IBcSAGdIGnenhuzXo7ySQiATRfh27WzSAp/AwF/+Zsos1kFiPq1mMr0VVQjUolkueuQRsxbn1RUhM2gcePTShqXKpXwZ1q3608dZdinLuuHp0/ctLQ9SEw X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(7416014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: F/u11h3MFfO6SEkn+ymER1L46KwkGuYSs8u2FPSsG+kS+7COhAg0qnsyfw/E7PXlpGEoF7BSRgviM1kUbiz6+f2kh1oCZ9iSE/S+OySV8a1m4G+zFc1VefoCKBKWIxGVJIgrEQWxwKu8+OmKLlwsCDu+vmXIrQQIuXGyXzxmUZAYHH1MRhjE1Dv9lNPBMIopvrs/P66F2vXneE7RbyVwxyAjvrsIxfC+vIkr6ybXnchdk8eUZtXCQe9zzjuL+BNolyof2EfaXrTZYeYqYWBn2cDqbDTzuMp/hKZVZExImh5o6vbM3RMYBZCpANrLmeEDjnthjSyrVBcd8KDZbFSLnc+J/hVF7NjJ188aYCfX7pioAWPMjnNwglscQ/QBX0wG8I5oG2EJFwdA45hmuCLC+g2qlOxjZN/cxO8coLYlo58XvZ/UTAYW3tgrzRbtZwgfMU0N7ERfKuVaGVSbT3rWUznXL81jhw0qKTTP6ZrdBdn2pCb4YxDEZzPJ7uYVh72s1+r31Wwa188pE7E9UCiaB6LSSWaovDfGzCWP/UDgWuQBiWmbW7lynBi2kIH5rN4B4DCLERtAcUtkgZSYz+Ch8UMdaVJ9Cphjr6yyTgfBt8Pv4TUF8o4esY1vMXYi4j7/85D12ylUfhY7Bs1lBvQPNkZS1A960TqYieB3AM9lfJ+pavZOsxzbZT6/OyCTPqKgiADAtPp/UKYVDh0ZwWkI0JIteKnHCnBWXG4YfERIfxO+lcYBwkIRfNWX2w51h6v0TMsz4pGTf52vzZwKYiaexiBj42/3z6rIrdZaSH6W1t3XaoXxj0l0S9fwuKZBmmGvJjdH45vKu80WyEExOdmgBwt5cDn6knl4jpq1Tku2H2OdYgthw1ASryyCLh9JXAxDc+Ry/UaPqde4jzANaUyAV6C7ZImi+0NKqhTCm3Hw8VwihPYQwF57Rg0640a/jmrpDZWwO+AE6YDcMcBhTd3x8u1+mS2xf9FmlhA+jN0QVDjrblY3P31Uy5l+p745isuysX+LWONBihZHY7PLGGI2g9Z/PtnFLa4hnYk5bGu2filf07YxDCvG11gsclF3DV5sx3l2Mqx9Glt+HgkfnsgEFttH8e0G4SVfQ2mgDAFzQh5oH/6wEz6VUHlSA/wCD25eQMvAF6wRlUSf57i25SpncXMDY/zvnUaBTD4WU8sUwybdRCrWaLYcNstx4Ub4Mw2iqgWXFdN0l48wqqBQ+aDmUcCPHwxoDn4RVql/QJXoEyXTxU2KuFUpewBhqRCw9mgz7YrsvwiBkr/RVWVd/QjZzi7Hm04Upy3+KWsiPVHiii/Twj0ix0mqNQ1VVfyjxWd1BTdVN7/3aD/0VN5XGgXUugQXUoEpKVVnrWtDphg+enjZ/HpkEFreHotCHgN04ZLAjo5AkElyg98UvtZwagXUgl/ZKolvnNxUyZ9UHJnMbjs3PEMDS0EbkPpgo29QqCAq5PlEzF7F+qRPj6BPFA4QYajF2NYVnFYMegOgb9JEsgvYnDCgH+XzQe3P419aKZ2dMLRbL6LzbpMlyzVAI5l+LkKo02EH0AS0FijHPkn+SjPTjfHXUsX03qUeGLdxolx9 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 2b4c2bd4-0e1f-47ca-6dea-08dd4ab3da60 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2025 15:50:48.0893 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: 8gokYCZ4P1erZg5GlzxQnN9tRQXiuUgZd+HRMIr3wDtNq34BYJ/7euOu5ja8Q8g5 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB8764 folio_split() splits a large folio in the same way as buddy allocator splits a large free page for allocation. The purpose is to minimize the number of folios after the split. For example, if user wants to free the 3rd subpage in a order-9 folio, folio_split() will split the order-9 folio as: O-0, O-0, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-8 if it is anon, since anon folio does not support order-1 yet. ----------------------------------------------------------------- | | | | | | | | | |O-0|O-0|O-0|O-0| O-2 |...| O-7 | O-8 | | | | | | | | | | ----------------------------------------------------------------- O-1, O-0, O-0, O-2, O-3, O-4, O-5, O-6, O-7, O-9 if it is pagecache --------------------------------------------------------------- | | | | | | | | | O-1 |O-0|O-0| O-2 |...| O-7 | O-8 | | | | | | | | | --------------------------------------------------------------- It generates fewer folios (i.e., 11 or 10) than existing page split approach, which splits the order-9 to 512 order-0 folios. It also reduces the number of new xa_node needed during a pagecache folio split from 8 to 1, potentially decreasing the folio split failure rate due to memory constraints. folio_split() and existing split_huge_page_to_list_to_order() share the folio unmapping and remapping code in __folio_split() and the common backend split code in __split_unmapped_folio() using uniform_split variable to distinguish their operations. uniform_split_supported() and non_uniform_split_supported() are added to factor out check code and will be used outside __folio_split() in the following commit. Signed-off-by: Zi Yan --- mm/huge_memory.c | 137 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 100 insertions(+), 37 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 21ebe2dec5a4..400dfe8a6e60 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3853,12 +3853,68 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, return ret; } +static bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns) +{ + /* order-1 is not supported for anonymous THP. */ + if (folio_test_anon(folio) && new_order == 1) { + VM_WARN_ONCE(warns, "Cannot split to order-1 folio"); + return false; + } + + /* + * No split if the file system does not support large folio. + * Note that we might still have THPs in such mappings due to + * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping + * does not actually support large folios properly. + */ + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + !mapping_large_folio_support(folio->mapping)) { + VM_WARN_ONCE(warns, + "Cannot split file folio to non-0 order"); + return false; + } + + /* Only swapping a whole PMD-mapped folio is supported */ + if (folio_test_swapcache(folio)) { + VM_WARN_ONCE(warns, + "Cannot split swapcache folio to non-0 order"); + return false; + } + + return true; +} + +/* See comments in non_uniform_split_supported() */ +static bool uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns) +{ + if (folio_test_anon(folio) && new_order == 1) { + VM_WARN_ONCE(warns, "Cannot split to order-1 folio"); + return false; + } + + if (new_order) { + if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && + !mapping_large_folio_support(folio->mapping)) { + VM_WARN_ONCE(warns, + "Cannot split file folio to non-0 order"); + return false; + } + if (folio_test_swapcache(folio)) { + VM_WARN_ONCE(warns, + "Cannot split swapcache folio to non-0 order"); + return false; + } + } + return true; +} + static int __folio_split(struct folio *folio, unsigned int new_order, - struct page *page, struct list_head *list) + struct page *page, struct list_head *list, bool uniform_split) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); - /* reset xarray order to new order after split */ - XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); + XA_STATE(xas, &folio->mapping->i_pages, folio->index); bool is_anon = folio_test_anon(folio); struct address_space *mapping = NULL; struct anon_vma *anon_vma = NULL; @@ -3873,29 +3929,11 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (new_order >= folio_order(folio)) return -EINVAL; - if (is_anon) { - /* order-1 is not supported for anonymous THP. */ - if (new_order == 1) { - VM_WARN_ONCE(1, "Cannot split to order-1 folio"); - return -EINVAL; - } - } else if (new_order) { - /* - * No split if the file system does not support large folio. - * Note that we might still have THPs in such mappings due to - * CONFIG_READ_ONLY_THP_FOR_FS. But in that case, the mapping - * does not actually support large folios properly. - */ - if (IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && - !mapping_large_folio_support(folio->mapping)) { - VM_WARN_ONCE(1, - "Cannot split file folio to non-0 order"); - return -EINVAL; - } - } + if (uniform_split && !uniform_split_supported(folio, new_order, true)) + return -EINVAL; - /* Only swapping a whole PMD-mapped folio is supported */ - if (folio_test_swapcache(folio) && new_order) + if (!uniform_split && + !non_uniform_split_supported(folio, new_order, true)) return -EINVAL; is_hzp = is_huge_zero_folio(folio); @@ -3952,10 +3990,13 @@ static int __folio_split(struct folio *folio, unsigned int new_order, goto out; } - xas_split_alloc(&xas, folio, folio_order(folio), gfp); - if (xas_error(&xas)) { - ret = xas_error(&xas); - goto out; + if (uniform_split) { + xas_set_order(&xas, folio->index, new_order); + xas_split_alloc(&xas, folio, folio_order(folio), gfp); + if (xas_error(&xas)) { + ret = xas_error(&xas); + goto out; + } } anon_vma = NULL; @@ -4020,7 +4061,6 @@ static int __folio_split(struct folio *folio, unsigned int new_order, if (mapping) { int nr = folio_nr_pages(folio); - xas_split(&xas, folio, folio_order(folio)); if (folio_test_pmd_mappable(folio) && new_order < HPAGE_PMD_ORDER) { if (folio_test_swapbacked(folio)) { @@ -4034,12 +4074,8 @@ static int __folio_split(struct folio *folio, unsigned int new_order, } } - if (is_anon) { - mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1); - mod_mthp_stat(new_order, MTHP_STAT_NR_ANON, 1 << (order - new_order)); - } - __split_huge_page(page, list, end, new_order); - ret = 0; + ret = __split_unmapped_folio(page_folio(page), new_order, + page, list, end, &xas, mapping, uniform_split); } else { spin_unlock(&ds_queue->split_queue_lock); fail: @@ -4117,7 +4153,34 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, { struct folio *folio = page_folio(page); - return __folio_split(folio, new_order, page, list); + return __folio_split(folio, new_order, page, list, true); +} + +/* + * folio_split: split a folio at @page to a @new_order folio + * @folio: folio to split + * @new_order: the order of the new folio + * @page: a page within the new folio + * + * return: 0: successful, <0 failed (if -ENOMEM is returned, @folio might be + * split but not to @new_order, the caller needs to check) + * + * It has the same prerequisites and returns as + * split_huge_page_to_list_to_order(). + * + * Split a folio at offset_in_new_order to a new_order folio, leave the + * remaining subpages of the original folio as large as possible. For example, + * split an order-9 folio at its third order-3 subpages to an order-3 folio. + * There are 2^6=64 order-3 subpages in an order-9 folio and the result will be + * a set of folios with different order and the new folio is in bracket: + * [order-4, {order-3}, order-3, order-5, order-6, order-7, order-8]. + * + * After split, folio is left locked for caller. + */ +int folio_split(struct folio *folio, unsigned int new_order, + struct page *page, struct list_head *list) +{ + return __folio_split(folio, new_order, page, list, false); } int min_order_for_split(struct folio *folio) From patchwork Tue Feb 11 15:50:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 864717 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2076.outbound.protection.outlook.com [40.107.100.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 61B41253B73; Tue, 11 Feb 2025 15:50:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.76 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289060; cv=fail; b=SfM/PgJ5iPMYLgvpuUKyud1aXdjcDwZ7viC0/H5BtSikP15rFOShQogEbKC9QWkhSWIXpML/lhlBbP7X4hF2w33xmQ9ZfBDFravbr7ziKX14hwPk2aM/nUorToeTJSAyzYctNOuEPP4QBIbJYv81AwYeAld31eShJsDCZvUFSu8= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289060; c=relaxed/simple; bh=s37hja60uFWjgwL27ebpMB3HnWuiUbSz89CNYWS1vTM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=dp5Xa07LBgqCwj0bZVmTtldtQVEPQxJJCE/ob/s3vA6hICjtz03VLKtvMJlTe9VtYwQ5G7yyPwsv1scdDnHyJrzKxk36C8HEqeY0Wk4+sszvO7A4uwjC5OexUOHBPs47NcvJadcpF+W+SYtfz7S9NP8nnZWyp4hUYbNZzjhXhlg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=TOGG22zX; arc=fail smtp.client-ip=40.107.100.76 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="TOGG22zX" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=kZB5Xfy+GLQw2k48omtzo/P7r/UDK4hEc6z3x27u7ZFzhGfsR0tuaEaASv/QpRDtv9mzXgtd+sM/gChpPvEs/4Ntfxyeqs8yKYv71Ci+DAEfQ3lBDyG3sHOmKuu0svWT2cmmFQzKUA5ms7Rivubo9Ogm+YsK8uXfX//882uf0Z7oAoIfPYg49o87Xv/4cpOcCf1i0KOJVHxnjsIyyF0TrB19MN9T2ZZUWOE1gCVDajV8h6X4I96prSYgpxIrDSfXWPuOmPp5/Wv1AzCmjyC+LnWOhz/vmAIbkwoGYGvlblcG/n4CN+6zUb1k8pgE1Ssv8apEnXsdJLoMa5nIwdIURg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ClSYVUWbTTC8Cb82lVfmu47Qu9TQA30MyMEQBjM2sHI=; b=GPqkLiBu5OlqW0EP579c4+K+uH3J922Xs9CVdmhMRHAS0Sl122nv9l4NFJLPgScnn6JmNOQ1Cz6IWInatUZyJ2vymaE9AoI0btSZZyvQuxu8HP8OwJ3l4z9o0QfFyI7wB8/siLd84WgfbKM3vBFKRB6hpzuhN4T8uP9i8UlG1BUChAj5M5wL+7VgOj4M6Zt+t9JPgIkL6a4Sb0AOqMF5AIi+Je52+jIP4WATPAjZRCwZO+uaR7x+yA+Yq8vwB4RnVP8gKPlPmeY57qvp/9ibB9kPBrThXxuUUfiZ5dI5aYnB00hsQN07zK1HiFM8W8CH+QMGH/EdIocvr8BDTZl/Vg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ClSYVUWbTTC8Cb82lVfmu47Qu9TQA30MyMEQBjM2sHI=; b=TOGG22zXB5JvGaTIHIXTDbucvpu2Olu/uFHb003ao0sdxLL49KreI8tz0/FgmcVz8AnvMbYVVPND3BmwRFrEMD2OZIj/P8Jv5ZKBUjT+sca5R4FX6ofDsRKPDc3C8MJLqk1ZwOeruQOnsDe3UrVBXLiAk2Xn6nKVycSJCi/gdQpVIU1sgDr2j2UE9Vz6NHVsCFQJtTN1cykWDtzARLSCT//sWnf45ywi7VcNIIyLPDgP+x9pGuASSkESyyi9UevKK0lT9+adcJtwZ1LXdNcYevdlO24/NOs4SSPxD4YRPbuGw8dZI5POKchPfFNqykYKiAbwc0G8kkDfI7rodUGbTg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA3PR12MB8764.namprd12.prod.outlook.com (2603:10b6:806:317::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8422.19; Tue, 11 Feb 2025 15:50:51 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8422.015; Tue, 11 Feb 2025 15:50:51 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v7 6/8] mm/huge_memory: add folio_split() to debugfs testing interface. Date: Tue, 11 Feb 2025 10:50:31 -0500 Message-ID: <20250211155034.268962-7-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250211155034.268962-1-ziy@nvidia.com> References: <20250211155034.268962-1-ziy@nvidia.com> X-ClientProxiedBy: MN0P220CA0001.NAMP220.PROD.OUTLOOK.COM (2603:10b6:208:52e::29) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA3PR12MB8764:EE_ X-MS-Office365-Filtering-Correlation-Id: 3725b3b2-7cd0-4a7e-0fa1-08dd4ab3dc1c X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: GRcVFCbGGH6G0aq3jdHrIS+1p174g4xM4DBw0HsqUy3dGKLD+tXnnO6PXIKpLmw3tsj/Sj27wL4fWfFiBGX5E9hcDCuFATApQ1q45+Ld6IUlPF92Mqwy3ZX+xVPwVR4YDVKOz8NKk3mUE5bQsdps3LgtpQDtybqKHD0zl0eg4CXNGM7XxiYd50nA/JLqUmZwOgVHwIltwiLDdf9itBh6VVjyrnVg4rV0JgaWt/eyE1b+wpsJsRhnmXZehK1if+cX4KT1FGeNNOib/EZTpWi+I/p9g6HOKwcPXlySTeVVl7DkFnNqR0K+VyEHoFrMii2DUVqHkKnUu7My39r1QmDw7NnFNiIW54kmnG743yyVcaYsWvj64f9H1Oo7c90HcYKVipIN95AW7wcZRsSGa4qdbpfsOwhUo2Fm8h8uHVyrHXjzQP1ICZGTN+SCEgyMHxhLvaBVsKfQS8dZBuC4Hnfgu2HbrIEtM3Wbb0Uz9H8IQp4CRgO3qtRrOFPTzV5z6wzLbeOtrlL8Gn8/zDr/yupnnXydPgnnew21wVVfxRISqk66c4zPB5iGYd/pUFt5JWTHgXd8vs4RWWhGtSiPrQZpbvPLgB7aDpXDYnQXKGA1ewSH9RVohBYtZuxchMDOFo+UhXW9IMT8keIrgoTj4UxTRDtidj5AjxxxSTQ5fLvkAlzChPMo/abuDQeXvv2eM5xFCTu3pMTwyyxrFglPLYwq+KzB3f5Q2eGXyb9/2B3h8iUzdWa/hx5CZhsmJJ7DCSE5O5VTklLZk1li9Fu3b5w7NbLU3fCCdWCGc7mn2hIPmpUR668amrIpOd6AWu0xVGPL1sFu7uWaHBIOKoW/xr5HuosYUM2HH6R++yCIZoJpz3m7NK6TVmw9j/R5zYoB3zVeiKD5lAfsltx9SY3RX5ploklgw60rb3S/PFLA3A4sVaasL2GPxq1sOyXJSFaRa1+k1qeuFztwByjPuF6Y0UyiuurpFJdYLWJVVq4NFrpf+CAo0KxV/nmvd5UwZZ9r9yxQznsDDAuBiAiwv+ws2OvWnXEXT2famyjU68SYKFjbGotL2W2Lp/Z0qe0F/pBI8S5ovfdF7rvOYUC8lllBnVRgEQNkaz1NL65419aNEByMPbvFZQW4JdpwrkugFHauLsxCGCfJ+JQ6meXNF3ZalYKTjkJn4ZBmpdOymbN9jIqVCWERdiiOHCSlvzHj4aOKctkFlHyyqjLLYRIxbAugkJwMndrhohbwy/j5zgCRJPn8+18Pt9I2Ud0N/X1YGZ0ykdqDARb5ly49XadUUSUc0xJaDRyi1PO7k6AKkFVyVyWdFsDgY/ejbUCdh88rcQWddLPwLcEGQVNuLh6p4w5joFktoxE1iUIDDNwqOeZjFsi7sRqJYVvsi5s2TkcGtH5pqUDx X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(7416014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 0qXcs6rUwK6CN6bLu3rKlnbgB/8j1BF5shXiGwUpbsdY4ZNY+MxP8dc2iqjK7HzFop5mXee/lSaSIEYumR+QrtRkdlnaNWdTYgmIGDfO57swWEpYEZAdsL8tUw0LMQVHjow4qDyFbviilexVxTp2EjB0l8X/K4spuAtoUCqHA9Uv/5Uz+yzA9EOJNttD+mWkJpoTgqfeGAvkrnuPx5UA98NAzN6kLH9/Huhbtm4od59I7hSpRqTYEs6Ga5i5bQYhY/tgXNjrM9zGdangbMunl5hNsQILUZMBz+3GnsAYt9ROHwDOgbGQQip+8WEbGs2b94dLCuSGmVeL17Lz9INZbdFnCNjPNtM20ONxcrMWyccvKpElFgpzDN1IQ71UbScCrZ3sE0UxFvFVTOredrzdnxv0CyLKZks+9+MVDRBnsYrCbDItxobWXAaYMEGRbfvjvc1/66n0TtHAaONiUY+hQlOUA9KuP+L36KDDDlX6+Q+sSlV+Ly4NOIOU3TaWAV+LQvzq9MZcicHNtWoltvGWK5IMXjyXmgmsiGpPFuv4EC/B15jP8kQQq7zz7Z32qFf+jMNZq3oyE3fe8EHxPekl73WDB632MFX5rdOXk/OJ4mfk7trOZuOH6bIceZVzaLzY4PE6/euRszG9bGeiwBn52yecT03HqFbDFhxqGJJLv8ENp7tW3MRC9WYLNj+uoX9D+zFlAX4bgx+hLsHIOwizHcX3uXIK8yE/nm6sAezFXCFfkVULc8WreBbNhhCQ3L/EcHpFvEIGppAlJcufGi8aBEz8c3DDlNQmU5TGoswtDfBgTC4JFa/mm2tuUWNb/rRkkhD+pHGvaZmhd+Y7M3OJxwOgi0laVaDpMSnLQBBFHx8Ke/SWgWm0GPNOjUHnpuiNjIKdFegLimInBFXc9eTZMb1pXckBjW3Ndlfr/7Iz6ItKfJOeHVKQuTzSrrYC6vMb41yBHDJv59igZAprObfV3mDqNH+IFADM0dTNX4i8Ib4aDZx8DXU75YhxXS0olX2KJFcPH3DhW/Ug9095T5HPRmu2CYBUqWiBBDXKugFe1vq24sjX91rWvC6zknv5x7eGAvoST1AZnCUSH6qjcMjlk+UtrUkwAQAZCR9lrXish9dtWtxSs/hiMcY7xCykZKiOrbbCCdSUtR0h3hxf8n7Bu7LqiV8AYtJTOp3BRC4OJwoskm5RoDvsYcV31uPTZdoEYCkGRVp8zDh182O1RcSHSddP7v5bb4S5EHTPcC0FLWy/iIjuugxZ9vFV84/d/+rZ1CeUpY/ft/A/80ywApe2szJWjHSB/TyPM7dP1tved99DdoH0XHi5DPniCt7PvLDQAvOb4cOYo5v7VBUInIIUxPEbCYyfiHG/0KbtclIaDIhF3NtcRdqYDE/5WY223VjwuFh3g94YZ5SFNJDO0SW6mb2PMwfzV20+C5gx5JEVKIN/tAtyk0u4qiTvgHpmhoGicwTuBDEBH2UK4BFztp2N5jLSvElnidxGH4xC03qyS0ETsXK4C+9mPK30N09ezFlW5KdG7jJRyzLhJ7FC1UZhSPco6XqI2FxfqHRoedl3TaYCWjalKBG+hTIBnRyg3VuA X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 3725b3b2-7cd0-4a7e-0fa1-08dd4ab3dc1c X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2025 15:50:51.0599 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: c5q+6AvML3XjQjHy4HY9c4zZWYsbXtB1noVRO3gIL87yYMQxbj5tIqhJ9lK3feu6 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB8764 This allows to test folio_split() by specifying an additional in folio page offset parameter to split_huge_page debugfs interface. Signed-off-by: Zi Yan --- mm/huge_memory.c | 47 ++++++++++++++++++++++++++++++++++------------- 1 file changed, 34 insertions(+), 13 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 437d0cd13663..05c09b791676 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -4295,7 +4295,8 @@ static inline bool vma_not_suitable_for_thp_split(struct vm_area_struct *vma) } static int split_huge_pages_pid(int pid, unsigned long vaddr_start, - unsigned long vaddr_end, unsigned int new_order) + unsigned long vaddr_end, unsigned int new_order, + long in_folio_offset) { int ret = 0; struct task_struct *task; @@ -4379,8 +4380,16 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, if (!folio_test_anon(folio) && folio->mapping != mapping) goto unlock; - if (!split_folio_to_order(folio, target_order)) - split++; + if (in_folio_offset < 0 || + in_folio_offset >= folio_nr_pages(folio)) { + if (!split_folio_to_order(folio, target_order)) + split++; + } else { + struct page *split_at = folio_page(folio, + in_folio_offset); + if (!folio_split(folio, target_order, split_at, NULL)) + split++; + } unlock: @@ -4403,7 +4412,8 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, } static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, - pgoff_t off_end, unsigned int new_order) + pgoff_t off_end, unsigned int new_order, + long in_folio_offset) { struct filename *file; struct file *candidate; @@ -4452,8 +4462,15 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, if (folio->mapping != mapping) goto unlock; - if (!split_folio_to_order(folio, target_order)) - split++; + if (in_folio_offset < 0 || in_folio_offset >= nr_pages) { + if (!split_folio_to_order(folio, target_order)) + split++; + } else { + struct page *split_at = folio_page(folio, + in_folio_offset); + if (!folio_split(folio, target_order, split_at, NULL)) + split++; + } unlock: folio_unlock(folio); @@ -4486,6 +4503,7 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, int pid; unsigned long vaddr_start, vaddr_end; unsigned int new_order = 0; + long in_folio_offset = -1; ret = mutex_lock_interruptible(&split_debug_mutex); if (ret) @@ -4514,30 +4532,33 @@ static ssize_t split_huge_pages_write(struct file *file, const char __user *buf, goto out; } - ret = sscanf(tok_buf, "0x%lx,0x%lx,%d", &off_start, - &off_end, &new_order); - if (ret != 2 && ret != 3) { + ret = sscanf(tok_buf, "0x%lx,0x%lx,%d,%ld", &off_start, &off_end, + &new_order, &in_folio_offset); + if (ret != 2 && ret != 3 && ret != 4) { ret = -EINVAL; goto out; } - ret = split_huge_pages_in_file(file_path, off_start, off_end, new_order); + ret = split_huge_pages_in_file(file_path, off_start, off_end, + new_order, in_folio_offset); if (!ret) ret = input_len; goto out; } - ret = sscanf(input_buf, "%d,0x%lx,0x%lx,%d", &pid, &vaddr_start, &vaddr_end, &new_order); + ret = sscanf(input_buf, "%d,0x%lx,0x%lx,%d,%ld", &pid, &vaddr_start, + &vaddr_end, &new_order, &in_folio_offset); if (ret == 1 && pid == 1) { split_huge_pages_all(); ret = strlen(input_buf); goto out; - } else if (ret != 3 && ret != 4) { + } else if (ret != 3 && ret != 4 && ret != 5) { ret = -EINVAL; goto out; } - ret = split_huge_pages_pid(pid, vaddr_start, vaddr_end, new_order); + ret = split_huge_pages_pid(pid, vaddr_start, vaddr_end, new_order, + in_folio_offset); if (!ret) ret = strlen(input_buf); out: From patchwork Tue Feb 11 15:50:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 864716 Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam04on2076.outbound.protection.outlook.com [40.107.100.76]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4172B257420; Tue, 11 Feb 2025 15:51:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.100.76 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289064; cv=fail; b=AJBPgA8FNHmfHS5EakomiD/MmsOvKWeyS78+2IXY/9Hrs46sZTdObkY7uYUfj2anpqewIAPFu7+U/221KOcwC4d64Le3BbZ8qM00668WmYMEuqu+nfKQvugOQtSf9zCgGutjeq7OfsfDxKv+N/H3xE4v49tSj1EC4N9VxAVZ+vY= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1739289064; c=relaxed/simple; bh=siUAkUsrYmN0xTkGTYj67zOhBR+jzX37/9m2FaJA0aI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=GlEvWG5FZqi1rkFX+/bOopqsU+iZNVudWmzSoyvnbWbpUrbkVi+zE7nxlc/0rev8cRdiyLzCG0tcj5sv8xKKw++k4g053UZDpLV+O7f6/YVknacCnY89KrYArXCTbet/fq9XK1O+4kZe70XJiHjEKCMmrROgDTo2TE0mXMgws1I= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=ZDSEGA3H; arc=fail smtp.client-ip=40.107.100.76 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="ZDSEGA3H" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=rYZ2qIGSecvwJjS3H/rNFuqBJgOHvCgAfwhKz9FYNsUQrhxN3LL4RghYxd0Lh0G1Tw4THHDq+08XSkqlDqhx3EXdcs3e8cT3ctZOadRKSxhO78sv7yd7+37dNApkYVxvQ3B73nuytmmpTPVO79G6BjOcwtakD7Rse9RhemTlIB0A3KnV3OH9mY/9qLj1Y9nrLHv3YWg3zXt38YG1Tuv6DKGjKIb8Klhxl7ycmWaDwpi63GSdC+7F+htXJH/fTm1HVyTvZhm951zxXnj5YU6nrr5PkY/LSLCUpU+hXaWZnTKilJuQrob0Kxp4OQoeM7s0kanbPFJvHdjHzzXHSVPHTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=QxUrFoVdwqE7hmVfm5+m5yVYkjr3vlGolGIklz7tl8A=; b=VUnEDNELhawETTd1oH5c383kYqDuLEnZ7NqeLWZywhxuVx9hONiqqfNEjP/HP7Fqd9eLfh9P4o7RBwVjqJoSKdzgLLyTrJ18mEexeQXyPMVYQ4HU7MAbu6hr388ioPwZBllhkNTbdlIy45RrR+KgydIsc+P068E4oQsDlDtdOudil0K1Ir2ByDXo0SbfGhJsFqBi2XQydujjAmnBb4nhj486hrUQ5k4qz9s9XioEmoqA2Zm61kI5aOS7TMMdtyEVbWmw5/8UCHzszUIx5Cnb3hCCF9h3b1Xfstdq86tI+9neGPjlDj118ZsonbUUh1HiAj42ebCTeXft/u5M0nYKYA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=QxUrFoVdwqE7hmVfm5+m5yVYkjr3vlGolGIklz7tl8A=; b=ZDSEGA3H9+nCp7UDQkiM2XqFgc3cNttZ8XMtsMTJG7mLHtLB+cTv089Pxv1eUkisyptl5Cm/ooePfKZUaXv0wSQrmZtk4FqyHbZQwo6IwVQd121XC3O+DUldCQQjo4tiOB5vLpFnHjLL4lfbyLrF8ZO3LNS+4veH36tM6ay7Fjf+4rpPoJZQfiAGnrA1oPiNrBS4IswumZ4n3+poWdLVWqzD47X/AL3g7sKtzvoN4N1VRP4Gh/Pdhc5gWcjbG5En17NzNC7gSlBwjxe5rR93tywuwrNYiRVSTihczJCcIIvBrNwftYTSk+Jp/i3wuT+6dICsX7y46WmI1AEJxmfOJA== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA3PR12MB8764.namprd12.prod.outlook.com (2603:10b6:806:317::13) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8422.19; Tue, 11 Feb 2025 15:50:54 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8422.015; Tue, 11 Feb 2025 15:50:53 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan Subject: [PATCH v7 8/8] selftests/mm: add tests for folio_split(), buddy allocator like split. Date: Tue, 11 Feb 2025 10:50:33 -0500 Message-ID: <20250211155034.268962-9-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250211155034.268962-1-ziy@nvidia.com> References: <20250211155034.268962-1-ziy@nvidia.com> X-ClientProxiedBy: MN0P220CA0008.NAMP220.PROD.OUTLOOK.COM (2603:10b6:208:52e::7) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA3PR12MB8764:EE_ X-MS-Office365-Filtering-Correlation-Id: 05b75c12-2b70-4c46-5839-08dd4ab3ddcb X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|376014|7416014|1800799024|366016; X-Microsoft-Antispam-Message-Info: pDHmo/hQ6hkzTmw36NNUOXl783mBkhAP/mEk8Zj3sgnyUpJ+jp5MdUejWniLHArkOItqvy5Upq0SAZHosXO30nx9CmYcCreVNKaCCkEFHdmQXYpcFt9HQTWAwEproOGZPpdKHiO/wkV55mnZgi+HIG4lJ1dX64lpevaCAuzpZPrp0nmaoERe7bQttWgl1PUsmcsffHysepbFZqSsKAbmLn/EMoQ8ps2GRXKOBx5xkog0Bw+ATOzP79xQ7YiYS9hbv3MWUozp7f8jwQS+OrkiYuTPFUkeXth4au9vdAOVrmX42Q6MT3WfZHTMtUbUifg+lqM+d2y9o5Xeg3YGBAHQdqV+j/mVcii0f6f/Z+jvaLPoC/Ep5kvJ/61N28X32xLLNpTfK3+CIDQRA/h+sPbasTRdl+cNr6rTGfh1nzcLasArm488BHnZtlaUYmGfu8c28hHaQVuGND2fT0jBfo/WkeUzycwnChg5CQPoN9LCAiMx+V2XhH+vZwG3eXOBy3IePq6EZjk37XqsKU0zPDA+uSg8PY5yMscRfU9Uq4W8v27B3cnRjbffKTXlhOPHxBDrRtjkRf5712HfcS99SeASlN0cQuU2yejybpoPb+lto/ysFxUkGzdwtbwTMsM3IqLMOdZZq5gdEUviyFSsm3KJZKe2JTZsoOSUaskNuz8j7/PJGl0SS1VX7JMJTDnVVxyNYD/oI6LwaROAOIQMjNHqo84cbqoHG+ZkzxMPnEK8PQRn5pmk/MIL8BDtLJlMg2IfoRFLz/ADHXG7wwd40VGLDtpqKE7W9urOezgpSsgY5yTNIX/7UP9IeezDQYtJOWRysgk6+IzePR9Cb37iNN3gEnEzsCxmbP193foUll4/8RrR2v9us+5+RireyXZK7eutz/RUJF9tuK33/jJiKHNQk4pdbGEFHu8D/ETXZ9Hne0rmK4XVk2ZnS1/nWuftzvefqGTeaGipsZYBFji1VQTPUVSIHBJImU/i6x4E8M+iAHezj5RpYKdW9ewah13ihBj8iO1MxaZhQ1qkeL88d51TDEZaKvnYARJWC6B/miKDMYY8VK3Qvl920ioUMH/nZSYno62X0qXWUZobrDTLrTdwJqbtxC7VpF3zeozll7BH1jBleWyKJqIVS9yaTRHBib8GVQCDTkjD7t8A37dQo8p60TUjgo3ZIcfYZbwPZPH6BUET80V+DCCz75HYJTgCkJsM3KeyL+BOF/3qzDx2sqAMc76xwUUpmc0eZX+HQr6LR3zh9+wbxMcsoXGOcQ8fb+L+/9LfkFHQQopaw5hQgCOwYsKN3xI1kYljilq6ED5ILFhVGmh7+vgk+wBX+gUTuDjOnCYJ0atp/Kp/pKLnzSBpdO/0jup7xDiT/a/16RIwKRY2tax8yWzkBEOzkoinbQGU X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(376014)(7416014)(1800799024)(366016); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: xZHcuYiK9xLKyeLtJdpVmTpND+67JNZE1DixL8z0zMoXHXBnEadhlV5PV8nR/uw2eRAqguJKfWNGGhKyNCLb2rH9/XSm6GoMg6ZixwKjIwjhfLzoTEfH8uYUh7E8ff86z1KPXDwPW4ls++sTh6SjatNTuN8AgwBQ6KISAWYBB5kFlWtfaxM5sv/CUbbUd3CJhEuPjeVcRWPwy9aXGD97GDJi4jAfTyC/bSnpr48n0sOZBoZiNm5/w4/VwDFLTFpoVwzeIsEelS8kGeIBiOCpkkl278+KGutwbsHrOHhLOAQ0SrtUSPYHS14BRdtbpPSgZ2N8JoG1LMSq357ccFpeomNYsidy+zesInBZoVOxCdLm6Z3cZQhn4vyRLACgAL+rPjalrDGbdKltfjDS3rDF74Glw1HWH62v6VjCSNsYz132cog6H6qHshEVmLr4AR4hJve1ebCDMPtW5LB5eI8Tv2Wi4nNMhrv88TtlIIRiLp94t6KMWK1PCv12XelfEqbJ14vquXL2tcqh8OfvV9kwkVmmkbTDIfo7n8cWT20yAUsDx3WjprsQHTiJnTn9phRLcmIMYvOvLyxKUNTdR9bvOTsJbQaVJbP+OnwHOdbTjxE4gm3xDWmeV7pfk/af8wDStP2XaILXlWkokSLDAZhvX5twsp1YgYOcrWHLIfPYYSfkl+iiMmtc1p12OG6mlJOU9O55VQg6u8AQdM6UxpeiiEUr7Zsp1e9MY+clIxNuMIcQp/EKYoK+aP6CimyJyzIN5niZ0WhSjKPzX1sp4051B1ZRgs2/6VAVjBWew4uZX15N0fAt864+BbnfWDmGZzHGZg8CG9ILsrj5XoMqRvDG1DhWTXtmee7/DhmXwXFDMfHYtfcoUOAP6DF+h1lfTBb9fQTPKJUBFv+ar1eZpEBBRMn6V5LHZfW1P1SZ6LXmmQUElPvvWIvzeWFnVQ+dH7z7JHd6dbrhKpxaSpu5m2Lgv8ni7e/NgbEBY8RMHKWWX8//6bblVns5nbKxm3NzMACXCteT9KpH23fzz+DwE9YhVqLrVMmBfznfGOpwYrvnwasxD9HxYV1OmBKF2u1Lcltf6/o1JePkUdjnNPhZx0FAOGeCzy+HeyFG1mAa7RH9BXKp4jfLsOFf1/RGK/ex3v86uxPFn3xNKzNgfQMGI4onqWOpalUNDz3WHuUairWCw/aZYAnq15ehjxH879a6HS9hvE3uPQ/ctkLxfrGAFhk9S3ZoXlSloTEboeeFSNF9N4QLuEHJ0lLCGXCEdFTvpYNe/sQsMtRPbqs9mVkeAk6i9y9yo6jGdnanRX+KibAY9D1QWPAwgBxPqZ6Jfl65E3P+Rn3G228B8eSoey/aWvTetycHlS5SF9WX6lyigzxK7ENt5dKnt/DBxrv3+dYux19CDnndsoG6SnxjPb1Ky/xC/cGP+k6En3F7l7yD1ekHihxUHyB2DynvMflWr5/3ueZ3cDzGAgCCumm3AoHDUhoCcNpjs5zbxMCaqt5PvhgsM+EWNtam8Nqm72YtX2WAJeBzf3w1v9if/idUkhqhoAgjqFNW09dVwz/8EW/A4rvtFr+Nzi53TstzUadx9gSEMzYW X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 05b75c12-2b70-4c46-5839-08dd4ab3ddcb X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 11 Feb 2025 15:50:53.8579 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: /TsT4nkRAkPSleweoi1K/qJi8NptEo3mQTu1ePrBGBZ3c48S49e58XIhmYYTt47h X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA3PR12MB8764 It splits page cache folios to orders from 0 to 8 at different in-folio offset. Signed-off-by: Zi Yan --- .../selftests/mm/split_huge_page_test.c | 34 +++++++++++++++---- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index e0304046b1a0..719c5e2a6624 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -14,6 +14,7 @@ #include #include #include +#include #include #include #include @@ -456,7 +457,8 @@ int create_pagecache_thp_and_fd(const char *testfile, size_t fd_size, int *fd, return -1; } -void split_thp_in_pagecache_to_order(size_t fd_size, int order, const char *fs_loc) +void split_thp_in_pagecache_to_order_at(size_t fd_size, const char *fs_loc, + int order, int offset) { int fd; char *addr; @@ -474,7 +476,12 @@ void split_thp_in_pagecache_to_order(size_t fd_size, int order, const char *fs_l return; err = 0; - write_debugfs(PID_FMT, getpid(), (uint64_t)addr, (uint64_t)addr + fd_size, order); + if (offset == -1) + write_debugfs(PID_FMT, getpid(), (uint64_t)addr, + (uint64_t)addr + fd_size, order); + else + write_debugfs(PID_FMT, getpid(), (uint64_t)addr, + (uint64_t)addr + fd_size, order, offset); for (i = 0; i < fd_size; i++) if (*(addr + i) != (char)i) { @@ -493,9 +500,15 @@ void split_thp_in_pagecache_to_order(size_t fd_size, int order, const char *fs_l munmap(addr, fd_size); close(fd); unlink(testfile); - if (err) - ksft_exit_fail_msg("Split PMD-mapped pagecache folio to order %d failed\n", order); - ksft_test_result_pass("Split PMD-mapped pagecache folio to order %d passed\n", order); + if (offset == -1) { + if (err) + ksft_exit_fail_msg("Split PMD-mapped pagecache folio to order %d failed\n", order); + ksft_test_result_pass("Split PMD-mapped pagecache folio to order %d passed\n", order); + } else { + if (err) + ksft_exit_fail_msg("Split PMD-mapped pagecache folio to order %d at in-folio offset %d failed\n", order, offset); + ksft_test_result_pass("Split PMD-mapped pagecache folio to order %d at in-folio offset %d passed\n", order, offset); + } } int main(int argc, char **argv) @@ -506,6 +519,7 @@ int main(int argc, char **argv) char fs_loc_template[] = "/tmp/thp_fs_XXXXXX"; const char *fs_loc; bool created_tmp; + int offset; ksft_print_header(); @@ -517,7 +531,7 @@ int main(int argc, char **argv) if (argc > 1) optional_xfs_path = argv[1]; - ksft_set_plan(1+8+1+9+9); + ksft_set_plan(1+8+1+9+9+8*4+2); pagesize = getpagesize(); pageshift = ffs(pagesize) - 1; @@ -540,7 +554,13 @@ int main(int argc, char **argv) created_tmp = prepare_thp_fs(optional_xfs_path, fs_loc_template, &fs_loc); for (i = 8; i >= 0; i--) - split_thp_in_pagecache_to_order(fd_size, i, fs_loc); + split_thp_in_pagecache_to_order_at(fd_size, fs_loc, i, -1); + + for (i = 0; i < 9; i++) + for (offset = 0; + offset < pmd_pagesize / pagesize; + offset += MAX(pmd_pagesize / pagesize / 4, 1 << i)) + split_thp_in_pagecache_to_order_at(fd_size, fs_loc, i, offset); cleanup_thp_fs(fs_loc, created_tmp); ksft_finished();