From patchwork Wed Feb 26 21:00:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 868674 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2056.outbound.protection.outlook.com [40.107.95.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 29C4E224247; Wed, 26 Feb 2025 21:00:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603650; cv=fail; b=GQ8ZGhFkO1JDhP2WmpJamzyP4r5JFliE3fAVG2dxQy70fSCAWQr6JETosWaoptOqZA/eIGe0DobeZwBfFShw2rXEIVrkjlavTThLCtrUIMF9ZRhz0YAePH/9mowBHkZN9hMMQ/aySTohu92aLTqYMEEi2hz5t4Q2mILlmAgGLRc= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603650; c=relaxed/simple; bh=S0A1mKJlR9A+AyVUpwtF6CBskgwEYXk7nsd/HmLtLns=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=ds8LU5q8UkrCRxGJ2rHUbOAk8tJRPNksp39nPrS705CdlzsTlDDif4vRFjRQku1B72kVep5UEKZh4T1V3VeMnVeWzhSI2hq6L6YzgyqVOmxm0MlZj2rqgutGLaepHf9asZQnzB2qmLVbK3uvjgaqkIbnRFOZT2DX3velRLe9luM= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=BiuixxKV; arc=fail smtp.client-ip=40.107.95.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="BiuixxKV" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=VS0jcmOys00bdh31wnTFXvQc/lYTEaFwVgt8JEZf3nMaSYaAJOPKBvIaVOVb6RkxBtGhklRjhgG3q4eSK0EBZECg/uUDjh4gGB0x6nCVAgVQlO/H0dDSxJ0doyhEgaIoP+ak7jp086Qhy9ZZZR4XS4y37j0L1uO5vpY8u7wDnjC3G28GW80ZXo+9jzJvk2Ray2PgNdHkoPKkGo9r0UQmZIXfUjh7WtnZgeW/L1Ow/k5hSZRZyVSB1Gt+NZ4g0OqNf3LASARCT1qz8RS2ZCszVjAqMMwSiOrVYHUJkLMlR7Ch4x92T8m1x3H7SMxIRBcgs+a2KqUoxXkw6jL6kUCOIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=HOPgE9Bitq6BAAMBDm/Zco3K6CauSloQGubTWAbltjc=; b=Gaxt9iBC1id7gtd1G80dKxzCsjvWGdfKWQYuLTILE4ogrjm/XPr3PRLGoQZ3Knf6jwIlCnicE8QLDp6uDTYoyXHKe6S0hNy9VDeVS3bJK4NbCc5QY9bcmPwwzFtTX50BsAjmHwNNhuwNIFr2pMeYREbdA7SjtWtWfbjhVNa8WagldX1VGqWZcnyanRhz0TeBDuYtBrEH+KCVCHFIoYNirC4LfpjVLkX4FQc2CJz0BoH9EzWT0ojGnR57TLqKMwPeTBT131u3xwLC5do2zIY+Gr5/2f00tpsRa1W04Y+V2+9d9CMtAZhaWj4mLY4WpW52wb+Nz1PQ+encQ07O3alQAQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=HOPgE9Bitq6BAAMBDm/Zco3K6CauSloQGubTWAbltjc=; b=BiuixxKV7pfVHohjJyK15+puCM7/oZZzfYBDdRgSiHixMDv3yOuU+lbIPefbbUmRbR2EJDdafF9wQWsEvKz9QsCOZaIlpoVmsWUzoxU9cY/zZYCwZAgCUzZPVdIxjLJegbdMATXeuON1NaeDOjrpy9I6BwFJQYWd6LUoj8nQd3Wau37l9sNcEdHNl3ZINo2ILkVGXwS4sH2uJ9GbVVKNEnTil0S9W10+4A3LP4gteV+Uk+GOcEPp1aGflV6MZMiWugSHHRyCZSmMM3UlB+d7T5dEygu6sg3z1QoMv0An6thu6CAt0w8vwL0CzpcUqAnrmljITPqw+1Z9Tew4subK/Q== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA1PR12MB5614.namprd12.prod.outlook.com (2603:10b6:806:228::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8466.19; Wed, 26 Feb 2025 21:00:40 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8466.016; Wed, 26 Feb 2025 21:00:40 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan , Kairui Song Subject: [PATCH v9 1/8] xarray: add xas_try_split() to split a multi-index entry Date: Wed, 26 Feb 2025 16:00:24 -0500 Message-ID: <20250226210032.2044041-2-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250226210032.2044041-1-ziy@nvidia.com> References: <20250226210032.2044041-1-ziy@nvidia.com> X-ClientProxiedBy: BL0PR02CA0003.namprd02.prod.outlook.com (2603:10b6:207:3c::16) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA1PR12MB5614:EE_ X-MS-Office365-Filtering-Correlation-Id: 111195e7-01ca-42c1-b294-08dd56a8a086 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: xpcPqn0vCxysF55ReFDrFUg1X16tAcOrGU8VVgrUXIezys3okCWjDsoWBYZBBytk3j/vp8Khs1+pVKmkaMXAAdF+zayVAcz9oQPb1lBtIB/qcEnRU0eIWg/KuyCFjve3TGVkYPeMpvunP3ZL6JQdKR3soBlPlWAazvmAxI1buO6G2rxSOBKoy8uFMevqNaxfYtAWLpjLP9p3yAoH0830CJmL/PkiC3K0/6d3fh1qRewNZr8+u4HV4jnf3yJ+bA8RwsMj/U3Trkqarh9IoNsY/lHkUNs0hadPZHszgQ/0SmyBxhd+mT9PyKZb0Grb8n1KO8p7xPJlIpQJY7OOCi1a0VOwtcmHV98J9zVLpl5v2cYzT4qEHb1d5G2IujqXwcj6clEbUcVuCJjRol4RPvcSvZf+q1vXkXgoU510DsTaIeOwLv15a0HH+odUtZHKw608bW/ac8sOStcdpdVvsrCJAAM4V1Z/GgF25eXByofA6RBoBWURxkEk7jL2x/e4KuZo5XBPpkb7nNvuR+p5wO0k+h7UxfLM2t//FPeLrfetburg7RNhDLDY0RZ6HyhmEyqoN7G9MoQPZowA216qZUPxSVq6dL7qjMUCkrV61pBGjs1GkQiv4bZlKfH+yxb3eDC6h0SiqKYYwYbGhadWw+LdyzW83fwFeYQOYf/ocp19c0cNad2oMm1oHtvTS+8f6khC8hDmglg1iNbWcETy5syjCjGT3U+ZFcH9z3BL/Xqu32v0gYq7sMvCkamX9ojdI1NwM8gNYXgWG/heU4CwtXiZrfn7RzxBkwAhGlyMxC8GBgfF7hpV6P4YL17cBdg4pTJmjYVwQKOM+YPamf7bZ2Hp+JGJlPT6W+f8JbP0Q1pZ4+9MVAVNmkf5k3/zDaxu6mJC99r7PX89z1Ijzv9pKYl9S/8Y+XmFn8L3O2nv/qOxMeTKxgZ5uuMggdw/lTtxttGtwhVY0ltCuiOxcxgHotrb1M2hvJhUEdvREbJldAy0jVCsoK6JtYAIsrXJ/ESQhx2LLvpkFMy5ciJEPLgF/6GbvMIg3FNVDdXPOGxRvIpbu+uH+iNrFvK9PskPJCYKmUk373PCtka6JHlxNhsULLKb7ZpP8jsHSVkZ2p5yhoXHtB3uoWtpLbcJ5ML9X/wQSxC9CCScbcVtI0But6cOoa4h4bPWLk3PH2k/ZF7gnk2zXZhUuLyavtQRKmTHfFuVD/SCz8M7F+dqCR0Q30k5vRZuhdiT3+/zGATB6kN3iJlGt+Ip+eaSvKvIculuF3HT6Q5c/qqFU9VNl41pFWUCRs0MeA7LiPX3wxZcyuV6jtvRflR7XQTbkrs3LAp7uVjM98ojHMeDvLXv0e9EUYLCtwEwK/sjK+1ZMbvxJxHQ6aAiIfpGCFgYSB/NXZioQ1DhP6kr X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014)(7416014); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 2OqFSmD+A8Vnn+RC1+ejOcDe4wrUriK0keVZK3T1hgpaybaNS/1v5qiACn4cjW1bVz9ICFkoEcl1C61tA+kRdxoweyAaSgezdKnFalb1DfvId6ypdUR2ANZi4exhCZ1Pny6MVh+rCUA00dI+WGemasUG7hcFR9mor4Q/CQQZxe5hUN2M4xXHnPR0PRp9kaNlOW37HhkbEVJq9BQMwsmXU3sJif0zoIbYrIuJ+lKMkhG1ZidvOjY2KJL/BcFN8hjb+4bT6stgfSEpMf9V6T9j5SHoBK5YrpYPUX3gqASe+wAKEZov7ncjrbHomz2KyLTnKtLoC7EDRhIN8W7TJU4Msm612Irw9NJpuUzR2bkp8wTAqdkNdgKYQ2TB0DI9mYg8K6ik15rkJo7IYTMzHwxA2QsiH2B2YPBjFkKE0lxoBqNIhye3t5qFNlwJXOUWGcpSHQ5hHjsWq7LkjASUqRMEtbCTBV9JvIlquonvMOYzY5EFFBsD+6GXu8kbo6K+TqiAxJvH0NZm4SOC8dciZorvl1tTd/kFNGaKkH0QHcqYPgTZoduhNqpbfgh/aF2TcPehLaC8JNaMKjABUq1FLpNW++WCpAVRRKDAMOq4W2eq4GcgTzrtZjkXajC6KhCZLlhTsEkB8MqavriSjaRZIQ+32FU8BYXBebPZ3zxxvqr+EPJ7fqwpDwg/S+NWdx2rnqRT1zSUB+jNK3tsaq3HE9hx946WjBfCwX5K7ka/T2mlAQ8k7mUWmQF89QwvjCcAc4qMeXPyD0kaxHBdzlPADGZbXXCvxD6KJrWJfqWF46gcu1HxXIZL9yfXKKcytmLQyhqVNaNqP9eKUCO9tfzE8KLRqPv0G1VgGXRMySqXQSC6DLleH0on2q6dOK1KwRRvX14I9fW5HWVV8oFyMUuEuyH3OOdkneMkfYMhAEpgXieZBfTV9pE0CjqhXtsfYPkoB7WipbJq7U8s3y/LRtadqs1qOIs7SWK1sDI5LHn3Z63Q804h5U8pL/9DQ/gmEAxrAv2F4Kfn1uspUep7ErOjL+x0QZUGEJcV4+V1LnQZW3/VmBehxRtKg9SzIj+yT5y2FdQ0lJagdvGk4pxALw4NdBYtSPqNVXa0EpytXwvuiPmCgjN14ALyR1mgZRMeCMc3smqg+EcpiPJpdubxqpWlcjyha8jCArP2AW4+jKYlyo2PyR5pTYlVIPFv0YGtZBGyQMclYZzHvYXDreDJUVbKRKA/KWCk1d8JoOvfvoaZDy1ULZ9mV8pov4YTGa+Mba5n/HgOP9hmWbR0uyL6HujFRbAXi/Rv+vX5IvfpjneGRCpBuMd6K+rbAIVx2e4q0cPqlEPrr4BJSBit1DRLZbS5QwKd+0y6ubIenZQgHD1S6/IhAVkk8m6Qq9yh1czTAJ8HfUsgMApQrWfRlwXy0fAQEXIRbBejxEF4vzvj2VrhBkI2atCML17sCassDlUY7oOfn5C3NF1Wbto7LOESUIcTp1eVYPqVkCZDG+Tzfdx8NaFaBuZXqSUXnAC2hfDuyEnh9JValzvAnZnDTf9XmV02SMpp6/LlsgDtoBeXTWqBDyn8zyU= X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 111195e7-01ca-42c1-b294-08dd56a8a086 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Feb 2025 21:00:40.5412 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: y0NXQM392VabD6TIXj+ClVwutt0Vj7ntzLljv3U/IE7cx4tiz1r+nmuHhNRafIq6 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB5614 A preparation patch for non-uniform folio split, which always split a folio into half iteratively, and minimal xarray entry split. Currently, xas_split_alloc() and xas_split() always split all slots from a multi-index entry. They cost the same number of xa_node as the to-be-split slots. For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, assuming XA_CHUNK_SHIFT is 6 (!CONFIG_BASE_SMALL), 8 xa_node are needed. Instead xas_try_split() is intended to be used iteratively to split the order-9 entry into 2 order-8 entries, then split one order-8 entry, based on the given index, to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. When splitting the order-6 entry and a new xa_node is needed, xas_try_split() will try to allocate one if possible. As a result, xas_try_split() would only need one xa_node instead of 8. When a new xa_node is needed during the split, xas_try_split() can try to allocate one but no more. -ENOMEM will be return if a node cannot be allocated. -EINVAL will be return if a sibling node is split or cascade split happens, where two or more new nodes are needed, and these are not supported by xas_try_split(). xas_split_alloc() and xas_split() split an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | | | ------- --- --- ------- | | ... | | V V V V ----------- ----------- ----------- ----------- | xa_node | | xa_node | ... | xa_node | | xa_node | ----------- ----------- ----------- ----------- xas_try_split() splits an order-9 to order-0: --------------------------------- | | | | | | | | | | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | | | | | | | | | | --------------------------------- | | V ----------- | xa_node | ----------- Signed-off-by: Zi Yan Cc: Baolin Wang Cc: David Hildenbrand Cc: Hugh Dickins Cc: John Hubbard Cc: Kefeng Wang Cc: Kirill A. Shuemov Cc: Miaohe Lin Cc: Matthew Wilcox Cc: Ryan Roberts Cc: Yang Shi Cc: Yu Zhao Cc: Zi Yan Cc: Kairui Song --- Documentation/core-api/xarray.rst | 14 +++- include/linux/xarray.h | 6 ++ lib/test_xarray.c | 52 ++++++++++++ lib/xarray.c | 131 +++++++++++++++++++++++++++--- tools/testing/radix-tree/Makefile | 1 + 5 files changed, 191 insertions(+), 13 deletions(-) diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst index f6a3eef4fe7f..c6c91cbd0c3c 100644 --- a/Documentation/core-api/xarray.rst +++ b/Documentation/core-api/xarray.rst @@ -489,7 +489,19 @@ Storing ``NULL`` into any index of a multi-index entry will set the entry at every index to ``NULL`` and dissolve the tie. A multi-index entry can be split into entries occupying smaller ranges by calling xas_split_alloc() without the xa_lock held, followed by taking the lock -and calling xas_split(). +and calling xas_split() or calling xas_try_split() with xa_lock. The +difference between xas_split_alloc()+xas_split() and xas_try_alloc() is +that xas_split_alloc() + xas_split() split the entry from the original +order to the new order in one shot uniformly, whereas xas_try_split() +iteratively splits the entry containing the index non-uniformly. +For example, to split an order-9 entry, which takes 2^(9-6)=8 slots, +assuming ``XA_CHUNK_SHIFT`` is 6, xas_split_alloc() + xas_split() need +8 xa_node. xas_try_split() splits the order-9 entry into +2 order-8 entries, then split one order-8 entry, based on the given index, +to 2 order-7 entries, ..., and split one order-1 entry to 2 order-0 entries. +When splitting the order-6 entry and a new xa_node is needed, xas_try_split() +will try to allocate one if possible. As a result, xas_try_split() would only +need 1 xa_node instead of 8. Functions and structures ======================== diff --git a/include/linux/xarray.h b/include/linux/xarray.h index 0b618ec04115..4010195201c9 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -1555,6 +1555,7 @@ int xa_get_order(struct xarray *, unsigned long index); int xas_get_order(struct xa_state *xas); void xas_split(struct xa_state *, void *entry, unsigned int order); void xas_split_alloc(struct xa_state *, void *entry, unsigned int order, gfp_t); +void xas_try_split(struct xa_state *xas, void *entry, unsigned int order); #else static inline int xa_get_order(struct xarray *xa, unsigned long index) { @@ -1576,6 +1577,11 @@ static inline void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order, gfp_t gfp) { } + +static inline void xas_try_split(struct xa_state *xas, void *entry, + unsigned int order) +{ +} #endif /** diff --git a/lib/test_xarray.c b/lib/test_xarray.c index 0e865bab4a10..080a39d22e73 100644 --- a/lib/test_xarray.c +++ b/lib/test_xarray.c @@ -1858,6 +1858,54 @@ static void check_split_1(struct xarray *xa, unsigned long index, xa_destroy(xa); } +static void check_split_2(struct xarray *xa, unsigned long index, + unsigned int order, unsigned int new_order) +{ + XA_STATE_ORDER(xas, xa, index, new_order); + unsigned int i, found; + void *entry; + + xa_store_order(xa, index, order, xa, GFP_KERNEL); + xa_set_mark(xa, index, XA_MARK_1); + + /* allocate a node for xas_try_split() */ + xas_set_err(&xas, -ENOMEM); + XA_BUG_ON(xa, !xas_nomem(&xas, GFP_KERNEL)); + + xas_lock(&xas); + xas_try_split(&xas, xa, order); + if (((new_order / XA_CHUNK_SHIFT) < (order / XA_CHUNK_SHIFT)) && + new_order < order - 1) { + XA_BUG_ON(xa, !xas_error(&xas) || xas_error(&xas) != -EINVAL); + xas_unlock(&xas); + goto out; + } + for (i = 0; i < (1 << order); i += (1 << new_order)) + __xa_store(xa, index + i, xa_mk_index(index + i), 0); + xas_unlock(&xas); + + for (i = 0; i < (1 << order); i++) { + unsigned int val = index + (i & ~((1 << new_order) - 1)); + XA_BUG_ON(xa, xa_load(xa, index + i) != xa_mk_index(val)); + } + + xa_set_mark(xa, index, XA_MARK_0); + XA_BUG_ON(xa, !xa_get_mark(xa, index, XA_MARK_0)); + + xas_set_order(&xas, index, 0); + found = 0; + rcu_read_lock(); + xas_for_each_marked(&xas, entry, ULONG_MAX, XA_MARK_1) { + found++; + XA_BUG_ON(xa, xa_is_internal(entry)); + } + rcu_read_unlock(); + XA_BUG_ON(xa, found != 1 << (order - new_order)); +out: + xas_destroy(&xas); + xa_destroy(xa); +} + static noinline void check_split(struct xarray *xa) { unsigned int order, new_order; @@ -1869,6 +1917,10 @@ static noinline void check_split(struct xarray *xa) check_split_1(xa, 0, order, new_order); check_split_1(xa, 1UL << order, order, new_order); check_split_1(xa, 3UL << order, order, new_order); + + check_split_2(xa, 0, order, new_order); + check_split_2(xa, 1UL << order, order, new_order); + check_split_2(xa, 3UL << order, order, new_order); } } } diff --git a/lib/xarray.c b/lib/xarray.c index 116e9286c64e..bc197c96d171 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -1007,6 +1007,26 @@ static void node_set_marks(struct xa_node *node, unsigned int offset, } } +static void __xas_init_node_for_split(struct xa_state *xas, + struct xa_node *node, void *entry) +{ + unsigned int i; + void *sibling = NULL; + unsigned int mask = xas->xa_sibs; + + if (!node) + return; + node->array = xas->xa; + for (i = 0; i < XA_CHUNK_SIZE; i++) { + if ((i & mask) == 0) { + RCU_INIT_POINTER(node->slots[i], entry); + sibling = xa_mk_sibling(i); + } else { + RCU_INIT_POINTER(node->slots[i], sibling); + } + } +} + /** * xas_split_alloc() - Allocate memory for splitting an entry. * @xas: XArray operation state. @@ -1025,7 +1045,6 @@ void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order, gfp_t gfp) { unsigned int sibs = (1 << (order % XA_CHUNK_SHIFT)) - 1; - unsigned int mask = xas->xa_sibs; /* XXX: no support for splitting really large entries yet */ if (WARN_ON(xas->xa_shift + 2 * XA_CHUNK_SHIFT <= order)) @@ -1034,22 +1053,13 @@ void xas_split_alloc(struct xa_state *xas, void *entry, unsigned int order, return; do { - unsigned int i; - void *sibling = NULL; struct xa_node *node; node = kmem_cache_alloc_lru(radix_tree_node_cachep, xas->xa_lru, gfp); if (!node) goto nomem; - node->array = xas->xa; - for (i = 0; i < XA_CHUNK_SIZE; i++) { - if ((i & mask) == 0) { - RCU_INIT_POINTER(node->slots[i], entry); - sibling = xa_mk_sibling(i); - } else { - RCU_INIT_POINTER(node->slots[i], sibling); - } - } + + __xas_init_node_for_split(xas, node, entry); RCU_INIT_POINTER(node->parent, xas->xa_alloc); xas->xa_alloc = node; } while (sibs-- > 0); @@ -1122,6 +1132,103 @@ void xas_split(struct xa_state *xas, void *entry, unsigned int order) xas_update(xas, node); } EXPORT_SYMBOL_GPL(xas_split); + +/** + * xas_try_split() - Try to split a multi-index entry. + * @xas: XArray operation state. + * @entry: New entry to store in the array. + * @order: Current entry order. + * + * The size of the new entries is set in @xas. The value in @entry is + * copied to all the replacement entries. If and only if one new xa_node is + * needed, the function will use GFP_NOWAIT to get one if xas->xa_alloc is + * NULL. If more new xa_node are needed, the function gives EINVAL error. + * + * Context: Any context. The caller should hold the xa_lock. + */ +void xas_try_split(struct xa_state *xas, void *entry, unsigned int order) +{ + unsigned int sibs = (1 << (order % XA_CHUNK_SHIFT)) - 1; + unsigned int offset, marks; + struct xa_node *node; + void *curr = xas_load(xas); + int values = 0; + gfp_t gfp = GFP_NOWAIT; + + node = xas->xa_node; + if (xas_top(node)) + return; + + if (xas->xa->xa_flags & XA_FLAGS_ACCOUNT) + gfp |= __GFP_ACCOUNT; + + marks = node_get_marks(node, xas->xa_offset); + + offset = xas->xa_offset + sibs; + + if (xas->xa_shift < node->shift) { + struct xa_node *child = xas->xa_alloc; + unsigned int expected_sibs = + (1 << ((order - 1) % XA_CHUNK_SHIFT)) - 1; + + /* + * No support for splitting sibling entries + * (horizontally) or cascade split (vertically), which + * requires two or more new xa_nodes. + * Since if one xa_node allocation fails, + * it is hard to free the prior allocations. + */ + if (sibs || xas->xa_sibs != expected_sibs) { + xas_destroy(xas); + xas_set_err(xas, -EINVAL); + return; + } + + if (!child) { + child = kmem_cache_alloc_lru(radix_tree_node_cachep, + xas->xa_lru, gfp); + if (!child) { + xas_destroy(xas); + xas_set_err(xas, -ENOMEM); + return; + } + RCU_INIT_POINTER(child->parent, xas->xa_alloc); + } + __xas_init_node_for_split(xas, child, entry); + + xas->xa_alloc = rcu_dereference_raw(child->parent); + child->shift = node->shift - XA_CHUNK_SHIFT; + child->offset = offset; + child->count = XA_CHUNK_SIZE; + child->nr_values = xa_is_value(entry) ? + XA_CHUNK_SIZE : 0; + RCU_INIT_POINTER(child->parent, node); + node_set_marks(node, offset, child, xas->xa_sibs, + marks); + rcu_assign_pointer(node->slots[offset], + xa_mk_node(child)); + if (xa_is_value(curr)) + values--; + xas_update(xas, child); + + } else { + do { + unsigned int canon = offset - xas->xa_sibs; + + node_set_marks(node, canon, NULL, 0, marks); + rcu_assign_pointer(node->slots[canon], entry); + while (offset > canon) + rcu_assign_pointer(node->slots[offset--], + xa_mk_sibling(canon)); + values += (xa_is_value(entry) - xa_is_value(curr)) * + (xas->xa_sibs + 1); + } while (offset-- > xas->xa_offset); + } + + node->nr_values += values; + xas_update(xas, node); +} +EXPORT_SYMBOL_GPL(xas_try_split); #endif /** diff --git a/tools/testing/radix-tree/Makefile b/tools/testing/radix-tree/Makefile index 8b3591a51e1f..b2a6660bbd92 100644 --- a/tools/testing/radix-tree/Makefile +++ b/tools/testing/radix-tree/Makefile @@ -14,6 +14,7 @@ include ../shared/shared.mk main: $(OFILES) +xarray.o: ../../../lib/test_xarray.c idr-test.o: ../../../lib/test_ida.c idr-test: idr-test.o $(CORE_OFILES) From patchwork Wed Feb 26 21:00:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 868673 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2056.outbound.protection.outlook.com [40.107.95.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DD70225A34F; Wed, 26 Feb 2025 21:00:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603654; cv=fail; b=LL9IzwWesrFNkgkDAiBYQDmv6lDRLZdhvQOfcT88R/jKvnbU159tday61sNO9FL5yN3C4zkLB8Mo1StHF8wFqqurTNjnFcRhtdg1NY3xPWNRun2r+cDRWgoTW1pHatoPL9EKzfNgeq3wilw9zq3NMwutOeCE9KFVW3B6D/FZCIs= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603654; c=relaxed/simple; bh=BXCYXygDQIKOWW9w5C6FYUYGVWw/MkXK8gchbYck3zw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=kj7/HseW1XFAsU46gEhUSKKGOMIjISigkEvGnPskAwF3Bwm8wGf0jKvN/D9XDpWRra0XWn/rYNtkdo+VAjRqwDQvn3AgcjcuW3Zbup9vcho4wZjqoM2Czadw2yp3EySkxTMwnKUFs7q5/Wh9cKQmzMn0Anx3wiAhFmlp74/7tTQ= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=epLTx7zv; arc=fail smtp.client-ip=40.107.95.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="epLTx7zv" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=G7pw7OJVzPCZi3UZoHJa6Nnr+QXkGXtXTQngnd2XwA7qfk73LbJ+SSWdlrY0SjZw8hOHU9n6XgL60+jvKu6cqOV5mOkbJE4GN8QJ30WiOLWIinOfYYz9V8sEvrhNySU4hTNo4y2fwasvaOqEN1zHzHGNd+vdHATEmCnlYbhIum2CDlGkB010jwitxeC3yW28GIkpOTsFKHDmr2HwSJ6FDAQ2Q1bZZ9SYmW3uU+7lMr1fP6rX/6w2S1d68/HbCJScE5WorLvQfRHLYIuQcBFPIasXmQVezxsrETcNSAs/tBfGvynfYgkDLChxKIXacRizonv8pgLQLqItFQM0Wl/dDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+ApHuPDcKFEGsmaUJTQ/8XKSydEfD60ZBVJX0K9tO/g=; b=l5H2syaEsXBJa2JT5wEaKbRQJRu8RoZS/YupctNjQYE6ShXNf396h6hsX4MCgxOGmYajAVwn58UrW/dK3Tlv3Yf3Y0E1Clp9nxU4PtNqQb26wYxrAgj/CBA76R+tvN7XLnd1Bc/L98UTb9F/WA1GGuiYrot/0XI6xTsgsyMBfW9kYbmnyp7ZS1uD3Rn7M5B1fR9253rfC08BjjGrNZoggck+J4JXNNF3SMdAjtwWgLBZRGfZCXQ1FfpsIYCLEITIqxmgyqeVTrt51pvmY3EmPSYlpNLs0MLndxfu9uFfy+ELgXAwujmltu+/4KxZtdpqlMAuvifb7e9G5iDDmhxxNQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+ApHuPDcKFEGsmaUJTQ/8XKSydEfD60ZBVJX0K9tO/g=; b=epLTx7zv/vJKQkO+WOVnSh8DRrO5tF2b59U1Md0KhBA1dXsoQzFCfmXgO9DDQ9YbnlA/mew2dJjgjYsoxdzH8oHZ0v+seHzQM9R1gArlhXo+ddKPtGJZLM32A6Yq86rJj/HkDiM5zKnQhWDpECHlIQ6oSiTbzYh6xgvwOlohds8CrEFbmBJPkdSa+seMu6KY+ga2cZJqzBSj0SG9AYyyXAtmkgBj+tkAJrtUmwsvVaFLTuioxgFBl06qkWO3VCF3OpEE/tcQMufxI2qofimAjcEJ1/eXvpAlrqGJZCKP71rH2rjGdHnKBI8ZqlHrKObGlivM1qdDm3zMo2W1FZW1sw== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA1PR12MB5614.namprd12.prod.outlook.com (2603:10b6:806:228::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8466.19; Wed, 26 Feb 2025 21:00:44 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8466.016; Wed, 26 Feb 2025 21:00:43 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan , Kairui Song Subject: [PATCH v9 3/8] mm/huge_memory: move folio split common code to __folio_split() Date: Wed, 26 Feb 2025 16:00:26 -0500 Message-ID: <20250226210032.2044041-4-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250226210032.2044041-1-ziy@nvidia.com> References: <20250226210032.2044041-1-ziy@nvidia.com> X-ClientProxiedBy: BL0PR02CA0024.namprd02.prod.outlook.com (2603:10b6:207:3c::37) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA1PR12MB5614:EE_ X-MS-Office365-Filtering-Correlation-Id: d9a7e18e-786b-4733-4d1e-08dd56a8a22b X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: 3iClvwbDDfof29q/cyT4/TswGTlhLGC1aBz7hwiwzVtX8G4xayJGXs3dC7vih0OaEGQqmUMCSpMYw8XvFCu3NkDOecG/iZx4Awnee4D+uwCiDBRkCeKYxwGsMbw9VDY2sKoGxqVybOTCahhppIjskcNr85XY8fE6HGDI8/eDkndOx3pYL6qsff+GMGJhn/DTV1gFflIWz+hAz2JiijuuIuyljy4BJ+sJfIzUOhrBXvYmHzVcSWLyUr37ewet+PChmTrbNY+1/d8On2e9xkAKc8QTAS4+ERBQwyVtLuGox+YVXXbm4Kji8ayvFkaFgVrw43M7ArgOQ17GUrReyzNjcIqsNnhh205SH8cJGcc1KfyfdTulOP6A+o49TK+Mc9C3rJaLt/mrZ1YDY10nZHPU2oJ4/cAzWMKt6I+tZ9CfUxHjXqEXHCRt90DpEZfD4tEoWuCwZp64wkunnAtRjmRGaiBG3H6YnIfx7B0HS91rrOzO4OKYIH6WX76M1zy8CKEfF/MX1Du12UYh9y+m1wDkzYYJIL8v+nMFY6ixAR/IxKkiggZAGijEnff45aIIIa/FwtbVWbhtARtzRSfj4EX2LAW+aHUwtSMpuxSUVir1KVPnSmXxOsR8r9HRaipPJSttRm7FNNgwvRXBPKoPUl9EGG3L1A+NAvkVoC+0JpD/A15uQE3hWs3IuI6a6deS3X7nssOhfV1iZJ17I7cqrKtMi31VCmsRJY0+VfZawAMIozlEf/HHHx0m0reWwLZ9DXsjVlanWxGpLO0DfKFQLlktQzExMNTQ1kTghAnReyEVixzOzMIvUKagK+QnS8PupiYnL7mEejWk7uWXYRGy14OHh3kTJQSmfAwlPXJMAx6I5WsKV4xsOehjVj98vXzU22Bn9JiluZxOt7OgcbOwlX4VADWx+VarY7cKrQCzLC/lyNBKMWE4m7u/yVoExM45/LQh0ck5OeO8SOjdm7/pkzN4NpbFcXBvUA1XExmFVSjPPErbvOfWsroB2VszVfFhXyN5lYEI45YD3mUKlCLLmDzUTZAK7+iwlWZdo0E+mzX/fdlueZRrZI4vebqiSxfpoPOnyiGb0X0ilwi8qtb6ONnAhxsj07H8+hA24CG8rarBnZVono3UbBMpm3rFqpe3VRERpCe7MvXpLp8mB5oz1nK1u4VJygL0ablL5IfqCKWHF7VRf9xltwYtObUBNwkNvgMwr3PpzLIX2x6QjXfUB9dcLcQ2xGWjZTGX+8VW1lo8tva7QVUFxMdWERIfZlZH5SXInnkU/p0ulkdWRZKz/lF1o/j6GwDNVgDPr+G1IDqLzQEeVRgB2dWezHoD/MbN7eh8ep31mXfFodk9qSHzoIX4gmIs80EjxwkXKUNGwN/a+lAUDiCO8gXTnEfzvCAN6+rR X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014)(7416014); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: 0TZ8e44he/za6EkZxte3kZDewz7uTHCLe24OetO6DbXiEOb6nlRho8no9TxhLn8Ktf6FkIxDZEqsFlyBFHgexbY+HmjDb8z9qQREA5l7hbpVYw8cmfoCAgudK3BzRFeem00FyNSDdqi4PnyK7kJ3g0jW8DljmFXswECEhCYN40EB2bZ0QljOkMG44DxsYcU89e1187M3v5Qdsiz9BctVB1FytOmizrntsaIrZJAeGhzjeLD65UmRajgK66h1Lx0iRV0NE2CKW1CTfOq3QH/ZGz7SX9XEsYTWYG6PqV5ygmcQC5vf1XZKQhBd03xpBcMXZSs2sMLPXZuBADmVUwEWXW4DFheMFiMwLZMemOjJWJJBHQYx0iTQvKK92cyej8CnOhiCbc33jKASmbj9ucDe3bHYJwyFEVaUfps3XDA7rrIYd3SX9V+qwMLfCL3NV4VSDLrNlITxqTHsL72/bf+wJS5SMOXlppd/PzsqXyPxN5xzL7M9RFbL4x967hcSKIMgX/R+hUYewr2EAqGnYb/tBopnM0z9eseOyoh80Y2aPTvhIl7Zf3i0E5BWGGemMO77pOpHl0OZ7ZYUp4TvrAsm+HtcIDplAwZ0yzLl3M9QR0C2YKA7lIRRQ+5my0DDFHS4MnJzv/hTfsbqL3o4g056FFEzetqaYV0jfgaqDY7y3pDHr++1sCTTYoYv0zOIENJb2MT3nIFVS8mZiOOapuCROVOV2MGrgu3CIluG6I6cb9rCHkTUHJ/lBPLfHYYsSKxxSdjlUtRjjgVlkD15gzoq4P2CTRCGyR8+2dokYWrUtYmMjmJ69mBG6hepkYR4K7LT7VlV2mytj1fhD0gDK0idzCtWd3GLQ2Yav0742iE73V/O9l4rsmjGnklptJns44NZsxcOozeD4p0eWyBRX0tlvcuaa9jBBc6hZnD8/yQxKsWt1eJ09cDgKz1xOCBZgdeOlhOvxMbTCAQT7zDIRUoRetbaw8pVHqqcxmyPIew8G0pU/CJaSttfWOhAOqgx7nh4zdwXfhDTbdnIl8m9Z9rT43YXlGdFU8KgbHgXj1pTOQA/oMbTJRsEvq2nt8vrUs1Nub1OE4rfj10Rqlz8cOmOhjvLvTXvEnPzBAWPAWRDU4pSTfQwD2zDB/MDvNuygHAsbC9HVCTnoOMrAE/O/omMC50VbZgr0Dgr9P2cO4qvs0w2hFsnTRUwRKH2jyzhTNTwc663n84zSScty31zzDapXOshIVffRnHiXY8X8VNxHZ2mATUVglyhRe7pR2GoLzkOFLFXFm88z+/2xiD3oxJOgz8mW0QcE793wPe5TFDb+IOpdk170NeQK1rWK3iQbZad5qCqgji25jFhENo6y4fDG9UK8IDXl5y6kx8TL4Jfli46IU6nbpF7SSo4seIyg6zppSM5H2FNT+tx97ncsrSPNG6tM9gWiU2E40xvPV4Ewem85ipTptLqmLS1cPcFTgEr3YtGqYxOFK5H80q91etLSgUxrb0C/yV14TgUOFSSW8pJVO6442n8r+xduswaaRTY7sGYTg9DfDqggQ7zTIw8vUwyMLD6KCbLEDgfVdDAjOhEGOW/Dbyj29GH+NWA7+ZH X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d9a7e18e-786b-4733-4d1e-08dd56a8a22b X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Feb 2025 21:00:43.2603 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: RPs4Bdj0Ac56WTQHMOvVB2MSi5QdCiZvkr8vW5MvbcHkj15IHP1ySyNr8+23S1vR X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB5614 This is a preparation patch for folio_split(). In the upcoming patch folio_split() will share folio unmapping and remapping code with split_huge_page_to_list_to_order(), so move the code to a common function __folio_split() first. Signed-off-by: Zi Yan Cc: Baolin Wang Cc: David Hildenbrand Cc: Hugh Dickins Cc: John Hubbard Cc: Kefeng Wang Cc: Kirill A. Shuemov Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Ryan Roberts Cc: Yang Shi Cc: Yu Zhao Cc: Kairui Song --- mm/huge_memory.c | 107 +++++++++++++++++++++++++---------------------- 1 file changed, 57 insertions(+), 50 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index b0105ba6db94..4c79f54566d4 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3847,57 +3847,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, return ret; } -/* - * This function splits a large folio into smaller folios of order @new_order. - * @page can point to any page of the large folio to split. The split operation - * does not change the position of @page. - * - * Prerequisites: - * - * 1) The caller must hold a reference on the @page's owning folio, also known - * as the large folio. - * - * 2) The large folio must be locked. - * - * 3) The folio must not be pinned. Any unexpected folio references, including - * GUP pins, will result in the folio not getting split; instead, the caller - * will receive an -EAGAIN. - * - * 4) @new_order > 1, usually. Splitting to order-1 anonymous folios is not - * supported for non-file-backed folios, because folio->_deferred_list, which - * is used by partially mapped folios, is stored in subpage 2, but an order-1 - * folio only has subpages 0 and 1. File-backed order-1 folios are supported, - * since they do not use _deferred_list. - * - * After splitting, the caller's folio reference will be transferred to @page, - * resulting in a raised refcount of @page after this call. The other pages may - * be freed if they are not mapped. - * - * If @list is null, tail pages will be added to LRU list, otherwise, to @list. - * - * Pages in @new_order will inherit the mapping, flags, and so on from the - * huge page. - * - * Returns 0 if the huge page was split successfully. - * - * Returns -EAGAIN if the folio has unexpected reference (e.g., GUP) or if - * the folio was concurrently removed from the page cache. - * - * Returns -EBUSY when trying to split the huge zeropage, if the folio is - * under writeback, if fs-specific folio metadata cannot currently be - * released, or if some unexpected race happened (e.g., anon VMA disappeared, - * truncation). - * - * Callers should ensure that the order respects the address space mapping - * min-order if one is set for non-anonymous folios. - * - * Returns -EINVAL when trying to split to an order that is incompatible - * with the folio. Splitting to order 0 is compatible with all folios. - */ -int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, - unsigned int new_order) +static int __folio_split(struct folio *folio, unsigned int new_order, + struct page *page, struct list_head *list) { - struct folio *folio = page_folio(page); struct deferred_split *ds_queue = get_deferred_split_queue(folio); /* reset xarray order to new order after split */ XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); @@ -4107,6 +4059,61 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, return ret; } +/* + * This function splits a large folio into smaller folios of order @new_order. + * @page can point to any page of the large folio to split. The split operation + * does not change the position of @page. + * + * Prerequisites: + * + * 1) The caller must hold a reference on the @page's owning folio, also known + * as the large folio. + * + * 2) The large folio must be locked. + * + * 3) The folio must not be pinned. Any unexpected folio references, including + * GUP pins, will result in the folio not getting split; instead, the caller + * will receive an -EAGAIN. + * + * 4) @new_order > 1, usually. Splitting to order-1 anonymous folios is not + * supported for non-file-backed folios, because folio->_deferred_list, which + * is used by partially mapped folios, is stored in subpage 2, but an order-1 + * folio only has subpages 0 and 1. File-backed order-1 folios are supported, + * since they do not use _deferred_list. + * + * After splitting, the caller's folio reference will be transferred to @page, + * resulting in a raised refcount of @page after this call. The other pages may + * be freed if they are not mapped. + * + * If @list is null, tail pages will be added to LRU list, otherwise, to @list. + * + * Pages in @new_order will inherit the mapping, flags, and so on from the + * huge page. + * + * Returns 0 if the huge page was split successfully. + * + * Returns -EAGAIN if the folio has unexpected reference (e.g., GUP) or if + * the folio was concurrently removed from the page cache. + * + * Returns -EBUSY when trying to split the huge zeropage, if the folio is + * under writeback, if fs-specific folio metadata cannot currently be + * released, or if some unexpected race happened (e.g., anon VMA disappeared, + * truncation). + * + * Callers should ensure that the order respects the address space mapping + * min-order if one is set for non-anonymous folios. + * + * Returns -EINVAL when trying to split to an order that is incompatible + * with the folio. Splitting to order 0 is compatible with all folios. + */ +int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, + unsigned int new_order) +{ + struct folio *folio = page_folio(page); + + return __folio_split(folio, new_order, page, list); +} + int min_order_for_split(struct folio *folio) { if (folio_test_anon(folio)) From patchwork Wed Feb 26 21:00:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 868672 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2056.outbound.protection.outlook.com [40.107.95.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9481D25D52B; Wed, 26 Feb 2025 21:00:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603660; cv=fail; b=Zos2tnUVLdyNdt0d641SgsOmVR1dDBFfLY3znl9uBA+6XgTpsEYdwR+blKTWB6ImtQXncbaRfFzRitA2dySPRrGy/EneHjmoLT1A5f8Sids8Svs6KkFxB2hwUCj8UWnPpg9xa4WeChAH8IxgErpos1hVjp3PmlSRGwZr9eo/xmk= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603660; c=relaxed/simple; bh=GoG064e3rBLWceG9UA6OR3g0PVWdE6ovuKWstULT1lU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=XQD/FD+YoAPvinvADBL76QuHXjp8gIfmI2lBDDZNIWBvE+WwIcdhpRYpDimdKHAYEFpQlzJVY5KJosuG63aI1D6m0FNuGikRdnV+F16YsFlCKzjbZijqkRHVlhrUSSxDgM9uq/RxrGVBuLAP1ifNFKbfegupMtcRO+wF4HqEaHg= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=SA/Wd40m; arc=fail smtp.client-ip=40.107.95.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="SA/Wd40m" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=LLKmiIlHKUymefgLidLsX71IoBc34fBeOgquHbAJLSqenmJvglbPbFrre3oCt45HaKRAmo+vb38nYkmIU5i2QgNVq9fStVxagAllRfVVHZLHllvs1L4cGsiYdeimXBw02jS2TEXQD/vRDxlEi2OZnnF77DPVw0hxXLx18egFkxZFE0V6PGlGDpaqfO+LGy6O5fNFj6tz0vraoW9ZVh4kDtqTYyodSGe5UKswJn7cjAPemXJyYb5LN8uBq7q3vw7miZJBuujkhF85fLcAv+Cfzs2HWUhwtlTKz+8yg91nhGfsOkZkpN4mKdESE8QJzIAfi6PQHKGIOKZ3ylAGFTrIww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=5L4JUeJ4bSLnuteZv1Q+5nblizeI9NFaVHLjbQONbtA=; b=NloX0x4Z7Dj4oAAvoqirsBnmHvrz+7/c0s9rtzH7Oyli78CfmgTsOHqZW1/4yXLINmmrFFGHJ7ri6jw64wjQWXPNSs8PbpHKoAd6WtRwD0SFiLU9L1DOmaYfdBMMMG0A7mhtvvohtvExPt8Gh7eozJo3ahmLclWeGbEWi34/DpOZ8mGPJPDZB5g01dKbtIrsWBTgBC3h9o+tEA02fbqBcDchJQWLRYtF2hqP/83/lzjmMdj3leaewG/ZR7U7VGFCyVMvff3vEAd2cLfBHwRYx+r/wnQ146ln8XNsCQ+YcnHRk6SvSrnovvd+4rBjbus2bDNCUC00zsnr4eBjlNW9Kg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=5L4JUeJ4bSLnuteZv1Q+5nblizeI9NFaVHLjbQONbtA=; b=SA/Wd40m9r+oSyb7K09lrMUo6S7OMwBLRNuwbdeGH848cGAE1ZiXreXVLqnP+3miVrO+8CiNsG2YUb0npnHNbCQA5J4Xsk91nQ8oDzXa95uixX/5duHVUiAW5yOfjNbo869xNDNLF5hBVhyAZDci9202N3HoWwGbopyGqdmp33DR77Q//UqIwtBf7xYLyNpEWMF0bUFzXNGeG6Z62qYElJUE3BmcsdEGy9YpdYH5nH6+v0VIJ6VjOlolLYmomWM1pAekUdB+Vd0gcjZlxU1GMSHMlnq0+5xpX4RXKnkHLg9LYcak52HIG3UmbXQ82fe0pBM3uZGhCt2I456SUxbpBg== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA1PR12MB5614.namprd12.prod.outlook.com (2603:10b6:806:228::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8466.19; Wed, 26 Feb 2025 21:00:46 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8466.016; Wed, 26 Feb 2025 21:00:46 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan , Kairui Song Subject: [PATCH v9 5/8] mm/huge_memory: remove the old, unused __split_huge_page() Date: Wed, 26 Feb 2025 16:00:28 -0500 Message-ID: <20250226210032.2044041-6-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250226210032.2044041-1-ziy@nvidia.com> References: <20250226210032.2044041-1-ziy@nvidia.com> X-ClientProxiedBy: BL0PR02CA0034.namprd02.prod.outlook.com (2603:10b6:207:3c::47) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA1PR12MB5614:EE_ X-MS-Office365-Filtering-Correlation-Id: 83e384ec-4e82-40ea-24e9-08dd56a8a407 X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: HBsQ25ONhgfOswFBt3ps7fgUYuX2fCVroKvQPxj92SVAdB5PUm3TjknpLdZtsSEC55SfqUZRlX/qzdPiWGIK5w/VWYz7CrReTTislV6xv4qHJbrmX1xcCVYIZPr5Hu0F13DDnOUxKIKGcblkgRja5JIto9nY4BoKX4oYhVF0WZi0TR2Qq8bB67CgRWOs9TsEk11BAwG8zl/hCINDBsS5L0QqUH3lyzEkh+d/YlyM4vf8v/uv/3xOrHSkCDNDVFO9bUtXejZRXe6VyUxUE2mvEBNQj6PdxNE9sDz2cKeCK7/M8AmRz/67er1iHcmaIZFp76Wn0oKReSaoNSDKfTSM9awHOcZYGV9SAcAHT2z6NkUnGboAhm4TPA69sTNIWSOCUNVZYSy3G0I2S+Qyt1utGwLFtw1LZ8q/sBMDmr5budf3AUVqELJ7NEx/31CwWBYyLVGLvvclf/4U4hlWUY1TO1I+GC+RXX3sLgE7zh9HxA+ZbP/ny7UpHKixaZ22Bhcfbv0YOdWCtQ+BVilT88BhYUXhFRlU6BOiQ9vEoPiIrW3NM+2Pe+3R/alP4OVGqC+YVce3mxt4RklLk/p9wQDDxX5H16hIMxCZyVieRjjF4PoxO3/1S/kJ89r1icTkSaTPCxUdhTAM/S6vsgrfXk18XHEY3hF9gxFRz8p3vT9RkTqM4og5gW97wwDTRJs2XWJlbnI3G+0pIGv+zyQ+G4XIL42DDrfWYeEGztpcpXNu7n8uzC9yQ8/ipo4OXAT8t1tbmu4WHZiO9G/nGoB0Aih65gRqKM8ISjW2I/zTW+IFy3aBFo/aN8X0lX0ZQXByhq/WqytdOYFJqOtVocXn3H3ghzdy+atVfx1WWYW+ihWlGxQ4kXHrKUtAlJEUuS/QHnv1pyKy57XPc3AYr9yu0aKLI89gQn2VCkHnfyBthFh/AYFJCHUHAd3AIU3i7HSWzkNtizX++uAkhsFwkW0QYOq6B4yEJ1HWnFf8aL+cbHQzKhoDV1+g3z1uFG/P+P4YTrFQl0FTRXYl95Z5OiNMistulxkaFRfpadhXgZa+DIZwIggT9zh/dzjw41i/K66sjVxhFINnUOBQlcNi+2lAKMzSnufvCNj8/GA3JQecm0SV4QeLGejJOZkCJ+z6rbo1Zrq/RcEwh9cAP/c8eljA0QHWAvJtK9IFlCwbhssDaTfv/pioY5mtGiNderNmx3nhhAn+QNOZkx+nbgrHYm74sMiGCRThIa5XiC8mZ0foAUSMiYNCltkDNC+4lSTP2ozjccpWOQ92FWSxu4jl+cyPSaollO7/MwWjdqpM5fGWio8enAdGWpiCrp5AY3OlRL/9VmFtgoiICr3njrjPQQosJOyxq/kz6Qi2VcFvN+tl5pHHk/uPEcwgRLI7STT3ulf+7oFa X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014)(7416014); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: YFhswWTeUidh9/zn3+bZ3zElsiEprXNfOj9qxkgH+36wCzepKTMTmQJGKyB68U5TLqvA1Mr9zYPnbd8/0FnwZpdgBNk+TqUNDEMMGfFWRiaAVBMPpd6xhErDxrVmQc+98flxF9STnIklJyEZtPA/y0yYqzdRvvUdx5HGZQ9VUnnAFTcKXzDawEKLT8V2vZy18yp65JqsQ1uvzdZbKqiT8IalTv+LEblveBelXJkC4YmmJw1PmfktKb7OOtwRVopkeNiO0xCS0HZO8XQYCauG1V+K3YUYhfU1+4WnsE8wAwM3D2+9/olKYjomlFzfQES+iq7AfP0SLtFG98VzOlv+56kJZfo5jiOpo/11j/NZW2s7g2i/lLR5QlBeEtHb4zO/5ewBlQEye92gqxwVb2yp1mjjmug5ol1w2vRo3Ag7FoI4x0jDblR6Gf9jstei5jgUD7EnBKagmHne+RyV4wNEMpkeCcn9y5YTPN15W4kLx4PD8CnLDmYqCZaf8nJq+D0v+mrlp4tSoICGOTlTscACStznyfjk67P6XO4SEXSjBStjrQbcoXwLNWR/KysraCmU4ombi9ZuHzf+kZ/8O4Xw8SoJqx2dzwfSVqd37EqqPeKmVd4rDA7q7Xq0+g61RUSYddjLeuv3KHRvAoevCpEdJQEUieWafNBDNjZCNex3NbgiZxhi7HS3lWqQXXlYNcBGZIcJoYg51lEbGfY/Vgi4S6jEy8tMrb63C3TB2vV5nf4rhixTka2MnYOaPyRpdxDNlFHNABW2ru9NAp2d3XaTQnvvxoDrwdaP9wiNZPEYl+AcMFObMvmF2zMn5lVgSf7rGjWNHUno3PUQeFRkt5Vin7OXZ1rf7qwNukfVjEAenkQdnqxYeAIedxS1aRC+yZT3Bx7/nTy3f+4nPU4cTyzk8BYoKkUgAdPvY5bBbXQ743SJU5ckaESRhY+AgzJNJGYY6ueB6YHb5JM7vyyFIjQBL/8/Ncvux570/3R6BbJny4k+6VPRZrvYw+PuzMXiczCW2GjQKTOoo2u8/zlXWvbAp3DIwgpOVngI6lQjIVqOzYT7kndu/SvFK9wfUo6vSBz0wrVcdyRYVxEw+5KGzCIgc4vixIF7pW3ExytccGybR4+hmJtZNYyhpMN0Wem1R7Ury36yuft2pKxo3KkKg5i3GtL1ka7xzOj73+Y1C2UKeY/YY3S8Y7bvvlYuImaoQMhcw9OFxWbo0/RodK2X99d3H8zuOXxJjL25oJkvvz8NHXQmnjVvs7LdiiDL6nzlGl0kIqeagDVbTovGRdANb51jHF+hr5P72+9fEn4gttd2UEcbyVq+4VgOWOUcYIDLReccfk2/EmdV0C/m7OIdfB0RJURy1CiPk0Odqi8EPoNWfJt2JufIa+IyJiU0SVo3TnKvjyJrvpxN6egwO09rIPzM6GgQaxXmQFqeFgWZfkaAn4JoVYavZFKEwsoGVmaH/IsZzGzobMAKfWsOvquCEaCa5PpL86bSI0wDPEcYolG3a7t3cbtFGSRXc56Cq3ktVxwyLokzbaPJEhmiL1GkWFDoCXr1MoYTcOhBYtQiz/XSbBOuz8jxAyFTbgwbPYiGzyct X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: 83e384ec-4e82-40ea-24e9-08dd56a8a407 X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Feb 2025 21:00:46.4573 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NigCvpTg4EEs/IxwRn7FpwYDb0pLwCmSM8QrwZj+XAy8NnhFSXHOhTI31NWcaJec X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB5614 Now split_huge_page_to_list_to_order() uses the new backend split code in __folio_split_without_mapping(), the old __split_huge_page() and __split_huge_page_tail() can be removed. Signed-off-by: Zi Yan Cc: Baolin Wang Cc: David Hildenbrand Cc: Hugh Dickins Cc: John Hubbard Cc: Kefeng Wang Cc: Kirill A. Shuemov Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Ryan Roberts Cc: Yang Shi Cc: Yu Zhao Cc: Kairui Song --- mm/huge_memory.c | 207 ----------------------------------------------- 1 file changed, 207 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0e45937c0d91..e7e50b2b23f6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3286,213 +3286,6 @@ static void lru_add_page_tail(struct folio *folio, struct page *tail, } } -static void __split_huge_page_tail(struct folio *folio, int tail, - struct lruvec *lruvec, struct list_head *list, - unsigned int new_order) -{ - struct page *head = &folio->page; - struct page *page_tail = head + tail; - /* - * Careful: new_folio is not a "real" folio before we cleared PageTail. - * Don't pass it around before clear_compound_head(). - */ - struct folio *new_folio = (struct folio *)page_tail; - - VM_BUG_ON_PAGE(atomic_read(&page_tail->_mapcount) != -1, page_tail); - - /* - * Clone page flags before unfreezing refcount. - * - * After successful get_page_unless_zero() might follow flags change, - * for example lock_page() which set PG_waiters. - * - * Note that for mapped sub-pages of an anonymous THP, - * PG_anon_exclusive has been cleared in unmap_folio() and is stored in - * the migration entry instead from where remap_page() will restore it. - * We can still have PG_anon_exclusive set on effectively unmapped and - * unreferenced sub-pages of an anonymous THP: we can simply drop - * PG_anon_exclusive (-> PG_mappedtodisk) for these here. - */ - page_tail->flags &= ~PAGE_FLAGS_CHECK_AT_PREP; - page_tail->flags |= (head->flags & - ((1L << PG_referenced) | - (1L << PG_swapbacked) | - (1L << PG_swapcache) | - (1L << PG_mlocked) | - (1L << PG_uptodate) | - (1L << PG_active) | - (1L << PG_workingset) | - (1L << PG_locked) | - (1L << PG_unevictable) | -#ifdef CONFIG_ARCH_USES_PG_ARCH_2 - (1L << PG_arch_2) | -#endif -#ifdef CONFIG_ARCH_USES_PG_ARCH_3 - (1L << PG_arch_3) | -#endif - (1L << PG_dirty) | - LRU_GEN_MASK | LRU_REFS_MASK)); - - /* ->mapping in first and second tail page is replaced by other uses */ - VM_BUG_ON_PAGE(tail > 2 && page_tail->mapping != TAIL_MAPPING, - page_tail); - new_folio->mapping = folio->mapping; - new_folio->index = folio->index + tail; - - /* - * page->private should not be set in tail pages. Fix up and warn once - * if private is unexpectedly set. - */ - if (unlikely(page_tail->private)) { - VM_WARN_ON_ONCE_PAGE(true, page_tail); - page_tail->private = 0; - } - if (folio_test_swapcache(folio)) - new_folio->swap.val = folio->swap.val + tail; - - /* Page flags must be visible before we make the page non-compound. */ - smp_wmb(); - - /* - * Clear PageTail before unfreezing page refcount. - * - * After successful get_page_unless_zero() might follow put_page() - * which needs correct compound_head(). - */ - clear_compound_head(page_tail); - if (new_order) { - prep_compound_page(page_tail, new_order); - folio_set_large_rmappable(new_folio); - } - - /* Finally unfreeze refcount. Additional reference from page cache. */ - page_ref_unfreeze(page_tail, - 1 + ((!folio_test_anon(folio) || folio_test_swapcache(folio)) ? - folio_nr_pages(new_folio) : 0)); - - if (folio_test_young(folio)) - folio_set_young(new_folio); - if (folio_test_idle(folio)) - folio_set_idle(new_folio); - - folio_xchg_last_cpupid(new_folio, folio_last_cpupid(folio)); - - /* - * always add to the tail because some iterators expect new - * pages to show after the currently processed elements - e.g. - * migrate_pages - */ - lru_add_page_tail(folio, page_tail, lruvec, list); -} - -static void __split_huge_page(struct page *page, struct list_head *list, - pgoff_t end, unsigned int new_order) -{ - struct folio *folio = page_folio(page); - struct page *head = &folio->page; - struct lruvec *lruvec; - struct address_space *swap_cache = NULL; - unsigned long offset = 0; - int i, nr_dropped = 0; - unsigned int new_nr = 1 << new_order; - int order = folio_order(folio); - unsigned int nr = 1 << order; - - /* complete memcg works before add pages to LRU */ - split_page_memcg(head, order, new_order); - - if (folio_test_anon(folio) && folio_test_swapcache(folio)) { - offset = swap_cache_index(folio->swap); - swap_cache = swap_address_space(folio->swap); - xa_lock(&swap_cache->i_pages); - } - - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ - lruvec = folio_lruvec_lock(folio); - - folio_clear_has_hwpoisoned(folio); - - for (i = nr - new_nr; i >= new_nr; i -= new_nr) { - struct folio *tail; - __split_huge_page_tail(folio, i, lruvec, list, new_order); - tail = page_folio(head + i); - /* Some pages can be beyond EOF: drop them from page cache */ - if (tail->index >= end) { - if (shmem_mapping(folio->mapping)) - nr_dropped += new_nr; - else if (folio_test_clear_dirty(tail)) - folio_account_cleaned(tail, - inode_to_wb(folio->mapping->host)); - __filemap_remove_folio(tail, NULL); - folio_put(tail); - } else if (!folio_test_anon(folio)) { - __xa_store(&folio->mapping->i_pages, tail->index, - tail, 0); - } else if (swap_cache) { - __xa_store(&swap_cache->i_pages, offset + i, - tail, 0); - } - } - - if (!new_order) - ClearPageCompound(head); - else { - struct folio *new_folio = (struct folio *)head; - - folio_set_order(new_folio, new_order); - } - unlock_page_lruvec(lruvec); - /* Caller disabled irqs, so they are still disabled here */ - - split_page_owner(head, order, new_order); - pgalloc_tag_split(folio, order, new_order); - - /* See comment in __split_huge_page_tail() */ - if (folio_test_anon(folio)) { - /* Additional pin to swap cache */ - if (folio_test_swapcache(folio)) { - folio_ref_add(folio, 1 + new_nr); - xa_unlock(&swap_cache->i_pages); - } else { - folio_ref_inc(folio); - } - } else { - /* Additional pin to page cache */ - folio_ref_add(folio, 1 + new_nr); - xa_unlock(&folio->mapping->i_pages); - } - local_irq_enable(); - - if (nr_dropped) - shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr, PageAnon(head) ? RMP_USE_SHARED_ZEROPAGE : 0); - - /* - * set page to its compound_head when split to non order-0 pages, so - * we can skip unlocking it below, since PG_locked is transferred to - * the compound_head of the page and the caller will unlock it. - */ - if (new_order) - page = compound_head(page); - - for (i = 0; i < nr; i += new_nr) { - struct page *subpage = head + i; - struct folio *new_folio = page_folio(subpage); - if (subpage == page) - continue; - folio_unlock(new_folio); - - /* - * Subpages may be freed if there wasn't any mapping - * like if add_to_swap() is running on a lru page that - * had its mapping zapped. And freeing these pages - * requires taking the lru_lock so we do the put_page - * of the tail pages after the split is complete. - */ - free_page_and_swap_cache(subpage); - } -} - /* Racy check whether the huge page can be split */ bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins) { From patchwork Wed Feb 26 21:00:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zi Yan X-Patchwork-Id: 868671 Received: from NAM02-DM3-obe.outbound.protection.outlook.com (mail-dm3nam02on2056.outbound.protection.outlook.com [40.107.95.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9120225E45A; Wed, 26 Feb 2025 21:01:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=fail smtp.client-ip=40.107.95.56 ARC-Seal: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603666; cv=fail; b=hB4de8AmK1PsE/GuaVUTGktoN7Mz62GkCxIotP0ltKu4R6CfeMUR63scwLIGlg9+aXFA+UGYKlzinglBbmhtX3Ucot+kiuVSqjR0ayHoEGVEwcfOdG1lT5ZW1109CRry6QvG1YALU+M9Lt7XrMIhnBCHdivUQF1R1vCyDKUG+ZM= ARC-Message-Signature: i=2; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740603666; c=relaxed/simple; bh=ePKHPNd/12/QDBb2J7DYmLXztnyGngBeXOUF1oT+6a8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: Content-Type:MIME-Version; b=pssDdj124R8+tG4UIYlqm34bzm8uqEjFASRC6cyIWj1DqB5y+rAP3v2wuhML1JMf2TPuNgZ6vgRspawq9KmOWqt9A1aDPydIMrwfBIRYFw+Go4IOERx4SAyFtk0pOkLS96LSW1D4M14IqB8/fdLrZOXLsLQd8t1Eam/5kakY7u8= ARC-Authentication-Results: i=2; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com; spf=fail smtp.mailfrom=nvidia.com; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b=BX1XHPOc; arc=fail smtp.client-ip=40.107.95.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=nvidia.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=nvidia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=Nvidia.com header.i=@Nvidia.com header.b="BX1XHPOc" ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Rk4DpLRL1yN4ZN1bVo3NObDAj5zE9OkJccYZUeVRmTnoz7LgZ8MokMgCjGHIuhG8QYXftcxyxRci8RxF4HrR6tJ6OZNfQ30SlaHbI0fOxe/nEHflkUnjYBTiOur/IYoncLNDqEvS5st2jUsyVpkGTbanE5fvxQDebLRUur0x6N9482vGiZ01M/wWUl67KNHrPs2fR45chDfY3N36d8LucJz/9wZVu5kjzG4ju3g6ZNwufvC+OTmdlYIFP0VKy/cAkqbPAuJiOOdAl1xuIXaSD5QylcbV0l/5+Lv78hx+YxAXKEZTdo9RbOvrD1QOBWRvFygZSDxiXTVsg3M5rDpUkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=dUKzKxCp18RebI50HRcVVot1Kh1u8K/QFu7z4xJIdFw=; b=smB6biVbcqjM/+XKqDux8SaVTxBig7GaA5pjxccUr2g9dswsn1S7QQ+ZitDPQDvhimrns3XVp+T4ew2St/VUu1J7wyDPwN2fM6a3txlCsIkb2V4vKxOz0RTttkdy6miTB0ThWJA/EJanbbEdMB9XUjMoDoLbLso3Q3/nf8y3uLFlM2jC9NoKcq13f0pDriiNJ2R4lI3N+8fSqYw/cd/SUBfjUtUfoZddU1LrU2CNScDLr7IGmI732aHruQoAnKK/flGtJR96ZiU4zVmYwvn4Z6u2BuAznukxDu6lggoiWlNnOxtKgseR5mSbBm9jWnZmUNrN0dYQMrr/PdWQs9uEDA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=dUKzKxCp18RebI50HRcVVot1Kh1u8K/QFu7z4xJIdFw=; b=BX1XHPOcxJvcPh/BULYh3UNq7Nvyk7plK/xnM7SL7pQ+2sJ5g+IWE2OCBiG84IcdlCyd0kWdBMxhTl9/kNe1G6MEIJ6F8Ogvrt5vb2hmDvHc5pCUnEofoD667oAEiaopfMkcRl5WffqyQsf3RPn4fpF7F/ouyDis+yC3CYpDDCS0z+BV/o3hWkBghuoPdw41R1PViHCC+J6RrsjmUncTLoKKRezBS9dbTQRF/gPPAyEmORwun6Eobt8q9no+yOHE8vK7pdR9+0ITBCZoya97u6jpjqu83giqjpCSAQCbPDvxWna//wMKsXBXyaMbxEUBmuB8SwfUT1TnvltFQKXGkQ== Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; Received: from DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) by SA1PR12MB5614.namprd12.prod.outlook.com (2603:10b6:806:228::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.8466.19; Wed, 26 Feb 2025 21:00:50 +0000 Received: from DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a]) by DS7PR12MB9473.namprd12.prod.outlook.com ([fe80::5189:ecec:d84a:133a%5]) with mapi id 15.20.8466.016; Wed, 26 Feb 2025 21:00:49 +0000 From: Zi Yan To: linux-mm@kvack.org, Andrew Morton , "Kirill A . Shutemov" , "Matthew Wilcox (Oracle)" Cc: Ryan Roberts , Hugh Dickins , David Hildenbrand , Yang Shi , Miaohe Lin , Kefeng Wang , Yu Zhao , John Hubbard , Baolin Wang , linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Zi Yan , Kairui Song Subject: [PATCH v9 7/8] mm/truncate: use buddy allocator like folio split for truncate operation Date: Wed, 26 Feb 2025 16:00:30 -0500 Message-ID: <20250226210032.2044041-8-ziy@nvidia.com> X-Mailer: git-send-email 2.47.2 In-Reply-To: <20250226210032.2044041-1-ziy@nvidia.com> References: <20250226210032.2044041-1-ziy@nvidia.com> X-ClientProxiedBy: BLAPR05CA0029.namprd05.prod.outlook.com (2603:10b6:208:335::10) To DS7PR12MB9473.namprd12.prod.outlook.com (2603:10b6:8:252::5) Precedence: bulk X-Mailing-List: linux-kselftest@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS7PR12MB9473:EE_|SA1PR12MB5614:EE_ X-MS-Office365-Filtering-Correlation-Id: d5d80e5b-333a-40a7-67d7-08dd56a8a5ea X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0;ARA:13230040|366016|1800799024|376014|7416014; X-Microsoft-Antispam-Message-Info: 56v4tAawdl7CSnrQzTvBlhzKhbu26f366tq3VEh1kuBWoQNYlekzBJAx3dTsb/epmXPN78tnOGvWEYy/3303u07L2RZnR4ltodHCDF5SEvA9ibCMh8vRgb8u+fNVhvAdVMVBS3kUA9LtdwUpW9v/DiMB+TUMoJfZKil5eP3dykXbyQFhLedb0dMmNKDF1rc0uG/uZ3M6Ul25039a3EjaFqlhOm8HknwBv3RxPsfgwvuGEjpxln5dB2Jr/eL6Q8b2eZ/fC8MqTrOp0juUwYZtjWDh5iAotvIiVD/94BdJPbDpBFNPNdSKb2DfySkVz2ZB710DOEsn4xkiwBBbpwolpcJtU44tAqWyaLdcQzfKtE3BWeyq/WPb4pN0bPQdQ8DRwQlT0tu0TA48x4kQDStCrAq5sETvavWWfyoHG8NWaf7bfMV1QStzg+uTTY0mZhFQpzGzSeU/fy12aZVG/9ywDaTPfKhruv0fXCtzZX6trzQ+owOnJEU/lHknSgvDACnnuMcoMNw2PF/7NgNiwpnTH/Vgx4yY8351H6qemo+PsbXJzNE7wjq5hm9SEyL3wyDdj54AVXWC2fP9oARbwca5Lx8eMvrd2RH03t5oq9m3lthaL6ny+Gc5KqDkqL+9/taWET7/iEf1Yo61sPPUQgQU64BmoOYhg+qH+3RGIYe7xxxy/Px4PMOyS4TlFR8ljZRl78I3/7HsQB9XV7/LkH4NJQoSCV1rqrFv+9QuI0xiyjfjFwVzMe8axOmX5oTPBF4NJ3+gMCoXpxdsmieqYRb36I/4Xuvk2HjpnB48xDQqbl19TeWXNHLttQBSpqWJv26xJvJCKTRPYj+MewTBY5XMsMR3mXesUdnKdueJZ5sIK0BVE6FZsL63IYdMTF8JChXC5aehUFoiJAHwvM+JpTD8PnYyKqE1CLe9+hnbUlCW+Y4z5Jv7A5adLdAq/wTZi52Cruqh9ODB1aX1K8OO22QQfFBowAimKcxrpzXm/wEhrRUkh7T4ZwGit8Iq/sMhO7KkrAMFSldIDutcmRAoAqn+917G9WaF9FB8va8tXYoif/fic1PSZH+ZxnUg4K/qxNobiPjufh4nbGXViJIyGdRK4XDwWcoNOEl5TCULGgFfc4Ykz1tUk0xK2WN0BPWRHZwpwOQtqhRYmrPttQpgxNldMOjgEJWF2uiel8lefjts3rTl1xZOOB1vWcakLl1UWLhLOOZkiI8Rhru19YFqcC/x1q73BYSQ5GR9xaUhueurJZmNvoGEY7O9TAwQCe9KmYBaACRKwTtAB4SksQG0KhXCSkBhz8DsX4Lddind4r86sMS0JyB5ueOseD9zaKrg01K7nRYpgyGtHokiO1GjsMVEmvvTg5FqTJrl9egNu4IvZz1C6VK/+C/NDRMzeEUnKehd X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS7PR12MB9473.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014)(7416014); DIR:OUT; SFP:1101; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: zGSf0MM3p7XGXE+974Au8iGhXGjNzd/I/z6ZAW8udS8KQgl7ANmZpwXgFbDfheFSrOz6PCtfAq8xwPEAqUlxc+TS76SA4PBtJpdbuwIIAEtC1H5wddSCqB6DghFnIinLr6VnbfE1qFIG6grhj3eCox81p8EF9CtmE/D/a/APvRJ8A0bcc7blP6NjoAeG0c+Tr6LPmaVatBMbzXo1DzlhiclrslhLnKiOLhebCvdnsiY5oPunYdRdZycSR+T0XGdorP9tITLuKuIJUzlU0swG2ml8LntNYluTSLL359rxIU1uoRWFRZ0sOp/3Xx2ohJzevQXLnXHMI5sWgsdfRze+7PSmm6Zh+P7cU11jd9Jj6NSsmHh8pLGkOJ1oRCyRbbJZEx4jJnkryvF4sNqGJ/tZjZYbWTIz8aWwgev6JkOvRdZVDB037u55H3uqDK2AO7k5Ue+Xr8qKumqwgxkXkFzM5Q1X9kLlAZRWqHCDEJvWIVjlxwn2ZL7LrlyxXncEB1myzmw1EfBLPjJI8TbddKRyOipz/uby14eHhxCYip4C3FumuO7ED3VrqsJihDTSSg51AbZyMSQRu23QNoKQVLLxW+KpcH4Du8ABfTkivCbWS6k/Q6zi1v2jXDN/6srDvjyiibGHlbgJk6AroGctca8aerZzGMpmp1R/8yRb2zbnZqstIAz7mqgGN9JBLbTOZc/pSlHtHV+6mS7sDj8KMzecwvWYxfRoYR7s4wIcYTDthNiwA8/+ppai3NUs5bhlp3OhO5IKsZMMK5yQ1TlyEE0jtUWt2V/+DGomfiLQdY9A/DfTSE9jlm/HMewJiwFHlvPgMOPFyliRKSf7L+384NdB/8xojPsZIsKYrBdFYcomHtjFma7eymhrm/G0akwi2zqRgNIL6oOWrS1do5iNkLDaN54q5ZNkwloWfyyAYFd0p52Jsd/anX7q3RR0C7Hh63ZTI6hYy4Wvj5KJ/aXjMaRJaMNgf5QamvEeAwviEt3faN9TOPThi+eoYsuoaHM7Z2UVMh8caZpK3wCHYHwcJtjexn3bNZTRqThFmVq96lBmztw3dblET1+S8c/v1gPPChp2CZKDYt/0vfbYBXoapo8wgctE6t8Gi1pHJoHyy/eUlE9fy0II9FU9riVe0gmj8QcotOQUIUV2dCuZ/d9BRkiviLOYaLwzL+EW33IWdUJ98xPCHKOlX545OHOvMab0rPklA1rBJujA7V/z8k5XzjPz/81o9HXC2Yz7q8VDQZF/+VGle9KE9VazS8sqVPTwQr5bCbubEq8Fx58ZlLiS8epuo+fydmEsysk8GdLma1kR2RSeHmTDtGWvZdQA5+EB+VzvJiG6oz/WG7EE8JqEHBKUTW53if+HuJTYJSwT+yUph3F3qQNrY+mkKxurtJRpOi8c5WOuCt9hYJf+cFgm8HroizSXM+2/+1eOCLSx9AEAqPF/UuEzpjNZCVfr+rCrexkr91EoNE+lXOQ3HbaGkYv/+ZURPSIBwuzGCj6D/HqYCPrw/LJYfkOnqcz2HdpGVBw/ge0sj2CGSpNDC5capwzy3qGXb9yhOcg1SnN0VNQxt/W4pbSztZ1SIq3CWdnGI1XV X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-Network-Message-Id: d5d80e5b-333a-40a7-67d7-08dd56a8a5ea X-MS-Exchange-CrossTenant-AuthSource: DS7PR12MB9473.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 26 Feb 2025 21:00:49.5841 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: Xmjg+yIW+UVgf6vtGlc7nrqmHEL94eKwE7WtD+BrWTbkTgMS31yRHRh59vtQG0v6 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SA1PR12MB5614 Instead of splitting the large folio uniformly during truncation, try to use buddy allocator like split at the start of truncation range to minimize the number of resulting folios if it is supported. try_folio_split() is introduced to use folio_split() if supported and fall back to uniform split otherwise. For example, to truncate a order-4 folio [0, 1, 2, 3, 4, 5, ..., 15] between [3, 10] (inclusive), folio_split() splits the folio to [0,1], [2], [3], [4..7], [8..15] and [3], [4..7] can be dropped and [8..15] is kept with zeros in [8..10], then another folio_split() is done at 10, so [8..10] can be dropped. One possible optimization is to make folio_split() to split a folio based on a given range, like [3..10] above. But that complicates folio_split(), so it will be investigated when necessary. Signed-off-by: Zi Yan Cc: Baolin Wang Cc: David Hildenbrand Cc: Hugh Dickins Cc: John Hubbard Cc: Kefeng Wang Cc: Kirill A. Shuemov Cc: Matthew Wilcox Cc: Miaohe Lin Cc: Ryan Roberts Cc: Yang Shi Cc: Yu Zhao Cc: Kairui Song --- include/linux/huge_mm.h | 36 ++++++++++++++++++++++++++++++++++++ mm/huge_memory.c | 6 +++--- mm/truncate.c | 31 ++++++++++++++++++++++++++++++- 3 files changed, 69 insertions(+), 4 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e57e811cfd3c..e893d546a49f 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -345,6 +345,36 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, unsigned int new_order); int min_order_for_split(struct folio *folio); int split_folio_to_list(struct folio *folio, struct list_head *list); +bool uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns); +bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, + bool warns); +int folio_split(struct folio *folio, unsigned int new_order, struct page *page, + struct list_head *list); +/* + * try_folio_split - try to split a @folio at @page using non uniform split. + * @folio: folio to be split + * @page: split to order-0 at the given page + * @list: store the after-split folios + * + * Try to split a @folio at @page using non uniform split to order-0, if + * non uniform split is not supported, fall back to uniform split. + * + * Return: 0: split is successful, otherwise split failed. + */ +static inline int try_folio_split(struct folio *folio, struct page *page, + struct list_head *list) +{ + int ret = min_order_for_split(folio); + + if (ret < 0) + return ret; + + if (!non_uniform_split_supported(folio, 0, false)) + return split_huge_page_to_list_to_order(&folio->page, list, + ret); + return folio_split(folio, ret, page, list); +} static inline int split_huge_page(struct page *page) { struct folio *folio = page_folio(page); @@ -537,6 +567,12 @@ static inline int split_folio_to_list(struct folio *folio, struct list_head *lis return 0; } +static inline int try_folio_split(struct folio *folio, struct page *page, + struct list_head *list) +{ + return 0; +} + static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 6298be12e843..6ac6d468af0d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3640,7 +3640,7 @@ static int __split_unmapped_folio(struct folio *folio, int new_order, return ret; } -static bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, +bool non_uniform_split_supported(struct folio *folio, unsigned int new_order, bool warns) { if (folio_test_anon(folio)) { @@ -3672,7 +3672,7 @@ static bool non_uniform_split_supported(struct folio *folio, unsigned int new_or } /* See comments in non_uniform_split_supported() */ -static bool uniform_split_supported(struct folio *folio, unsigned int new_order, +bool uniform_split_supported(struct folio *folio, unsigned int new_order, bool warns) { if (folio_test_anon(folio)) { @@ -3986,7 +3986,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, * * After split, folio is left locked for caller. */ -static int folio_split(struct folio *folio, unsigned int new_order, +int folio_split(struct folio *folio, unsigned int new_order, struct page *split_at, struct list_head *list) { return __folio_split(folio, new_order, split_at, &folio->page, list, diff --git a/mm/truncate.c b/mm/truncate.c index 0395e578d946..031d0be19f42 100644 --- a/mm/truncate.c +++ b/mm/truncate.c @@ -192,6 +192,7 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) { loff_t pos = folio_pos(folio); unsigned int offset, length; + struct page *split_at, *split_at2; if (pos < start) offset = start - pos; @@ -221,8 +222,36 @@ bool truncate_inode_partial_folio(struct folio *folio, loff_t start, loff_t end) folio_invalidate(folio, offset, length); if (!folio_test_large(folio)) return true; - if (split_folio(folio) == 0) + + split_at = folio_page(folio, PAGE_ALIGN_DOWN(offset) / PAGE_SIZE); + split_at2 = folio_page(folio, + PAGE_ALIGN_DOWN(offset + length) / PAGE_SIZE); + + if (!try_folio_split(folio, split_at, NULL)) { + /* + * try to split at offset + length to make sure folios within + * the range can be dropped, especially to avoid memory waste + * for shmem truncate + */ + struct folio *folio2 = page_folio(split_at2); + + if (!folio_try_get(folio2)) + goto no_split; + + if (!folio_test_large(folio2)) + goto out; + + if (!folio_trylock(folio2)) + goto out; + + /* split result does not matter here */ + try_folio_split(folio2, split_at2, NULL); + folio_unlock(folio2); +out: + folio_put(folio2); +no_split: return true; + } if (folio_test_dirty(folio)) return false; truncate_inode_folio(folio->mapping, folio);