mbox series

[v5,0/8] Split a folio to any lower order folios

Message ID 20240226205534.1603748-1-zi.yan@sent.com
Headers show
Series Split a folio to any lower order folios | expand

Message

Zi Yan Feb. 26, 2024, 8:55 p.m. UTC
From: Zi Yan <ziy@nvidia.com>

Hi all,

File folio supports any order and multi-size THP is upstreamed[1], so both
file and anonymous folios can be >0 order. Currently, split_huge_page()
only splits a huge page to order-0 pages, but splitting to orders higher than
0 might better utilize large folios, if done properly. In addition,
Large Block Sizes in XFS support would benefit from it during truncate[2].
This patchset adds support for splitting a large folio to any lower order
folios. The patchset is on top of mm-everything-2024-02-24-02-40.

In addition to this implementation of split_huge_page_to_list_to_order(),
a possible optimization could be splitting a large folio to arbitrary
smaller folios instead of a single order. As both Hugh and Ryan pointed
out [3,5] that split to a single order might not be optimal, an order-9 folio
might be better split into 1 order-8, 1 order-7, ..., 1 order-1, and 2 order-0
folios, depending on subsequent folio operations. Leave this as future work.


Changelog
===

Since v4[4]
1. Picked up Matthew's order-1 folio support in the page cache patch, so
   that XFS Large Block Sizes patchset can avoid additional code churn in
   split_huge_page_to_list_to_order().
2. Dropped truncate change patch and corresponding testing code.
3. Removed thp_nr_pages() use in __split_huge_page()
   (per David Hildenbrand).
4. Fixed __split_page_owner() (per David Hildenbrand).
5. Changed unmap_folio() to only add TTU_SPLIT_HUGE_PMD if the folios is
   pmd mappable (per Ryan Roberts).
6. Moved swapcached folio split warning upfront and return -EINVAL
   (per Ryan Roberts).

Since v3
---
1. Excluded shmem folios and pagecache folios without FS support from
   splitting to any order (per Hugh Dickins).
2. Allowed splitting anonymous large folio to any lower order since
   multi-size THP is upstreamed.
3. Adapted selftests code to new framework.

Since v2
---
1. Fixed an issue in __split_page_owner() introduced during my rebase

Since v1
---
1. Changed split_page_memcg() and split_page_owner() parameter to use order
2. Used folio_test_pmd_mappable() in place of the equivalent code

Details
===

* Patch 1 changes unmap_folio() to only add TTU_SPLIT_HUGE_PMD if the
  folio is pmd mappable.
* Patch 2 adds support for order-1 page cache folio.
* Patch 3 changes split_page_memcg() to use order instead of nr_pages.
* Patch 4 changes split_page_owner() to use order instead of nr_pages.
* Patch 5 and 6 add new_order parameter split_page_memcg() and
  split_page_owner() and prepare for upcoming changes.
* Patch 7 adds split_huge_page_to_list_to_order() to split a huge page
  to any lower order. The original split_huge_page_to_list() calls
  split_huge_page_to_list_to_order() with new_order = 0.
* Patch 8 adds a test API to debugfs and test cases in
  split_huge_page_test selftests.

Comments and/or suggestions are welcome.

[1] https://lore.kernel.org/all/20231207161211.2374093-1-ryan.roberts@arm.com/
[2] https://lore.kernel.org/linux-mm/20240226094936.2677493-1-kernel@pankajraghav.com/
[3] https://lore.kernel.org/linux-mm/9dd96da-efa2-5123-20d4-4992136ef3ad@google.com/
[4] https://lore.kernel.org/linux-mm/cbb1d6a0-66dd-47d0-8733-f836fe050374@arm.com/
[5] https://lore.kernel.org/linux-mm/20240213215520.1048625-1-zi.yan@sent.com/


Matthew Wilcox (Oracle) (1):
  mm: Support order-1 folios in the page cache

Zi Yan (7):
  mm/huge_memory: only split PMD mapping when necessary in unmap_folio()
  mm/memcg: use order instead of nr in split_page_memcg()
  mm/page_owner: use order instead of nr in split_page_owner()
  mm: memcg: make memcg huge page split support any order split.
  mm: page_owner: add support for splitting to any order in split
    page_owner.
  mm: thp: split huge page to any lower order pages
  mm: huge_memory: enable debugfs to split huge pages to any order.

 include/linux/huge_mm.h                       |  21 ++-
 include/linux/memcontrol.h                    |   4 +-
 include/linux/page_owner.h                    |  14 +-
 mm/filemap.c                                  |   2 -
 mm/huge_memory.c                              | 173 +++++++++++++-----
 mm/internal.h                                 |   3 +-
 mm/memcontrol.c                               |  10 +-
 mm/page_alloc.c                               |   8 +-
 mm/page_owner.c                               |   6 +-
 mm/readahead.c                                |   3 -
 .../selftests/mm/split_huge_page_test.c       | 115 +++++++++++-
 11 files changed, 276 insertions(+), 83 deletions(-)