[v9,0/5] Generic page pool & deferred freeing for system dmabuf hea

Message ID	20210630013421.735092-1-john.stultz@linaro.org
Headers	show Return-Path: <linux-media-owner@kernel.org> From: John Stultz <john.stultz@linaro.org> To: lkml <linux-kernel@vger.kernel.org> Cc: John Stultz <john.stultz@linaro.org>, Daniel Vetter <daniel@ffwll.ch>, Christian Koenig <christian.koenig@amd.com>, Sumit Semwal <sumit.semwal@linaro.org>, Liam Mark <lmark@codeaurora.org>, Chris Goldsworthy <cgoldswo@codeaurora.org>, Laura Abbott <labbott@kernel.org>, Brian Starkey <Brian.Starkey@arm.com>, Hridya Valsaraju <hridya@google.com>, Suren Baghdasaryan <surenb@google.com>, Sandeep Patil <sspatil@google.com>, Daniel Mentz <danielmentz@google.com>, =?utf-8?q?=C3=98rjan_Eide?= <orjan.eide@arm.com>, Robin Murphy <robin.murphy@arm.com>, Ezequiel Garcia <ezequiel@collabora.com>, Simon Ser <contact@emersion.fr>, James Jones <jajones@nvidia.com>, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Subject: [PATCH v9 0/5] Generic page pool & deferred freeing for system dmabuf hea Date: Wed, 30 Jun 2021 01:34:16 +0000 Message-Id: <20210630013421.735092-1-john.stultz@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Generic page pool & deferred freeing for system dmabuf hea \| expand [v9,0/5] Generic page pool & deferred freeing for system dmabuf hea [v9,2/5] drm: ttm_pool: Rework ttm_pool to use drm_page_pool [v9,5/5] dma-buf: system_heap: Add deferred freeing to the system heap

John Stultz June 30, 2021, 1:34 a.m. UTC

After an unfortunately long pause (covid work-schedule burnout),
I wanted to revive and resubmit this series.

As before, the point of this series is trying to add both a page
pool as well as deferred-freeingto the DMA-BUF system heap to
improve allocation performance (so that it can match or beat the
old ION system heaps performance).

The combination of the page pool along with deferred freeing
allows us to offload page-zeroing out of the allocation hot
path. This was done originally with ION and this patch series
allows the DMA-BUF system heap to match ION's system heap
allocation performance in a simple microbenchmark [1] (ION
re-added to the kernel for comparision, running on an x86 vm
image):

./dmabuf-heap-bench -i 0 1 system
Testing dmabuf system vs ion heaptype 0 (flags: 0x1)
---------------------------------------------
dmabuf heap: alloc 4096 bytes 5000 times in 79314244 ns          15862 ns/call
ion heap:    alloc 4096 bytes 5000 times in 107390769 ns         21478 ns/call
dmabuf heap: alloc 1048576 bytes 5000 times in 259083419 ns      51816 ns/call
ion heap:    alloc 1048576 bytes 5000 times in 340497344 ns      68099 ns/call
dmabuf heap: alloc 8388608 bytes 5000 times in 2603105563 ns     520621 ns/call
ion heap:    alloc 8388608 bytes 5000 times in 3613592860 ns     722718 ns/call
dmabuf heap: alloc 33554432 bytes 5000 times in 12212492979 ns   2442498 ns/call
ion heap:    alloc 33554432 bytes 5000 times in 14584157792 ns   2916831 ns/call


Daniel didn't like earlier attempts to re-use the network
page-pool code to achieve this, and suggested the ttm_pool be
used instead, so this series pulls the page pool functionality
out of the ttm_pool logic and creates a generic page pool
that can be shared.

New in v9:
* Tried to address Christian König's feedback on the page pool
  changes (Kerneldoc, static functions, locking issues, duplicative
  order tracking)
* Fix up Kconfig dependency issue as Reported-by:
  kernel test robot <lkp@intel.com>
* Fix compiler warning Reported-by:
  kernel test robot <lkp@intel.com>

I know Christian had some less specific feedback on the deferred free
work that I'd like to revisit, but I wanted to restart the discussion
with this new series, rather then trying to dregdge up and reply to
a ~4mo old thread.

Input would be greatly appreciated. Testing as well, as I don't
have any development hardware that utilizes the ttm pool.

Thanks
-john

[1] https://android.googlesource.com/platform/system/memory/libdmabufheap/+/refs/heads/master/tests/dmabuf_heap_bench.c

Cc: Daniel Vetter <daniel@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Liam Mark <lmark@codeaurora.org>
Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
Cc: Laura Abbott <labbott@kernel.org>
Cc: Brian Starkey <Brian.Starkey@arm.com>
Cc: Hridya Valsaraju <hridya@google.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Sandeep Patil <sspatil@google.com>
Cc: Daniel Mentz <danielmentz@google.com>
Cc: Ørjan Eide <orjan.eide@arm.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Cc: Ezequiel Garcia <ezequiel@collabora.com>
Cc: Simon Ser <contact@emersion.fr>
Cc: James Jones <jajones@nvidia.com>
Cc: linux-media@vger.kernel.org
Cc: dri-devel@lists.freedesktop.org

John Stultz (5):
  drm: Add a sharable drm page-pool implementation
  drm: ttm_pool: Rework ttm_pool to use drm_page_pool
  dma-buf: system_heap: Add drm pagepool support to system heap
  dma-buf: heaps: Add deferred-free-helper library code
  dma-buf: system_heap: Add deferred freeing to the system heap

 drivers/dma-buf/heaps/Kconfig                |   5 +
 drivers/dma-buf/heaps/Makefile               |   1 +
 drivers/dma-buf/heaps/deferred-free-helper.c | 138 +++++++++
 drivers/dma-buf/heaps/deferred-free-helper.h |  55 ++++
 drivers/dma-buf/heaps/system_heap.c          |  46 ++-
 drivers/gpu/drm/Kconfig                      |   4 +
 drivers/gpu/drm/Makefile                     |   2 +
 drivers/gpu/drm/page_pool.c                  | 297 +++++++++++++++++++
 drivers/gpu/drm/ttm/ttm_pool.c               | 167 ++---------
 include/drm/page_pool.h                      |  68 +++++
 include/drm/ttm/ttm_pool.h                   |  14 +-
 11 files changed, 643 insertions(+), 154 deletions(-)
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.c
 create mode 100644 drivers/dma-buf/heaps/deferred-free-helper.h
 create mode 100644 drivers/gpu/drm/page_pool.c
 create mode 100644 include/drm/page_pool.h

Christian König June 30, 2021, 9:10 a.m. UTC | #1

Am 30.06.21 um 03:34 schrieb John Stultz:
> This adds a shrinker controlled page pool, extracted
> out of the ttm_pool logic, and abstracted out a bit
> so it can be used by other non-ttm drivers.
>
> Cc: Daniel Vetter <daniel@ffwll.ch>
> Cc: Christian Koenig <christian.koenig@amd.com>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Liam Mark <lmark@codeaurora.org>
> Cc: Chris Goldsworthy <cgoldswo@codeaurora.org>
> Cc: Laura Abbott <labbott@kernel.org>
> Cc: Brian Starkey <Brian.Starkey@arm.com>
> Cc: Hridya Valsaraju <hridya@google.com>
> Cc: Suren Baghdasaryan <surenb@google.com>
> Cc: Sandeep Patil <sspatil@google.com>
> Cc: Daniel Mentz <danielmentz@google.com>
> Cc: Ørjan Eide <orjan.eide@arm.com>
> Cc: Robin Murphy <robin.murphy@arm.com>
> Cc: Ezequiel Garcia <ezequiel@collabora.com>
> Cc: Simon Ser <contact@emersion.fr>
> Cc: James Jones <jajones@nvidia.com>
> Cc: linux-media@vger.kernel.org
> Cc: dri-devel@lists.freedesktop.org
> Signed-off-by: John Stultz <john.stultz@linaro.org>
> ---
> v8:
> * Completely rewritten from scratch, using only the
>    ttm_pool logic so it can be dual licensed.
> v9:
> * Add Kerneldoc comments similar to tmm implementation
>    as suggested by ChristianK
> * Mark some functions static as suggested by ChristianK
> * Fix locking issue ChristianK pointed out
> * Add methods to block the shrinker so users can
>    make atomic calculations across multiple pools, as
>    suggested by ChristianK
> * Fix up Kconfig dependency issue as Reported-by:
>    kernel test robot <lkp@intel.com>
> ---
>   drivers/gpu/drm/Kconfig     |   3 +
>   drivers/gpu/drm/Makefile    |   2 +
>   drivers/gpu/drm/page_pool.c | 297 ++++++++++++++++++++++++++++++++++++
>   include/drm/page_pool.h     |  68 +++++++++
>   4 files changed, 370 insertions(+)
>   create mode 100644 drivers/gpu/drm/page_pool.c
>   create mode 100644 include/drm/page_pool.h
>
> diff --git a/drivers/gpu/drm/Kconfig b/drivers/gpu/drm/Kconfig
> index 3c16bd1afd87..52d9ba92b35e 100644
> --- a/drivers/gpu/drm/Kconfig
> +++ b/drivers/gpu/drm/Kconfig
> @@ -177,6 +177,9 @@ config DRM_DP_CEC
>   	  Note: not all adapters support this feature, and even for those
>   	  that do support this they often do not hook up the CEC pin.
>   
> +config DRM_PAGE_POOL
> +	bool
> +
>   config DRM_TTM
>   	tristate
>   	depends on DRM && MMU
> diff --git a/drivers/gpu/drm/Makefile b/drivers/gpu/drm/Makefile
> index 5279db4392df..affa4ca3a08e 100644
> --- a/drivers/gpu/drm/Makefile
> +++ b/drivers/gpu/drm/Makefile
> @@ -39,6 +39,8 @@ obj-$(CONFIG_DRM_VRAM_HELPER) += drm_vram_helper.o
>   drm_ttm_helper-y := drm_gem_ttm_helper.o
>   obj-$(CONFIG_DRM_TTM_HELPER) += drm_ttm_helper.o
>   
> +drm-$(CONFIG_DRM_PAGE_POOL) += page_pool.o
> +
>   drm_kms_helper-y := drm_bridge_connector.o drm_crtc_helper.o drm_dp_helper.o \
>   		drm_dsc.o drm_probe_helper.o \
>   		drm_plane_helper.o drm_dp_mst_topology.o drm_atomic_helper.o \
> diff --git a/drivers/gpu/drm/page_pool.c b/drivers/gpu/drm/page_pool.c
> new file mode 100644
> index 000000000000..c07bbe3afc32
> --- /dev/null
> +++ b/drivers/gpu/drm/page_pool.c
> @@ -0,0 +1,297 @@
> +// SPDX-License-Identifier: GPL-2.0 OR MIT
> +/*
> + * Sharable page pool implementation
> + *
> + * Extracted from drivers/gpu/drm/ttm/ttm_pool.c
> + * Copyright 2020 Advanced Micro Devices, Inc.
> + * Copyright 2021 Linaro Ltd.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors: Christian König, John Stultz
> + */
> +
> +#include <linux/module.h>
> +#include <linux/highmem.h>
> +#include <linux/shrinker.h>
> +#include <drm/page_pool.h>
> +
> +static unsigned long page_pool_size; /* max size of the pool */
> +
> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> +module_param(page_pool_size, ulong, 0644);
> +
> +static atomic_long_t nr_managed_pages;
> +
> +static struct mutex shrinker_lock;
> +static struct list_head shrinker_list;
> +static struct shrinker mm_shrinker;
> +
> +/**
> + * drm_page_pool_set_max - Sets maximum size of all pools
> + *
> + * Sets the maximum number of pages allows in all pools.
> + * This can only be set once, and the first caller wins.
> + */
> +void drm_page_pool_set_max(unsigned long max)
> +{
> +	if (!page_pool_size)
> +		page_pool_size = max;
> +}
> +
> +/**
> + * drm_page_pool_get_max - Maximum size of all pools
> + *
> + * Return the maximum number of pages allows in all pools
> + */
> +unsigned long drm_page_pool_get_max(void)
> +{
> +	return page_pool_size;
> +}

Well in general I don't think it is a good idea to have getters/setters 
for one line functionality, similar applies to locking/unlocking the 
mutex below.

Then in this specific case what those functions do is to aid 
initializing the general pool manager and that in turn should absolutely 
not be exposed.

The TTM pool manager exposes this as function because initializing the 
pool manager is done in one part of the module and calculating the 
default value for the pages in another one. But that is not something I 
would like to see here.

> +
> +/**
> + * drm_page_pool_get_total - Current size of all pools
> + *
> + * Return the number of pages in all managed pools
> + */
> +unsigned long drm_page_pool_get_total(void)
> +{
> +	return atomic_long_read(&nr_managed_pages);
> +}
> +
> +/**
> + * drm_page_pool_get_size - Get the number of pages in the pool
> + *
> + * @pool: Pool to calculate the size of
> + *
> + * Return the number of pages in the specified pool
> + */
> +unsigned long drm_page_pool_get_size(struct drm_page_pool *pool)
> +{
> +	unsigned long size;
> +
> +	spin_lock(&pool->lock);
> +	size = pool->page_count;
> +	spin_unlock(&pool->lock);
> +	return size;
> +}
> +
> +/**
> + * drm_page_pool_add - Add a page to a pool
> + *
> + * @pool: Pool to add page to
> + * @page: Page to be added to the pool
> + *
> + * Gives specified page into a specific pool
> + */
> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p)
> +{
> +	unsigned int i, num_pages = 1 << pool->order;
> +
> +	/* Make sure we won't grow larger then the max pool size */
> +	if (page_pool_size &&
> +	       ((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
> +		pool->free(pool, p);
> +		return;
> +	}

That is not a good idea. See how ttm_pool_free() does this.

First the page is given back to the pool, then all pools are shrinked if 
they are above the global limit.

Regards,
Christian.

> +
> +	/* Be sure to zero pages before adding them to the pool */
> +	for (i = 0; i < num_pages; ++i) {
> +		if (PageHighMem(p))
> +			clear_highpage(p + i);
> +		else
> +			clear_page(page_address(p + i));
> +	}
> +
> +	spin_lock(&pool->lock);
> +	list_add(&p->lru, &pool->pages);
> +	pool->page_count += 1 << pool->order;
> +	spin_unlock(&pool->lock);
> +	atomic_long_add(1 << pool->order, &nr_managed_pages);
> +
> +}
> +
> +/**
> + * drm_page_pool_remove - Remove page from pool
> + *
> + * @pool: Pool to pull the page from
> + *
> + * Take pages from a specific pool, return NULL when nothing available
> + */
> +struct page *drm_page_pool_remove(struct drm_page_pool *pool)
> +{
> +	struct page *p;
> +
> +	spin_lock(&pool->lock);
> +	p = list_first_entry_or_null(&pool->pages, typeof(*p), lru);
> +	if (p) {
> +		atomic_long_sub(1 << pool->order, &nr_managed_pages);
> +		pool->page_count -= 1 << pool->order;
> +		list_del(&p->lru);
> +	}
> +	spin_unlock(&pool->lock);
> +
> +	return p;
> +}
> +
> +/**
> + * drm_page_pool_init - Initialize a pool
> + *
> + * @pool: the pool to initialize
> + * @order: page allocation order
> + * @free_page: function pointer to free the pool's pages
> + *
> + * Initialize and add a pool type to the global shrinker list
> + */
> +void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order,
> +			void (*free_page)(struct drm_page_pool *pool, struct page *p))
> +{
> +	pool->order = order;
> +	spin_lock_init(&pool->lock);
> +	INIT_LIST_HEAD(&pool->pages);
> +	pool->free = free_page;
> +	pool->page_count = 0;
> +
> +	mutex_lock(&shrinker_lock);
> +	list_add_tail(&pool->shrinker_list, &shrinker_list);
> +	mutex_unlock(&shrinker_lock);
> +}
> +
> +/**
> + * drm_page_pool_fini - Cleanup a pool
> + *
> + * @pool: the pool to clean up
> + *
> + * Remove a pool_type from the global shrinker list and free all pages
> + */
> +void drm_page_pool_fini(struct drm_page_pool *pool)
> +{
> +	struct page *p;
> +
> +	mutex_lock(&shrinker_lock);
> +	list_del(&pool->shrinker_list);
> +	mutex_unlock(&shrinker_lock);
> +
> +	while ((p = drm_page_pool_remove(pool)))
> +		pool->free(pool, p);
> +}
> +
> +/**
> + * drm_page_pool_shrink - Shrink the drm page pool
> + *
> + * Free pages using the global shrinker list. Returns
> + * the number of pages free
> + */
> +unsigned int drm_page_pool_shrink(void)
> +{
> +	struct drm_page_pool *pool;
> +	unsigned int num_freed;
> +	struct page *p;
> +
> +	mutex_lock(&shrinker_lock);
> +	pool = list_first_entry(&shrinker_list, typeof(*pool), shrinker_list);
> +
> +	p = drm_page_pool_remove(pool);
> +	if (p) {
> +		pool->free(pool, p);
> +		num_freed = 1 << pool->order;
> +	} else {
> +		num_freed = 0;
> +	}
> +
> +	list_move_tail(&pool->shrinker_list, &shrinker_list);
> +	mutex_unlock(&shrinker_lock);
> +
> +	return num_freed;
> +}
> +
> +/* As long as pages are available make sure to release at least one */
> +static unsigned long drm_page_pool_shrinker_scan(struct shrinker *shrink,
> +						 struct shrink_control *sc)
> +{
> +	unsigned long num_freed = 0;
> +
> +	do
> +		num_freed += drm_page_pool_shrink();
> +	while (!num_freed && atomic_long_read(&nr_managed_pages));
> +
> +	return num_freed;
> +}
> +
> +/* Return the number of pages available or SHRINK_EMPTY if we have none */
> +static unsigned long drm_page_pool_shrinker_count(struct shrinker *shrink,
> +						  struct shrink_control *sc)
> +{
> +	unsigned long num_pages = atomic_long_read(&nr_managed_pages);
> +
> +	return num_pages ? num_pages : SHRINK_EMPTY;
> +}
> +
> +/**
> + * dma_page_pool_lock_shrinker - Take the shrinker lock
> + *
> + * Takes the shrinker lock, preventing the shrinker from making
> + * changes to the pools
> + */
> +void dma_page_pool_lock_shrinker(void)
> +{
> +	mutex_lock(&shrinker_lock);
> +}
> +
> +/**
> + * dma_page_pool_unlock_shrinker - Release the shrinker lock
> + *
> + * Releases the shrinker lock, allowing the shrinker to free
> + * pages
> + */
> +void dma_page_pool_unlock_shrinker(void)
> +{
> +	mutex_unlock(&shrinker_lock);
> +}
> +
> +/**
> + * drm_page_pool_shrinker_init - Initialize globals
> + *
> + * Initialize the global locks and lists for the shrinker.
> + */
> +static int drm_page_pool_shrinker_init(void)
> +{
> +	mutex_init(&shrinker_lock);
> +	INIT_LIST_HEAD(&shrinker_list);
> +
> +	mm_shrinker.count_objects = drm_page_pool_shrinker_count;
> +	mm_shrinker.scan_objects = drm_page_pool_shrinker_scan;
> +	mm_shrinker.seeks = 1;
> +	return register_shrinker(&mm_shrinker);
> +}
> +
> +/**
> + * drm_page_pool_shrinker_fini - Finalize globals
> + *
> + * Unregister the shrinker.
> + */
> +static void drm_page_pool_shrinker_fini(void)
> +{
> +	unregister_shrinker(&mm_shrinker);
> +	WARN_ON(!list_empty(&shrinker_list));
> +}
> +
> +module_init(drm_page_pool_shrinker_init);
> +module_exit(drm_page_pool_shrinker_fini);
> +MODULE_LICENSE("Dual MIT/GPL");
> diff --git a/include/drm/page_pool.h b/include/drm/page_pool.h
> new file mode 100644
> index 000000000000..860cf6c92aab
> --- /dev/null
> +++ b/include/drm/page_pool.h
> @@ -0,0 +1,68 @@
> +/* SPDX-License-Identifier: GPL-2.0 OR MIT */
> +/*
> + * Extracted from include/drm/ttm/ttm_pool.h
> + * Copyright 2020 Advanced Micro Devices, Inc.
> + * Copyright 2021 Linaro Ltd
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a
> + * copy of this software and associated documentation files (the "Software"),
> + * to deal in the Software without restriction, including without limitation
> + * the rights to use, copy, modify, merge, publish, distribute, sublicense,
> + * and/or sell copies of the Software, and to permit persons to whom the
> + * Software is furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
> + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
> + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
> + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
> + * OTHER DEALINGS IN THE SOFTWARE.
> + *
> + * Authors: Christian König, John Stultz
> + */
> +
> +#ifndef _DRM_PAGE_POOL_H_
> +#define _DRM_PAGE_POOL_H_
> +
> +#include <linux/mmzone.h>
> +#include <linux/llist.h>
> +#include <linux/spinlock.h>
> +
> +/**
> + * drm_page_pool - Page Pool for a certain memory type
> + *
> + * @order: the allocation order our pages have
> + * @pages: the list of pages in the pool
> + * @shrinker_list: our place on the global shrinker list
> + * @lock: protection of the page list
> + * @page_count: number of pages currently in the pool
> + * @free: Function pointer to free the page
> + */
> +struct drm_page_pool {
> +	unsigned int order;
> +	struct list_head pages;
> +	struct list_head shrinker_list;
> +	spinlock_t lock;
> +
> +	unsigned long page_count;
> +	void (*free)(struct drm_page_pool *pool, struct page *p);
> +};
> +
> +void drm_page_pool_set_max(unsigned long max);
> +unsigned long drm_page_pool_get_max(void);
> +unsigned long drm_page_pool_get_total(void);
> +unsigned int drm_page_pool_shrink(void);
> +unsigned long drm_page_pool_get_size(struct drm_page_pool *pool);
> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p);
> +struct page *drm_page_pool_remove(struct drm_page_pool *pool);
> +void dma_page_pool_lock_shrinker(void);
> +void dma_page_pool_unlock_shrinker(void);
> +void drm_page_pool_init(struct drm_page_pool *pool, unsigned int order,
> +			void (*free_page)(struct drm_page_pool *pool, struct page *p));
> +void drm_page_pool_fini(struct drm_page_pool *pool);
> +
> +#endif

John Stultz June 30, 2021, 10:24 p.m. UTC | #2

On Wed, Jun 30, 2021 at 2:10 AM Christian König
<christian.koenig@amd.com> wrote:
> Am 30.06.21 um 03:34 schrieb John Stultz:
> > +static unsigned long page_pool_size; /* max size of the pool */
> > +
> > +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> > +module_param(page_pool_size, ulong, 0644);
> > +
> > +static atomic_long_t nr_managed_pages;
> > +
> > +static struct mutex shrinker_lock;
> > +static struct list_head shrinker_list;
> > +static struct shrinker mm_shrinker;
> > +
> > +/**
> > + * drm_page_pool_set_max - Sets maximum size of all pools
> > + *
> > + * Sets the maximum number of pages allows in all pools.
> > + * This can only be set once, and the first caller wins.
> > + */
> > +void drm_page_pool_set_max(unsigned long max)
> > +{
> > +     if (!page_pool_size)
> > +             page_pool_size = max;
> > +}
> > +
> > +/**
> > + * drm_page_pool_get_max - Maximum size of all pools
> > + *
> > + * Return the maximum number of pages allows in all pools
> > + */
> > +unsigned long drm_page_pool_get_max(void)
> > +{
> > +     return page_pool_size;
> > +}
>
> Well in general I don't think it is a good idea to have getters/setters
> for one line functionality, similar applies to locking/unlocking the
> mutex below.
>
> Then in this specific case what those functions do is to aid
> initializing the general pool manager and that in turn should absolutely
> not be exposed.
>
> The TTM pool manager exposes this as function because initializing the
> pool manager is done in one part of the module and calculating the
> default value for the pages in another one. But that is not something I
> would like to see here.

So, I guess I'm not quite clear on what you'd like to see...

Part of what I'm balancing here is the TTM subsystem normally sets a
global max size, whereas the old ION pool didn't have caps (instead
just relying on the shrinker when needed).
So I'm trying to come up with a solution that can serve both uses. So
I've got this drm_page_pool_set_max() function to optionally set the
maximum value, which is called in the TTM initialization path or set
the boot argument. But for systems that use the dmabuf system heap,
but don't use TTM, no global limit is enforced.

Your earlier suggestion to have it as an argument to the
drm_page_pool_shrinker_init() didn't make much sense to me, as then we
may have multiple subsystems trying to initialize the pool shrinker,
which doesn't seem ideal. So I'm not sure what would be preferred.

I guess we could have subsystems allocate their own pool managers with
their own shrinkers and per-manager-size-limits? But that also feels
like a fair increase in complexity, for I'm not sure how much benefit.

> > +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p)
> > +{
> > +     unsigned int i, num_pages = 1 << pool->order;
> > +
> > +     /* Make sure we won't grow larger then the max pool size */
> > +     if (page_pool_size &&
> > +            ((drm_page_pool_get_total()) + num_pages > page_pool_size)) {
> > +             pool->free(pool, p);
> > +             return;
> > +     }
>
> That is not a good idea. See how ttm_pool_free() does this.
>
> First the page is given back to the pool, then all pools are shrinked if
> they are above the global limit.

Ok, initially my thought was it seemed like wasteful overhead to add
the page to the pool and then shrink the pool, so just freeing the
given page directly if the pool was full seemed more straightforward.
But on consideration here I do realize having most-recently-used hot
pages in the pool would be good for performance, so I'll switch that
back. Thanks for pointing this out!

Thanks again so much for the review and feedback!
-john

Christian König July 1, 2021, 6:52 a.m. UTC | #3

Am 01.07.21 um 00:24 schrieb John Stultz:
> On Wed, Jun 30, 2021 at 2:10 AM Christian König

> <christian.koenig@amd.com> wrote:

>> Am 30.06.21 um 03:34 schrieb John Stultz:

>>> +static unsigned long page_pool_size; /* max size of the pool */

>>> +

>>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");

>>> +module_param(page_pool_size, ulong, 0644);

>>> +

>>> +static atomic_long_t nr_managed_pages;

>>> +

>>> +static struct mutex shrinker_lock;

>>> +static struct list_head shrinker_list;

>>> +static struct shrinker mm_shrinker;

>>> +

>>> +/**

>>> + * drm_page_pool_set_max - Sets maximum size of all pools

>>> + *

>>> + * Sets the maximum number of pages allows in all pools.

>>> + * This can only be set once, and the first caller wins.

>>> + */

>>> +void drm_page_pool_set_max(unsigned long max)

>>> +{

>>> +     if (!page_pool_size)

>>> +             page_pool_size = max;

>>> +}

>>> +

>>> +/**

>>> + * drm_page_pool_get_max - Maximum size of all pools

>>> + *

>>> + * Return the maximum number of pages allows in all pools

>>> + */

>>> +unsigned long drm_page_pool_get_max(void)

>>> +{

>>> +     return page_pool_size;

>>> +}

>> Well in general I don't think it is a good idea to have getters/setters

>> for one line functionality, similar applies to locking/unlocking the

>> mutex below.

>>

>> Then in this specific case what those functions do is to aid

>> initializing the general pool manager and that in turn should absolutely

>> not be exposed.

>>

>> The TTM pool manager exposes this as function because initializing the

>> pool manager is done in one part of the module and calculating the

>> default value for the pages in another one. But that is not something I

>> would like to see here.

> So, I guess I'm not quite clear on what you'd like to see...

>

> Part of what I'm balancing here is the TTM subsystem normally sets a

> global max size, whereas the old ION pool didn't have caps (instead

> just relying on the shrinker when needed).

> So I'm trying to come up with a solution that can serve both uses. So

> I've got this drm_page_pool_set_max() function to optionally set the

> maximum value, which is called in the TTM initialization path or set

> the boot argument. But for systems that use the dmabuf system heap,

> but don't use TTM, no global limit is enforced.


Yeah, exactly that's what I'm trying to prevent.

See if we have the same functionality used by different use cases we 
should not have different behavior depending on what drivers are loaded.

Is it a problem if we restrict the ION pool to 50% of system memory as 
well? If yes than I would rather drop the limit from TTM and only rely 
on the shrinker there as well.

> Your earlier suggestion to have it as an argument to the

> drm_page_pool_shrinker_init() didn't make much sense to me, as then we

> may have multiple subsystems trying to initialize the pool shrinker,

> which doesn't seem ideal. So I'm not sure what would be preferred.

>

> I guess we could have subsystems allocate their own pool managers with

> their own shrinkers and per-manager-size-limits? But that also feels

> like a fair increase in complexity, for I'm not sure how much benefit.

>

>>> +void drm_page_pool_add(struct drm_page_pool *pool, struct page *p)

>>> +{

>>> +     unsigned int i, num_pages = 1 << pool->order;

>>> +

>>> +     /* Make sure we won't grow larger then the max pool size */

>>> +     if (page_pool_size &&

>>> +            ((drm_page_pool_get_total()) + num_pages > page_pool_size)) {

>>> +             pool->free(pool, p);

>>> +             return;

>>> +     }

>> That is not a good idea. See how ttm_pool_free() does this.

>>

>> First the page is given back to the pool, then all pools are shrinked if

>> they are above the global limit.

> Ok, initially my thought was it seemed like wasteful overhead to add

> the page to the pool and then shrink the pool, so just freeing the

> given page directly if the pool was full seemed more straightforward.

> But on consideration here I do realize having most-recently-used hot

> pages in the pool would be good for performance, so I'll switch that

> back. Thanks for pointing this out!


The even bigger problem is that you then always drop pages from the 
active pools.

E.g. a pool which just allocated and then freed 2MiB during driver load 
for some firmware upload will never see pressure if you do it this way.

So those 2MiB would never be recycled while they could be well used in 
one of the active pools.

Regards,
Christian.

>

> Thanks again so much for the review and feedback!

> -john

John Stultz July 6, 2021, 9:03 p.m. UTC | #4

On Wed, Jun 30, 2021 at 11:52 PM Christian König
<christian.koenig@amd.com> wrote:
>
> Am 01.07.21 um 00:24 schrieb John Stultz:
> > On Wed, Jun 30, 2021 at 2:10 AM Christian König
> > <christian.koenig@amd.com> wrote:
> >> Am 30.06.21 um 03:34 schrieb John Stultz:
> >>> +static unsigned long page_pool_size; /* max size of the pool */
> >>> +
> >>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> >>> +module_param(page_pool_size, ulong, 0644);
> >>> +
> >>> +static atomic_long_t nr_managed_pages;
> >>> +
> >>> +static struct mutex shrinker_lock;
> >>> +static struct list_head shrinker_list;
> >>> +static struct shrinker mm_shrinker;
> >>> +
> >>> +/**
> >>> + * drm_page_pool_set_max - Sets maximum size of all pools
> >>> + *
> >>> + * Sets the maximum number of pages allows in all pools.
> >>> + * This can only be set once, and the first caller wins.
> >>> + */
> >>> +void drm_page_pool_set_max(unsigned long max)
> >>> +{
> >>> +     if (!page_pool_size)
> >>> +             page_pool_size = max;
> >>> +}
> >>> +
> >>> +/**
> >>> + * drm_page_pool_get_max - Maximum size of all pools
> >>> + *
> >>> + * Return the maximum number of pages allows in all pools
> >>> + */
> >>> +unsigned long drm_page_pool_get_max(void)
> >>> +{
> >>> +     return page_pool_size;
> >>> +}
> >> Well in general I don't think it is a good idea to have getters/setters
> >> for one line functionality, similar applies to locking/unlocking the
> >> mutex below.
> >>
> >> Then in this specific case what those functions do is to aid
> >> initializing the general pool manager and that in turn should absolutely
> >> not be exposed.
> >>
> >> The TTM pool manager exposes this as function because initializing the
> >> pool manager is done in one part of the module and calculating the
> >> default value for the pages in another one. But that is not something I
> >> would like to see here.
> > So, I guess I'm not quite clear on what you'd like to see...
> >
> > Part of what I'm balancing here is the TTM subsystem normally sets a
> > global max size, whereas the old ION pool didn't have caps (instead
> > just relying on the shrinker when needed).
> > So I'm trying to come up with a solution that can serve both uses. So
> > I've got this drm_page_pool_set_max() function to optionally set the
> > maximum value, which is called in the TTM initialization path or set
> > the boot argument. But for systems that use the dmabuf system heap,
> > but don't use TTM, no global limit is enforced.
>
> Yeah, exactly that's what I'm trying to prevent.
>
> See if we have the same functionality used by different use cases we
> should not have different behavior depending on what drivers are loaded.
>
> Is it a problem if we restrict the ION pool to 50% of system memory as
> well? If yes than I would rather drop the limit from TTM and only rely
> on the shrinker there as well.

Would having the default value as a config option (still overridable
via boot argument) be an acceptable solution?

Thanks again for the feedback!

thanks
-john

Daniel Vetter July 6, 2021, 9:15 p.m. UTC | #5

On Tue, Jul 6, 2021 at 11:04 PM John Stultz <john.stultz@linaro.org> wrote:
> On Wed, Jun 30, 2021 at 11:52 PM Christian König
> <christian.koenig@amd.com> wrote:
> >
> > Am 01.07.21 um 00:24 schrieb John Stultz:
> > > On Wed, Jun 30, 2021 at 2:10 AM Christian König
> > > <christian.koenig@amd.com> wrote:
> > >> Am 30.06.21 um 03:34 schrieb John Stultz:
> > >>> +static unsigned long page_pool_size; /* max size of the pool */
> > >>> +
> > >>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
> > >>> +module_param(page_pool_size, ulong, 0644);
> > >>> +
> > >>> +static atomic_long_t nr_managed_pages;
> > >>> +
> > >>> +static struct mutex shrinker_lock;
> > >>> +static struct list_head shrinker_list;
> > >>> +static struct shrinker mm_shrinker;
> > >>> +
> > >>> +/**
> > >>> + * drm_page_pool_set_max - Sets maximum size of all pools
> > >>> + *
> > >>> + * Sets the maximum number of pages allows in all pools.
> > >>> + * This can only be set once, and the first caller wins.
> > >>> + */
> > >>> +void drm_page_pool_set_max(unsigned long max)
> > >>> +{
> > >>> +     if (!page_pool_size)
> > >>> +             page_pool_size = max;
> > >>> +}
> > >>> +
> > >>> +/**
> > >>> + * drm_page_pool_get_max - Maximum size of all pools
> > >>> + *
> > >>> + * Return the maximum number of pages allows in all pools
> > >>> + */
> > >>> +unsigned long drm_page_pool_get_max(void)
> > >>> +{
> > >>> +     return page_pool_size;
> > >>> +}
> > >> Well in general I don't think it is a good idea to have getters/setters
> > >> for one line functionality, similar applies to locking/unlocking the
> > >> mutex below.
> > >>
> > >> Then in this specific case what those functions do is to aid
> > >> initializing the general pool manager and that in turn should absolutely
> > >> not be exposed.
> > >>
> > >> The TTM pool manager exposes this as function because initializing the
> > >> pool manager is done in one part of the module and calculating the
> > >> default value for the pages in another one. But that is not something I
> > >> would like to see here.
> > > So, I guess I'm not quite clear on what you'd like to see...
> > >
> > > Part of what I'm balancing here is the TTM subsystem normally sets a
> > > global max size, whereas the old ION pool didn't have caps (instead
> > > just relying on the shrinker when needed).
> > > So I'm trying to come up with a solution that can serve both uses. So
> > > I've got this drm_page_pool_set_max() function to optionally set the
> > > maximum value, which is called in the TTM initialization path or set
> > > the boot argument. But for systems that use the dmabuf system heap,
> > > but don't use TTM, no global limit is enforced.
> >
> > Yeah, exactly that's what I'm trying to prevent.
> >
> > See if we have the same functionality used by different use cases we
> > should not have different behavior depending on what drivers are loaded.
> >
> > Is it a problem if we restrict the ION pool to 50% of system memory as
> > well? If yes than I would rather drop the limit from TTM and only rely
> > on the shrinker there as well.
>
> Would having the default value as a config option (still overridable
> via boot argument) be an acceptable solution?

We're also trying to get ttm over to the shrinker model, and a first
cut of that even landed, but didn't really work out yet. So maybe just
aiming for the shrinker? I do agree this should be consistent across
the board, otherwise we're just sharing code but not actually sharing
functionality, which is a recipe for disaster because one side will
end up breaking the other side's use-case.
-Daniel

John Stultz July 6, 2021, 9:19 p.m. UTC | #6

On Tue, Jul 6, 2021 at 2:15 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>

> On Tue, Jul 6, 2021 at 11:04 PM John Stultz <john.stultz@linaro.org> wrote:

> > On Wed, Jun 30, 2021 at 11:52 PM Christian König

> > <christian.koenig@amd.com> wrote:

> > >

> > > Am 01.07.21 um 00:24 schrieb John Stultz:

> > > > On Wed, Jun 30, 2021 at 2:10 AM Christian König

> > > > <christian.koenig@amd.com> wrote:

> > > >> Am 30.06.21 um 03:34 schrieb John Stultz:

> > > >>> +static unsigned long page_pool_size; /* max size of the pool */

> > > >>> +

> > > >>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");

> > > >>> +module_param(page_pool_size, ulong, 0644);

> > > >>> +

> > > >>> +static atomic_long_t nr_managed_pages;

> > > >>> +

> > > >>> +static struct mutex shrinker_lock;

> > > >>> +static struct list_head shrinker_list;

> > > >>> +static struct shrinker mm_shrinker;

> > > >>> +

> > > >>> +/**

> > > >>> + * drm_page_pool_set_max - Sets maximum size of all pools

> > > >>> + *

> > > >>> + * Sets the maximum number of pages allows in all pools.

> > > >>> + * This can only be set once, and the first caller wins.

> > > >>> + */

> > > >>> +void drm_page_pool_set_max(unsigned long max)

> > > >>> +{

> > > >>> +     if (!page_pool_size)

> > > >>> +             page_pool_size = max;

> > > >>> +}

> > > >>> +

> > > >>> +/**

> > > >>> + * drm_page_pool_get_max - Maximum size of all pools

> > > >>> + *

> > > >>> + * Return the maximum number of pages allows in all pools

> > > >>> + */

> > > >>> +unsigned long drm_page_pool_get_max(void)

> > > >>> +{

> > > >>> +     return page_pool_size;

> > > >>> +}

> > > >> Well in general I don't think it is a good idea to have getters/setters

> > > >> for one line functionality, similar applies to locking/unlocking the

> > > >> mutex below.

> > > >>

> > > >> Then in this specific case what those functions do is to aid

> > > >> initializing the general pool manager and that in turn should absolutely

> > > >> not be exposed.

> > > >>

> > > >> The TTM pool manager exposes this as function because initializing the

> > > >> pool manager is done in one part of the module and calculating the

> > > >> default value for the pages in another one. But that is not something I

> > > >> would like to see here.

> > > > So, I guess I'm not quite clear on what you'd like to see...

> > > >

> > > > Part of what I'm balancing here is the TTM subsystem normally sets a

> > > > global max size, whereas the old ION pool didn't have caps (instead

> > > > just relying on the shrinker when needed).

> > > > So I'm trying to come up with a solution that can serve both uses. So

> > > > I've got this drm_page_pool_set_max() function to optionally set the

> > > > maximum value, which is called in the TTM initialization path or set

> > > > the boot argument. But for systems that use the dmabuf system heap,

> > > > but don't use TTM, no global limit is enforced.

> > >

> > > Yeah, exactly that's what I'm trying to prevent.

> > >

> > > See if we have the same functionality used by different use cases we

> > > should not have different behavior depending on what drivers are loaded.

> > >

> > > Is it a problem if we restrict the ION pool to 50% of system memory as

> > > well? If yes than I would rather drop the limit from TTM and only rely

> > > on the shrinker there as well.

> >

> > Would having the default value as a config option (still overridable

> > via boot argument) be an acceptable solution?

>

> We're also trying to get ttm over to the shrinker model, and a first

> cut of that even landed, but didn't really work out yet. So maybe just

> aiming for the shrinker? I do agree this should be consistent across

> the board, otherwise we're just sharing code but not actually sharing

> functionality, which is a recipe for disaster because one side will

> end up breaking the other side's use-case.


Fair enough, maybe it would be best to remove the default limit, but
leave the logic so it can still be set via the boot argument?

thanks
-john

Christoph Hellwig July 7, 2021, 6:38 a.m. UTC | #7

On Wed, Jun 30, 2021 at 01:34:17AM +0000, John Stultz wrote:
> This adds a shrinker controlled page pool, extracted
> out of the ttm_pool logic, and abstracted out a bit
> so it can be used by other non-ttm drivers.

Can you explain in detail why you need a differnt page pool over the one
maintained by the page allocator?  Fragmenting the memory into all kinds
of pools has lots of downsides, so the upsides need to be explained in
detail.

Christian König July 7, 2021, 6:52 a.m. UTC | #8

Am 06.07.21 um 23:19 schrieb John Stultz:
> On Tue, Jul 6, 2021 at 2:15 PM Daniel Vetter <daniel@ffwll.ch> wrote:
>> On Tue, Jul 6, 2021 at 11:04 PM John Stultz <john.stultz@linaro.org> wrote:
>>> On Wed, Jun 30, 2021 at 11:52 PM Christian König
>>> <christian.koenig@amd.com> wrote:
>>>> Am 01.07.21 um 00:24 schrieb John Stultz:
>>>>> On Wed, Jun 30, 2021 at 2:10 AM Christian König
>>>>> <christian.koenig@amd.com> wrote:
>>>>>> Am 30.06.21 um 03:34 schrieb John Stultz:
>>>>>>> +static unsigned long page_pool_size; /* max size of the pool */
>>>>>>> +
>>>>>>> +MODULE_PARM_DESC(page_pool_size, "Number of pages in the drm page pool");
>>>>>>> +module_param(page_pool_size, ulong, 0644);
>>>>>>> +
>>>>>>> +static atomic_long_t nr_managed_pages;
>>>>>>> +
>>>>>>> +static struct mutex shrinker_lock;
>>>>>>> +static struct list_head shrinker_list;
>>>>>>> +static struct shrinker mm_shrinker;
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_page_pool_set_max - Sets maximum size of all pools
>>>>>>> + *
>>>>>>> + * Sets the maximum number of pages allows in all pools.
>>>>>>> + * This can only be set once, and the first caller wins.
>>>>>>> + */
>>>>>>> +void drm_page_pool_set_max(unsigned long max)
>>>>>>> +{
>>>>>>> +     if (!page_pool_size)
>>>>>>> +             page_pool_size = max;
>>>>>>> +}
>>>>>>> +
>>>>>>> +/**
>>>>>>> + * drm_page_pool_get_max - Maximum size of all pools
>>>>>>> + *
>>>>>>> + * Return the maximum number of pages allows in all pools
>>>>>>> + */
>>>>>>> +unsigned long drm_page_pool_get_max(void)
>>>>>>> +{
>>>>>>> +     return page_pool_size;
>>>>>>> +}
>>>>>> Well in general I don't think it is a good idea to have getters/setters
>>>>>> for one line functionality, similar applies to locking/unlocking the
>>>>>> mutex below.
>>>>>>
>>>>>> Then in this specific case what those functions do is to aid
>>>>>> initializing the general pool manager and that in turn should absolutely
>>>>>> not be exposed.
>>>>>>
>>>>>> The TTM pool manager exposes this as function because initializing the
>>>>>> pool manager is done in one part of the module and calculating the
>>>>>> default value for the pages in another one. But that is not something I
>>>>>> would like to see here.
>>>>> So, I guess I'm not quite clear on what you'd like to see...
>>>>>
>>>>> Part of what I'm balancing here is the TTM subsystem normally sets a
>>>>> global max size, whereas the old ION pool didn't have caps (instead
>>>>> just relying on the shrinker when needed).
>>>>> So I'm trying to come up with a solution that can serve both uses. So
>>>>> I've got this drm_page_pool_set_max() function to optionally set the
>>>>> maximum value, which is called in the TTM initialization path or set
>>>>> the boot argument. But for systems that use the dmabuf system heap,
>>>>> but don't use TTM, no global limit is enforced.
>>>> Yeah, exactly that's what I'm trying to prevent.
>>>>
>>>> See if we have the same functionality used by different use cases we
>>>> should not have different behavior depending on what drivers are loaded.
>>>>
>>>> Is it a problem if we restrict the ION pool to 50% of system memory as
>>>> well? If yes than I would rather drop the limit from TTM and only rely
>>>> on the shrinker there as well.
>>> Would having the default value as a config option (still overridable
>>> via boot argument) be an acceptable solution?
>> We're also trying to get ttm over to the shrinker model, and a first
>> cut of that even landed, but didn't really work out yet. So maybe just
>> aiming for the shrinker? I do agree this should be consistent across
>> the board, otherwise we're just sharing code but not actually sharing
>> functionality, which is a recipe for disaster because one side will
>> end up breaking the other side's use-case.
> Fair enough, maybe it would be best to remove the default limit, but
> leave the logic so it can still be set via the boot argument?

Yeah, that would work for me and the shrinker implementation should 
already be good enough.

Regards,
Christian.

>
> thanks
> -john

Christian König July 7, 2021, 7:10 a.m. UTC | #9

Am 07.07.21 um 08:38 schrieb Christoph Hellwig:
> On Wed, Jun 30, 2021 at 01:34:17AM +0000, John Stultz wrote:
>> This adds a shrinker controlled page pool, extracted
>> out of the ttm_pool logic, and abstracted out a bit
>> so it can be used by other non-ttm drivers.
> Can you explain in detail why you need a differnt page pool over the one
> maintained by the page allocator?  Fragmenting the memory into all kinds
> of pools has lots of downsides, so the upsides need to be explained in
> detail.

Well, the original code all this is based on already had the comment 
that this really belong into the page allocator.

The key point is traditionally only GPUs used uncached and 
write-combined memory in such large quantities that having a pool for 
them makes sense.

Because of this we had this separately to reduce the complexity in the 
page allocator to handle another set of complexity of allocation types.

For the upside, for those use cases it means huge performance 
improvements for those drivers. See the numbers John provided in the 
cover letter.

But essentially at least I would be totally fine moving this into the 
page allocator, but moving it outside of TTM already helps with this 
goal. So this patch set is certainly a step into the right direction.

Regards,
Christian.

Christoph Hellwig July 7, 2021, 7:14 a.m. UTC | #10

On Wed, Jul 07, 2021 at 09:10:26AM +0200, Christian K??nig wrote:
> Well, the original code all this is based on already had the comment that

> this really belong into the page allocator.

> 

> The key point is traditionally only GPUs used uncached and write-combined

> memory in such large quantities that having a pool for them makes sense.

> 

> Because of this we had this separately to reduce the complexity in the page

> allocator to handle another set of complexity of allocation types.

> 

> For the upside, for those use cases it means huge performance improvements

> for those drivers. See the numbers John provided in the cover letter.

> 

> But essentially at least I would be totally fine moving this into the page

> allocator, but moving it outside of TTM already helps with this goal. So

> this patch set is certainly a step into the right direction.


Unless I'm badly misreading the patch and this series there is nothing
about cache attributes in this code.  It just allocates pages, zeroes
them, eventually hands them out to a consumer and registers a shrinker
for its freelist.

If OTOH it actually dealt with cachability that should be documented
in the commit log and probably also the naming of the implementation.

Christian König July 7, 2021, 9:32 a.m. UTC | #11

Am 07.07.21 um 09:14 schrieb Christoph Hellwig:
> On Wed, Jul 07, 2021 at 09:10:26AM +0200, Christian K??nig wrote:
>> Well, the original code all this is based on already had the comment that
>> this really belong into the page allocator.
>>
>> The key point is traditionally only GPUs used uncached and write-combined
>> memory in such large quantities that having a pool for them makes sense.
>>
>> Because of this we had this separately to reduce the complexity in the page
>> allocator to handle another set of complexity of allocation types.
>>
>> For the upside, for those use cases it means huge performance improvements
>> for those drivers. See the numbers John provided in the cover letter.
>>
>> But essentially at least I would be totally fine moving this into the page
>> allocator, but moving it outside of TTM already helps with this goal. So
>> this patch set is certainly a step into the right direction.
> Unless I'm badly misreading the patch and this series there is nothing
> about cache attributes in this code.  It just allocates pages, zeroes
> them, eventually hands them out to a consumer and registers a shrinker
> for its freelist.
>
> If OTOH it actually dealt with cachability that should be documented
> in the commit log and probably also the naming of the implementation.

Mhm, good point. In this case all left is the clearing overhead from the 
allocation hot path to the free path and I'm not really sure why the 
core page allocator shouldn't do that as well.

Regards,
Christian.

John Stultz July 7, 2021, 7:35 p.m. UTC | #12

On Tue, Jul 6, 2021 at 11:38 PM Christoph Hellwig <hch@infradead.org> wrote:
> On Wed, Jun 30, 2021 at 01:34:17AM +0000, John Stultz wrote:
> > This adds a shrinker controlled page pool, extracted
> > out of the ttm_pool logic, and abstracted out a bit
> > so it can be used by other non-ttm drivers.
>
> Can you explain in detail why you need a differnt page pool over the one
> maintained by the page allocator?  Fragmenting the memory into all kinds
> of pools has lots of downsides, so the upsides need to be explained in
> detail.

So, as Christian mentioned, on the TTM side it's useful, as they are
trying to avoid TLB flushes when changing caching attributes.

For the dmabuf system heap purposes, the main benefit is moving the
page zeroing to the free path, rather than the allocation path. This
on its own doesn't save much, but allows us to defer frees (and thus
the zeroing) to the background, which can get that work out of the hot
path.

thanks
-john

John Stultz July 7, 2021, 7:42 p.m. UTC | #13

On Wed, Jul 7, 2021 at 12:15 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Wed, Jul 07, 2021 at 09:10:26AM +0200, Christian K??nig wrote:
> > Well, the original code all this is based on already had the comment that
> > this really belong into the page allocator.
> >
> > The key point is traditionally only GPUs used uncached and write-combined
> > memory in such large quantities that having a pool for them makes sense.
> >
> > Because of this we had this separately to reduce the complexity in the page
> > allocator to handle another set of complexity of allocation types.
> >
> > For the upside, for those use cases it means huge performance improvements
> > for those drivers. See the numbers John provided in the cover letter.
> >
> > But essentially at least I would be totally fine moving this into the page
> > allocator, but moving it outside of TTM already helps with this goal. So
> > this patch set is certainly a step into the right direction.
>
> Unless I'm badly misreading the patch and this series there is nothing
> about cache attributes in this code.  It just allocates pages, zeroes
> them, eventually hands them out to a consumer and registers a shrinker
> for its freelist.
>
> If OTOH it actually dealt with cachability that should be documented
> in the commit log and probably also the naming of the implementation.

So the cache attributes are still managed in the tmm_pool code. This
series just pulls the pool/shrinker management out into common code so
we can use it in the dmabuf heap code without duplicating things.

thanks
-john

Christoph Hellwig July 8, 2021, 4:20 a.m. UTC | #14

On Wed, Jul 07, 2021 at 12:35:23PM -0700, John Stultz wrote:
> So, as Christian mentioned, on the TTM side it's useful, as they are

> trying to avoid TLB flushes when changing caching attributes.

> 

> For the dmabuf system heap purposes, the main benefit is moving the

> page zeroing to the free path, rather than the allocation path. This

> on its own doesn't save much, but allows us to defer frees (and thus

> the zeroing) to the background, which can get that work out of the hot

> path.


I really do no think that is worth it to fragment the free pages.

Christian König July 8, 2021, 7:37 a.m. UTC | #15

Am 08.07.21 um 06:20 schrieb Christoph Hellwig:
> On Wed, Jul 07, 2021 at 12:35:23PM -0700, John Stultz wrote:
>> So, as Christian mentioned, on the TTM side it's useful, as they are
>> trying to avoid TLB flushes when changing caching attributes.
>>
>> For the dmabuf system heap purposes, the main benefit is moving the
>> page zeroing to the free path, rather than the allocation path. This
>> on its own doesn't save much, but allows us to defer frees (and thus
>> the zeroing) to the background, which can get that work out of the hot
>> path.
> I really do no think that is worth it to fragment the free pages.

And I think functionality like that should be part of the common page 
allocator.

I mean we already have __GFP_ZERO, why not have a background kernel 
thread which zeros free pages when a CPU core is idle? (I'm pretty sure 
we already have that somehow).

Christian.

[v9,0/5] Generic page pool & deferred freeing for system dmabuf hea

Message

Comments