mbox series

[00/10] convert the majority of file systems to mmap_prepare

Message ID cover.1750099179.git.lorenzo.stoakes@oracle.com
Headers show
Series convert the majority of file systems to mmap_prepare | expand

Message

Lorenzo Stoakes June 16, 2025, 7:33 p.m. UTC
REVIEWER'S NOTES
================

I am basing this on the mm-new branch in Andrew's tree, so let me know if I
should rebase anything here. Given the mm bits touched I did think perhaps
we should take it through the mm tree, however it may be more sensible to
take it through an fs tree - let me know!

Apologies for the noise/churn, but there are some prerequisite steps here
that inform an ordering - "fs: consistently use file_has_valid_mmap_hooks()
helper" being especially critical, and so I put the bulk of the work in the
same series.

Let me know if there's anything I can do to make life easier here.

Thanks!

===============

In commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
callback"), a new hook for mmap was introduced - f_op->mmap_prepare().

This is preferred to the existing f_op->mmap() hook as it does require a
VMA to be established yet, thus allowing the mmap logic to invoke this hook
far, far earlier, prior to inserting a VMA into the virtual address space,
or performing any other heavy handed operations.

This allows for much simpler unwinding on error, and for there to be a
single attempt at merging a VMA rather than having to possibly reattempt a
merge based on potentially altered VMA state.

Far more importantly, it prevents inappropriate manipulation of
incompletely initialised VMA state, which is something that has been the
cause of bugs and complexity in the past.

The intent is to gradually deprecate f_op->mmap, and in that vein this
series coverts the majority of file systems to using f_op->mmap_prepare.

Prerequisite steps are taken - firstly ensuring all checks for mmap
capabilities use the file_has_valid_mmap_hooks() helper rather than
directly checking for f_op->mmap (which is now not a valid check) and
secondly updating daxdev_mapping_supported() to not require a VMA parameter
to allow ext4 and xfs to be converted.

Commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for nested
file systems") handles the nasty edge-case of nested file systems like
overlayfs, which introduces a compatibility shim to allow
f_op->mmap_prepare() to be invoked from an f_op->mmap() callback.

This allows for nested filesystems to continue to function correctly with
all file systems regardless of which callback is used. Once we finally
convert all file systems, this shim can be removed.

As a result, ecryptfs, fuse, and overlayfs remain unaltered so they can
nest all other file systems.

We additionally do not update resctl - as this requires an update to
remap_pfn_range() (or an alternative to it) which we defer to a later
series, equally we do not update cramfs which needs a mixed mapping
insertion with the same issue, nor do we update procfs, hugetlbfs, syfs or
kernfs all of which require VMAs for internal state and hooks. We shall
return to all of these later.

Lorenzo Stoakes (10):
  mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
  mm/nommu: use file_has_valid_mmap_hooks() helper
  fs: consistently use file_has_valid_mmap_hooks() helper
  fs/dax: make it possible to check dev dax support without a VMA
  fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
  fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
  mm/filemap: introduce generic_file_*_mmap_prepare() helpers
  fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
  fs: convert most other generic_file_*mmap() users to .mmap_prepare()
  fs: replace mmap hook with .mmap_prepare for simple mappings

 block/fops.c                               |  9 +++---
 drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c |  2 +-
 fs/9p/vfs_file.c                           | 13 +++++----
 fs/adfs/file.c                             |  2 +-
 fs/affs/file.c                             |  2 +-
 fs/afs/file.c                              | 11 ++++----
 fs/aio.c                                   |  8 +++---
 fs/backing-file.c                          |  4 +--
 fs/bcachefs/fs.c                           |  8 +++---
 fs/bfs/file.c                              |  2 +-
 fs/binfmt_elf.c                            |  4 +--
 fs/binfmt_elf_fdpic.c                      |  2 +-
 fs/btrfs/file.c                            |  7 +++--
 fs/ceph/addr.c                             |  5 ++--
 fs/ceph/file.c                             |  2 +-
 fs/ceph/super.h                            |  2 +-
 fs/coda/file.c                             |  6 ++--
 fs/ecryptfs/file.c                         |  2 +-
 fs/erofs/data.c                            | 16 ++++++-----
 fs/exfat/file.c                            |  7 +++--
 fs/ext2/file.c                             | 12 ++++----
 fs/ext4/file.c                             | 13 +++++----
 fs/f2fs/file.c                             |  7 +++--
 fs/fat/file.c                              |  2 +-
 fs/hfs/inode.c                             |  2 +-
 fs/hfsplus/inode.c                         |  2 +-
 fs/hostfs/hostfs_kern.c                    |  2 +-
 fs/hpfs/file.c                             |  2 +-
 fs/jffs2/file.c                            |  2 +-
 fs/jfs/file.c                              |  2 +-
 fs/minix/file.c                            |  2 +-
 fs/nfs/file.c                              | 13 +++++----
 fs/nfs/internal.h                          |  2 +-
 fs/nfs/nfs4file.c                          |  2 +-
 fs/nilfs2/file.c                           |  8 +++---
 fs/ntfs3/file.c                            | 15 +++++-----
 fs/ocfs2/file.c                            |  4 +--
 fs/ocfs2/mmap.c                            |  5 ++--
 fs/ocfs2/mmap.h                            |  2 +-
 fs/omfs/file.c                             |  2 +-
 fs/orangefs/file.c                         | 10 ++++---
 fs/ramfs/file-mmu.c                        |  2 +-
 fs/ramfs/file-nommu.c                      | 12 ++++----
 fs/read_write.c                            |  2 +-
 fs/romfs/mmap-nommu.c                      |  6 ++--
 fs/smb/client/cifsfs.c                     | 12 ++++----
 fs/smb/client/cifsfs.h                     |  4 +--
 fs/smb/client/file.c                       | 14 ++++++----
 fs/ubifs/file.c                            |  8 +++---
 fs/ufs/file.c                              |  2 +-
 fs/vboxsf/file.c                           |  8 +++---
 fs/xfs/xfs_file.c                          | 15 +++++-----
 fs/zonefs/file.c                           | 10 ++++---
 include/linux/dax.h                        | 16 ++++++-----
 include/linux/fs.h                         | 11 ++++----
 ipc/shm.c                                  |  2 +-
 mm/filemap.c                               | 29 ++++++++++++++++++++
 mm/internal.h                              |  2 +-
 mm/nommu.c                                 |  2 +-
 mm/vma.c                                   |  2 +-
 tools/testing/vma/vma_internal.h           | 32 ++++++++++++++++++----
 61 files changed, 245 insertions(+), 171 deletions(-)

--
2.49.0

Comments

Kees Cook June 16, 2025, 8:01 p.m. UTC | #1
On June 16, 2025 12:33:22 PM PDT, Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
>callback"), the f_op->mmap() hook has been deprecated in favour of
>f_op->mmap_prepare().
>
>Additionally, commit bb666b7c2707 ("mm: add mmap_prepare() compatibility
>layer for nested file systems") permits the use of the .mmap_prepare() hook
>even in nested filesystems like overlayfs.
>
>There are a number of places where we check only for f_op->mmap - this is
>incorrect now mmap_prepare exists, so update all of these to use the
>general helper file_has_valid_mmap_hooks().
>
>Most notably, this updates the elf logic to allow for the ability to
>execute binaries on filesystems which have the .mmap_prepare hook, but
>additionally we update nested filesystems.
>
>Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
>---
> fs/backing-file.c     | 2 +-
> fs/binfmt_elf.c       | 4 ++--
> fs/binfmt_elf_fdpic.c | 2 +-

Thanks for the refactoring!

Acked-by: Kees Cook<kees@kernel.org>
Al Viro June 16, 2025, 8:41 p.m. UTC | #2
On Mon, Jun 16, 2025 at 08:33:19PM +0100, Lorenzo Stoakes wrote:
> REVIEWER'S NOTES
> ================
> 
> I am basing this on the mm-new branch in Andrew's tree, so let me know if I
> should rebase anything here. Given the mm bits touched I did think perhaps
> we should take it through the mm tree, however it may be more sensible to
> take it through an fs tree - let me know!
> 
> Apologies for the noise/churn, but there are some prerequisite steps here
> that inform an ordering - "fs: consistently use file_has_valid_mmap_hooks()
> helper" being especially critical, and so I put the bulk of the work in the
> same series.
> 
> Let me know if there's anything I can do to make life easier here.

Documentation/filesystems/porting.rst?
Andrew Morton June 16, 2025, 11:11 p.m. UTC | #3
On Mon, 16 Jun 2025 20:33:19 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> I am basing this on the mm-new branch in Andrew's tree, so let me know if I
> should rebase anything here. Given the mm bits touched I did think perhaps
> we should take it through the mm tree, however it may be more sensible to
> take it through an fs tree - let me know!

It's more fs/ than mm/ purely from a footprint point of view.  But is
there any expectation that there will be additional patches which build
on this?

I'll scoop it into mm-new for now, see what happens.

Minus all the cc's.  Sorry ;)
Viacheslav Dubeyko June 16, 2025, 11:26 p.m. UTC | #4
On Mon, 2025-06-16 at 20:33 +0100, Lorenzo Stoakes wrote:
> Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
> callback"), the f_op->mmap() hook has been deprecated in favour of
> f_op->mmap_prepare().
> 
> We have provided generic .mmap_prepare() equivalents, so update all file
> systems that specify these directly in their file_operations structures.
> 
> This updates 9p, adfs, affs, bfs, fat, hfs, hfsplus, hostfs, hpfs, jffs2,
> jfs, minix, omfs, ramfs and ufs file systems directly.
> 
> It updates generic_ro_fops which impacts qnx4, cramfs, befs, squashfs,
> frebxfs, qnx6, efs, romfs, erofs and isofs file systems.
> 
> There are remaining file systems which use generic hooks in a less direct
> way which we address in a subsequent commit.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> ---
>  fs/9p/vfs_file.c        | 2 +-
>  fs/adfs/file.c          | 2 +-
>  fs/affs/file.c          | 2 +-
>  fs/bfs/file.c           | 2 +-
>  fs/fat/file.c           | 2 +-
>  fs/hfs/inode.c          | 2 +-
>  fs/hfsplus/inode.c      | 2 +-
>  fs/hostfs/hostfs_kern.c | 2 +-
>  fs/hpfs/file.c          | 2 +-
>  fs/jffs2/file.c         | 2 +-
>  fs/jfs/file.c           | 2 +-
>  fs/minix/file.c         | 2 +-
>  fs/omfs/file.c          | 2 +-
>  fs/ramfs/file-mmu.c     | 2 +-
>  fs/read_write.c         | 2 +-
>  fs/ufs/file.c           | 2 +-
>  16 files changed, 16 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/9p/vfs_file.c b/fs/9p/vfs_file.c
> index 348cc90bf9c5..2ff3e0ac7266 100644
> --- a/fs/9p/vfs_file.c
> +++ b/fs/9p/vfs_file.c
> @@ -516,7 +516,7 @@ const struct file_operations v9fs_file_operations = {
>  	.open = v9fs_file_open,
>  	.release = v9fs_dir_release,
>  	.lock = v9fs_file_lock,
> -	.mmap = generic_file_readonly_mmap,
> +	.mmap_prepare = generic_file_readonly_mmap_prepare,
>  	.splice_read = v9fs_file_splice_read,
>  	.splice_write = iter_file_splice_write,
>  	.fsync = v9fs_file_fsync,
> diff --git a/fs/adfs/file.c b/fs/adfs/file.c
> index ee80718aaeec..cd13165fd904 100644
> --- a/fs/adfs/file.c
> +++ b/fs/adfs/file.c
> @@ -25,7 +25,7 @@
>  const struct file_operations adfs_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.fsync		= generic_file_fsync,
>  	.write_iter	= generic_file_write_iter,
>  	.splice_read	= filemap_splice_read,
> diff --git a/fs/affs/file.c b/fs/affs/file.c
> index 7a71018e3f67..fbac204b7055 100644
> --- a/fs/affs/file.c
> +++ b/fs/affs/file.c
> @@ -999,7 +999,7 @@ const struct file_operations affs_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.open		= affs_file_open,
>  	.release	= affs_file_release,
>  	.fsync		= affs_file_fsync,
> diff --git a/fs/bfs/file.c b/fs/bfs/file.c
> index fa66a09e496a..6685c3411fe7 100644
> --- a/fs/bfs/file.c
> +++ b/fs/bfs/file.c
> @@ -27,7 +27,7 @@ const struct file_operations bfs_file_operations = {
>  	.llseek 	= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.splice_read	= filemap_splice_read,
>  };
>  
> diff --git a/fs/fat/file.c b/fs/fat/file.c
> index e887e9ab7472..4fc49a614fb8 100644
> --- a/fs/fat/file.c
> +++ b/fs/fat/file.c
> @@ -204,7 +204,7 @@ const struct file_operations fat_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.release	= fat_file_release,
>  	.unlocked_ioctl	= fat_generic_ioctl,
>  	.compat_ioctl	= compat_ptr_ioctl,
> diff --git a/fs/hfs/inode.c b/fs/hfs/inode.c
> index a81ce7a740b9..d419586d668d 100644
> --- a/fs/hfs/inode.c
> +++ b/fs/hfs/inode.c
> @@ -690,7 +690,7 @@ static const struct file_operations hfs_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.splice_read	= filemap_splice_read,
>  	.fsync		= hfs_file_fsync,
>  	.open		= hfs_file_open,
> diff --git a/fs/hfsplus/inode.c b/fs/hfsplus/inode.c
> index f331e9574217..0af7e302730c 100644
> --- a/fs/hfsplus/inode.c
> +++ b/fs/hfsplus/inode.c
> @@ -366,7 +366,7 @@ static const struct file_operations hfsplus_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.splice_read	= filemap_splice_read,
>  	.fsync		= hfsplus_file_fsync,
>  	.open		= hfsplus_file_open,

Looks good for HFS/HFS+.

Reviewed-by: Viacheslav Dubeyko <Slava.Dubeyko@ibm.com>

Thanks,
Slava.

> diff --git a/fs/hostfs/hostfs_kern.c b/fs/hostfs/hostfs_kern.c
> index 702c41317589..bc22b6cc72af 100644
> --- a/fs/hostfs/hostfs_kern.c
> +++ b/fs/hostfs/hostfs_kern.c
> @@ -382,7 +382,7 @@ static const struct file_operations hostfs_file_fops = {
>  	.splice_write	= iter_file_splice_write,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.open		= hostfs_open,
>  	.release	= hostfs_file_release,
>  	.fsync		= hostfs_fsync,
> diff --git a/fs/hpfs/file.c b/fs/hpfs/file.c
> index 449a3fc1b8d9..a1a44e3edb19 100644
> --- a/fs/hpfs/file.c
> +++ b/fs/hpfs/file.c
> @@ -255,7 +255,7 @@ const struct file_operations hpfs_file_ops =
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.release	= hpfs_file_release,
>  	.fsync		= hpfs_file_fsync,
>  	.splice_read	= filemap_splice_read,
> diff --git a/fs/jffs2/file.c b/fs/jffs2/file.c
> index 13c18ccc13b0..1e05f7fe5dd4 100644
> --- a/fs/jffs2/file.c
> +++ b/fs/jffs2/file.c
> @@ -54,7 +54,7 @@ const struct file_operations jffs2_file_operations =
>   	.read_iter =	generic_file_read_iter,
>   	.write_iter =	generic_file_write_iter,
>  	.unlocked_ioctl=jffs2_ioctl,
> -	.mmap =		generic_file_readonly_mmap,
> +	.mmap_prepare =	generic_file_readonly_mmap_prepare,
>  	.fsync =	jffs2_fsync,
>  	.splice_read =	filemap_splice_read,
>  	.splice_write = iter_file_splice_write,
> diff --git a/fs/jfs/file.c b/fs/jfs/file.c
> index 01b6912e60f8..5e47951db630 100644
> --- a/fs/jfs/file.c
> +++ b/fs/jfs/file.c
> @@ -143,7 +143,7 @@ const struct file_operations jfs_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.splice_read	= filemap_splice_read,
>  	.splice_write	= iter_file_splice_write,
>  	.fsync		= jfs_fsync,
> diff --git a/fs/minix/file.c b/fs/minix/file.c
> index 906d192ab7f3..dca7ac71f049 100644
> --- a/fs/minix/file.c
> +++ b/fs/minix/file.c
> @@ -17,7 +17,7 @@ const struct file_operations minix_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.fsync		= generic_file_fsync,
>  	.splice_read	= filemap_splice_read,
>  };
> diff --git a/fs/omfs/file.c b/fs/omfs/file.c
> index 98358d405b6a..319c04e63964 100644
> --- a/fs/omfs/file.c
> +++ b/fs/omfs/file.c
> @@ -332,7 +332,7 @@ const struct file_operations omfs_file_operations = {
>  	.llseek = generic_file_llseek,
>  	.read_iter = generic_file_read_iter,
>  	.write_iter = generic_file_write_iter,
> -	.mmap = generic_file_mmap,
> +	.mmap_prepare = generic_file_mmap_prepare,
>  	.fsync = generic_file_fsync,
>  	.splice_read = filemap_splice_read,
>  };
> diff --git a/fs/ramfs/file-mmu.c b/fs/ramfs/file-mmu.c
> index b45c7edc3225..b11f5b20b78b 100644
> --- a/fs/ramfs/file-mmu.c
> +++ b/fs/ramfs/file-mmu.c
> @@ -41,7 +41,7 @@ static unsigned long ramfs_mmu_get_unmapped_area(struct file *file,
>  const struct file_operations ramfs_file_operations = {
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.fsync		= noop_fsync,
>  	.splice_read	= filemap_splice_read,
>  	.splice_write	= iter_file_splice_write,
> diff --git a/fs/read_write.c b/fs/read_write.c
> index 0ef70e128c4a..80fdab99f9e4 100644
> --- a/fs/read_write.c
> +++ b/fs/read_write.c
> @@ -28,7 +28,7 @@
>  const struct file_operations generic_ro_fops = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
> -	.mmap		= generic_file_readonly_mmap,
> +	.mmap_prepare	= generic_file_readonly_mmap_prepare,
>  	.splice_read	= filemap_splice_read,
>  };
>  
> diff --git a/fs/ufs/file.c b/fs/ufs/file.c
> index 487ad1fc2de6..c2a391c17df7 100644
> --- a/fs/ufs/file.c
> +++ b/fs/ufs/file.c
> @@ -38,7 +38,7 @@ const struct file_operations ufs_file_operations = {
>  	.llseek		= generic_file_llseek,
>  	.read_iter	= generic_file_read_iter,
>  	.write_iter	= generic_file_write_iter,
> -	.mmap		= generic_file_mmap,
> +	.mmap_prepare	= generic_file_mmap_prepare,
>  	.open           = generic_file_open,
>  	.fsync		= generic_file_fsync,
>  	.splice_read	= filemap_splice_read,
Christoph Hellwig June 17, 2025, 5:10 a.m. UTC | #5
On Mon, Jun 16, 2025 at 08:33:20PM +0100, Lorenzo Stoakes wrote:
> The call_mmap() function violates the existing convention in
> include/linux/fs.h whereby invocations of virtual file system hooks is
> performed by functions prefixed with vfs_xxx().
> 
> Correct this by renaming call_mmap() to vfs_mmap(). This also avoids
> confusion as to the fact that f_op->mmap_prepare may be invoked here.
> 
> Also rename __call_mmap_prepare() function to vfs_mmap_prepare() and adjust
> to accept a file parameter, this is useful later for nested file systems.
> 
> Finally, fix up the VMA userland tests and ensure the mmap_prepare -> mmap
> shim is implemented there.

Can we please just kill these silly call_* helpers instead?
Lorenzo Stoakes June 17, 2025, 5:25 a.m. UTC | #6
On Mon, Jun 16, 2025 at 10:11:28PM -0700, Christoph Hellwig wrote:
> On Mon, Jun 16, 2025 at 08:33:22PM +0100, Lorenzo Stoakes wrote:
> > Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
> > callback"), the f_op->mmap() hook has been deprecated in favour of
> > f_op->mmap_prepare().
> >
> > Additionally, commit bb666b7c2707 ("mm: add mmap_prepare() compatibility
> > layer for nested file systems") permits the use of the .mmap_prepare() hook
> > even in nested filesystems like overlayfs.
> >
> > There are a number of places where we check only for f_op->mmap - this is
> > incorrect now mmap_prepare exists, so update all of these to use the
> > general helper file_has_valid_mmap_hooks().
> >
> > Most notably, this updates the elf logic to allow for the ability to
> > execute binaries on filesystems which have the .mmap_prepare hook, but
> > additionally we update nested filesystems.
>
> Can you please give the function a better name before spreading it?
> file operations aren't hooks by any classic definition.
>

can_mmap_file()?
Lorenzo Stoakes June 17, 2025, 5:29 a.m. UTC | #7
On Mon, Jun 16, 2025 at 10:10:30PM -0700, Christoph Hellwig wrote:
> On Mon, Jun 16, 2025 at 08:33:20PM +0100, Lorenzo Stoakes wrote:
> > The call_mmap() function violates the existing convention in
> > include/linux/fs.h whereby invocations of virtual file system hooks is
> > performed by functions prefixed with vfs_xxx().
> >
> > Correct this by renaming call_mmap() to vfs_mmap(). This also avoids
> > confusion as to the fact that f_op->mmap_prepare may be invoked here.
> >
> > Also rename __call_mmap_prepare() function to vfs_mmap_prepare() and adjust
> > to accept a file parameter, this is useful later for nested file systems.
> >
> > Finally, fix up the VMA userland tests and ensure the mmap_prepare -> mmap
> > shim is implemented there.
>
> Can we please just kill these silly call_* helpers instead?

The vfs_mmap() function now has some actual meaningful logic added in
commit bb666b7c2707 ("mm: add mmap_prepare() compatibility layer for nested
file systems").

>
>
> ______________________________________________________
> Linux MTD discussion mailing list
> http://lists.infradead.org/mailman/listinfo/linux-mtd/
Jan Kara June 17, 2025, 10:08 a.m. UTC | #8
On Tue 17-06-25 06:25:34, Lorenzo Stoakes wrote:
> On Mon, Jun 16, 2025 at 10:11:28PM -0700, Christoph Hellwig wrote:
> > On Mon, Jun 16, 2025 at 08:33:22PM +0100, Lorenzo Stoakes wrote:
> > > Since commit c84bf6dd2b83 ("mm: introduce new .mmap_prepare() file
> > > callback"), the f_op->mmap() hook has been deprecated in favour of
> > > f_op->mmap_prepare().
> > >
> > > Additionally, commit bb666b7c2707 ("mm: add mmap_prepare() compatibility
> > > layer for nested file systems") permits the use of the .mmap_prepare() hook
> > > even in nested filesystems like overlayfs.
> > >
> > > There are a number of places where we check only for f_op->mmap - this is
> > > incorrect now mmap_prepare exists, so update all of these to use the
> > > general helper file_has_valid_mmap_hooks().
> > >
> > > Most notably, this updates the elf logic to allow for the ability to
> > > execute binaries on filesystems which have the .mmap_prepare hook, but
> > > additionally we update nested filesystems.
> >
> > Can you please give the function a better name before spreading it?
> > file operations aren't hooks by any classic definition.
> >
> 
> can_mmap_file()?

I like this name more as well :). With this patch looks good to me. Again a
note that Fixes tag would be probably appropriate for this patch...

								Honza
Jan Kara June 17, 2025, 10:23 a.m. UTC | #9
On Mon 16-06-25 20:33:28, Lorenzo Stoakes wrote:
> Update nearly all generic_file_mmap() and generic_file_readonly_mmap()
> callers to use generic_file_mmap_prepare() and
> generic_file_readonly_mmap_prepare() respectively.
> 
> We update blkdev, 9p, afs, erofs, ext2, nfs, ntfs3, smb, ubifs and vboxsf
> file systems this way.
> 
> Remaining users we cannot yet update are ecryptfs, fuse and cramfs. The
> former two are nested file systems that must support any underlying file
> ssytem, and cramfs inserts a mixed mapping which currently requires a VMA.
> 
> Once all file systems have been converted to mmap_prepare(), we can then
> update nested file systems.
> 
> Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Overall the patch looks good. Just a couple of notes regarding pointless
local variable being created...

> ---
>  block/fops.c           |  9 +++++----
>  fs/9p/vfs_file.c       | 11 ++++++-----
>  fs/afs/file.c          | 11 ++++++-----
>  fs/erofs/data.c        | 16 +++++++++-------
>  fs/ext2/file.c         | 12 +++++++-----
>  fs/nfs/file.c          | 13 +++++++------
>  fs/nfs/internal.h      |  2 +-
>  fs/nfs/nfs4file.c      |  2 +-
>  fs/ntfs3/file.c        | 15 ++++++++-------
>  fs/smb/client/cifsfs.c | 12 ++++++------
>  fs/smb/client/cifsfs.h |  4 ++--
>  fs/smb/client/file.c   | 14 ++++++++------
>  fs/ubifs/file.c        |  8 ++++----
>  fs/vboxsf/file.c       |  8 ++++----
>  14 files changed, 74 insertions(+), 63 deletions(-)
> 
> diff --git a/block/fops.c b/block/fops.c
> index 1309861d4c2c..5a0ebc81e489 100644
> --- a/block/fops.c
> +++ b/block/fops.c
> @@ -911,14 +911,15 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
>  	return error;
>  }
>  
> -static int blkdev_mmap(struct file *file, struct vm_area_struct *vma)
> +static int blkdev_mmap_prepare(struct vm_area_desc *desc)
>  {
> +	struct file *file = desc->file;
>  	struct inode *bd_inode = bdev_file_inode(file);

I guess no need to create 'file' variable here since it has only one use in
the line above...

>  
>  	if (bdev_read_only(I_BDEV(bd_inode)))
> -		return generic_file_readonly_mmap(file, vma);
> +		return generic_file_readonly_mmap_prepare(desc);
>  
> -	return generic_file_mmap(file, vma);
> +	return generic_file_mmap_prepare(desc);
>  }
>  
>  const struct file_operations def_blk_fops = {

...

> @@ -492,16 +492,17 @@ static void afs_drop_open_mmap(struct afs_vnode *vnode)
>  /*
>   * Handle setting up a memory mapping on an AFS file.
>   */
> -static int afs_file_mmap(struct file *file, struct vm_area_struct *vma)
> +static int afs_file_mmap_prepare(struct vm_area_desc *desc)
>  {
> +	struct file *file = desc->file;
>  	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));

Same comment about pointless local variable here as well.

>  	int ret;
>  
>  	afs_add_open_mmap(vnode);
>  
> -	ret = generic_file_mmap(file, vma);
> +	ret = generic_file_mmap_prepare(desc);
>  	if (ret == 0)
> -		vma->vm_ops = &afs_vm_ops;
> +		desc->vm_ops = &afs_vm_ops;
>  	else
>  		afs_drop_open_mmap(vnode);
>  	return ret;
> diff --git a/fs/erofs/data.c b/fs/erofs/data.c
> index 6a329c329f43..52dfd1a44c43 100644
> --- a/fs/erofs/data.c
> +++ b/fs/erofs/data.c
> @@ -409,20 +409,22 @@ static const struct vm_operations_struct erofs_dax_vm_ops = {
>  	.huge_fault	= erofs_dax_huge_fault,
>  };
>  
> -static int erofs_file_mmap(struct file *file, struct vm_area_struct *vma)
> +static int erofs_file_mmap_prepare(struct vm_area_desc *desc)
>  {
> +	struct file *file = desc->file;
> +
>  	if (!IS_DAX(file_inode(file)))

And here...

> -		return generic_file_readonly_mmap(file, vma);
> +		return generic_file_readonly_mmap_prepare(desc);
>  
> -	if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE))
> +	if ((desc->vm_flags & VM_SHARED) && (desc->vm_flags & VM_MAYWRITE))
>  		return -EINVAL;
>  
> -	vma->vm_ops = &erofs_dax_vm_ops;
> -	vm_flags_set(vma, VM_HUGEPAGE);
> +	desc->vm_ops = &erofs_dax_vm_ops;
> +	desc->vm_flags |= VM_HUGEPAGE;
>  	return 0;
>  }
...
> diff --git a/fs/smb/client/file.c b/fs/smb/client/file.c
> index 9835672267d2..2ed5173cfa73 100644
> --- a/fs/smb/client/file.c
> +++ b/fs/smb/client/file.c
> @@ -2995,8 +2995,9 @@ static const struct vm_operations_struct cifs_file_vm_ops = {
>  	.page_mkwrite = cifs_page_mkwrite,
>  };
>  
> -int cifs_file_strict_mmap(struct file *file, struct vm_area_struct *vma)
> +int cifs_file_strict_mmap_prepare(struct vm_area_desc *desc)
>  {
> +	struct file *file = desc->file;
>  	int xid, rc = 0;
>  	struct inode *inode = file_inode(file);

Again pointless local variable 'file' here.

>  
> @@ -3005,16 +3006,17 @@ int cifs_file_strict_mmap(struct file *file, struct vm_area_struct *vma)
>  	if (!CIFS_CACHE_READ(CIFS_I(inode)))
>  		rc = cifs_zap_mapping(inode);
>  	if (!rc)
> -		rc = generic_file_mmap(file, vma);
> +		rc = generic_file_mmap_prepare(desc);
>  	if (!rc)
> -		vma->vm_ops = &cifs_file_vm_ops;
> +		desc->vm_ops = &cifs_file_vm_ops;
>  
>  	free_xid(xid);
>  	return rc;
>  }
>  
> -int cifs_file_mmap(struct file *file, struct vm_area_struct *vma)
> +int cifs_file_mmap_prepare(struct vm_area_desc *desc)
>  {
> +	struct file *file = desc->file;
>  	int rc, xid;

And here (the only use is in cifs_revalidate_file(file)).

								Honza
Christian Brauner June 17, 2025, 11:31 a.m. UTC | #10
On Mon, Jun 16, 2025 at 04:11:11PM -0700, Andrew Morton wrote:
> On Mon, 16 Jun 2025 20:33:19 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
> 
> > I am basing this on the mm-new branch in Andrew's tree, so let me know if I
> > should rebase anything here. Given the mm bits touched I did think perhaps
> > we should take it through the mm tree, however it may be more sensible to
> > take it through an fs tree - let me know!
> 
> It's more fs/ than mm/ purely from a footprint point of view.  But
> there any expectation that there will be additional patches which build
> on this?
> 
> I'll scoop it into mm-new for now, see what happens.

I'm going to carry this in the vfs-6.17.mmap_prepare branch after fixing
up the various minor issues spotted in the series.
Christian Brauner June 17, 2025, 11:54 a.m. UTC | #11
On Tue, Jun 17, 2025 at 12:23:41PM +0200, Jan Kara wrote:
> On Mon 16-06-25 20:33:28, Lorenzo Stoakes wrote:
> > Update nearly all generic_file_mmap() and generic_file_readonly_mmap()
> > callers to use generic_file_mmap_prepare() and
> > generic_file_readonly_mmap_prepare() respectively.
> > 
> > We update blkdev, 9p, afs, erofs, ext2, nfs, ntfs3, smb, ubifs and vboxsf
> > file systems this way.
> > 
> > Remaining users we cannot yet update are ecryptfs, fuse and cramfs. The
> > former two are nested file systems that must support any underlying file
> > ssytem, and cramfs inserts a mixed mapping which currently requires a VMA.
> > 
> > Once all file systems have been converted to mmap_prepare(), we can then
> > update nested file systems.
> > 
> > Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> Overall the patch looks good. Just a couple of notes regarding pointless
> local variable being created...
> 
> > ---
> >  block/fops.c           |  9 +++++----
> >  fs/9p/vfs_file.c       | 11 ++++++-----
> >  fs/afs/file.c          | 11 ++++++-----
> >  fs/erofs/data.c        | 16 +++++++++-------
> >  fs/ext2/file.c         | 12 +++++++-----
> >  fs/nfs/file.c          | 13 +++++++------
> >  fs/nfs/internal.h      |  2 +-
> >  fs/nfs/nfs4file.c      |  2 +-
> >  fs/ntfs3/file.c        | 15 ++++++++-------
> >  fs/smb/client/cifsfs.c | 12 ++++++------
> >  fs/smb/client/cifsfs.h |  4 ++--
> >  fs/smb/client/file.c   | 14 ++++++++------
> >  fs/ubifs/file.c        |  8 ++++----
> >  fs/vboxsf/file.c       |  8 ++++----
> >  14 files changed, 74 insertions(+), 63 deletions(-)
> > 
> > diff --git a/block/fops.c b/block/fops.c
> > index 1309861d4c2c..5a0ebc81e489 100644
> > --- a/block/fops.c
> > +++ b/block/fops.c
> > @@ -911,14 +911,15 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start,
> >  	return error;
> >  }
> >  
> > -static int blkdev_mmap(struct file *file, struct vm_area_struct *vma)
> > +static int blkdev_mmap_prepare(struct vm_area_desc *desc)
> >  {
> > +	struct file *file = desc->file;
> >  	struct inode *bd_inode = bdev_file_inode(file);
> 
> I guess no need to create 'file' variable here since it has only one use in
> the line above...

Agreed, fixed in-tree.

> > -static int afs_file_mmap(struct file *file, struct vm_area_struct *vma)
> > +static int afs_file_mmap_prepare(struct vm_area_desc *desc)
> >  {
> > +	struct file *file = desc->file;
> >  	struct afs_vnode *vnode = AFS_FS_I(file_inode(file));
> 
> Same comment about pointless local variable here as well.

Same.

> > -static int erofs_file_mmap(struct file *file, struct vm_area_struct *vma)
> > +static int erofs_file_mmap_prepare(struct vm_area_desc *desc)
> >  {
> > +	struct file *file = desc->file;
> > +
> >  	if (!IS_DAX(file_inode(file)))
> 
> And here...

Same.

> > -int cifs_file_strict_mmap(struct file *file, struct vm_area_struct *vma)
> > +int cifs_file_strict_mmap_prepare(struct vm_area_desc *desc)
> >  {
> > +	struct file *file = desc->file;
> >  	int xid, rc = 0;
> >  	struct inode *inode = file_inode(file);
> 
> Again pointless local variable 'file' here.

Same.

> > -int cifs_file_mmap(struct file *file, struct vm_area_struct *vma)
> > +int cifs_file_mmap_prepare(struct vm_area_desc *desc)
> >  {
> > +	struct file *file = desc->file;
> >  	int rc, xid;
> 
> And here (the only use is in cifs_revalidate_file(file)).

Same.
Christian Brauner June 17, 2025, 11:58 a.m. UTC | #12
On Mon, 16 Jun 2025 20:33:19 +0100, Lorenzo Stoakes wrote:
> REVIEWER'S NOTES
> ================
> 
> I am basing this on the mm-new branch in Andrew's tree, so let me know if I
> should rebase anything here. Given the mm bits touched I did think perhaps
> we should take it through the mm tree, however it may be more sensible to
> take it through an fs tree - let me know!
> 
> [...]

This looks good. I fixed up the minor review comments.
Looking forward to further cleanups in this area.

---

Applied to the vfs-6.17.mmap_prepare branch of the vfs/vfs.git tree.
Patches in the vfs-6.17.mmap_prepare branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-6.17.mmap_prepare

[01/10] mm: rename call_mmap/mmap_prepare to vfs_mmap/mmap_prepare
        https://git.kernel.org/vfs/vfs/c/20ca475d9860
[02/10] mm/nommu: use file_has_valid_mmap_hooks() helper
        https://git.kernel.org/vfs/vfs/c/c6900f227f89
[03/10] fs: consistently use file_has_valid_mmap_hooks() helper
        https://git.kernel.org/vfs/vfs/c/b013ed403197
[04/10] fs/dax: make it possible to check dev dax support without a VMA
        https://git.kernel.org/vfs/vfs/c/0335f6afd348
[05/10] fs/ext4: transition from deprecated .mmap hook to .mmap_prepare
        https://git.kernel.org/vfs/vfs/c/8c90ae8fe5e3
[06/10] fs/xfs: transition from deprecated .mmap hook to .mmap_prepare
        https://git.kernel.org/vfs/vfs/c/6528d29b46d8
[07/10] mm/filemap: introduce generic_file_*_mmap_prepare() helpers
        https://git.kernel.org/vfs/vfs/c/5b44297bcfa4
[08/10] fs: convert simple use of generic_file_*_mmap() to .mmap_prepare()
        https://git.kernel.org/vfs/vfs/c/951ea2f4844c
[09/10] fs: convert most other generic_file_*mmap() users to .mmap_prepare()
        https://git.kernel.org/vfs/vfs/c/a5ee9a82981d
[10/10] fs: replace mmap hook with .mmap_prepare for simple mappings
        https://git.kernel.org/vfs/vfs/c/a1e5b36c4034
Jeff Layton June 17, 2025, 1:45 p.m. UTC | #13
On Mon, 2025-06-16 at 21:41 +0100, Al Viro wrote:
> On Mon, Jun 16, 2025 at 08:33:19PM +0100, Lorenzo Stoakes wrote:
> > REVIEWER'S NOTES
> > ================
> > 
> > I am basing this on the mm-new branch in Andrew's tree, so let me know if I
> > should rebase anything here. Given the mm bits touched I did think perhaps
> > we should take it through the mm tree, however it may be more sensible to
> > take it through an fs tree - let me know!
> > 
> > Apologies for the noise/churn, but there are some prerequisite steps here
> > that inform an ordering - "fs: consistently use file_has_valid_mmap_hooks()
> > helper" being especially critical, and so I put the bulk of the work in the
> > same series.
> > 
> > Let me know if there's anything I can do to make life easier here.
> 
> Documentation/filesystems/porting.rst?

Also, an entry for ->mmap_prepare in Documentation/filesystems/vfs.rst
would be good.

I went there first to understand what the requirements of mmap_prepare
are, but there is nothing.
David Howells June 17, 2025, 2:05 p.m. UTC | #14
Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> This is preferred to the existing f_op->mmap() hook as it does require a
> VMA to be established yet,

Did you mean ".. doesn't require a VMA to be established yet, ..."

David
Lorenzo Stoakes June 17, 2025, 2:17 p.m. UTC | #15
On Tue, Jun 17, 2025 at 03:05:59PM +0100, David Howells wrote:
> Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:
>
> > This is preferred to the existing f_op->mmap() hook as it does require a
> > VMA to be established yet,
>
> Did you mean ".. doesn't require a VMA to be established yet, ..."
>
> David
>

Yeah apologies, indeed I did :)