mbox series

[v4,0/4] mm: userfaultfd: refactor and add UFFDIO_CONTINUE_MODE_WP

Message ID 20230308221932.1548827-1-axelrasmussen@google.com
Headers show
Series mm: userfaultfd: refactor and add UFFDIO_CONTINUE_MODE_WP | expand

Message

Axel Rasmussen March 8, 2023, 10:19 p.m. UTC
This series, currently based on 6.3-rc1, is divided into two parts:

- Commits 1-3 refactor userfaultfd ioctl code without behavior changes, with the
  main goal of improving consistency and reducing the number of function args.
- Commit 4 adds UFFDIO_CONTINUE_MODE_WP.

The refactors are sorted by increasing controversial-ness, the idea being we
could drop some of the refactors if they are deemed not worth it.

Changelog:

v3->v4:
 - massage the uffd_flags_t implementation to eliminate all sparse warnings
 - add a couple inline helpers to make uffd_flags_t usage easier
 - drop the refactor passing `struct uffdio_range *` around (previously 4/5)
 - define a temporary `struct mm_struct *` in function with >=3 `vma->vm_mm`
 - consistent argument order between `flags` and `pagep`
 - expand on the use case in patch 4/4 message

v2->v3:
 - rebase onto 6.3-rc1
 - typedef a new type for mfill flags in patch 3/5 (suggested by Nadav)

v1->v2:
 - refactor before adding the new flag, to avoid perpetuating messiness

Axel Rasmussen (4):
  mm: userfaultfd: rename functions for clarity + consistency
  mm: userfaultfd: don't pass around both mm and vma
  mm: userfaultfd: combine 'mode' and 'wp_copy' arguments
  mm: userfaultfd: add UFFDIO_CONTINUE_MODE_WP to install WP PTEs

 fs/userfaultfd.c                         |  29 ++--
 include/linux/hugetlb.h                  |  27 ++--
 include/linux/shmem_fs.h                 |   9 +-
 include/linux/userfaultfd_k.h            |  68 +++++----
 include/uapi/linux/userfaultfd.h         |   7 +
 mm/hugetlb.c                             |  28 ++--
 mm/shmem.c                               |  14 +-
 mm/userfaultfd.c                         | 170 +++++++++++------------
 tools/testing/selftests/mm/userfaultfd.c |   4 +
 9 files changed, 187 insertions(+), 169 deletions(-)

--
2.40.0.rc1.284.g88254d51c5-goog

Comments

Mike Rapoport March 9, 2023, 8:40 a.m. UTC | #1
On Wed, Mar 08, 2023 at 02:19:29PM -0800, Axel Rasmussen wrote:
> The basic problem is, over time we've added new userfaultfd ioctls, and
> we've refactored the code so functions which used to handle only one
> case are now re-used to deal with several cases. While this happened, we
> didn't bother to rename the functions.
> 
> Similarly, as we added new functions, we cargo-culted pieces of the
> now-inconsistent naming scheme, so those functions too ended up with
> names that don't make a lot of sense.
> 
> A key point here is, "copy" in most userfaultfd code refers specifically
> to UFFDIO_COPY, where we allocate a new page and copy its contents from
> userspace. There are many functions with "copy" in the name that don't
> actually do this (at least in some cases).
> 
> So, rename things into a consistent scheme. The high level idea is that
> the call stack for userfaultfd ioctls becomes:
> 
> userfaultfd_ioctl
>   -> userfaultfd_(particular ioctl)
>     -> mfill_atomic_(particular kind of fill operation)
>       -> mfill_atomic    /* loops over pages in range */
>         -> mfill_atomic_pte    /* deals with single pages */
>           -> mfill_atomic_pte_(particular kind of fill operation)
>             -> mfill_atomic_install_pte
> 
> There are of course some special cases (shmem, hugetlb), but this is the
> general structure which all function names now adhere to.
> 
> Acked-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>

> ---
>  fs/userfaultfd.c              | 18 +++----
>  include/linux/hugetlb.h       | 30 +++++------
>  include/linux/userfaultfd_k.h | 18 +++----
>  mm/hugetlb.c                  | 20 +++----
>  mm/userfaultfd.c              | 98 +++++++++++++++++------------------
>  5 files changed, 92 insertions(+), 92 deletions(-)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index 44d1ee429eb0..365bf00dd8dd 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1741,9 +1741,9 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
>  	if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP))
>  		goto out;
>  	if (mmget_not_zero(ctx->mm)) {
> -		ret = mcopy_atomic(ctx->mm, uffdio_copy.dst, uffdio_copy.src,
> -				   uffdio_copy.len, &ctx->mmap_changing,
> -				   uffdio_copy.mode);
> +		ret = mfill_atomic_copy(ctx->mm, uffdio_copy.dst, uffdio_copy.src,
> +					uffdio_copy.len, &ctx->mmap_changing,
> +					uffdio_copy.mode);
>  		mmput(ctx->mm);
>  	} else {
>  		return -ESRCH;
> @@ -1793,9 +1793,9 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx *ctx,
>  		goto out;
>  
>  	if (mmget_not_zero(ctx->mm)) {
> -		ret = mfill_zeropage(ctx->mm, uffdio_zeropage.range.start,
> -				     uffdio_zeropage.range.len,
> -				     &ctx->mmap_changing);
> +		ret = mfill_atomic_zeropage(ctx->mm, uffdio_zeropage.range.start,
> +					   uffdio_zeropage.range.len,
> +					   &ctx->mmap_changing);
>  		mmput(ctx->mm);
>  	} else {
>  		return -ESRCH;
> @@ -1903,9 +1903,9 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg)
>  		goto out;
>  
>  	if (mmget_not_zero(ctx->mm)) {
> -		ret = mcopy_continue(ctx->mm, uffdio_continue.range.start,
> -				     uffdio_continue.range.len,
> -				     &ctx->mmap_changing);
> +		ret = mfill_atomic_continue(ctx->mm, uffdio_continue.range.start,
> +					    uffdio_continue.range.len,
> +					    &ctx->mmap_changing);
>  		mmput(ctx->mm);
>  	} else {
>  		return -ESRCH;
> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 7c977d234aba..8f0467bf1cbd 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -158,13 +158,13 @@ unsigned long hugetlb_total_pages(void);
>  vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  			unsigned long address, unsigned int flags);
>  #ifdef CONFIG_USERFAULTFD
> -int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte,
> -				struct vm_area_struct *dst_vma,
> -				unsigned long dst_addr,
> -				unsigned long src_addr,
> -				enum mcopy_atomic_mode mode,
> -				struct page **pagep,
> -				bool wp_copy);
> +int hugetlb_mfill_atomic_pte(struct mm_struct *dst_mm, pte_t *dst_pte,
> +			     struct vm_area_struct *dst_vma,
> +			     unsigned long dst_addr,
> +			     unsigned long src_addr,
> +			     enum mcopy_atomic_mode mode,
> +			     struct page **pagep,
> +			     bool wp_copy);
>  #endif /* CONFIG_USERFAULTFD */
>  bool hugetlb_reserve_pages(struct inode *inode, long from, long to,
>  						struct vm_area_struct *vma,
> @@ -393,14 +393,14 @@ static inline void hugetlb_free_pgd_range(struct mmu_gather *tlb,
>  }
>  
>  #ifdef CONFIG_USERFAULTFD
> -static inline int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> -						pte_t *dst_pte,
> -						struct vm_area_struct *dst_vma,
> -						unsigned long dst_addr,
> -						unsigned long src_addr,
> -						enum mcopy_atomic_mode mode,
> -						struct page **pagep,
> -						bool wp_copy)
> +static inline int hugetlb_mfill_atomic_pte(struct mm_struct *dst_mm,
> +					   pte_t *dst_pte,
> +					   struct vm_area_struct *dst_vma,
> +					   unsigned long dst_addr,
> +					   unsigned long src_addr,
> +					   enum mcopy_atomic_mode mode,
> +					   struct page **pagep,
> +					   bool wp_copy)
>  {
>  	BUG();
>  	return 0;
> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index 3767f18114ef..468080125612 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
> @@ -61,15 +61,15 @@ extern int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
>  				    unsigned long dst_addr, struct page *page,
>  				    bool newly_allocated, bool wp_copy);
>  
> -extern ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start,
> -			    unsigned long src_start, unsigned long len,
> -			    atomic_t *mmap_changing, __u64 mode);
> -extern ssize_t mfill_zeropage(struct mm_struct *dst_mm,
> -			      unsigned long dst_start,
> -			      unsigned long len,
> -			      atomic_t *mmap_changing);
> -extern ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long dst_start,
> -			      unsigned long len, atomic_t *mmap_changing);
> +extern ssize_t mfill_atomic_copy(struct mm_struct *dst_mm, unsigned long dst_start,
> +				 unsigned long src_start, unsigned long len,
> +				 atomic_t *mmap_changing, __u64 mode);
> +extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm,
> +				     unsigned long dst_start,
> +				     unsigned long len,
> +				     atomic_t *mmap_changing);
> +extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start,
> +				     unsigned long len, atomic_t *mmap_changing);
>  extern int mwriteprotect_range(struct mm_struct *dst_mm,
>  			       unsigned long start, unsigned long len,
>  			       bool enable_wp, atomic_t *mmap_changing);
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 07abcb6eb203..4c9276549394 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -6154,17 +6154,17 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
>  
>  #ifdef CONFIG_USERFAULTFD
>  /*
> - * Used by userfaultfd UFFDIO_COPY.  Based on mcopy_atomic_pte with
> - * modifications for huge pages.
> + * Used by userfaultfd UFFDIO_* ioctls. Based on userfaultfd's mfill_atomic_pte
> + * with modifications for hugetlb pages.
>   */
> -int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm,
> -			    pte_t *dst_pte,
> -			    struct vm_area_struct *dst_vma,
> -			    unsigned long dst_addr,
> -			    unsigned long src_addr,
> -			    enum mcopy_atomic_mode mode,
> -			    struct page **pagep,
> -			    bool wp_copy)
> +int hugetlb_mfill_atomic_pte(struct mm_struct *dst_mm,
> +			     pte_t *dst_pte,
> +			     struct vm_area_struct *dst_vma,
> +			     unsigned long dst_addr,
> +			     unsigned long src_addr,
> +			     enum mcopy_atomic_mode mode,
> +			     struct page **pagep,
> +			     bool wp_copy)
>  {
>  	bool is_continue = (mode == MCOPY_ATOMIC_CONTINUE);
>  	struct hstate *h = hstate_vma(dst_vma);
> diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c
> index 53c3d916ff66..84db5b2fad3a 100644
> --- a/mm/userfaultfd.c
> +++ b/mm/userfaultfd.c
> @@ -127,13 +127,13 @@ int mfill_atomic_install_pte(struct mm_struct *dst_mm, pmd_t *dst_pmd,
>  	return ret;
>  }
>  
> -static int mcopy_atomic_pte(struct mm_struct *dst_mm,
> -			    pmd_t *dst_pmd,
> -			    struct vm_area_struct *dst_vma,
> -			    unsigned long dst_addr,
> -			    unsigned long src_addr,
> -			    struct page **pagep,
> -			    bool wp_copy)
> +static int mfill_atomic_pte_copy(struct mm_struct *dst_mm,
> +				 pmd_t *dst_pmd,
> +				 struct vm_area_struct *dst_vma,
> +				 unsigned long dst_addr,
> +				 unsigned long src_addr,
> +				 struct page **pagep,
> +				 bool wp_copy)
>  {
>  	void *page_kaddr;
>  	int ret;
> @@ -204,10 +204,10 @@ static int mcopy_atomic_pte(struct mm_struct *dst_mm,
>  	goto out;
>  }
>  
> -static int mfill_zeropage_pte(struct mm_struct *dst_mm,
> -			      pmd_t *dst_pmd,
> -			      struct vm_area_struct *dst_vma,
> -			      unsigned long dst_addr)
> +static int mfill_atomic_pte_zeropage(struct mm_struct *dst_mm,
> +				     pmd_t *dst_pmd,
> +				     struct vm_area_struct *dst_vma,
> +				     unsigned long dst_addr)
>  {
>  	pte_t _dst_pte, *dst_pte;
>  	spinlock_t *ptl;
> @@ -240,11 +240,11 @@ static int mfill_zeropage_pte(struct mm_struct *dst_mm,
>  }
>  
>  /* Handles UFFDIO_CONTINUE for all shmem VMAs (shared or private). */
> -static int mcontinue_atomic_pte(struct mm_struct *dst_mm,
> -				pmd_t *dst_pmd,
> -				struct vm_area_struct *dst_vma,
> -				unsigned long dst_addr,
> -				bool wp_copy)
> +static int mfill_atomic_pte_continue(struct mm_struct *dst_mm,
> +				     pmd_t *dst_pmd,
> +				     struct vm_area_struct *dst_vma,
> +				     unsigned long dst_addr,
> +				     bool wp_copy)
>  {
>  	struct inode *inode = file_inode(dst_vma->vm_file);
>  	pgoff_t pgoff = linear_page_index(dst_vma, dst_addr);
> @@ -307,10 +307,10 @@ static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address)
>  
>  #ifdef CONFIG_HUGETLB_PAGE
>  /*
> - * __mcopy_atomic processing for HUGETLB vmas.  Note that this routine is
> + * mfill_atomic processing for HUGETLB vmas.  Note that this routine is
>   * called with mmap_lock held, it will release mmap_lock before returning.
>   */
> -static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
> +static __always_inline ssize_t mfill_atomic_hugetlb(struct mm_struct *dst_mm,
>  					      struct vm_area_struct *dst_vma,
>  					      unsigned long dst_start,
>  					      unsigned long src_start,
> @@ -411,7 +411,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
>  			goto out_unlock;
>  		}
>  
> -		err = hugetlb_mcopy_atomic_pte(dst_mm, dst_pte, dst_vma,
> +		err = hugetlb_mfill_atomic_pte(dst_mm, dst_pte, dst_vma,
>  					       dst_addr, src_addr, mode, &page,
>  					       wp_copy);
>  
> @@ -463,7 +463,7 @@ static __always_inline ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
>  }
>  #else /* !CONFIG_HUGETLB_PAGE */
>  /* fail at build time if gcc attempts to use this */
> -extern ssize_t __mcopy_atomic_hugetlb(struct mm_struct *dst_mm,
> +extern ssize_t mfill_atomic_hugetlb(struct mm_struct *dst_mm,
>  				      struct vm_area_struct *dst_vma,
>  				      unsigned long dst_start,
>  				      unsigned long src_start,
> @@ -484,8 +484,8 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
>  	ssize_t err;
>  
>  	if (mode == MCOPY_ATOMIC_CONTINUE) {
> -		return mcontinue_atomic_pte(dst_mm, dst_pmd, dst_vma, dst_addr,
> -					    wp_copy);
> +		return mfill_atomic_pte_continue(dst_mm, dst_pmd, dst_vma,
> +						 dst_addr, wp_copy);
>  	}
>  
>  	/*
> @@ -500,11 +500,11 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
>  	 */
>  	if (!(dst_vma->vm_flags & VM_SHARED)) {
>  		if (mode == MCOPY_ATOMIC_NORMAL)
> -			err = mcopy_atomic_pte(dst_mm, dst_pmd, dst_vma,
> -					       dst_addr, src_addr, page,
> -					       wp_copy);
> +			err = mfill_atomic_pte_copy(dst_mm, dst_pmd, dst_vma,
> +						    dst_addr, src_addr, page,
> +						    wp_copy);
>  		else
> -			err = mfill_zeropage_pte(dst_mm, dst_pmd,
> +			err = mfill_atomic_pte_zeropage(dst_mm, dst_pmd,
>  						 dst_vma, dst_addr);
>  	} else {
>  		err = shmem_mfill_atomic_pte(dst_mm, dst_pmd, dst_vma,
> @@ -516,13 +516,13 @@ static __always_inline ssize_t mfill_atomic_pte(struct mm_struct *dst_mm,
>  	return err;
>  }
>  
> -static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm,
> -					      unsigned long dst_start,
> -					      unsigned long src_start,
> -					      unsigned long len,
> -					      enum mcopy_atomic_mode mcopy_mode,
> -					      atomic_t *mmap_changing,
> -					      __u64 mode)
> +static __always_inline ssize_t mfill_atomic(struct mm_struct *dst_mm,
> +					    unsigned long dst_start,
> +					    unsigned long src_start,
> +					    unsigned long len,
> +					    enum mcopy_atomic_mode mcopy_mode,
> +					    atomic_t *mmap_changing,
> +					    __u64 mode)
>  {
>  	struct vm_area_struct *dst_vma;
>  	ssize_t err;
> @@ -588,9 +588,9 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm,
>  	 * If this is a HUGETLB vma, pass off to appropriate routine
>  	 */
>  	if (is_vm_hugetlb_page(dst_vma))
> -		return  __mcopy_atomic_hugetlb(dst_mm, dst_vma, dst_start,
> -					       src_start, len, mcopy_mode,
> -					       wp_copy);
> +		return  mfill_atomic_hugetlb(dst_mm, dst_vma, dst_start,
> +					     src_start, len, mcopy_mode,
> +					     wp_copy);
>  
>  	if (!vma_is_anonymous(dst_vma) && !vma_is_shmem(dst_vma))
>  		goto out_unlock;
> @@ -688,26 +688,26 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm,
>  	return copied ? copied : err;
>  }
>  
> -ssize_t mcopy_atomic(struct mm_struct *dst_mm, unsigned long dst_start,
> -		     unsigned long src_start, unsigned long len,
> -		     atomic_t *mmap_changing, __u64 mode)
> +ssize_t mfill_atomic_copy(struct mm_struct *dst_mm, unsigned long dst_start,
> +			  unsigned long src_start, unsigned long len,
> +			  atomic_t *mmap_changing, __u64 mode)
>  {
> -	return __mcopy_atomic(dst_mm, dst_start, src_start, len,
> -			      MCOPY_ATOMIC_NORMAL, mmap_changing, mode);
> +	return mfill_atomic(dst_mm, dst_start, src_start, len,
> +			    MCOPY_ATOMIC_NORMAL, mmap_changing, mode);
>  }
>  
> -ssize_t mfill_zeropage(struct mm_struct *dst_mm, unsigned long start,
> -		       unsigned long len, atomic_t *mmap_changing)
> +ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, unsigned long start,
> +			      unsigned long len, atomic_t *mmap_changing)
>  {
> -	return __mcopy_atomic(dst_mm, start, 0, len, MCOPY_ATOMIC_ZEROPAGE,
> -			      mmap_changing, 0);
> +	return mfill_atomic(dst_mm, start, 0, len, MCOPY_ATOMIC_ZEROPAGE,
> +			    mmap_changing, 0);
>  }
>  
> -ssize_t mcopy_continue(struct mm_struct *dst_mm, unsigned long start,
> -		       unsigned long len, atomic_t *mmap_changing)
> +ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start,
> +			      unsigned long len, atomic_t *mmap_changing)
>  {
> -	return __mcopy_atomic(dst_mm, start, 0, len, MCOPY_ATOMIC_CONTINUE,
> -			      mmap_changing, 0);
> +	return mfill_atomic(dst_mm, start, 0, len, MCOPY_ATOMIC_CONTINUE,
> +			    mmap_changing, 0);
>  }
>  
>  long uffd_wp_range(struct mm_struct *dst_mm, struct vm_area_struct *dst_vma,
> -- 
> 2.40.0.rc1.284.g88254d51c5-goog
>
Mike Rapoport March 9, 2023, 9:10 a.m. UTC | #2
On Wed, Mar 08, 2023 at 02:19:32PM -0800, Axel Rasmussen wrote:
> UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new
> PTE to resolve a missing fault, one can install a write-protected one.
> This is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in
> combination.
> 
> This was motivated by testing HugeTLB HGM [1], and in particular its
> interaction with userfaultfd features. Existing userfaultfd code
> supports using WP and MINOR modes together (i.e. you can register an
> area with both enabled), but without this CONTINUE flag the combination
> is in practice unusable.
> 
> So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing
> as UFFDIO_COPY_MODE_WP, but for *minor* faults.
> 
> Update the selftest to do some very basic exercising of the new flag.
> 
> [1]: https://patchwork.kernel.org/project/linux-mm/cover/20230218002819.1486479-1-jthoughton@google.com/
> 
> Acked-by: Peter Xu <peterx@redhat.com>
> Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>

Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>

> ---
>  fs/userfaultfd.c                         | 8 ++++++--
>  include/linux/userfaultfd_k.h            | 3 ++-
>  include/uapi/linux/userfaultfd.h         | 7 +++++++
>  mm/userfaultfd.c                         | 5 +++--
>  tools/testing/selftests/mm/userfaultfd.c | 4 ++++
>  5 files changed, 22 insertions(+), 5 deletions(-)
> 
> diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> index 005e5e306266..14059a0861bf 100644
> --- a/include/uapi/linux/userfaultfd.h
> +++ b/include/uapi/linux/userfaultfd.h
> @@ -297,6 +297,13 @@ struct uffdio_writeprotect {
>  struct uffdio_continue {
>  	struct uffdio_range range;
>  #define UFFDIO_CONTINUE_MODE_DONTWAKE		((__u64)1<<0)
> +	/*
> +	 * UFFDIO_CONTINUE_MODE_WP will map the page write protected on
> +	 * the fly.  UFFDIO_CONTINUE_MODE_WP is available only if the
> +	 * write protected ioctl is implemented for the range
> +	 * according to the uffdio_register.ioctls.
> +	 */
> +#define UFFDIO_CONTINUE_MODE_WP			((__u64)1<<1)

Please add the description of the new flag to Documentation/ and to the
userfaultfd man pages.

>  	__u64 mode;
>  
>  	/*
Axel Rasmussen March 9, 2023, 10:27 p.m. UTC | #3
On Thu, Mar 9, 2023 at 1:11 AM Mike Rapoport <rppt@kernel.org> wrote:
>
> On Wed, Mar 08, 2023 at 02:19:32PM -0800, Axel Rasmussen wrote:
> > UFFDIO_COPY already has UFFDIO_COPY_MODE_WP, so when installing a new
> > PTE to resolve a missing fault, one can install a write-protected one.
> > This is useful when using UFFDIO_REGISTER_MODE_{MISSING,WP} in
> > combination.
> >
> > This was motivated by testing HugeTLB HGM [1], and in particular its
> > interaction with userfaultfd features. Existing userfaultfd code
> > supports using WP and MINOR modes together (i.e. you can register an
> > area with both enabled), but without this CONTINUE flag the combination
> > is in practice unusable.
> >
> > So, add an analogous UFFDIO_CONTINUE_MODE_WP, which does the same thing
> > as UFFDIO_COPY_MODE_WP, but for *minor* faults.
> >
> > Update the selftest to do some very basic exercising of the new flag.
> >
> > [1]: https://patchwork.kernel.org/project/linux-mm/cover/20230218002819.1486479-1-jthoughton@google.com/
> >
> > Acked-by: Peter Xu <peterx@redhat.com>
> > Signed-off-by: Axel Rasmussen <axelrasmussen@google.com>
>
> Acked-by: Mike Rapoport (IBM) <rppt@kernel.org>
>
> > ---
> >  fs/userfaultfd.c                         | 8 ++++++--
> >  include/linux/userfaultfd_k.h            | 3 ++-
> >  include/uapi/linux/userfaultfd.h         | 7 +++++++
> >  mm/userfaultfd.c                         | 5 +++--
> >  tools/testing/selftests/mm/userfaultfd.c | 4 ++++
> >  5 files changed, 22 insertions(+), 5 deletions(-)
> >
> > diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h
> > index 005e5e306266..14059a0861bf 100644
> > --- a/include/uapi/linux/userfaultfd.h
> > +++ b/include/uapi/linux/userfaultfd.h
> > @@ -297,6 +297,13 @@ struct uffdio_writeprotect {
> >  struct uffdio_continue {
> >       struct uffdio_range range;
> >  #define UFFDIO_CONTINUE_MODE_DONTWAKE                ((__u64)1<<0)
> > +     /*
> > +      * UFFDIO_CONTINUE_MODE_WP will map the page write protected on
> > +      * the fly.  UFFDIO_CONTINUE_MODE_WP is available only if the
> > +      * write protected ioctl is implemented for the range
> > +      * according to the uffdio_register.ioctls.
> > +      */
> > +#define UFFDIO_CONTINUE_MODE_WP                      ((__u64)1<<1)
>
> Please add the description of the new flag to Documentation/ and to the
> userfaultfd man pages.

Funny enough, neither flag is mentioned in Documentation/ today - I'll
add a short passage about both.

Happy to update the man pages as well, I'll send that patch separately.

Thanks for reviewing!

>
> >       __u64 mode;
> >
> >       /*
>
> --
> Sincerely yours,
> Mike.