mbox series

[v2,0/3] netfs, afs: Fix netfs_write_begin and THP handling

Message ID 162391823192.1173366.9740514875196345746.stgit@warthog.procyon.org.uk
Headers show
Series netfs, afs: Fix netfs_write_begin and THP handling | expand

Message

David Howells June 17, 2021, 8:23 a.m. UTC
Here are some patches to fix netfs_write_begin() and the handling of THPs in
that and afs_write_begin/end() in the following ways:

 (1) Use offset_in_thp() rather than manually calculating the offset into
     the page.

 (2) In the future, the len parameter may extend beyond the page allocated.
     This is because the page allocation is deferred to write_begin() and
     that gets to decide what size of THP to allocate.

 (3) In netfs_write_begin(), extract the decision about whether to skip a
     page out to its own helper and have that clear around the region to be
     written, but not clear that region.  This requires the filesystem to
     patch it up afterwards if the hole doesn't get completely filled.

 (4) Due to (3), afs_write_end() now needs to handle short data write into
     the page by generic_perform_write().  I've adopted an analogous
     approach to ceph of just returning 0 in this case and letting the
     caller go round again.

I wonder if generic_perform_write() should pass in a flag indicating
whether this is the first attempt or a second attempt at this, and on the
second attempt we just completely prefill the page and just let the partial
write stand - which we have to do if the page was already uptodate when we
started.

The patches can be found here:

	https://git.kernel.org/pub/scm/linux/kernel/git/dhowells/linux-fs.git/log/?h=afs-fixes

David

Link: https://lore.kernel.org/r/20210613233345.113565-1-jlayton@kernel.org/
Link: https://lore.kernel.org/r/162367681795.460125.11729955608839747375.stgit@warthog.procyon.org.uk/

Changes
=======

ver #2:
   - Removed a var that's no longer used (spotted by the kernel test robot)
   - Removed a forgotten "noinline".

ver #1:
   - Prefixed the Jeff's new helper with "netfs_".
   - Don't call zero_user_segments() for a full-page write.
   - Altered the beyond-last-page check to avoid a DIV.
   - Removed redundant zero-length-file check.
   - Added patches to fix afs.

---
David Howells (2):
      afs: Handle len being extending over page end in write_begin/write_end
      afs: Fix afs_write_end() to handle short writes

Jeff Layton (1):
      netfs: fix test for whether we can skip read when writing beyond EOF


 fs/afs/write.c         | 12 +++++++++--
 fs/netfs/read_helper.c | 49 +++++++++++++++++++++++++++++++-----------
 2 files changed, 46 insertions(+), 15 deletions(-)

Comments

Al Viro June 18, 2021, 3:46 a.m. UTC | #1
On Thu, Jun 17, 2021 at 09:23:51AM +0100, David Howells wrote:
> 
> Here are some patches to fix netfs_write_begin() and the handling of THPs in
> that and afs_write_begin/end() in the following ways:
> 
>  (1) Use offset_in_thp() rather than manually calculating the offset into
>      the page.
> 
>  (2) In the future, the len parameter may extend beyond the page allocated.
>      This is because the page allocation is deferred to write_begin() and
>      that gets to decide what size of THP to allocate.
> 
>  (3) In netfs_write_begin(), extract the decision about whether to skip a
>      page out to its own helper and have that clear around the region to be
>      written, but not clear that region.  This requires the filesystem to
>      patch it up afterwards if the hole doesn't get completely filled.
> 
>  (4) Due to (3), afs_write_end() now needs to handle short data write into
>      the page by generic_perform_write().  I've adopted an analogous
>      approach to ceph of just returning 0 in this case and letting the
>      caller go round again.

Series looks sane.  I'd like to hear about the thp-related plans in
more detail, but that's a separate story.

> I wonder if generic_perform_write() should pass in a flag indicating
> whether this is the first attempt or a second attempt at this, and on the
> second attempt we just completely prefill the page and just let the partial
> write stand - which we have to do if the page was already uptodate when we
> started.

Not really - we'll simply get a shorter chunk next time around (with
the patches in -next right now it'll be "the amount we'd actually
managed to copy this time around" in case ->write_begin() tells us
to take a hike), and that shorter chunk is what ->write_begin() will
see.  No need for the flags...