Message ID | 20241003010512.58559-1-batrick@batbytes.com |
---|---|
State | Superseded |
Headers | show |
Series | ceph: fix cap ref leak via netfs init_request | expand |
Patrick Donnelly <batrick@batbytes.com> wrote: > Log recovered from a user's cluster: > > <7>[ 5413.970692] ceph: get_cap_refs 00000000958c114b ret 1 got Fr > <7>[ 5413.970695] ceph: start_read 00000000958c114b, no cache cap > ... > <7>[ 5473.934609] ceph: my wanted = Fr, used = Fr, dirty - > <7>[ 5473.934616] ceph: revocation: pAsLsXsFr -> pAsLsXs (revoking Fr) > <7>[ 5473.934632] ceph: __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs > <7>[ 5473.934638] ceph: check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr AUTHONLY NOINVAL FLUSH_FORCE > > The MDS subsequently complains that the kernel client is late releasing caps. > > Approximately, a series of changes to this code by the three commits cited > below resulted in subtle resource cleanup to be missed. The main culprit is the > change in error handling in 2d31604 which meant that a failure in init_request > would no longer cause cleanup to be called. That would prevent the > ceph_put_cap_refs which would cleanup the leaked cap ref. > > Closes: https://tracker.ceph.com/issues/67008 > Fixes: 49870056005ca9387e5ee31451991491f99cc45f ("ceph: convert ceph_readpages to ceph_readahead") > Fixes: 2de160417315b8d64455fe03e9bb7d3308ac3281 ("netfs: Change ->init_request() to return an error code") > Fixes: a5c9dc4451394b2854493944dcc0ff71af9705a3 ("ceph: Make ceph_init_request() check caps on readahead") Note that you only need the first 12 digits of the SHA1 sum. > Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> > Cc: stable@vger.kernel.org Reviewed-by: David Howells <dhowells@redhat.com>
Hi Ilya, Are you going to pick this up, or should I ask Christian to take it through the vfs tree? David
On Fri, Oct 4, 2024 at 5:37 PM David Howells <dhowells@redhat.com> wrote: > > Hi Ilya, > > Are you going to pick this up, or should I ask Christian to take it through > the vfs tree? Hi David, I picked it up yesterday. Thanks, Ilya
On 10/3/24 09:05, Patrick Donnelly wrote: > From: Patrick Donnelly <pdonnell@redhat.com> > > Log recovered from a user's cluster: > > <7>[ 5413.970692] ceph: get_cap_refs 00000000958c114b ret 1 got Fr > <7>[ 5413.970695] ceph: start_read 00000000958c114b, no cache cap > ... > <7>[ 5473.934609] ceph: my wanted = Fr, used = Fr, dirty - > <7>[ 5473.934616] ceph: revocation: pAsLsXsFr -> pAsLsXs (revoking Fr) > <7>[ 5473.934632] ceph: __ceph_caps_issued 00000000958c114b cap 00000000f7784259 issued pAsLsXs > <7>[ 5473.934638] ceph: check_caps 10000000e68.fffffffffffffffe file_want - used Fr dirty - flushing - issued pAsLsXs revoking Fr retain pAsLsXsFsr AUTHONLY NOINVAL FLUSH_FORCE > > The MDS subsequently complains that the kernel client is late releasing caps. > > Approximately, a series of changes to this code by the three commits cited > below resulted in subtle resource cleanup to be missed. The main culprit is the > change in error handling in 2d31604 which meant that a failure in init_request > would no longer cause cleanup to be called. That would prevent the > ceph_put_cap_refs which would cleanup the leaked cap ref. > > Closes: https://tracker.ceph.com/issues/67008 > Fixes: 49870056005ca9387e5ee31451991491f99cc45f ("ceph: convert ceph_readpages to ceph_readahead") > Fixes: 2de160417315b8d64455fe03e9bb7d3308ac3281 ("netfs: Change ->init_request() to return an error code") > Fixes: a5c9dc4451394b2854493944dcc0ff71af9705a3 ("ceph: Make ceph_init_request() check caps on readahead") > Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> > Cc: stable@vger.kernel.org > --- > fs/ceph/addr.c | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c > index 53fef258c2bc..702c6a730b70 100644 > --- a/fs/ceph/addr.c > +++ b/fs/ceph/addr.c > @@ -489,8 +489,11 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file) > rreq->io_streams[0].sreq_max_len = fsc->mount_options->rsize; > > out: > - if (ret < 0) > + if (ret < 0) { > + if (got) > + ceph_put_cap_refs(ceph_inode(inode), got); > kfree(priv); > + } > > return ret; > } > > base-commit: e32cde8d2bd7d251a8f9b434143977ddf13dcec6 Good catch! Reviewed-by: Xiubo Li <xiubli@redhat.com>
diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 53fef258c2bc..702c6a730b70 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -489,8 +489,11 @@ static int ceph_init_request(struct netfs_io_request *rreq, struct file *file) rreq->io_streams[0].sreq_max_len = fsc->mount_options->rsize; out: - if (ret < 0) + if (ret < 0) { + if (got) + ceph_put_cap_refs(ceph_inode(inode), got); kfree(priv); + } return ret; }