Message ID | 20230824095551.134118-1-xiubli@redhat.com |
---|---|
State | New |
Headers | show |
Series | ceph: skip reconnecting if MDS is not ready | expand |
On Thu, Aug 24, 2023 at 3:28 PM <xiubli@redhat.com> wrote: > > From: Xiubo Li <xiubli@redhat.com> > > When MDS closed the session the kclient will send to reconnect to > it immediately, but if the MDS just restarted and still not ready > yet, such as still in the up:replay state and the sessionmap journal > logs hasn't be replayed, the MDS will close the session. > > And then the kclient could remove the session and later when the > mdsmap is in RECONNECT phrase it will skip reconnecting. But the > will wait until timeout and then evicts the kclient. > > Just skip sending the reconnection request until the MDS is ready. > > URL: https://tracker.ceph.com/issues/62489 > Signed-off-by: Xiubo Li <xiubli@redhat.com> > --- > fs/ceph/mds_client.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c > index 9aae39289b43..a9ef93411679 100644 > --- a/fs/ceph/mds_client.c > +++ b/fs/ceph/mds_client.c > @@ -5809,7 +5809,8 @@ static void mds_peer_reset(struct ceph_connection *con) > > pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", > s->s_mds); > - if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO) > + if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO && > + ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT) > send_mds_reconnect(mdsc, s); > } > > -- > 2.39.1 > Tested-by: Venky Shankar <vshankar@redhat.com>
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 9aae39289b43..a9ef93411679 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -5809,7 +5809,8 @@ static void mds_peer_reset(struct ceph_connection *con) pr_warn_client(mdsc->fsc->client, "mds%d closed our session\n", s->s_mds); - if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO) + if (READ_ONCE(mdsc->fsc->mount_state) != CEPH_MOUNT_FENCE_IO && + ceph_mdsmap_get_state(mdsc->mdsmap, s->s_mds) >= CEPH_MDS_STATE_RECONNECT) send_mds_reconnect(mdsc, s); }