mbox series

[v2,0/2] libceph: fix sparse-read failure bug

Message ID 20231208160601.124892-1-xiubli@redhat.com
Headers show
Series libceph: fix sparse-read failure bug | expand

Message

Xiubo Li Dec. 8, 2023, 4:05 p.m. UTC
From: Xiubo Li <xiubli@redhat.com>

The debug logs:

7271665 <7>[120727.292870] libceph:  get_reply tid 5526 000000002bfe53c9
7271666 <7>[120727.292875] libceph:  get_osd 000000009e8420b7 4 -> 5
7271667 <7>[120727.292882] libceph:  prep_next_sparse_read: [2] starting new sparse read req. srlen=0x7000
7271668 <7>[120727.292887] libceph:  prep_next_sparse_read: [2] new sparse read op at idx 0 0x60000~0x7000
7271669 <7>[120727.292894] libceph:  [2] got 1 extents
7271670 <7>[120727.292900] libceph:  [2] ext 0 off 0x60000 len 0x4000
7271671 <7>[120727.292912] libceph:  prep_next_sparse_read: [2] completed extent array len 1 cursor->resid 12288
7271672 <7>[120727.292917] libceph:  read_partial left: 21, have: 0, con->v1.in_base_pos: 53
7271673 <7>[120727.292923] libceph:  read_partial left: 7, have: 14, con->v1.in_base_pos: 67     ====> there were 7 bytes not received
7271674 <7>[120727.292928] libceph:  read_partial return 0
7271675 <7>[120727.292931] libceph:  try_read done on 00000000ddd953f1 ret 0
7271676 <7>[120727.292935] libceph:  try_write start 00000000ddd953f1 state 12
7271677 <7>[120727.292939] libceph:  try_write out_kvec_bytes 0
7271678 <7>[120727.292943] libceph:  try_write nothing else to write.
7271679 <7>[120727.292948] libceph:  try_write done on 00000000ddd953f1 ret 0
7271680 <7>[120727.292955] libceph:  put_osd 000000009e8420b7 5 -> 4
7271681 <7>[120727.293021] libceph:  ceph_sock_data_ready 00000000ddd953f1 state = 12, queueing work
7271682 <7>[120727.293029] libceph:  get_osd 000000009e8420b7 4 -> 5
7271683 <7>[120727.293041] libceph:  queue_con_delay 00000000ddd953f1 0
7271684 <7>[120727.293134] libceph:  try_read start 00000000ddd953f1 state 12
7271685 <7>[120727.293141] libceph:  try_read tag 7 in_base_pos 67
7271686 <7>[120727.293145] libceph:  read_partial_message con 00000000ddd953f1 msg 000000002bfe53c9
7271687 <7>[120727.293150] libceph:  read_partial return 1
7271688 <7>[120727.293154] libceph:  read_partial left: 7, have: 14, con->v1.in_base_pos: 67     ====> the left 7 bytes came
7271689 <7>[120727.293189] libceph:  read_partial return 1
7271690 <7>[120727.293193] libceph:  read_partial_message got msg 000000002bfe53c9 164 (216900879) + 0 (0) + 16408 (1227708997)
7271691 <7>[120727.293203] libceph:  ===== 000000002bfe53c9 3092 from osd2 43=osd_opreply len 164+0+16408 (216900879 0 1227708997) =====
7271692 <7>[120727.293211] libceph:  handle_reply msg 000000002bfe53c9 tid 5526                  ====> the req was successfully finished
7271693 <7>[120727.293217] libceph:  handle_reply req 00000000b7727657 tid 5526 flags 0x400015 pgid 3.55 epoch 52 attempt 0 v 0'0 uv 2275
7271694 <7>[120727.293225] libceph:   req 00000000b7727657 tid 5526 op 0 rval 0 len 16408
7271695 <7>[120727.293231] libceph:  handle_reply req 00000000b7727657 tid 5526 result 0 data_len 16408
7271696 <7>[120727.293236] libceph:  finish_request req 00000000b7727657 tid 5526
7271697 <7>[120727.293241] libceph:  unlink_request osd 000000009e8420b7 osd2 req 00000000b7727657 tid 5526



V2:
- fix the sparse-read bug in the sparse-read code instead

Xiubo Li (2):
  libceph: fail the sparse-read if there still has data in socket
  libceph: just wait for more data to be available on the socket

 include/linux/ceph/osd_client.h |  1 +
 net/ceph/osd_client.c           | 10 ++++++----
 2 files changed, 7 insertions(+), 4 deletions(-)