Message ID | 20220308002747.122682-10-michael.christie@oracle.com |
---|---|
State | Superseded |
Headers | show |
Series | misc iscsi patches | expand |
On 3/7/22 16:27, Mike Christie wrote: > We currently require that the back_lock is held when calling the functions > that manipulate the iscsi_task refcount. The only reason for this is to > handle races where we are handing SCSI-ml eh callbacks and the cmd is > completing at the same time the normal completion path is running, and we > can't return from the EH callback until the driver has stopped accessing > the cmd. By holding the back_lock while also accessing the task->state > makes it simple to check that a cmd is completing and also get/put a > refcount at the same time. > > The problem is that we don't want to take the back_lock from the xmit path > for normal IO since it causes contention with the completion path if the > user has chosen to try and split those paths on different CPUs (in this > case abusing the CPUs and igoring caching improves perf for some uses). > > This patch begins to remove the back_lock requirement for > iscsi_get/put_task by removing the requirement for the get path. Instead > of always holding the back_lock we detect if something has done the last > put and is about to call iscsi_free_task. The next patch will then allow > iscsi code to do the last put on a task and only grab the back_lock if > the refcount is now zero and it's going to call iscsi_free_task. > > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/scsi/be2iscsi/be_main.c | 19 ++++++- > drivers/scsi/libiscsi.c | 95 +++++++++++++++++++++++---------- > drivers/scsi/libiscsi_tcp.c | 5 +- > include/scsi/libiscsi.h | 2 +- > 4 files changed, 88 insertions(+), 33 deletions(-) > > diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c > index ab55681145f8..26e5446ac237 100644 > --- a/drivers/scsi/be2iscsi/be_main.c > +++ b/drivers/scsi/be2iscsi/be_main.c > @@ -231,6 +231,7 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) > cls_session = starget_to_session(scsi_target(sc->device)); > session = cls_session->dd_data; > > +completion_check: > /* check if we raced, task just got cleaned up under us */ > spin_lock_bh(&session->back_lock); > if (!abrt_task || !abrt_task->sc) { > @@ -238,7 +239,13 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) > return SUCCESS; > } > /* get a task ref till FW processes the req for the ICD used */ > - __iscsi_get_task(abrt_task); > + if (!iscsi_get_task(abrt_task)) { > + spin_unlock(&session->back_lock); > + /* We are just about to call iscsi_free_task so wait for it. */ > + udelay(5); > + goto completion_check; > + } > + > abrt_io_task = abrt_task->dd_data; > conn = abrt_task->conn; > beiscsi_conn = conn->dd_data; > @@ -323,7 +330,15 @@ static int beiscsi_eh_device_reset(struct scsi_cmnd *sc) > } > > /* get a task ref till FW processes the req for the ICD used */ > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* > + * The task has completed in the driver and is > + * completing in libiscsi. Just ignore it here. When we > + * call iscsi_eh_device_reset, it will wait for us. > + */ > + continue; > + } > + > io_task = task->dd_data; > /* mark WRB invalid which have been not processed by FW yet */ > if (is_chip_be2_be3r(phba)) { > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > index 5c74ab92725f..a2d0daf5bd60 100644 > --- a/drivers/scsi/libiscsi.c > +++ b/drivers/scsi/libiscsi.c > @@ -83,6 +83,8 @@ MODULE_PARM_DESC(debug_libiscsi_eh, > "%s " dbg_fmt, __func__, ##arg); \ > } while (0); > > +#define ISCSI_CMD_COMPL_WAIT 5 > + > inline void iscsi_conn_queue_xmit(struct iscsi_conn *conn) > { > struct Scsi_Host *shost = conn->session->host; > @@ -482,11 +484,11 @@ static void iscsi_free_task(struct iscsi_task *task) > } > } > > -void __iscsi_get_task(struct iscsi_task *task) > +bool iscsi_get_task(struct iscsi_task *task) > { > - refcount_inc(&task->refcount); > + return refcount_inc_not_zero(&task->refcount); > } > -EXPORT_SYMBOL_GPL(__iscsi_get_task); > +EXPORT_SYMBOL_GPL(iscsi_get_task); > > void __iscsi_put_task(struct iscsi_task *task) > { > @@ -600,20 +602,17 @@ static bool cleanup_queued_task(struct iscsi_task *task) > } > > /* > - * session frwd lock must be held and if not called for a task that is still > - * pending or from the xmit thread, then xmit thread must be suspended > + * session back and frwd lock must be held and if not called for a task that > + * is still pending or from the xmit thread, then xmit thread must be suspended > */ > -static void fail_scsi_task(struct iscsi_task *task, int err) > +static void __fail_scsi_task(struct iscsi_task *task, int err) > { > struct iscsi_conn *conn = task->conn; > struct scsi_cmnd *sc; > int state; > > - spin_lock_bh(&conn->session->back_lock); > - if (cleanup_queued_task(task)) { > - spin_unlock_bh(&conn->session->back_lock); > + if (cleanup_queued_task(task)) > return; > - } > > if (task->state == ISCSI_TASK_PENDING) { > /* > @@ -632,7 +631,15 @@ static void fail_scsi_task(struct iscsi_task *task, int err) > sc->result = err << 16; > scsi_set_resid(sc, scsi_bufflen(sc)); > iscsi_complete_task(task, state); > - spin_unlock_bh(&conn->session->back_lock); > +} > + > +static void fail_scsi_task(struct iscsi_task *task, int err) > +{ > + struct iscsi_session *session = task->conn->session; > + > + spin_lock_bh(&session->back_lock); > + __fail_scsi_task(task, err); > + spin_unlock_bh(&session->back_lock); > } > > static int iscsi_prep_mgmt_task(struct iscsi_conn *conn, > @@ -1449,8 +1456,17 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, > spin_lock_bh(&conn->session->back_lock); > > if (!conn->task) { > - /* Take a ref so we can access it after xmit_task() */ > - __iscsi_get_task(task); > + /* > + * Take a ref so we can access it after xmit_task(). > + * > + * This should never fail because the failure paths will have > + * stopped the xmit thread. WARN on move on. > + */ > + if (!iscsi_get_task(task)) { > + spin_unlock_bh(&conn->session->back_lock); > + WARN_ON_ONCE(1); > + return 0; > + } > } else { > /* Already have a ref from when we failed to send it last call */ > conn->task = NULL; > @@ -1492,7 +1508,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, > * get an extra ref that is released next time we access it > * as conn->task above. > */ > - __iscsi_get_task(task); > + iscsi_get_task(task); > conn->task = task; > } > > @@ -1907,6 +1923,7 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) > struct iscsi_task *task; > int i; > > +restart_cmd_loop: > spin_lock_bh(&session->back_lock); > for (i = 0; i < session->cmds_max; i++) { > task = session->cmds[i]; > @@ -1915,22 +1932,25 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) > > if (lun != -1 && lun != task->sc->device->lun) > continue; > - > - __iscsi_get_task(task); > - spin_unlock_bh(&session->back_lock); > + /* > + * The cmd is completing but if this is called from an eh > + * callout path then when we return scsi-ml owns the cmd. Wait > + * for the completion path to finish freeing the cmd. > + */ > + if (!iscsi_get_task(task)) { > + spin_unlock_bh(&session->back_lock); > + spin_unlock_bh(&session->frwd_lock); > + udelay(ISCSI_CMD_COMPL_WAIT); > + spin_lock_bh(&session->frwd_lock); > + goto restart_cmd_loop; > + } > > ISCSI_DBG_SESSION(session, > "failing sc %p itt 0x%x state %d\n", > task->sc, task->itt, task->state); > - fail_scsi_task(task, error); > - > - spin_unlock_bh(&session->frwd_lock); > - iscsi_put_task(task); > - spin_lock_bh(&session->frwd_lock); > - > - spin_lock_bh(&session->back_lock); > + __fail_scsi_task(task, error); > + __iscsi_put_task(task); > } > - > spin_unlock_bh(&session->back_lock); > } > > @@ -2035,7 +2055,16 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc) > spin_unlock(&session->back_lock); > goto done; > } > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* > + * Racing with the completion path right now, so give it more > + * time so that path can complete it like normal. > + */ > + rc = BLK_EH_RESET_TIMER; > + task = NULL; > + spin_unlock(&session->back_lock); > + goto done; > + } > spin_unlock(&session->back_lock); > > if (session->state != ISCSI_STATE_LOGGED_IN) { > @@ -2282,6 +2311,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) > > ISCSI_DBG_EH(session, "aborting sc %p\n", sc); > > +completion_check: > mutex_lock(&session->eh_mutex); > spin_lock_bh(&session->frwd_lock); > /* > @@ -2321,13 +2351,20 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) > return SUCCESS; > } > > + if (!iscsi_get_task(task)) { > + spin_unlock(&session->back_lock); > + spin_unlock_bh(&session->frwd_lock); > + mutex_unlock(&session->eh_mutex); > + /* We are just about to call iscsi_free_task so wait for it. */ > + udelay(ISCSI_CMD_COMPL_WAIT); > + goto completion_check; > + } > + > + ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); > conn = session->leadconn; > iscsi_get_conn(conn->cls_conn); > conn->eh_abort_cnt++; > age = session->age; > - > - ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); > - __iscsi_get_task(task); > spin_unlock(&session->back_lock); > > if (task->state == ISCSI_TASK_PENDING) { > diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c > index 883005757ddb..defe08142b75 100644 > --- a/drivers/scsi/libiscsi_tcp.c > +++ b/drivers/scsi/libiscsi_tcp.c > @@ -558,7 +558,10 @@ static int iscsi_tcp_r2t_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr) > return 0; > } > task->last_xfer = jiffies; > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* Let the path that got the early rsp complete it */ > + return 0; > + } > > tcp_conn = conn->dd_data; > rhdr = (struct iscsi_r2t_rsp *)tcp_conn->in.hdr; > diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h > index 522fd16f1dbb..9032a214104c 100644 > --- a/include/scsi/libiscsi.h > +++ b/include/scsi/libiscsi.h > @@ -470,7 +470,7 @@ extern struct iscsi_task *iscsi_itt_to_task(struct iscsi_conn *, itt_t); > extern void iscsi_requeue_task(struct iscsi_task *task); > extern void iscsi_put_task(struct iscsi_task *task); > extern void __iscsi_put_task(struct iscsi_task *task); > -extern void __iscsi_get_task(struct iscsi_task *task); > +extern bool iscsi_get_task(struct iscsi_task *task); > extern void iscsi_complete_scsi_task(struct iscsi_task *task, > uint32_t exp_cmdsn, uint32_t max_cmdsn); > Reviewed-by: Lee Duncan <lduncan@suse.com>
On 3/7/22 16:27, Mike Christie wrote: > We currently require that the back_lock is held when calling the functions > that manipulate the iscsi_task refcount. The only reason for this is to > handle races where we are handing SCSI-ml eh callbacks and the cmd is handing -> handling > completing at the same time the normal completion path is running, and we > can't return from the EH callback until the driver has stopped accessing > the cmd. By holding the back_lock while also accessing the task->state > makes it simple to check that a cmd is completing and also get/put a > refcount at the same time. This last sentence doesn't completely parse. Maybe leave off the work "By"? > > The problem is that we don't want to take the back_lock from the xmit path > for normal IO since it causes contention with the completion path if the > user has chosen to try and split those paths on different CPUs (in this > case abusing the CPUs and igoring caching improves perf for some uses). > > This patch begins to remove the back_lock requirement for > iscsi_get/put_task by removing the requirement for the get path. Instead > of always holding the back_lock we detect if something has done the last > put and is about to call iscsi_free_task. The next patch will then allow > iscsi code to do the last put on a task and only grab the back_lock if > the refcount is now zero and it's going to call iscsi_free_task. > > Signed-off-by: Mike Christie <michael.christie@oracle.com> > --- > drivers/scsi/be2iscsi/be_main.c | 19 ++++++- > drivers/scsi/libiscsi.c | 95 +++++++++++++++++++++++---------- > drivers/scsi/libiscsi_tcp.c | 5 +- > include/scsi/libiscsi.h | 2 +- > 4 files changed, 88 insertions(+), 33 deletions(-) > > diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c > index ab55681145f8..26e5446ac237 100644 > --- a/drivers/scsi/be2iscsi/be_main.c > +++ b/drivers/scsi/be2iscsi/be_main.c > @@ -231,6 +231,7 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) > cls_session = starget_to_session(scsi_target(sc->device)); > session = cls_session->dd_data; > > +completion_check: > /* check if we raced, task just got cleaned up under us */ > spin_lock_bh(&session->back_lock); > if (!abrt_task || !abrt_task->sc) { > @@ -238,7 +239,13 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) > return SUCCESS; > } > /* get a task ref till FW processes the req for the ICD used */ > - __iscsi_get_task(abrt_task); > + if (!iscsi_get_task(abrt_task)) { > + spin_unlock(&session->back_lock); > + /* We are just about to call iscsi_free_task so wait for it. */ > + udelay(5); > + goto completion_check; > + } > + > abrt_io_task = abrt_task->dd_data; > conn = abrt_task->conn; > beiscsi_conn = conn->dd_data; > @@ -323,7 +330,15 @@ static int beiscsi_eh_device_reset(struct scsi_cmnd *sc) > } > > /* get a task ref till FW processes the req for the ICD used */ > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* > + * The task has completed in the driver and is > + * completing in libiscsi. Just ignore it here. When we > + * call iscsi_eh_device_reset, it will wait for us. > + */ > + continue; > + } > + > io_task = task->dd_data; > /* mark WRB invalid which have been not processed by FW yet */ > if (is_chip_be2_be3r(phba)) { > diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c > index 5c74ab92725f..a2d0daf5bd60 100644 > --- a/drivers/scsi/libiscsi.c > +++ b/drivers/scsi/libiscsi.c > @@ -83,6 +83,8 @@ MODULE_PARM_DESC(debug_libiscsi_eh, > "%s " dbg_fmt, __func__, ##arg); \ > } while (0); > > +#define ISCSI_CMD_COMPL_WAIT 5 > + > inline void iscsi_conn_queue_xmit(struct iscsi_conn *conn) > { > struct Scsi_Host *shost = conn->session->host; > @@ -482,11 +484,11 @@ static void iscsi_free_task(struct iscsi_task *task) > } > } > > -void __iscsi_get_task(struct iscsi_task *task) > +bool iscsi_get_task(struct iscsi_task *task) > { > - refcount_inc(&task->refcount); > + return refcount_inc_not_zero(&task->refcount); > } > -EXPORT_SYMBOL_GPL(__iscsi_get_task); > +EXPORT_SYMBOL_GPL(iscsi_get_task); > > void __iscsi_put_task(struct iscsi_task *task) > { > @@ -600,20 +602,17 @@ static bool cleanup_queued_task(struct iscsi_task *task) > } > > /* > - * session frwd lock must be held and if not called for a task that is still > - * pending or from the xmit thread, then xmit thread must be suspended > + * session back and frwd lock must be held and if not called for a task that > + * is still pending or from the xmit thread, then xmit thread must be suspended > */ > -static void fail_scsi_task(struct iscsi_task *task, int err) > +static void __fail_scsi_task(struct iscsi_task *task, int err) > { > struct iscsi_conn *conn = task->conn; > struct scsi_cmnd *sc; > int state; > > - spin_lock_bh(&conn->session->back_lock); > - if (cleanup_queued_task(task)) { > - spin_unlock_bh(&conn->session->back_lock); > + if (cleanup_queued_task(task)) > return; > - } > > if (task->state == ISCSI_TASK_PENDING) { > /* > @@ -632,7 +631,15 @@ static void fail_scsi_task(struct iscsi_task *task, int err) > sc->result = err << 16; > scsi_set_resid(sc, scsi_bufflen(sc)); > iscsi_complete_task(task, state); > - spin_unlock_bh(&conn->session->back_lock); > +} > + > +static void fail_scsi_task(struct iscsi_task *task, int err) > +{ > + struct iscsi_session *session = task->conn->session; > + > + spin_lock_bh(&session->back_lock); > + __fail_scsi_task(task, err); > + spin_unlock_bh(&session->back_lock); > } > > static int iscsi_prep_mgmt_task(struct iscsi_conn *conn, > @@ -1449,8 +1456,17 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, > spin_lock_bh(&conn->session->back_lock); > > if (!conn->task) { > - /* Take a ref so we can access it after xmit_task() */ > - __iscsi_get_task(task); > + /* > + * Take a ref so we can access it after xmit_task(). > + * > + * This should never fail because the failure paths will have > + * stopped the xmit thread. WARN on move on. > + */ > + if (!iscsi_get_task(task)) { > + spin_unlock_bh(&conn->session->back_lock); > + WARN_ON_ONCE(1); > + return 0; > + } > } else { > /* Already have a ref from when we failed to send it last call */ > conn->task = NULL; > @@ -1492,7 +1508,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, > * get an extra ref that is released next time we access it > * as conn->task above. > */ > - __iscsi_get_task(task); > + iscsi_get_task(task); > conn->task = task; > } > > @@ -1907,6 +1923,7 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) > struct iscsi_task *task; > int i; > > +restart_cmd_loop: > spin_lock_bh(&session->back_lock); > for (i = 0; i < session->cmds_max; i++) { > task = session->cmds[i]; > @@ -1915,22 +1932,25 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) > > if (lun != -1 && lun != task->sc->device->lun) > continue; > - > - __iscsi_get_task(task); > - spin_unlock_bh(&session->back_lock); > + /* > + * The cmd is completing but if this is called from an eh > + * callout path then when we return scsi-ml owns the cmd. Wait > + * for the completion path to finish freeing the cmd. > + */ > + if (!iscsi_get_task(task)) { > + spin_unlock_bh(&session->back_lock); > + spin_unlock_bh(&session->frwd_lock); > + udelay(ISCSI_CMD_COMPL_WAIT); > + spin_lock_bh(&session->frwd_lock); > + goto restart_cmd_loop; > + } > > ISCSI_DBG_SESSION(session, > "failing sc %p itt 0x%x state %d\n", > task->sc, task->itt, task->state); > - fail_scsi_task(task, error); > - > - spin_unlock_bh(&session->frwd_lock); > - iscsi_put_task(task); > - spin_lock_bh(&session->frwd_lock); > - > - spin_lock_bh(&session->back_lock); > + __fail_scsi_task(task, error); > + __iscsi_put_task(task); > } > - > spin_unlock_bh(&session->back_lock); > } > > @@ -2035,7 +2055,16 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc) > spin_unlock(&session->back_lock); > goto done; > } > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* > + * Racing with the completion path right now, so give it more > + * time so that path can complete it like normal. > + */ > + rc = BLK_EH_RESET_TIMER; > + task = NULL; > + spin_unlock(&session->back_lock); > + goto done; > + } > spin_unlock(&session->back_lock); > > if (session->state != ISCSI_STATE_LOGGED_IN) { > @@ -2282,6 +2311,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) > > ISCSI_DBG_EH(session, "aborting sc %p\n", sc); > > +completion_check: > mutex_lock(&session->eh_mutex); > spin_lock_bh(&session->frwd_lock); > /* > @@ -2321,13 +2351,20 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) > return SUCCESS; > } > > + if (!iscsi_get_task(task)) { > + spin_unlock(&session->back_lock); > + spin_unlock_bh(&session->frwd_lock); > + mutex_unlock(&session->eh_mutex); > + /* We are just about to call iscsi_free_task so wait for it. */ > + udelay(ISCSI_CMD_COMPL_WAIT); > + goto completion_check; > + } > + > + ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); > conn = session->leadconn; > iscsi_get_conn(conn->cls_conn); > conn->eh_abort_cnt++; > age = session->age; > - > - ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); > - __iscsi_get_task(task); > spin_unlock(&session->back_lock); > > if (task->state == ISCSI_TASK_PENDING) { > diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c > index 883005757ddb..defe08142b75 100644 > --- a/drivers/scsi/libiscsi_tcp.c > +++ b/drivers/scsi/libiscsi_tcp.c > @@ -558,7 +558,10 @@ static int iscsi_tcp_r2t_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr) > return 0; > } > task->last_xfer = jiffies; > - __iscsi_get_task(task); > + if (!iscsi_get_task(task)) { > + /* Let the path that got the early rsp complete it */ > + return 0; > + } > > tcp_conn = conn->dd_data; > rhdr = (struct iscsi_r2t_rsp *)tcp_conn->in.hdr; > diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h > index 522fd16f1dbb..9032a214104c 100644 > --- a/include/scsi/libiscsi.h > +++ b/include/scsi/libiscsi.h > @@ -470,7 +470,7 @@ extern struct iscsi_task *iscsi_itt_to_task(struct iscsi_conn *, itt_t); > extern void iscsi_requeue_task(struct iscsi_task *task); > extern void iscsi_put_task(struct iscsi_task *task); > extern void __iscsi_put_task(struct iscsi_task *task); > -extern void __iscsi_get_task(struct iscsi_task *task); > +extern bool iscsi_get_task(struct iscsi_task *task); > extern void iscsi_complete_scsi_task(struct iscsi_task *task, > uint32_t exp_cmdsn, uint32_t max_cmdsn); > Reviewed-by: Lee Duncan <lduncan@suse.com>
diff --git a/drivers/scsi/be2iscsi/be_main.c b/drivers/scsi/be2iscsi/be_main.c index ab55681145f8..26e5446ac237 100644 --- a/drivers/scsi/be2iscsi/be_main.c +++ b/drivers/scsi/be2iscsi/be_main.c @@ -231,6 +231,7 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) cls_session = starget_to_session(scsi_target(sc->device)); session = cls_session->dd_data; +completion_check: /* check if we raced, task just got cleaned up under us */ spin_lock_bh(&session->back_lock); if (!abrt_task || !abrt_task->sc) { @@ -238,7 +239,13 @@ static int beiscsi_eh_abort(struct scsi_cmnd *sc) return SUCCESS; } /* get a task ref till FW processes the req for the ICD used */ - __iscsi_get_task(abrt_task); + if (!iscsi_get_task(abrt_task)) { + spin_unlock(&session->back_lock); + /* We are just about to call iscsi_free_task so wait for it. */ + udelay(5); + goto completion_check; + } + abrt_io_task = abrt_task->dd_data; conn = abrt_task->conn; beiscsi_conn = conn->dd_data; @@ -323,7 +330,15 @@ static int beiscsi_eh_device_reset(struct scsi_cmnd *sc) } /* get a task ref till FW processes the req for the ICD used */ - __iscsi_get_task(task); + if (!iscsi_get_task(task)) { + /* + * The task has completed in the driver and is + * completing in libiscsi. Just ignore it here. When we + * call iscsi_eh_device_reset, it will wait for us. + */ + continue; + } + io_task = task->dd_data; /* mark WRB invalid which have been not processed by FW yet */ if (is_chip_be2_be3r(phba)) { diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c index 5c74ab92725f..a2d0daf5bd60 100644 --- a/drivers/scsi/libiscsi.c +++ b/drivers/scsi/libiscsi.c @@ -83,6 +83,8 @@ MODULE_PARM_DESC(debug_libiscsi_eh, "%s " dbg_fmt, __func__, ##arg); \ } while (0); +#define ISCSI_CMD_COMPL_WAIT 5 + inline void iscsi_conn_queue_xmit(struct iscsi_conn *conn) { struct Scsi_Host *shost = conn->session->host; @@ -482,11 +484,11 @@ static void iscsi_free_task(struct iscsi_task *task) } } -void __iscsi_get_task(struct iscsi_task *task) +bool iscsi_get_task(struct iscsi_task *task) { - refcount_inc(&task->refcount); + return refcount_inc_not_zero(&task->refcount); } -EXPORT_SYMBOL_GPL(__iscsi_get_task); +EXPORT_SYMBOL_GPL(iscsi_get_task); void __iscsi_put_task(struct iscsi_task *task) { @@ -600,20 +602,17 @@ static bool cleanup_queued_task(struct iscsi_task *task) } /* - * session frwd lock must be held and if not called for a task that is still - * pending or from the xmit thread, then xmit thread must be suspended + * session back and frwd lock must be held and if not called for a task that + * is still pending or from the xmit thread, then xmit thread must be suspended */ -static void fail_scsi_task(struct iscsi_task *task, int err) +static void __fail_scsi_task(struct iscsi_task *task, int err) { struct iscsi_conn *conn = task->conn; struct scsi_cmnd *sc; int state; - spin_lock_bh(&conn->session->back_lock); - if (cleanup_queued_task(task)) { - spin_unlock_bh(&conn->session->back_lock); + if (cleanup_queued_task(task)) return; - } if (task->state == ISCSI_TASK_PENDING) { /* @@ -632,7 +631,15 @@ static void fail_scsi_task(struct iscsi_task *task, int err) sc->result = err << 16; scsi_set_resid(sc, scsi_bufflen(sc)); iscsi_complete_task(task, state); - spin_unlock_bh(&conn->session->back_lock); +} + +static void fail_scsi_task(struct iscsi_task *task, int err) +{ + struct iscsi_session *session = task->conn->session; + + spin_lock_bh(&session->back_lock); + __fail_scsi_task(task, err); + spin_unlock_bh(&session->back_lock); } static int iscsi_prep_mgmt_task(struct iscsi_conn *conn, @@ -1449,8 +1456,17 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, spin_lock_bh(&conn->session->back_lock); if (!conn->task) { - /* Take a ref so we can access it after xmit_task() */ - __iscsi_get_task(task); + /* + * Take a ref so we can access it after xmit_task(). + * + * This should never fail because the failure paths will have + * stopped the xmit thread. WARN on move on. + */ + if (!iscsi_get_task(task)) { + spin_unlock_bh(&conn->session->back_lock); + WARN_ON_ONCE(1); + return 0; + } } else { /* Already have a ref from when we failed to send it last call */ conn->task = NULL; @@ -1492,7 +1508,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task, * get an extra ref that is released next time we access it * as conn->task above. */ - __iscsi_get_task(task); + iscsi_get_task(task); conn->task = task; } @@ -1907,6 +1923,7 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) struct iscsi_task *task; int i; +restart_cmd_loop: spin_lock_bh(&session->back_lock); for (i = 0; i < session->cmds_max; i++) { task = session->cmds[i]; @@ -1915,22 +1932,25 @@ static void fail_scsi_tasks(struct iscsi_conn *conn, u64 lun, int error) if (lun != -1 && lun != task->sc->device->lun) continue; - - __iscsi_get_task(task); - spin_unlock_bh(&session->back_lock); + /* + * The cmd is completing but if this is called from an eh + * callout path then when we return scsi-ml owns the cmd. Wait + * for the completion path to finish freeing the cmd. + */ + if (!iscsi_get_task(task)) { + spin_unlock_bh(&session->back_lock); + spin_unlock_bh(&session->frwd_lock); + udelay(ISCSI_CMD_COMPL_WAIT); + spin_lock_bh(&session->frwd_lock); + goto restart_cmd_loop; + } ISCSI_DBG_SESSION(session, "failing sc %p itt 0x%x state %d\n", task->sc, task->itt, task->state); - fail_scsi_task(task, error); - - spin_unlock_bh(&session->frwd_lock); - iscsi_put_task(task); - spin_lock_bh(&session->frwd_lock); - - spin_lock_bh(&session->back_lock); + __fail_scsi_task(task, error); + __iscsi_put_task(task); } - spin_unlock_bh(&session->back_lock); } @@ -2035,7 +2055,16 @@ enum blk_eh_timer_return iscsi_eh_cmd_timed_out(struct scsi_cmnd *sc) spin_unlock(&session->back_lock); goto done; } - __iscsi_get_task(task); + if (!iscsi_get_task(task)) { + /* + * Racing with the completion path right now, so give it more + * time so that path can complete it like normal. + */ + rc = BLK_EH_RESET_TIMER; + task = NULL; + spin_unlock(&session->back_lock); + goto done; + } spin_unlock(&session->back_lock); if (session->state != ISCSI_STATE_LOGGED_IN) { @@ -2282,6 +2311,7 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) ISCSI_DBG_EH(session, "aborting sc %p\n", sc); +completion_check: mutex_lock(&session->eh_mutex); spin_lock_bh(&session->frwd_lock); /* @@ -2321,13 +2351,20 @@ int iscsi_eh_abort(struct scsi_cmnd *sc) return SUCCESS; } + if (!iscsi_get_task(task)) { + spin_unlock(&session->back_lock); + spin_unlock_bh(&session->frwd_lock); + mutex_unlock(&session->eh_mutex); + /* We are just about to call iscsi_free_task so wait for it. */ + udelay(ISCSI_CMD_COMPL_WAIT); + goto completion_check; + } + + ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); conn = session->leadconn; iscsi_get_conn(conn->cls_conn); conn->eh_abort_cnt++; age = session->age; - - ISCSI_DBG_EH(session, "aborting [sc %p itt 0x%x]\n", sc, task->itt); - __iscsi_get_task(task); spin_unlock(&session->back_lock); if (task->state == ISCSI_TASK_PENDING) { diff --git a/drivers/scsi/libiscsi_tcp.c b/drivers/scsi/libiscsi_tcp.c index 883005757ddb..defe08142b75 100644 --- a/drivers/scsi/libiscsi_tcp.c +++ b/drivers/scsi/libiscsi_tcp.c @@ -558,7 +558,10 @@ static int iscsi_tcp_r2t_rsp(struct iscsi_conn *conn, struct iscsi_hdr *hdr) return 0; } task->last_xfer = jiffies; - __iscsi_get_task(task); + if (!iscsi_get_task(task)) { + /* Let the path that got the early rsp complete it */ + return 0; + } tcp_conn = conn->dd_data; rhdr = (struct iscsi_r2t_rsp *)tcp_conn->in.hdr; diff --git a/include/scsi/libiscsi.h b/include/scsi/libiscsi.h index 522fd16f1dbb..9032a214104c 100644 --- a/include/scsi/libiscsi.h +++ b/include/scsi/libiscsi.h @@ -470,7 +470,7 @@ extern struct iscsi_task *iscsi_itt_to_task(struct iscsi_conn *, itt_t); extern void iscsi_requeue_task(struct iscsi_task *task); extern void iscsi_put_task(struct iscsi_task *task); extern void __iscsi_put_task(struct iscsi_task *task); -extern void __iscsi_get_task(struct iscsi_task *task); +extern bool iscsi_get_task(struct iscsi_task *task); extern void iscsi_complete_scsi_task(struct iscsi_task *task, uint32_t exp_cmdsn, uint32_t max_cmdsn);
We currently require that the back_lock is held when calling the functions that manipulate the iscsi_task refcount. The only reason for this is to handle races where we are handing SCSI-ml eh callbacks and the cmd is completing at the same time the normal completion path is running, and we can't return from the EH callback until the driver has stopped accessing the cmd. By holding the back_lock while also accessing the task->state makes it simple to check that a cmd is completing and also get/put a refcount at the same time. The problem is that we don't want to take the back_lock from the xmit path for normal IO since it causes contention with the completion path if the user has chosen to try and split those paths on different CPUs (in this case abusing the CPUs and igoring caching improves perf for some uses). This patch begins to remove the back_lock requirement for iscsi_get/put_task by removing the requirement for the get path. Instead of always holding the back_lock we detect if something has done the last put and is about to call iscsi_free_task. The next patch will then allow iscsi code to do the last put on a task and only grab the back_lock if the refcount is now zero and it's going to call iscsi_free_task. Signed-off-by: Mike Christie <michael.christie@oracle.com> --- drivers/scsi/be2iscsi/be_main.c | 19 ++++++- drivers/scsi/libiscsi.c | 95 +++++++++++++++++++++++---------- drivers/scsi/libiscsi_tcp.c | 5 +- include/scsi/libiscsi.h | 2 +- 4 files changed, 88 insertions(+), 33 deletions(-)