Message ID | 20240904210304.2947789-1-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | [v3] sd: Retry START STOP UNIT commands | expand |
On 2024/09/05 6:03, Bart Van Assche wrote: > During system resume, sd_start_stop_device() submits a START STOP UNIT > command to the SCSI device that is being resumed. That command is not > retried in case of a unit attention and hence may fail. An example: > > [16575.983359] sd 0:0:0:3: [sdd] Starting disk > [16575.983693] sd 0:0:0:3: [sdd] Start/Stop Unit failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK > [16575.983712] sd 0:0:0:3: [sdd] Sense Key : 0x6 > [16575.983730] sd 0:0:0:3: [sdd] ASC=0x29 ASCQ=0x0 > [16575.983738] sd 0:0:0:3: PM: dpm_run_callback(): scsi_bus_resume+0x0/0xa0 returns -5 > [16575.983783] sd 0:0:0:3: PM: failed to resume async: error -5 > > Make the SCSI core retry the START STOP UNIT command if the device > reports that it has been powered on or that it has been reset. > > Cc: Damien Le Moal <dlemoal@kernel.org> > Cc: Mike Christie <michael.christie@oracle.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> Looks OK to me. Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Bart, > During system resume, sd_start_stop_device() submits a START STOP UNIT > command to the SCSI device that is being resumed. That command is not > retried in case of a unit attention and hence may fail. An example: Thanks for making this change! Applied to 6.12/scsi-staging.
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 9db86943d04c..9f09060ab401 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -4093,9 +4093,38 @@ static int sd_start_stop_device(struct scsi_disk *sdkp, int start) { unsigned char cmd[6] = { START_STOP }; /* START_VALID */ struct scsi_sense_hdr sshdr; + struct scsi_failure failure_defs[] = { + { + /* Power on, reset, or bus device reset occurred */ + .sense = UNIT_ATTENTION, + .asc = 0x29, + .ascq = 0, + .result = SAM_STAT_CHECK_CONDITION, + }, + { + /* Power on occurred */ + .sense = UNIT_ATTENTION, + .asc = 0x29, + .ascq = 1, + .result = SAM_STAT_CHECK_CONDITION, + }, + { + /* SCSI bus reset */ + .sense = UNIT_ATTENTION, + .asc = 0x29, + .ascq = 2, + .result = SAM_STAT_CHECK_CONDITION, + }, + {} + }; + struct scsi_failures failures = { + .total_allowed = 3, + .failure_definitions = failure_defs, + }; const struct scsi_exec_args exec_args = { .sshdr = &sshdr, .req_flags = BLK_MQ_REQ_PM, + .failures = &failures, }; struct scsi_device *sdp = sdkp->device; int res;
During system resume, sd_start_stop_device() submits a START STOP UNIT command to the SCSI device that is being resumed. That command is not retried in case of a unit attention and hence may fail. An example: [16575.983359] sd 0:0:0:3: [sdd] Starting disk [16575.983693] sd 0:0:0:3: [sdd] Start/Stop Unit failed: Result: hostbyte=0x00 driverbyte=DRIVER_OK [16575.983712] sd 0:0:0:3: [sdd] Sense Key : 0x6 [16575.983730] sd 0:0:0:3: [sdd] ASC=0x29 ASCQ=0x0 [16575.983738] sd 0:0:0:3: PM: dpm_run_callback(): scsi_bus_resume+0x0/0xa0 returns -5 [16575.983783] sd 0:0:0:3: PM: failed to resume async: error -5 Make the SCSI core retry the START STOP UNIT command if the device reports that it has been powered on or that it has been reset. Cc: Damien Le Moal <dlemoal@kernel.org> Cc: Mike Christie <michael.christie@oracle.com> Signed-off-by: Bart Van Assche <bvanassche@acm.org> --- Changes compared to v2: - Dropped the SCMD_RETRY_PASSTHROUGH flag and use the SCSI failure mechanism instead. Changes compared to v1: - Renamed SCMD_RETRY_PASST_ON_UA into SCMD_RETRY_PASSTHROUGH. drivers/scsi/sd.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+)