Message ID | 20230112140412.667308-1-niklas.cassel@wdc.com |
---|---|
Headers | show |
Series | Add Command Duration Limits support | expand |
On 1/12/23 15:03, Niklas Cassel wrote: > Current ata_scsi_set_sense() calls scsi_build_sense(), which, > in addition to setting the sense data, unconditionally sets the > scsicmd->result to SAM_STAT_CHECK_CONDITION. > > For Command Duration Limits policy 0xD: > The device shall complete the command without error (SAM_STAT_GOOD) > with the additional sense code set to DATA CURRENTLY UNAVAILABLE. > > It is perfectly fine to have sense data for a command that returned > completion without error. > > In order to support for CDL policy 0xD, we have to remove this > assumption that having sense data means that the command failed > (SAM_STAT_CHECK_CONDITION). > > Add a new parameter to ata_scsi_set_sense() to allow us to set > sense data without unconditionally setting SAM_STAT_CHECK_CONDITION. > This new parameter will be used in a follow-up patch. > > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> > --- > drivers/ata/libata-eh.c | 3 ++- > drivers/ata/libata-sata.c | 4 ++-- > drivers/ata/libata-scsi.c | 38 ++++++++++++++++++++------------------ > drivers/ata/libata.h | 4 ++-- > 4 files changed, 26 insertions(+), 23 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On 1/12/23 15:03, Niklas Cassel wrote: > In SCSI, we get the sense data as part of the completion, for ATA > however, we need to fetch the sense data as an extra step. For an > aborted ATA command the sense data is fetched via libata's > ->eh_strategy_handler(). > > For Command Duration Limits policy 0xD: > The device shall complete the command without error with the additional > sense code set to DATA CURRENTLY UNAVAILABLE. > > In order to handle this policy in libata, we intend to send a successful > command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the > sense data for the good command. This is similar to how we handle an > aborted ATA command, just that we need to read the Successful NCQ > Commands log instead of the NCQ Command Error log. > > When we get a SATA completion with successful commands, ATA_SENSE will > be set, indicating that some commands in the completion have sense data. > > The sense_valid bitmask in the Sense Data for Successful NCQ Commands > log will inform exactly which commands that had sense data, which might > be a subset of all the commands that was completed in the same > completion. (Yet all will have ATA_SENSE set, since the status is per > completion.) > > The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" > sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will > not set the scmd->result to DID_TIME_OUT for these commands. However, > the successful commands that did not have sense data, must not get their > result marked as DID_TIME_OUT by SCSI EH. > > Add a new flag SCMD_EH_SUCCESS_CMD, which tells SCSI EH to not mark a > command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. > > This will be used by libata in a follow-up patch. > > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> > --- > drivers/scsi/scsi_error.c | 3 ++- > include/scsi/scsi_cmnd.h | 5 +++++ > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > index 2aa2c2aee6e7..51aa5c1e31b5 100644 > --- a/drivers/scsi/scsi_error.c > +++ b/drivers/scsi/scsi_error.c > @@ -2165,7 +2165,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q) > * scsi_eh_get_sense), scmd->result is already > * set, do not set DID_TIME_OUT. > */ > - if (!scmd->result) > + if (!scmd->result && > + !(scmd->flags & SCMD_EH_SUCCESS_CMD)) > scmd->result |= (DID_TIME_OUT << 16); > SCSI_LOG_ERROR_RECOVERY(3, > scmd_printk(KERN_INFO, scmd, Wouldn't it be better to use '!scmd->result && !scsi_sense_valid(scmd)' instead of a new flag? After all, if we have a valid sense code we _have_ been able to communicate with the device. And as we did so it's questionable whether it should count as a command time out ... Cheers, Hannes
On 1/12/23 15:03, Niklas Cassel wrote: > From: Damien Le Moal <damien.lemoal@opensource.wdc.com> > > Allow scsi_mode_sense() to retrieve sub-pages of mode pages by adding > the subpage argument. Change all the current caller sites to specify > the subpage 0. > > Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> > --- > drivers/scsi/scsi_lib.c | 4 +++- > drivers/scsi/scsi_transport_sas.c | 2 +- > drivers/scsi/sd.c | 9 ++++----- > drivers/scsi/sr.c | 2 +- > include/scsi/scsi_device.h | 5 +++-- > 5 files changed, 12 insertions(+), 10 deletions(-) > Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes
On 1/13/23 20:55, Hannes Reinecke wrote: > On 1/12/23 15:03, Niklas Cassel wrote: >> From: Damien Le Moal <damien.lemoal@opensource.wdc.com> >> >> Introduce the IOPRIO_CLASS_DL priority class to indicate that IOs should >> be executed using duration-limits targets. The duration target to apply >> to a command is indicated using the priority level. Up to 8 levels are >> supported, with level 0 indiating "no limit". >> >> This priority class has effect only if the target device supports the >> command duration limits feature and this feature is enabled by the user. >> In BFQ and mq-deadline, all requests with this new priority class are >> handled using the highest priority class RT and priority level 0. >> >> Signed-off-by: Damien Le Moal <damien.lemoal@opensource.wdc.com> >> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> >> --- >> block/bfq-iosched.c | 10 ++++++++++ >> block/blk-ioprio.c | 3 +++ >> block/ioprio.c | 3 ++- >> block/mq-deadline.c | 1 + >> include/linux/ioprio.h | 2 +- >> include/uapi/linux/ioprio.h | 7 +++++++ >> 6 files changed, 24 insertions(+), 2 deletions(-) >> > I wonder. > > _Normally_ a command timeout is only in force once the command is being > handed off to the driver. As such it doesn't apply for any scheduling > being done before that; most notably the I/O scheduler is not affected > by any command timeout. > > And I was under the impression that CDL really only allows the drive to > impose a command timeout on its own. > (Pray correct me if I'm mistaken) The CDL descriptors to be used for read/write commands are defined by the user and set on the drive (write log command). By default, the drive does not have any CDL descriptor set, so no limit == regular behavior (no timeout aborts). Also keep in mind that CDLs may be of the (1) "best effort" flavor, or the (2) "abort" flavor. For (2), a command missing its CDL limit will be aborted by the device (limit set by the user !). But for (1), the drive will continue executing the command even if the CDL limit is exceeded, so no timeout error. > Hence: does CDL really impinge on the I/O scheduler? Or shouldn't we > treat CDL just like a 'normal' command timeout, only to be activated > when normal command timeout is enabled? No impact on the IO scheduler, at least for now. But given that CDL is all about controlling IO latency, we would miss that goal if we have an IO scheduler that excessively delays a request with a short CDL set... The main point of this patch is to have CDL commands treated similarly to high priority commands to avoid such delays. This is also consistant with the fact that CDL and ATA NCQ priority commands are mutually exclusive features. They cannot be used together. So there we cannot have a mix of CDL and NCQ prio commands together to the same drive.
On Fri, Jan 13, 2023 at 08:57:37AM +0100, Hannes Reinecke wrote: > On 1/12/23 15:03, Niklas Cassel wrote: > > In SCSI, we get the sense data as part of the completion, for ATA > > however, we need to fetch the sense data as an extra step. For an > > aborted ATA command the sense data is fetched via libata's > > ->eh_strategy_handler(). > > > > For Command Duration Limits policy 0xD: > > The device shall complete the command without error with the additional > > sense code set to DATA CURRENTLY UNAVAILABLE. > > > > In order to handle this policy in libata, we intend to send a successful > > command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the > > sense data for the good command. This is similar to how we handle an > > aborted ATA command, just that we need to read the Successful NCQ > > Commands log instead of the NCQ Command Error log. > > > > When we get a SATA completion with successful commands, ATA_SENSE will > > be set, indicating that some commands in the completion have sense data. > > > > The sense_valid bitmask in the Sense Data for Successful NCQ Commands > > log will inform exactly which commands that had sense data, which might > > be a subset of all the commands that was completed in the same > > completion. (Yet all will have ATA_SENSE set, since the status is per > > completion.) > > > > The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" > > sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will > > not set the scmd->result to DID_TIME_OUT for these commands. However, > > the successful commands that did not have sense data, must not get their > > result marked as DID_TIME_OUT by SCSI EH. > > > > Add a new flag SCMD_EH_SUCCESS_CMD, which tells SCSI EH to not mark a > > command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. > > > > This will be used by libata in a follow-up patch. > > > > Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> > > --- > > drivers/scsi/scsi_error.c | 3 ++- > > include/scsi/scsi_cmnd.h | 5 +++++ > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c > > index 2aa2c2aee6e7..51aa5c1e31b5 100644 > > --- a/drivers/scsi/scsi_error.c > > +++ b/drivers/scsi/scsi_error.c > > @@ -2165,7 +2165,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q) > > * scsi_eh_get_sense), scmd->result is already > > * set, do not set DID_TIME_OUT. > > */ > > - if (!scmd->result) > > + if (!scmd->result && > > + !(scmd->flags & SCMD_EH_SUCCESS_CMD)) > > scmd->result |= (DID_TIME_OUT << 16); > > SCSI_LOG_ERROR_RECOVERY(3, > > scmd_printk(KERN_INFO, scmd, > Wouldn't it be better to use '!scmd->result && !scsi_sense_valid(scmd)' > instead of a new flag? > After all, if we have a valid sense code we _have_ been able to communicate > with the device. And as we did so it's questionable whether it should count > as a command time out ... Hello Hannes, Thanks a lot for helping out reviewing this series! Unfortunately, your suggestion won't work. Let me explain: When you get a FIS, the ACT register will have a bit set for each command that finished, however, all the commands will share a single STATUS value (since there is just a shared STATUS field in the FIS). So let's say that tags 0-3 got finished (i.e. bits 0-3 are set in the ACT field) and the STATUS field has the "Sense Data Available" bit set. This just tells us that at least one of tags 0-3 has sense data. In order to know which of these tags that actually has sense data, we need to read the "Sense Data for Successful NCQ Commands log", which contains a sense_valid bitmask (which contains one bit for each of the 32 tags). So reading the "Sense Data for Successful NCQ Commands log" might tell us that just tag 0-1 have sense data. So, libata calls ata_qc_schedule_eh() on tags 0-3, wait until SCSI calls libata .eh_strategy_handler(). libata .eh_strategy_handler() will read the "Sense Data for Successful NCQ Commands log", which will see that there is sense data for tags 0-1, and will add sense data for those commands, and call scsi_check_sense() for tags 0-1. ata_eh_finish() will finally be called, to determine what to do with the commands that belonged to EH. The code looks like this: if (qc->flags & ATA_QCFLAG_SENSE_VALID || qc->flags & ATA_QCFLAG_EH_SUCCESS_CMD) { ata_eh_qc_complete(qc); } So it will call complete for all 4 tags, regardless is they had sense data or not. scsi_eh_flush_done_q() will soon be called, and since ata_eh_qc_complete() sets scmd->retries == scmd->allowed, none of the four commands will be retired. if (!scmd->result && !(scmd->flags & SCMD_EH_SUCCESS_CMD)) scmd->result |= (DID_TIME_OUT << 16); The 2 commands with sense data will not get DID_TIMEOUT, because scmd->result has the SCSI ML byte set (which is set by scsi_check_sense()). The 2 commands without sense data will have scmd->result == 0, so they will get DID_TIME_OUT set if we don't have the extra !(scmd->flags & SCMD_EH_SUCCESS_CMD)) condition. SCSI could add an additional check for: if (!scmd->result && !scsi_sense_valid(scmd) && !(scmd->flags & SCMD_EH_SUCCESS_CMD)) scmd->result |= (DID_TIME_OUT << 16); so that a command with does have sense data, but scsi_check_sense() did not set any SCSI ML byte, does not get DID_TIME_OUT set. However, for CDL policy 0xD, which this patch cares about, we would still need the "&& !(scmd->flags & SCMD_EH_SUCCESS_CMD))", so at least from a CDL perspective, I don't see how any benefit of also adding a check for "&& !scsi_sense_valid(scmd)". Kind regards, Niklas
On 1/16/23 13:43, Niklas Cassel wrote: > On Fri, Jan 13, 2023 at 08:57:37AM +0100, Hannes Reinecke wrote: >> On 1/12/23 15:03, Niklas Cassel wrote: >>> In SCSI, we get the sense data as part of the completion, for ATA >>> however, we need to fetch the sense data as an extra step. For an >>> aborted ATA command the sense data is fetched via libata's >>> ->eh_strategy_handler(). >>> >>> For Command Duration Limits policy 0xD: >>> The device shall complete the command without error with the additional >>> sense code set to DATA CURRENTLY UNAVAILABLE. >>> >>> In order to handle this policy in libata, we intend to send a successful >>> command via SCSI EH, and let libata's ->eh_strategy_handler() fetch the >>> sense data for the good command. This is similar to how we handle an >>> aborted ATA command, just that we need to read the Successful NCQ >>> Commands log instead of the NCQ Command Error log. >>> >>> When we get a SATA completion with successful commands, ATA_SENSE will >>> be set, indicating that some commands in the completion have sense data. >>> >>> The sense_valid bitmask in the Sense Data for Successful NCQ Commands >>> log will inform exactly which commands that had sense data, which might >>> be a subset of all the commands that was completed in the same >>> completion. (Yet all will have ATA_SENSE set, since the status is per >>> completion.) >>> >>> The successful commands that have e.g. a "DATA CURRENTLY UNAVAILABLE" >>> sense data will have a SCSI ML byte set, so scsi_eh_flush_done_q() will >>> not set the scmd->result to DID_TIME_OUT for these commands. However, >>> the successful commands that did not have sense data, must not get their >>> result marked as DID_TIME_OUT by SCSI EH. >>> >>> Add a new flag SCMD_EH_SUCCESS_CMD, which tells SCSI EH to not mark a >>> command as DID_TIME_OUT, even if it has scmd->result == SAM_STAT_GOOD. >>> >>> This will be used by libata in a follow-up patch. >>> >>> Signed-off-by: Niklas Cassel <niklas.cassel@wdc.com> >>> --- >>> drivers/scsi/scsi_error.c | 3 ++- >>> include/scsi/scsi_cmnd.h | 5 +++++ >>> 2 files changed, 7 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c >>> index 2aa2c2aee6e7..51aa5c1e31b5 100644 >>> --- a/drivers/scsi/scsi_error.c >>> +++ b/drivers/scsi/scsi_error.c >>> @@ -2165,7 +2165,8 @@ void scsi_eh_flush_done_q(struct list_head *done_q) >>> * scsi_eh_get_sense), scmd->result is already >>> * set, do not set DID_TIME_OUT. >>> */ >>> - if (!scmd->result) >>> + if (!scmd->result && >>> + !(scmd->flags & SCMD_EH_SUCCESS_CMD)) >>> scmd->result |= (DID_TIME_OUT << 16); >>> SCSI_LOG_ERROR_RECOVERY(3, >>> scmd_printk(KERN_INFO, scmd, >> Wouldn't it be better to use '!scmd->result && !scsi_sense_valid(scmd)' >> instead of a new flag? >> After all, if we have a valid sense code we _have_ been able to communicate >> with the device. And as we did so it's questionable whether it should count >> as a command time out ... > > Hello Hannes, > > Thanks a lot for helping out reviewing this series! > > Unfortunately, your suggestion won't work. > > > Let me explain: > > When you get a FIS, the ACT register will have a bit set for each > command that finished, however, all the commands will share a single > STATUS value (since there is just a shared STATUS field in the FIS). > > So let's say that tags 0-3 got finished (i.e. bits 0-3 are set in the > ACT field) and the STATUS field has the "Sense Data Available" bit set. > > This just tells us that at least one of tags 0-3 has sense data. > > > In order to know which of these tags that actually has sense data, > we need to read the "Sense Data for Successful NCQ Commands log", > which contains a sense_valid bitmask (which contains one bit for > each of the 32 tags). > > So reading the "Sense Data for Successful NCQ Commands log" might > tell us that just tag 0-1 have sense data. > > So, libata calls ata_qc_schedule_eh() on tags 0-3, wait until SCSI calls > libata .eh_strategy_handler(). libata .eh_strategy_handler() will read the > "Sense Data for Successful NCQ Commands log", which will see that there is > sense data for tags 0-1, and will add sense data for those commands, and > call scsi_check_sense() for tags 0-1. > > ata_eh_finish() will finally be called, to determine what to do with the > commands that belonged to EH. > > The code looks like this: > if (qc->flags & ATA_QCFLAG_SENSE_VALID || > qc->flags & ATA_QCFLAG_EH_SUCCESS_CMD) { > ata_eh_qc_complete(qc); > } > > So it will call complete for all 4 tags, regardless is they had sense data > or not. > > > scsi_eh_flush_done_q() will soon be called, and since ata_eh_qc_complete() > sets scmd->retries == scmd->allowed, none of the four commands will be retired. > > if (!scmd->result && > !(scmd->flags & SCMD_EH_SUCCESS_CMD)) > scmd->result |= (DID_TIME_OUT << 16); > > The 2 commands with sense data will not get DID_TIMEOUT, > because scmd->result has the SCSI ML byte set > (which is set by scsi_check_sense()). > > The 2 commands without sense data will have scmd->result == 0, > so they will get DID_TIME_OUT set if we don't have the extra > !(scmd->flags & SCMD_EH_SUCCESS_CMD)) condition. > > > SCSI could add an additional check for: > > if (!scmd->result && !scsi_sense_valid(scmd) && > !(scmd->flags & SCMD_EH_SUCCESS_CMD)) > scmd->result |= (DID_TIME_OUT << 16); > > so that a command with does have sense data, but scsi_check_sense() > did not set any SCSI ML byte, does not get DID_TIME_OUT set. > > However, for CDL policy 0xD, which this patch cares about, > we would still need the "&& !(scmd->flags & SCMD_EH_SUCCESS_CMD))", > so at least from a CDL perspective, I don't see how any benefit of > also adding a check for "&& !scsi_sense_valid(scmd)". > Right, I see. Thanks for the explanation. You can add: Reviewed-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes