Message ID | 1601268657-940-1-git-send-email-muneendra.kumar@broadcom.com |
---|---|
Headers | show |
Series | scsi: Support to handle Intermittent errors | expand |
On 9/27/20 11:50 PM, Muneendra wrote: > This patch adds a support to prevent retries of all the pending/inflight > io's after an abort succeeds on a particular device when transport > connectivity to the device is encountering intermittent errors. > > Intermittent connectivity is a condition that can be detected by transport > fabric notifications. A service can monitor the ELS notifications and > take action on all the outstanding io's of a scsi device at that instant. > Is the service mentioned above a new daemon or is it integrated into something like multipathd? What does the part about monitoring ELS notifications mean? Is the service just doing something like a ELS ECHO, or is it able to watch the IO on the wire/card (like if you did tcpdump and watched iscsi/tcp traffic) or is it something completely different?
On 10/2/2020 10:01 AM, Mike Christie wrote: > On 9/27/20 11:50 PM, Muneendra wrote: >> This patch adds a support to prevent retries of all the pending/inflight >> io's after an abort succeeds on a particular device when transport >> connectivity to the device is encountering intermittent errors. >> >> Intermittent connectivity is a condition that can be detected by transport >> fabric notifications. A service can monitor the ELS notifications and >> take action on all the outstanding io's of a scsi device at that instant. >> > > Is the service mentioned above a new daemon or is it integrated into > something like multipathd? > > What does the part about monitoring ELS notifications mean? Is the > service just doing something like a ELS ECHO, or is it able to watch > the IO on the wire/card (like if you did tcpdump and watched iscsi/tcp > traffic) or is it something completely different? > For the last part.... the FC drivers, when receiving FC FPIN ELS's are calling a scsi transport routine with the FPIN payload. The transport is pushing this as an "event" via netlink. An app bound to the local address used by the scsi transport can receive the event and parse it. This is a new daemon, specific to FC, which monitors for FPIN events, parses the related topology devices, then interacts with sysfs and possibly multipath based on what it's seeing from the fabric. -- james