Message ID | 20231220162658.12392-1-dwagner@suse.de |
---|---|
State | New |
Headers | show |
Series | scsi: lpfc: use unsigned type for num_sge | expand |
The change is good, however, I don't think this was really the problem. We would like to know if this patch really solved an issue they observed. A good data point to know is what adapter they're using. If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff. But even then, this patch implies that dma_map_sg() returned a crazy huge amount with the MSB set. On Wed, Dec 20, 2023 at 8:29 AM Daniel Wagner <dwagner@suse.de> wrote: > From: Hannes Reinecke <hare@suse.de> > > LUNs going into “failed ready running” state observed on >1T and on > even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when > DIF is enabled at the host. > > The kernel logs: > > Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0 > > The host lpfc driver is failing to setup scatter/gather list > (protection data) for the IO's. > > The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes > the compiler to remove the most significant bit. Use an unsigned type > instead. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > [dwagner: added commit message] > Signed-off-by: Daniel Wagner <dwagner@suse.de> > --- > drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------ > 1 file changed, 6 insertions(+), 6 deletions(-) > > diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c > index d26941b131fd..bf879d81846b 100644 > --- a/drivers/scsi/lpfc/lpfc_scsi.c > +++ b/drivers/scsi/lpfc/lpfc_scsi.c > @@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct > scsi_cmnd *sc, > * > * Returns the number of SGEs added to the SGL. > **/ > -static int > +static uint32_t > lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc, > struct sli4_sge *sgl, int datasegcnt, > struct lpfc_io_buf *lpfc_cmd) > @@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct > scsi_cmnd *sc, > struct scatterlist *sgde = NULL; /* s/g data entry */ > struct sli4_sge_diseed *diseed = NULL; > dma_addr_t physaddr; > - int i = 0, num_sge = 0, status; > - uint32_t reftag; > + int i = 0, status; > + uint32_t reftag, num_sge = 0; > uint8_t txop, rxop; > #ifdef CONFIG_SCSI_LPFC_DEBUG_FS > uint32_t rc; > @@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct > scsi_cmnd *sc, > * > * Returns the number of SGEs added to the SGL. > **/ > -static int > +static uint32_t > lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc, > struct sli4_sge *sgl, int datacnt, int protcnt, > struct lpfc_io_buf *lpfc_cmd) > @@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct > scsi_cmnd *sc, > uint32_t rc; > #endif > uint32_t checking = 1; > - uint32_t dma_offset = 0; > - int num_sge = 0, j = 2; > + uint32_t dma_offset = 0, num_sge = 0; > + int j = 2; > struct sli4_hybrid_sgl *sgl_xtra = NULL; > > sgpe = scsi_prot_sglist(sc); > -- > 2.43.0 > >
Hi Dick, On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote: > The change is good, however, I don't think this was really the > problem. I tried to write the commit message based on the bug report we got. So yes, it's possible the it is not correct as I was not really involved and might missinterpret it. > We would like to know if this patch really solved an issue they > observed. Yes, it fixes the reported problem. > A good data point to know is what adapter they're using. A bunch of different HPE cards which show this log entry: SN1700E, SN1610E and SN1200E. > If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we > would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff. > > But even then, this patch implies that dma_map_sg() returned a crazy > huge amount with the MSB set. Sure, though this seems to be the case. One noteworthy information is that DIF needs to be enabled to trigger it: # cat /sys/module/lpfc/parameters/lpfc_enable_bg 1 # cat /sys/module/lpfc/parameters/lpfc_prot_guard 2 Thanks, Daniel
On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote: > Hi Dick, > > On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote: > > The change is good, however, I don't think this was really the > > problem. > > I tried to write the commit message based on the bug report we got. So > yes, it's possible the it is not correct as I was not really involved > and might missinterpret it. Any chance to get this moving forward?
Sorry, I got distracted. The change is fine. On Wed, Jan 31, 2024 at 6:41 AM Daniel Wagner <dwagner@suse.de> wrote: > On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote: > > Hi Dick, > > > > On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote: > > > The change is good, however, I don't think this was really the > > > problem. > > > > I tried to write the commit message based on the bug report we got. So > > yes, it's possible the it is not correct as I was not really involved > > and might missinterpret it. > > Any chance to get this moving forward? >
On Wed, 20 Dec 2023 17:26:58 +0100, Daniel Wagner wrote: > LUNs going into “failed ready running” state observed on >1T and on > even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when > DIF is enabled at the host. > > The kernel logs: > > Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0 > > [...] Applied to 6.8/scsi-fixes, thanks! [1/1] scsi: lpfc: use unsigned type for num_sge https://git.kernel.org/mkp/scsi/c/d6c1b19153f9
diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c index d26941b131fd..bf879d81846b 100644 --- a/drivers/scsi/lpfc/lpfc_scsi.c +++ b/drivers/scsi/lpfc/lpfc_scsi.c @@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc, * * Returns the number of SGEs added to the SGL. **/ -static int +static uint32_t lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc, struct sli4_sge *sgl, int datasegcnt, struct lpfc_io_buf *lpfc_cmd) @@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc, struct scatterlist *sgde = NULL; /* s/g data entry */ struct sli4_sge_diseed *diseed = NULL; dma_addr_t physaddr; - int i = 0, num_sge = 0, status; - uint32_t reftag; + int i = 0, status; + uint32_t reftag, num_sge = 0; uint8_t txop, rxop; #ifdef CONFIG_SCSI_LPFC_DEBUG_FS uint32_t rc; @@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc, * * Returns the number of SGEs added to the SGL. **/ -static int +static uint32_t lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc, struct sli4_sge *sgl, int datacnt, int protcnt, struct lpfc_io_buf *lpfc_cmd) @@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc, uint32_t rc; #endif uint32_t checking = 1; - uint32_t dma_offset = 0; - int num_sge = 0, j = 2; + uint32_t dma_offset = 0, num_sge = 0; + int j = 2; struct sli4_hybrid_sgl *sgl_xtra = NULL; sgpe = scsi_prot_sglist(sc);