diff mbox series

scsi: lpfc: use unsigned type for num_sge

Message ID 20231220162658.12392-1-dwagner@suse.de
State New
Headers show
Series scsi: lpfc: use unsigned type for num_sge | expand

Commit Message

Daniel Wagner Dec. 20, 2023, 4:26 p.m. UTC
From: Hannes Reinecke <hare@suse.de>

LUNs going into “failed ready running” state observed on >1T and on
even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
DIF is enabled at the host.

The kernel logs:

  Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0

The host lpfc driver is failing to setup scatter/gather list
(protection data) for the IO's.

The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
the compiler to remove the most significant bit. Use an unsigned type
instead.

Signed-off-by: Hannes Reinecke <hare@suse.de>
[dwagner: added commit message]
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
 drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

Comments

Dick Kennedy Dec. 22, 2023, 6:04 p.m. UTC | #1
The change is good, however, I  don't think this was really the problem.

We would  like to know if this patch really solved an issue they observed.

 A good data point to know is what adapter they're using.
 If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
 would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.

 But even then, this patch implies that dma_map_sg() returned a crazy
 huge amount with the MSB set.

On Wed, Dec 20, 2023 at 8:29 AM Daniel Wagner <dwagner@suse.de> wrote:

> From: Hannes Reinecke <hare@suse.de>
>
> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
>
> The kernel logs:
>
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
>
> The host lpfc driver is failing to setup scatter/gather list
> (protection data) for the IO's.
>
> The return type lpfc_bg_setup_sgl()/lpfc_bg_setup_sgl_prot() causes
> the compiler to remove the most significant bit. Use an unsigned type
> instead.
>
> Signed-off-by: Hannes Reinecke <hare@suse.de>
> [dwagner: added commit message]
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
>  drivers/scsi/lpfc/lpfc_scsi.c | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
> index d26941b131fd..bf879d81846b 100644
> --- a/drivers/scsi/lpfc/lpfc_scsi.c
> +++ b/drivers/scsi/lpfc/lpfc_scsi.c
> @@ -1918,7 +1918,7 @@ lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datasegcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -1926,8 +1926,8 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         struct scatterlist *sgde = NULL; /* s/g data entry */
>         struct sli4_sge_diseed *diseed = NULL;
>         dma_addr_t physaddr;
> -       int i = 0, num_sge = 0, status;
> -       uint32_t reftag;
> +       int i = 0, status;
> +       uint32_t reftag, num_sge = 0;
>         uint8_t txop, rxop;
>  #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
>         uint32_t rc;
> @@ -2099,7 +2099,7 @@ lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>   *
>   * Returns the number of SGEs added to the SGL.
>   **/
> -static int
> +static uint32_t
>  lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
>                 struct sli4_sge *sgl, int datacnt, int protcnt,
>                 struct lpfc_io_buf *lpfc_cmd)
> @@ -2123,8 +2123,8 @@ lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct
> scsi_cmnd *sc,
>         uint32_t rc;
>  #endif
>         uint32_t checking = 1;
> -       uint32_t dma_offset = 0;
> -       int num_sge = 0, j = 2;
> +       uint32_t dma_offset = 0, num_sge = 0;
> +       int j = 2;
>         struct sli4_hybrid_sgl *sgl_xtra = NULL;
>
>         sgpe = scsi_prot_sglist(sc);
> --
> 2.43.0
>
>
Daniel Wagner Jan. 17, 2024, 10:56 a.m. UTC | #2
Hi Dick,

On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> The change is good, however, I  don't think this was really the
> problem.

I tried to write the commit message based on the bug report we got. So
yes, it's possible the it is not correct as I was not really involved
and might missinterpret it.

> We would  like to know if this patch really solved an issue they
> observed.

Yes, it fixes the reported problem.

>  A good data point to know is what adapter they're using.

A bunch of different HPE cards which show this log entry: SN1700E,
SN1610E and SN1200E.

>  If that adapter supports hybrid sgl (i.e. phba->cfg_xpsgl), then we
>  would have set the max sg_tablesize = LPFC_MAX_SG_TABLESIZE = 0xffff.
> 
>  But even then, this patch implies that dma_map_sg() returned a crazy
>  huge amount with the MSB set.

Sure, though this seems to be the case.

One noteworthy information is that DIF needs to be enabled to trigger
it:

# cat /sys/module/lpfc/parameters/lpfc_enable_bg
1
# cat /sys/module/lpfc/parameters/lpfc_prot_guard
2

Thanks,
Daniel
Daniel Wagner Jan. 31, 2024, 2:41 p.m. UTC | #3
On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote:
> Hi Dick,
> 
> On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> > The change is good, however, I  don't think this was really the
> > problem.
> 
> I tried to write the commit message based on the bug report we got. So
> yes, it's possible the it is not correct as I was not really involved
> and might missinterpret it.

Any chance to get this moving forward?
Dick Kennedy Jan. 31, 2024, 2:43 p.m. UTC | #4
Sorry, I got distracted.
The change is fine.

On Wed, Jan 31, 2024 at 6:41 AM Daniel Wagner <dwagner@suse.de> wrote:

> On Wed, Jan 17, 2024 at 11:56:27AM +0100, Daniel Wagner wrote:
> > Hi Dick,
> >
> > On Fri, Dec 22, 2023 at 10:04:50AM -0800, Dick Kennedy wrote:
> > > The change is good, however, I  don't think this was really the
> > > problem.
> >
> > I tried to write the commit message based on the bug report we got. So
> > yes, it's possible the it is not correct as I was not really involved
> > and might missinterpret it.
>
> Any chance to get this moving forward?
>
Martin K. Petersen Feb. 6, 2024, 2:08 a.m. UTC | #5
On Wed, 20 Dec 2023 17:26:58 +0100, Daniel Wagner wrote:

> LUNs going into “failed ready running” state observed on >1T and on
> even numbers of size (2T, 4T, 6T, 8T and 10T). The issue occurs when
> DIF is enabled at the host.
> 
> The kernel logs:
> 
>   Cannot setup S/G List for HBAIO segs 1/1 SGL 512 SCSI 256: 3 0
> 
> [...]

Applied to 6.8/scsi-fixes, thanks!

[1/1] scsi: lpfc: use unsigned type for num_sge
      https://git.kernel.org/mkp/scsi/c/d6c1b19153f9
diff mbox series

Patch

diff --git a/drivers/scsi/lpfc/lpfc_scsi.c b/drivers/scsi/lpfc/lpfc_scsi.c
index d26941b131fd..bf879d81846b 100644
--- a/drivers/scsi/lpfc/lpfc_scsi.c
+++ b/drivers/scsi/lpfc/lpfc_scsi.c
@@ -1918,7 +1918,7 @@  lpfc_bg_setup_bpl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datasegcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -1926,8 +1926,8 @@  lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	struct scatterlist *sgde = NULL; /* s/g data entry */
 	struct sli4_sge_diseed *diseed = NULL;
 	dma_addr_t physaddr;
-	int i = 0, num_sge = 0, status;
-	uint32_t reftag;
+	int i = 0, status;
+	uint32_t reftag, num_sge = 0;
 	uint8_t txop, rxop;
 #ifdef CONFIG_SCSI_LPFC_DEBUG_FS
 	uint32_t rc;
@@ -2099,7 +2099,7 @@  lpfc_bg_setup_sgl(struct lpfc_hba *phba, struct scsi_cmnd *sc,
  *
  * Returns the number of SGEs added to the SGL.
  **/
-static int
+static uint32_t
 lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 		struct sli4_sge *sgl, int datacnt, int protcnt,
 		struct lpfc_io_buf *lpfc_cmd)
@@ -2123,8 +2123,8 @@  lpfc_bg_setup_sgl_prot(struct lpfc_hba *phba, struct scsi_cmnd *sc,
 	uint32_t rc;
 #endif
 	uint32_t checking = 1;
-	uint32_t dma_offset = 0;
-	int num_sge = 0, j = 2;
+	uint32_t dma_offset = 0, num_sge = 0;
+	int j = 2;
 	struct sli4_hybrid_sgl *sgl_xtra = NULL;
 
 	sgpe = scsi_prot_sglist(sc);