Message ID | 20201201082730.24158-1-njavali@marvell.com |
---|---|
Headers | show |
Series | qla2xxx bug fixes | expand |
Hi Daniel, Comments inline.. > -----Original Message----- > From: Daniel Wagner <dwagner@suse.de> > Sent: Tuesday, December 1, 2020 2:32 PM > To: Nilesh Javali <njavali@marvell.com> > Cc: martin.petersen@oracle.com; linux-scsi@vger.kernel.org; GR-QLogic- > Storage-Upstream <GR-QLogic-Storage-Upstream@marvell.com> > Subject: Re: [PATCH 05/15] qla2xxx: Don't check for fw_started while posting > nvme command > > On Tue, Dec 01, 2020 at 12:27:20AM -0800, Nilesh Javali wrote: > > From: Saurav Kashyap <skashyap@marvell.com> > > > > NVMe commands can come only after successful addition of rport and nvme > > connect, and rport is only registered after FW started bit is set. Remove the > > redundant check. > > > > Signed-off-by: Saurav Kashyap <skashyap@marvell.com> > > Signed-off-by: Nilesh Javali <njavali@marvell.com> > > --- > > drivers/scsi/qla2xxx/qla_nvme.c | 12 ++++-------- > > 1 file changed, 4 insertions(+), 8 deletions(-) > > > > diff --git a/drivers/scsi/qla2xxx/qla_nvme.c > b/drivers/scsi/qla2xxx/qla_nvme.c > > index b7a1dc24db38..d4159d5a4ffd 100644 > > --- a/drivers/scsi/qla2xxx/qla_nvme.c > > +++ b/drivers/scsi/qla2xxx/qla_nvme.c > > @@ -554,19 +554,15 @@ static int qla_nvme_post_cmd(struct > nvme_fc_local_port *lport, > > > > fcport = qla_rport->fcport; > > > > - if (!qpair || !fcport) > > - return -ENODEV; > > - > > - if (!qpair->fw_started || fcport->deleted) > > + if (unlikely(!qpair || !fcport || fcport->deleted)) > > return -EBUSY; > > This reverts the fix from patch #1 in this series. What's the reasoning > that needs to return EBUSY when !qpair || !fcport is true? <SK> Ideally driver should not hit (!qpair || !fcport) case. The patch was to remove fw_started flag and consolidate other checks. We want IO to retry until remote port is deleted and below condition is hit. ----------<condition>-------- if (!(fcport->nvme_flag & NVME_FLAG_REGISTERED)) return rval; ----------<condition>--------- Thanks, ~Saurav
Hi Saurav On Tue, Dec 01, 2020 at 09:39:05AM +0000, Saurav Kashyap wrote: > > > --- a/drivers/scsi/qla2xxx/qla_nvme.c > > > +++ b/drivers/scsi/qla2xxx/qla_nvme.c > > > @@ -554,19 +554,15 @@ static int qla_nvme_post_cmd(struct > > nvme_fc_local_port *lport, > > > > > > fcport = qla_rport->fcport; > > > > > > - if (!qpair || !fcport) > > > - return -ENODEV; > > > - > > > - if (!qpair->fw_started || fcport->deleted) > > > + if (unlikely(!qpair || !fcport || fcport->deleted)) > > > return -EBUSY; > > > > This reverts the fix from patch #1 in this series. What's the reasoning > > that needs to return EBUSY when !qpair || !fcport is true? > > Ideally driver should not hit (!qpair || !fcport) case. The patch was > to remove fw_started flag and consolidate other checks. Looking again on the patch I think I got confused. > We want IO to retry until remote port is deleted and below condition is hit. The result of this patch is that in EBUSY will be returned in all the cases, not just for the case of fcport->deleted. So all is good from my point of view. Thanks for explaining. Sorry for the noise. Thanks, Daniel
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > From: Daniel Wagner <dwagner@suse.de> > > When the fcport is about to be deleted we should return EBUSY instead of > ENODEV. Only for EBUSY will the request be requeued in a multipath setup. > > Also return EBUSY when the firmware has not yet started to avoid dropping > the request. > > Link: https://lore.kernel.org/r/20201014073048.36219-1-dwagner@suse.de > Reviewed-by: Arun Easi <aeasi@marvell.com> > Signed-off-by: Daniel Wagner <dwagner@suse.de> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> > --- > drivers/scsi/qla2xxx/qla_nvme.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c > index 1f9005125313..b7a1dc24db38 100644 > --- a/drivers/scsi/qla2xxx/qla_nvme.c > +++ b/drivers/scsi/qla2xxx/qla_nvme.c > @@ -554,10 +554,12 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, > > fcport = qla_rport->fcport; > > - if (!qpair || !fcport || (qpair && !qpair->fw_started) || > - (fcport && fcport->deleted)) > + if (!qpair || !fcport) > return -ENODEV; > > + if (!qpair->fw_started || fcport->deleted) > + return -EBUSY; > + > vha = fcport->vha; > > if (!(fcport->nvme_flag & NVME_FLAG_REGISTERED)) > -- > 2.19.0.rc0 > Looks Good. Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > From: Quinn Tran <qutran@marvell.com> > > Driver created too many QPairs(126) with 28xx adapter. > Limit the number of CPUs to lower wasted resources. > > Signed-off-by: Quinn Tran <qutran@marvell.com> > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_isr.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c > index a24b82de4aab..77dd7630c3f8 100644 > --- a/drivers/scsi/qla2xxx/qla_isr.c > +++ b/drivers/scsi/qla2xxx/qla_isr.c > @@ -3952,10 +3952,12 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) > if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { > /* user wants to control IRQ setting for target mode */ > ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, > - ha->msix_count, PCI_IRQ_MSIX); > + min((u16)ha->msix_count, (u16)num_online_cpus()), > + PCI_IRQ_MSIX); > } else > ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, > - ha->msix_count, PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, > + min((u16)ha->msix_count, (u16)num_online_cpus()), > + PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, > &desc); > > if (ret < 0) { > -- > 2.19.0.rc0 > Looks good. Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > From: Saurav Kashyap <skashyap@marvell.com> > > NVMe commands can come only after successful addition of rport and nvme > connect, and rport is only registered after FW started bit is set. Remove the > redundant check. > > Signed-off-by: Saurav Kashyap <skashyap@marvell.com> > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_nvme.c | 12 ++++-------- > 1 file changed, 4 insertions(+), 8 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c > index b7a1dc24db38..d4159d5a4ffd 100644 > --- a/drivers/scsi/qla2xxx/qla_nvme.c > +++ b/drivers/scsi/qla2xxx/qla_nvme.c > @@ -554,19 +554,15 @@ static int qla_nvme_post_cmd(struct nvme_fc_local_port *lport, > > fcport = qla_rport->fcport; > > - if (!qpair || !fcport) > - return -ENODEV; > - > - if (!qpair->fw_started || fcport->deleted) > + if (unlikely(!qpair || !fcport || fcport->deleted)) > return -EBUSY; > > - vha = fcport->vha; > - > if (!(fcport->nvme_flag & NVME_FLAG_REGISTERED)) > return -ENODEV; > > - if (test_bit(ABORT_ISP_ACTIVE, &vha->dpc_flags) || > - (qpair && !qpair->fw_started) || fcport->deleted) > + vha = fcport->vha; > + > + if (test_bit(ABORT_ISP_ACTIVE, &vha->dpc_flags)) > return -EBUSY; > > /* > -- > 2.19.0.rc0 > Looks Good Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > From: Arun Easi <aeasi@marvell.com> > > Crash stack: > [576544.715489] Unable to handle kernel paging request for data at address 0xd00000000f970000 > [576544.715497] Faulting instruction address: 0xd00000000f880f64 > [576544.715503] Oops: Kernel access of bad area, sig: 11 [#1] > [576544.715506] SMP NR_CPUS=2048 NUMA pSeries > : > [576544.715703] NIP [d00000000f880f64] .qla27xx_fwdt_template_valid+0x94/0x100 [qla2xxx] > [576544.715722] LR [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] > [576544.715726] Call Trace: > [576544.715731] [c0000004d0ffb000] [c0000006fe02c350] 0xc0000006fe02c350 (unreliable) > [576544.715750] [c0000004d0ffb080] [d00000000f7952dc] .qla24xx_load_risc_flash+0x2fc/0x590 [qla2xxx] > [576544.715770] [c0000004d0ffb170] [d00000000f7aa034] .qla81xx_load_risc+0x84/0x1a0 [qla2xxx] > [576544.715789] [c0000004d0ffb210] [d00000000f79f7c8] .qla2x00_setup_chip+0xc8/0x910 [qla2xxx] > [576544.715808] [c0000004d0ffb300] [d00000000f7a631c] .qla2x00_initialize_adapter+0x4dc/0xb00 [qla2xxx] > [576544.715826] [c0000004d0ffb3e0] [d00000000f78ce28] .qla2x00_probe_one+0xf08/0x2200 [qla2xxx] > > Fixes: f73cb695d3ec ("[SCSI] qla2xxx: Add support for ISP2071.") > Signed-off-by: Arun Easi <aeasi@marvell.com> > Signed-off-by: Nilesh Javali <njavali@marvell.com> I think this needs stable tag as well. > --- > drivers/scsi/qla2xxx/qla_tmpl.c | 9 +++++---- > drivers/scsi/qla2xxx/qla_tmpl.h | 2 +- > 2 files changed, 6 insertions(+), 5 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_tmpl.c b/drivers/scsi/qla2xxx/qla_tmpl.c > index 84f4416d366f..a6bb1c0e2245 100644 > --- a/drivers/scsi/qla2xxx/qla_tmpl.c > +++ b/drivers/scsi/qla2xxx/qla_tmpl.c > @@ -928,7 +928,8 @@ qla27xx_template_checksum(void *p, ulong size) > static inline int > qla27xx_verify_template_checksum(struct qla27xx_fwdt_template *tmp) > { > - return qla27xx_template_checksum(tmp, tmp->template_size) == 0; > + return qla27xx_template_checksum(tmp, > + le32_to_cpu(tmp->template_size)) == 0; > } > > static inline int > @@ -944,7 +945,7 @@ qla27xx_execute_fwdt_template(struct scsi_qla_host *vha, > ulong len = 0; > > if (qla27xx_fwdt_template_valid(tmp)) { > - len = tmp->template_size; > + len = le32_to_cpu(tmp->template_size); > tmp = memcpy(buf, tmp, len); > ql27xx_edit_template(vha, tmp); > qla27xx_walk_template(vha, tmp, buf, &len); > @@ -960,7 +961,7 @@ qla27xx_fwdt_calculate_dump_size(struct scsi_qla_host *vha, void *p) > ulong len = 0; > > if (qla27xx_fwdt_template_valid(tmp)) { > - len = tmp->template_size; > + len = le32_to_cpu(tmp->template_size); > qla27xx_walk_template(vha, tmp, NULL, &len); > } > > @@ -972,7 +973,7 @@ qla27xx_fwdt_template_size(void *p) > { > struct qla27xx_fwdt_template *tmp = p; > > - return tmp->template_size; > + return le32_to_cpu(tmp->template_size); > } > > int > diff --git a/drivers/scsi/qla2xxx/qla_tmpl.h b/drivers/scsi/qla2xxx/qla_tmpl.h > index c47184db5081..6e0987edfceb 100644 > --- a/drivers/scsi/qla2xxx/qla_tmpl.h > +++ b/drivers/scsi/qla2xxx/qla_tmpl.h > @@ -12,7 +12,7 @@ > struct __packed qla27xx_fwdt_template { > __le32 template_type; > __le32 entry_offset; > - uint32_t template_size; > + __le32 template_size; > uint32_t count; /* borrow field for running/residual count */ > > __le32 entry_count; > -- > 2.19.0.rc0 > Otherwise.. Looks good Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > From: Saurav Kashyap <skashyap@marvell.com> > > Driver unload with IOs causes server to crash. > Return IO with retry if fcport undergoing deletion. > Any call stack or panic signature to share in commit message? > Signed-off-by: Saurav Kashyap <skashyap@marvell.com> > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_os.c | 8 ++++---- > 1 file changed, 4 insertions(+), 4 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c > index a75edba2b334..be9d10092dd3 100644 > --- a/drivers/scsi/qla2xxx/qla_os.c > +++ b/drivers/scsi/qla2xxx/qla_os.c > @@ -884,8 +884,8 @@ qla2xxx_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd) > goto qc24_fail_command; > } > > - if (!fcport) { > - cmd->result = DID_NO_CONNECT << 16; > + if (!fcport || fcport->deleted) { > + cmd->result = DID_IMM_RETRY << 16; > goto qc24_fail_command; > } > > @@ -966,8 +966,8 @@ qla2xxx_mqueuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd, > goto qc24_fail_command; > } > > - if (!fcport) { > - cmd->result = DID_NO_CONNECT << 16; > + if (!fcport || fcport->deleted) { > + cmd->result = DID_IMM_RETRY << 16; > goto qc24_fail_command; > } > > -- > 2.19.0.rc0 > Patch itself looks good. Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering
> On Dec 1, 2020, at 2:27 AM, Nilesh Javali <njavali@marvell.com> wrote: > > Signed-off-by: Nilesh Javali <njavali@marvell.com> > --- > drivers/scsi/qla2xxx/qla_version.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/drivers/scsi/qla2xxx/qla_version.h b/drivers/scsi/qla2xxx/qla_version.h > index c2d4da52f4a9..ccec858875dd 100644 > --- a/drivers/scsi/qla2xxx/qla_version.h > +++ b/drivers/scsi/qla2xxx/qla_version.h > @@ -6,9 +6,9 @@ > /* > * Driver version > */ > -#define QLA2XXX_VERSION "10.02.00.103-k" > +#define QLA2XXX_VERSION "10.02.00.104-k" > > #define QLA_DRIVER_MAJOR_VER 10 > #define QLA_DRIVER_MINOR_VER 2 > #define QLA_DRIVER_PATCH_VER 0 > -#define QLA_DRIVER_BETA_VER 103 > +#define QLA_DRIVER_BETA_VER 104 > -- > 2.19.0.rc0 > Reviewed-by: Himanshu Madhani <himanshu.madhani@oracle.com> -- Himanshu Madhani Oracle Linux Engineering