From patchwork Mon Oct 26 04:18:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Chan X-Patchwork-Id: 298610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, MIME_HEADER_CTYPE_ONLY, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS, T_TVD_MIME_NO_HEADERS, URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2EE7BC56202 for ; Mon, 26 Oct 2020 04:18:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D055522247 for ; Mon, 26 Oct 2020 04:18:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=broadcom.com header.i=@broadcom.com header.b="OTt1gOkN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1421046AbgJZESk (ORCPT ); Mon, 26 Oct 2020 00:18:40 -0400 Received: from mail-pf1-f193.google.com ([209.85.210.193]:33789 "EHLO mail-pf1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1420945AbgJZESh (ORCPT ); Mon, 26 Oct 2020 00:18:37 -0400 Received: by mail-pf1-f193.google.com with SMTP id j18so5503933pfa.0 for ; Sun, 25 Oct 2020 21:18:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=vC3xozK2wZN6iUGIbKodyxCftG+54UZ6I8VNlV5QpNE=; b=OTt1gOkNUsoKyD0ZfZZHAvJ/p8nPrK5Nqg9yG9Mg/Bvp2tT7N9oxVrXfi2CAD3x0A4 bpqCH5RdviQyO/PRZqmbeVENOOERtki2gfoeXFMF0wwuGTKLkpJljrFGZwAV817YH2Wy WmfttLYTwWwApFvjDR6hokwnJpvJ5km/EKleg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=vC3xozK2wZN6iUGIbKodyxCftG+54UZ6I8VNlV5QpNE=; b=g6w0JRR/qfkUsqL66DePBsCYNnkRqd/cpx1Y16Jvk9uoI8LXoQEVonrktFtwkFJU3c htMhPSKn1bHey91ZpzFlbKOt0DBj4fLcNoVxnln49Kby2By4lVMW/Wjzlr5D/Euir/jc XfWlSdz7+yhZs1Pa/fVdx8cHBeaQwVUoPRhMU7J4jRB6zs4xzEZ2MbsKvd7xY55lLzug B29/e2Pewg2mrx87hjQiiB7yermpbTr0vdYJE5g+saa23SwUOF3FpxIphxqQwoJ6+WEa TkRrdmDGSxoBsp3s67FiSrG8doZIgKDuiZ2FJrKkYc4nnC+LvC0dzau8v2IHLnqqEU8M wJyA== X-Gm-Message-State: AOAM532sMAFGfRbPNIdwiDRt3TlzN2K/9AOVP6+P7b1S9FDxUzLvVHvd cLk9C5i2gkl0mh/1Onc77mWftyFBndaDeg== X-Google-Smtp-Source: ABdhPJyKVoI+AEgt8ExLABJ0ofythjszofAn1ICSxSQD2Iv7ggsSOQhpvGhWQyCWrCoqn/4o/+ZYVg== X-Received: by 2002:a63:9508:: with SMTP id p8mr14376742pgd.189.1603685916043; Sun, 25 Oct 2020 21:18:36 -0700 (PDT) Received: from localhost.swdvt.lab.broadcom.net ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id 10sm11505835pjt.50.2020.10.25.21.18.34 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 25 Oct 2020 21:18:35 -0700 (PDT) From: Michael Chan To: kuba@kernel.org Cc: netdev@vger.kernel.org, gospo@broadcom.com Subject: [PATCH net 4/5] bnxt_en: Check abort error state in bnxt_open_nic(). Date: Mon, 26 Oct 2020 00:18:20 -0400 Message-Id: <1603685901-17917-5-git-send-email-michael.chan@broadcom.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1603685901-17917-1-git-send-email-michael.chan@broadcom.com> References: <1603685901-17917-1-git-send-email-michael.chan@broadcom.com> Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org bnxt_open_nic() is called during configuration changes that require the NIC to be closed and then opened. This call is protected by rtnl_lock. Firmware reset can be happening at the same time. Only critical portions of the entire firmware reset sequence are protected by the rtnl_lock. It is possible that bnxt_open_nic() can be called when the firmware reset sequence is aborting. In that case, bnxt_open_nic() needs to check if the ABORT_ERR flag is set and abort if it is. The configuration change that resulted in the bnxt_open_nic() call will fail but the NIC will be brought to a consistent IF_DOWN state. Without this patch, if bnxt_open_nic() were to continue in this error state, it may crash like this: [ 1648.659736] BUG: unable to handle kernel NULL pointer dereference at (null) [ 1648.659768] IP: [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.659796] PGD 101e1b3067 PUD 101e1b2067 PMD 0 [ 1648.659813] Oops: 0000 [#1] SMP [ 1648.659825] Modules linked in: xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT nf_reject_ipv4 tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter sunrpc dell_smbios dell_wmi_descriptor dcdbas amd64_edac_mod edac_mce_amd kvm_amd kvm irqbypass crc32_pclmul ghash_clmulni_intel aesni_intel lrw gf128mul glue_helper ablk_helper vfat cryptd fat pcspkr ipmi_ssif sg k10temp i2c_piix4 wmi ipmi_si ipmi_devintf ipmi_msghandler tpm_crb acpi_power_meter sch_fq_codel ip_tables xfs libcrc32c sd_mod crc_t10dif crct10dif_generic mgag200 i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm ahci drm libahci megaraid_sas crct10dif_pclmul crct10dif_common [ 1648.660063] tg3 libata crc32c_intel bnxt_en(OE) drm_panel_orientation_quirks devlink ptp pps_core dm_mirror dm_region_hash dm_log dm_mod fuse [ 1648.660105] CPU: 13 PID: 3867 Comm: ethtool Kdump: loaded Tainted: G OE ------------ 3.10.0-1152.el7.x86_64 #1 [ 1648.660911] Hardware name: Dell Inc. PowerEdge R7515/0R4CNN, BIOS 1.2.14 01/28/2020 [ 1648.661662] task: ffff94e64cbc9080 ti: ffff94f55df1c000 task.ti: ffff94f55df1c000 [ 1648.662409] RIP: 0010:[] [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.663171] RSP: 0018:ffff94f55df1fba8 EFLAGS: 00010202 [ 1648.663927] RAX: 0000000000000000 RBX: ffff94e6827e0000 RCX: 0000000000000000 [ 1648.664684] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff94e6827e08c0 [ 1648.665433] RBP: ffff94f55df1fc20 R08: 00000000000001ff R09: 0000000000000008 [ 1648.666184] R10: 0000000000000d53 R11: ffff94f55df1f7ce R12: ffff94e6827e08c0 [ 1648.666940] R13: ffff94e6827e08c0 R14: ffff94e6827e08c0 R15: ffffffffb9115e40 [ 1648.667695] FS: 00007f8aadba5740(0000) GS:ffff94f57eb40000(0000) knlGS:0000000000000000 [ 1648.668447] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 1648.669202] CR2: 0000000000000000 CR3: 0000001022772000 CR4: 0000000000340fe0 [ 1648.669966] Call Trace: [ 1648.670730] [] ? bnxt_need_reserve_rings+0x9d/0x170 [bnxt_en] [ 1648.671496] [] __bnxt_open_nic+0x8a/0x9a0 [bnxt_en] [ 1648.672263] [] ? bnxt_close_nic+0x59/0x1b0 [bnxt_en] [ 1648.673031] [] bnxt_open_nic+0x1b/0x50 [bnxt_en] [ 1648.673793] [] bnxt_set_ringparam+0x6c/0xa0 [bnxt_en] [ 1648.674550] [] dev_ethtool+0x1334/0x21a0 [ 1648.675306] [] dev_ioctl+0x1ef/0x5f0 [ 1648.676061] [] sock_do_ioctl+0x4d/0x60 [ 1648.676810] [] sock_ioctl+0x1eb/0x2d0 [ 1648.677548] [] do_vfs_ioctl+0x3a0/0x5b0 [ 1648.678282] [] ? __do_page_fault+0x238/0x500 [ 1648.679016] [] SyS_ioctl+0xa1/0xc0 [ 1648.679745] [] system_call_fastpath+0x25/0x2a [ 1648.680461] Code: 9e 60 01 00 00 0f 1f 40 00 45 8b 8e 48 01 00 00 31 c9 45 85 c9 0f 8e 73 01 00 00 66 0f 1f 44 00 00 49 8b 86 a8 00 00 00 48 63 d1 <48> 8b 14 d0 48 85 d2 0f 84 46 01 00 00 41 8b 86 44 01 00 00 c7 [ 1648.681986] RIP [] bnxt_alloc_mem+0x50a/0x1140 [bnxt_en] [ 1648.682724] RSP [ 1648.683451] CR2: 0000000000000000 Fixes: ec5d31e3c15d ("bnxt_en: Handle firmware reset status during IF_UP.") Reviewed-by: Vasundhara Volam Reviewed-by: Pavan Chebbi Signed-off-by: Michael Chan --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 8012386b4a0f..0165f70dba74 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -9779,7 +9779,10 @@ int bnxt_open_nic(struct bnxt *bp, bool irq_re_init, bool link_re_init) { int rc = 0; - rc = __bnxt_open_nic(bp, irq_re_init, link_re_init); + if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) + rc = -EIO; + if (!rc) + rc = __bnxt_open_nic(bp, irq_re_init, link_re_init); if (rc) { netdev_err(bp->dev, "nic open fail (rc: %x)\n", rc); dev_close(bp->dev);