Message ID | 20201028142433.18501-1-kitakar@gmail.com |
---|---|
Headers | show |
Series | mwifiex: disable ps_mode by default for stability | expand |
On Wed, Oct 28, 2020 at 11:24:30PM +0900, Tsuchiya Yuto wrote: > Hello all, > > On Microsoft Surface devices (PCIe-88W8897), we are observing stability > issues when ps_mode (IEEE power_save) is enabled, then eventually causes > firmware crash. Especially on 5GHz APs, the connection is completely > unstable and almost unusable. > > I think the most desirable change is to fix the ps_mode itself. But is > seems to be hard work [1], I'm afraid we have to go this way. > > Therefore, the first patch of this series disables the ps_mode by default > instead of enabling it on driver init. I'm not sure if explicitly > disabling it is really required or not. I don't have access to the details > of this chip. Let me know if it's enough to just remove the code that > enables ps_mode. > > The Second patch adds a new module parameter named "allow_ps_mode". Since > other wifi drivers just disable power_save by default by module parameter > like this, I also added this. > > The third patch adds a message when ps_mode will be changed. Useful when > diagnosing connection issues. > [1] https://bugzilla.kernel.org/show_bug.cgi?id=109681 Can you attach this to the actual patch as BugLink: tag?
On Wed, Oct 28, 2020 at 7:04 PM Tsuchiya Yuto <kitakar@gmail.com> wrote: > > On Microsoft Surface devices (PCIe-88W8897), the ps_mode causes > connection unstable, especially with 5GHz APs. Then, it eventually causes > fw crash. > > This commit disables ps_mode by default instead of enabling it. > > Required code is extracted from mwifiex_drv_set_power(). > > Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com> You should read up on WIPHY_FLAG_PS_ON_BY_DEFAULT and CONFIG_CFG80211_DEFAULT_PS, and set/respect those appropriately (hint: mwifiex sets WIPHY_FLAG_PS_ON_BY_DEFAULT, and your patch makes this a lie). Also, this seems like a quirk that you haven't properly worked out -- if you're working on a quirk framework in your other series, you should just key into that. For the record, Chrome OS supports plenty of mwifiex systems with 8897 (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting those. Your problem sounds to be exclusively a problem with the PCIe 8897 firmware. As-is, NAK. Brian
On Thu, Oct 29, 2020 at 8:29 PM Brian Norris <briannorris@chromium.org> wrote: > On Wed, Oct 28, 2020 at 7:04 PM Tsuchiya Yuto <kitakar@gmail.com> wrote: ... > For the record, Chrome OS supports plenty of mwifiex systems with 8897 > (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting > those. Your problem sounds to be exclusively a problem with the PCIe > 8897 firmware. And this feeling (that it's a FW issue) what I have. But the problem here, that Marvell didn't fix and probably won't fix their FW... Just wondering if Google (and MS in their turn) use different firmwares to what we have available in Linux.
On Thu, Oct 29, 2020 at 11:37 AM Andy Shevchenko <andy.shevchenko@gmail.com> wrote: > And this feeling (that it's a FW issue) what I have. But the problem > here, that Marvell didn't fix and probably won't fix their FW... Sure, I wouldn't hold your breath. So some of these tactics (disabling PS, etc.) may be valid, but you have to do them smartly, acknowledging that there are other (more stable) firmwares and chips in use for this same driver. > Just wondering if Google (and MS in their turn) use different > firmwares to what we have available in Linux. No clue about MS. But Chrom{e,ium} OS generally publishes all this stuff where possible. You can see what we use here: https://chromium.googlesource.com/chromiumos/third_party/linux-firmware/+/HEAD/mrvl/ https://chromium.googlesource.com/chromiumos/third_party/marvell/+/HEAD/ We try to stay somewhat in sync / parallel with "upstream" linux-firmware, and strongly encourage vendors to send the same binaries upstream when they hand them to us, but there are exceptions and oversights (e.g., old products might have used a different firmware branch). Notably, I'll repeat: we (Chrome OS) don't actually support the PCIe variant of 8897, so the report in question ("PCIe-88W8897") has no equivalent in a supported Chrome OS system (even if there are binaries in the links above, we don't use them). I would not be surprised if there are an enormous number of firmware bugs there, as there were initially for PCIe-88W8997 (which we do support). Brian
On Thu, 2020-10-29 at 11:25 -0700, Brian Norris wrote: > On Wed, Oct 28, 2020 at 7:04 PM Tsuchiya Yuto <kitakar@gmail.com> wrote: > > > > On Microsoft Surface devices (PCIe-88W8897), the ps_mode causes > > connection unstable, especially with 5GHz APs. Then, it eventually causes > > fw crash. > > > > This commit disables ps_mode by default instead of enabling it. > > > > Required code is extracted from mwifiex_drv_set_power(). > > > > Signed-off-by: Tsuchiya Yuto <kitakar@gmail.com> > > You should read up on WIPHY_FLAG_PS_ON_BY_DEFAULT and > CONFIG_CFG80211_DEFAULT_PS, and set/respect those appropriately (hint: > mwifiex sets WIPHY_FLAG_PS_ON_BY_DEFAULT, and your patch makes this a > lie). Also, this seems like a quirk that you haven't properly worked > out -- if you're working on a quirk framework in your other series, > you should just key into that. Thanks for the review! I didn't know about the flag, much appreciated. By setting the flag to false explicitly, indeed userspace doesn't try to enable power_save now at least for this short amount of time. I wonder if I can drop the second patch (adding module parameter) now. But I still want to make sure that power_save won't be enabled by userspace tools by default. Regarding quirks, I also don't want to break existing users. So, of course I can try to use the quirk framework if we really can't fix the firmware. > For the record, Chrome OS supports plenty of mwifiex systems with 8897 > (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting > those. Your problem sounds to be exclusively a problem with the PCIe > 8897 firmware. Actually, I already know that some Chromebooks use these mwifiex cards (but not out PCIe-88W8897) because I personally like chromiumos. I'm always wondering what is the difference. If the difference is firmware, our PCIe-88W8897 firmware should really be fixed instead of this stupid series. Yes, I'm sorry that I know this series is just a stupid one but I have to send this anyway because this stability issue has not been fixed for a long time. I should have added this buglink to every commit as well: BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681 If the firmware can't be fixed, I'm afraid I have to go this way. It makes no sense to keep enabling power_save for the affected devices if we know it's broken. > As-is, NAK. > > Brian
On Fri, Oct 30, 2020 at 1:04 AM Tsuchiya Yuto <kitakar@gmail.com> wrote: > On Thu, 2020-10-29 at 11:25 -0700, Brian Norris wrote: > > For the record, Chrome OS supports plenty of mwifiex systems with 8897 > > (SDIO only) and 8997 (PCIe), with PS enabled, and you're hurting > > those. Your problem sounds to be exclusively a problem with the PCIe > > 8897 firmware. > > Actually, I already know that some Chromebooks use these mwifiex cards > (but not out PCIe-88W8897) because I personally like chromiumos. I'm > always wondering what is the difference. If the difference is firmware, > our PCIe-88W8897 firmware should really be fixed instead of this stupid > series. PCIe is a very different beast. (For one, it uses DMA and memory-mapped registers, where SDIO has neither.) It was a very difficult slog to get PCIe/8997 working reliably for the few Chromebooks that shipped it, and lots of that work is in firmware. I would not be surprised if the PCIe-related changes Marvell made for 8997 never fed back into their PCIe-8897 firmware. Or maybe they only ever launched PCIe-8897 for Windows, and the Windows driver included workarounds that were never published to their Linux driver. But now I'm just speculating. > Yes, I'm sorry that I know this series is just a stupid one but I have to > send this anyway because this stability issue has not been fixed for a > long time. I should have added this buglink to every commit as well: > > BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=109681 > > If the firmware can't be fixed, I'm afraid I have to go this way. It makes > no sense to keep enabling power_save for the affected devices if we know > it's broken. Condolences and sympathy, seriously. You likely have little chance of getting the firmware fixed, so without new information (e.g,. other workarounds?), this is the probably the right way to go. Brian
Hello! Please CC me in future for mwifiex discussion :-) On Wednesday 28 October 2020 23:24:30 Tsuchiya Yuto wrote: > Hello all, > > On Microsoft Surface devices (PCIe-88W8897), we are observing stability > issues when ps_mode (IEEE power_save) is enabled, then eventually causes > firmware crash. Especially on 5GHz APs, the connection is completely > unstable and almost unusable. > > I think the most desirable change is to fix the ps_mode itself. But is > seems to be hard work [1], I'm afraid we have to go this way. > > Therefore, the first patch of this series disables the ps_mode by default > instead of enabling it on driver init. I'm not sure if explicitly > disabling it is really required or not. I don't have access to the details > of this chip. Let me know if it's enough to just remove the code that > enables ps_mode. > > The Second patch adds a new module parameter named "allow_ps_mode". Since > other wifi drivers just disable power_save by default by module parameter > like this, I also added this. > > The third patch adds a message when ps_mode will be changed. Useful when > diagnosing connection issues. There are more issues with power save API and implementation in mwifiex. See my email for more details: https://lore.kernel.org/linux-wireless/20200609111544.v7u5ort3yk4s7coy@pali/T/#u These patches would just break power save API and reporting status to userspace even more due to WIPHY_FLAG_PS_ON_BY_DEFAULT and CONFIG_CFG80211_DEFAULT_PS options. I would suggest to first fix issues mentioned in my email and then start providing a way how to blacklist or whitelist power save feature depending on firmware or card/chip version.