mbox series

[v3,0/8] wfx: implement Remain On Channel

Message ID 20231004172843.195332-1-jerome.pouiller@silabs.com
Headers show
Series wfx: implement Remain On Channel | expand

Message

Jérôme Pouiller Oct. 4, 2023, 5:28 p.m. UTC
Hello,

Apart from the 3 first patch, this series implements Remain On Channel for
WF200 chips. The implementation is a bit twisted (I hijack the scan feature
to implements RoC). However, it has been extensively tested with
DPP/EasyConnect and I have not noticed any issue.

v3:
  - Patch 5 and 6 has been squashed.
  - Reordered patches 5 to 9. It was not so easy, but I guarantee the final
    code is the same and every patch compile.
v2:
  - Rebase on last stable tree

Jérôme Pouiller (8):
  wifi: wfx: fix power_save setting when AP is stopped
  wifi: wfx: relocate wfx_rate_mask_to_hw()
  wifi: wfx: move wfx_skb_*() out of the header file
  wifi: wfx: introduce hif_scan_uniq()
  wifi: wfx: simplify exclusion between scan and Rx filters
  wifi: wfx: scan_lock is global to the device
  wifi: wfx: allow to send frames during ROC
  wifi: wfx: implement wfx_remain_on_channel()

 drivers/net/wireless/silabs/wfx/data_tx.c | 54 ++++++++++++++++---
 drivers/net/wireless/silabs/wfx/data_tx.h | 21 ++------
 drivers/net/wireless/silabs/wfx/hif_tx.c  | 43 +++++++++++++++
 drivers/net/wireless/silabs/wfx/hif_tx.h  |  1 +
 drivers/net/wireless/silabs/wfx/main.c    |  5 ++
 drivers/net/wireless/silabs/wfx/queue.c   | 38 ++++++++++---
 drivers/net/wireless/silabs/wfx/queue.h   |  1 +
 drivers/net/wireless/silabs/wfx/scan.c    | 66 ++++++++++++++++++++++-
 drivers/net/wireless/silabs/wfx/scan.h    |  6 +++
 drivers/net/wireless/silabs/wfx/sta.c     | 41 +++++---------
 drivers/net/wireless/silabs/wfx/sta.h     |  1 -
 drivers/net/wireless/silabs/wfx/wfx.h     |  8 +--
 12 files changed, 218 insertions(+), 67 deletions(-)

Comments

Jérôme Pouiller Nov. 27, 2024, 9:18 a.m. UTC | #1
On Tuesday 26 November 2024 16:54:12 CET Sverdlin, Alexander wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender and know the content is safe.
> 
> 
> Thanks for the quick reply Jerome,
> 
> On Tue, 2024-11-26 at 15:45 +0100, Jérôme Pouiller wrote:
> > > > +             for (i = 0; i < num_queues; i++) {
> > > > +                     skb = skb_dequeue(&queues[i]->offchan);
> > >                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > >
> > > Nevertheless, the lockdep splat comes from here, because
> > > wfx_tx_queues_init() has never been called for wvif->id == 2.
> > >
> > > What was your original plan for this to happen?
> > > Do we need an explicit analogue of wfx_add_interface() for vif->id 2 somewhere?
> > >
> > > I still have not come with a reproducer yet, but you definitely have more
> > > insight into this code, maybe this will ring some bells on your side...
> > >
> > > PS. It's v6.11, even though it's exactly the same splan as in
> > > "staging: wfx: fix potential use before init", but the patch in indeed inside.
> >
> > Yes, probably a very similar issue to "staging: wfx: fix potential use
> > before init". I don't believe the issue is related to wvif->id == 2.
> >
> > You have only produced this issue once, that's it?
> >
> > I wonder why this does not happen with queues[i]->normal and
> > queues[i]->cab. Is it because queues[i]->offchan is the first to be
> > checked? Or mutex_is_locked(&wdev->scan_lock) has an impact in the
> > process?
> >
> > In wfx_add_interface(), the list of wvif is protected by conf_lock.
> > However, wfx_tx_queues_get_skb() is not protected by conf_lock. We
> > initialize struct wvif before to add it to the wvif list and we
> > consider it is sufficient. However, after reading memory-barriers.txt
> > again, it's probably a wrong assumption.
> 
> I've actually disassembled the stack trace exactly to offchan processing.
> I have no idea why kernel sends offchan on a non-configured idle interface,
> I still need to come up with a reproducer.
> 
> But as soon as there is an offchan in the sorted list of "queues" (coming
> originally from VIF 0:
> 
> void wfx_tx(struct ieee80211_hw *hw, struct ieee80211_tx_control *control, struct sk_buff *skb)
> {
> ...
>         if (tx_info->control.vif)
>                 wvif = (struct wfx_vif *)tx_info->control.vif->drv_priv;
>         else
>                 wvif = wvif_iterate(wdev, NULL);
>                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> Puts any offchan into offchan not of VIF 2, but of the only VIF 0...

Note skb_dequeue(&queues[i]->offchan) is called whatever there is a frame
in the offchan queue. In fact, wfx_tx_queues_get_skb() can be called even
if all the tx queues are empty (and this happen when the wake up event
comes from the device).

So the reproducer involves wfx_add_interface() and a not-yet-identified
event (that could be an IRQ and a Tx frame) that wake up the bh workqueue.

> I think you are right, this could only be offchan queue of VIF 0.
> But then it's just a race of TX workqueue against
> wfx_remove_interface()/wfx_add_interface() pair (which I see regularly).

We have the same conclusion.

> We probably need RCU in the TX path and NetLink lock in the VIF add/remove
> path similar to other network drivers...
> I can try to come up with a patch for this...

I wonder if there is a way to iterate over the vif using the cfg80211/mac80211
API rather than maintaining a list of vif in the driver.

[...]