mbox series

[RFC,00/13] Convert mac80211 to TXQs only

Message ID 20250127162625.20747-1-Alexander@wetzel-home.de
Headers show
Series Convert mac80211 to TXQs only | expand

Message

Alexander Wetzel Jan. 27, 2025, 4:26 p.m. UTC
This converts mac80211 to only use TXQs internally, moves TX into
a kthread and picks up some easy optimizations on the way.

The patches should all have merge-quality and are working both with
iwlmvm (as station) and with the hostapd tests without any known issues.
(Short of not yet including the PS and filtered frame migration to TXQs.)

The first three patches are mostly unrelated for the TXQ move.
When these are ok for you I could submit them independently, cutting
them off here.

Here a short overview of the patches in the series:
  wifi: mac80211: Fix virtual monitor interface creation
	This basically covers questions I have about the monitor handling,
	just posed as a patch. (It has it's own comment section).

  wifi: mac80211: Fix debugfs handling for virtual monitor
	Found during QA testing of this series. Otherwise unrelated.
	
  wifi: mac80211: Cleanup sta TXQs on flush
	Another thing I found during QA while working on an unrelated
	regression I caused.

  wifi: mac80211: Always provide the MMPDU TXQ
	When only using TXQs for TX we can't have this queue optionally.
	We probably want to discuss how to make that mandatory for all
	drivers with native TXQ support. But this works, allowing to
	soften the probably desired updates to (some) TXQ drivers.

  wifi: mac80211: Convert vif->txq to an array
	We need some more TXQs for the patch below. STA TXQs are already
	in an array, so just put the VIF TXQ into one, too.

  wifi: mac80211: Add new TX queues to replace legacy TX
	This here starts the "core" of the patch series.
	It's creating all the missing TXQs and updates the support
	function for them. It also directly switches traffic to them,
	when possible. (Only offchannel is sticking to legacy TX here.)

  wifi: mac80211: Stop using legacy TX path
	This is dropping the legacy TX support and moves offchannel TX to
	the new alternate TXQ path: So far named TXQ_NOQUEUE.
	With that mac80211 has two TX paths both using TXQ:
	 - The existing one, which uses the TXQ for queuing and
	 - TXQ_NOQUEUE. Which just puts frames into a TXQ and
	   immediately sends out the frame by also calling drv_tx() for
	   it. There never can be more than one frame in any of these
	   TXQs. They never see a wake_tx_queue call by the driver or
	   mac80211.

  wifi: mac80211: Call ieee80211_tx_h_select_key only once
	This is just a simple drive-by optimization. Not needed.

  wifi: mac80211: Rename IEEE80211_TX_INTFL_OFFCHAN_TX_OK
	Also functionality irrelevant. Just seems to be a good way to use
	that as the official selector for the TXQ_NOQUEUE path and
	represent that in the name.

  wifi: mac80211: Simplify AMPDU handling
	Also a kind of drive-by optimization. Uses TXQs to buffer frames
	when AMPDU is started/stopped and even gets rid
	IEEE80211_QUEUE_STOP_REASON_AGGREGATION.

  wifi: mac80211: Migrate TX to kthread
	Moves all TX operation except TXQ_NOQUEUE to a new kthread.
	This hooks into the existing txq scheduling and uses local->active_txqs
	to determine which TXQs need to run. We may also consider 
	forcing all drivers to use ieee80211_txq_schedule_start() to get
	rid of the code figuring that out per run...

  wifi: mac80211: Drop wake_txqs_tasklet
	Another drive-by cleanup/optimization which became possible due
	to the kthread patch above.

  wifi: mac80211: Cleanup *ieee80211_wake_txq* naming
	And finally a patch just renaming a few functions. Not sure if
	that avoids or adds confusion... 

As an outline:
When there are no fundamental concerns of the mayor changes in this
series I would try again to pick apart the PS buffering and filtered
frame mess. Which probably will have to be in one patch. Last time I tried
that without kthread it was not possible to rip it out independently.

Comments

Johannes Berg Jan. 29, 2025, 7:35 a.m. UTC | #1
On Mon, 2025-01-27 at 17:26 +0100, Alexander Wetzel wrote:
> Virtual monitor interfaces should only be created when all of the
> following conditions are fulfilled:
>  - All remaining enabled interfaces are monitor interface(s),
>  - %IEEE80211_HW_NO_VIRTUAL_MONITOR isn't set by the driver and
>  - %MONITOR_FLAG_ACTIVE and %MONITOR_FLAG_COOK_FRAMES isn't set on any
>    of the enabled (monitor) interfaces.
> 
> ieee80211_add_virtual_monitor() will then setup the virtual monitor
> interface for local->monitor_sdata and - when
> %IEEE80211_HW_WANT_MONITOR_VIF is set - inform the driver about the
> virtual monitor interface.
> 
> But assuming %IEEE80211_HW_NO_VIRTUAL_MONITOR is not set, the current
> code still creates a virtual monitor interface when:
> 
>  1) We have one non-monitor and one monitor interface with
>     %MONITOR_FLAG_ACTIVE enabled and then delete the non-monitor
>     interface.

That sounds like a bug, indeed. We shouldn't count monitors with ACTIVE
in local->monitors at all, I think.

>  2) We only have monitor interfaces enabled on resume while at least one
>     has %MONITOR_FLAG_ACTIVE set.

Which would also address this, I'd guess?

> Fix the logic to follow the above stated rules and fixed/updated related
> checks not following the logic.
> 
> Signed-off-by: Alexander Wetzel <Alexander@wetzel-home.de>
> ---
> I have problems understanding the current logic of how monitor
> interfaces are handled. For me it looks like we use local->monitors
> sometimes to count all monitor interfaces - except the ones using
> %MONITOR_FLAG_COOK_FRAMES - and at other times to count interfaces
> linked to the virtual monitor interface.
> 
> This patch tries to fix the logic according to my understanding.
> 
> Noteworthy here is the wording in 0d9c2beed116 ("wifi: mac80211:
> fix monitor channel with chanctx emulation"):
>      ...
>      Fix this by always allocating the virtual monitor sdata, and
>      simply not telling the driver about it unless it wanted to.
>      This way, we have an interface/sdata to bind the chanctx to,
>      and the emulation can work correctly.
> 
> But the commit with this statement is not changing the condition to call
> ieee80211_add_virtual_monitor(), contradicting these sentences.
> 
> From ieee80211_do_open():
>         case NL80211_IFTYPE_MONITOR:
>                 if (sdata->u.mntr.flags & MONITOR_FLAG_COOK_FRAMES) {
>                         local->cooked_mntrs++;
>                         break;
>                 }
> 
>                 if (sdata->u.mntr.flags & MONITOR_FLAG_ACTIVE) {
>                         res = drv_add_interface(local, sdata);
>                         if (res)
>                                 goto err_stop;
>                 } else if (local->monitors == 0 && local->open_count == 0) {
>                         res = ieee80211_add_virtual_monitor(local);


The "always" there perhaps should've said "whenever we have monitor
active", obviously we don't want a virtual monitor when there's no
monitor at all.

The "cooked" is completely different, and not really a monitor, it's
indeed used for ancient versions of hostapd that don't know nl80211 yet
and need to receive all frames not otherwise handled by the kernel via
monitor.

We can probably justify removing that, it was only used by hostapd
versions between April 2009 and December 2011, from
0915d02c3c08 ("wpa_supplicant AP: Add management frame RX for nl80211")
to
a11241fa1149 ("nl80211: Use nl80211 for mgmt TX/RX in AP mode")


The "active" monitor is something we still want, and indeed it doesn't
work for iwlwifi (which simply doesn't support it), but that should
indeed always be added to the driver - drivers that support it need the
vif so they have the MAC address to send ACKs for (to program it into
the device.)

> +++ b/net/mac80211/cfg.c
> @@ -4366,10 +4366,11 @@ static int ieee80211_cfg_get_channel(struct wiphy *wiphy,
>  	if (chanctx_conf) {
>  		*chandef = link->conf->chanreq.oper;
>  		ret = 0;
> -	} else if (!ieee80211_hw_check(&local->hw, NO_VIRTUAL_MONITOR) &&
> -		   local->open_count > 0 &&
> -		   local->open_count == local->monitors &&
> -		   sdata->vif.type == NL80211_IFTYPE_MONITOR) {
> +	} else if (local->virt_monitors &&
> +		   local->open_count == local->virt_monitors &&
> +		   sdata->vif.type == NL80211_IFTYPE_MONITOR &&
> +		   !(sdata->u.mntr.flags & (MONITOR_FLAG_ACTIVE |
> +					    MONITOR_FLAG_COOK_FRAMES))) {
>  		*chandef = local->monitor_chanreq.oper;
>  		ret = 0;

Not sure this is really all that relevant - if you have (only) active
monitors then it should still be correct, and having only cooked
monitors is pointless, not even sure it's supported at all. So I'm not
sure I see a need to change this?

Overall this seems needlessly complex.

I'd go for two changes:
 * remove cooked monitor support entirely, unused since Dec 2011.
 * not count active monitors in local->monitors, which should address
   the other issues?

johannes
Johannes Berg Jan. 29, 2025, 8:21 a.m. UTC | #2
Hi,
Nice! I haven't gotten through it all yet, just some comments here also:
> 
> Here a short overview of the patches in the series:
>   wifi: mac80211: Fix virtual monitor interface creation
> 	This basically covers questions I have about the monitor
handling,
> 	just posed as a patch. (It has it's own comment section).

Right, see my comments there.

>   wifi: mac80211: Fix debugfs handling for virtual monitor
> 	Found during QA testing of this series. Otherwise unrelated.

Haven't really figured this one out, I should ask Benjamin to take a
look.

>   wifi: mac80211: Cleanup sta TXQs on flush
> 	Another thing I found during QA while working on an unrelated
> 	regression I caused.

That looks fine, I think.

>   wifi: mac80211: Always provide the MMPDU TXQ
> 	When only using TXQs for TX we can't have this queue optionally.
> 	We probably want to discuss how to make that mandatory for all
> 	drivers with native TXQ support. But this works, allowing to
> 	soften the probably desired updates to (some) TXQ drivers.

Yeah not sure how we could deal with the drivers - after all none other
than iwlwifi opted in to this, I think, even through it exists for quite
a while now.

>   wifi: mac80211: Convert vif->txq to an array
> 	We need some more TXQs for the patch below. STA TXQs are already
> 	in an array, so just put the VIF TXQ into one, too.

Sure, looks mostly fine.

>   wifi: mac80211: Add new TX queues to replace legacy TX
> 	This here starts the "core" of the patch series.
> 	It's creating all the missing TXQs and updates the support
> 	function for them. It also directly switches traffic to them,
> 	when possible. (Only offchannel is sticking to legacy TX here.)

I only got to this one for the individual reviews.

>   wifi: mac80211: Stop using legacy TX path
> 	This is dropping the legacy TX support and moves offchannel TX to
> 	the new alternate TXQ path: So far named TXQ_NOQUEUE.
> 	With that mac80211 has two TX paths both using TXQ:
> 	 - The existing one, which uses the TXQ for queuing and
> 	 - TXQ_NOQUEUE. Which just puts frames into a TXQ and
> 	   immediately sends out the frame by also calling drv_tx() for
> 	   it. There never can be more than one frame in any of these
> 	   TXQs. They never see a wake_tx_queue call by the driver or
> 	   mac80211.

Just reading the description I wondered what we gain here, but of course
all the internal code path differences are removed, that's nice :-)

>   wifi: mac80211: Call ieee80211_tx_h_select_key only once
> 	This is just a simple drive-by optimization. Not needed.

Also I guess independent?

>   wifi: mac80211: Rename IEEE80211_TX_INTFL_OFFCHAN_TX_OK
> 	Also functionality irrelevant. Just seems to be a good way to use
> 	that as the official selector for the TXQ_NOQUEUE path and
> 	represent that in the name.

makes sense

>   wifi: mac80211: Simplify AMPDU handling
> 	Also a kind of drive-by optimization. Uses TXQs to buffer frames
> 	when AMPDU is started/stopped and even gets rid
> 	IEEE80211_QUEUE_STOP_REASON_AGGREGATION.

That's nice though, also a really nice cleanup. Not sure I'd necessarily
call it an optimisation.

>   wifi: mac80211: Migrate TX to kthread
> 	Moves all TX operation except TXQ_NOQUEUE to a new kthread.
> 	This hooks into the existing txq scheduling and uses local->active_txqs
> 	to determine which TXQs need to run. We may also consider 
> 	forcing all drivers to use ieee80211_txq_schedule_start() to get
> 	rid of the code figuring that out per run...

Forcing drivers is always hard ... will need to take a closer look at
this.

>   wifi: mac80211: Drop wake_txqs_tasklet
> 	Another drive-by cleanup/optimization which became possible due
> 	to the kthread patch above.

Feels a bit like that should just be part of the kthread patch? Why keep
a useless tasklet around after that?

>   wifi: mac80211: Cleanup *ieee80211_wake_txq* naming
> 	And finally a patch just renaming a few functions. Not sure if
> 	that avoids or adds confusion... 

:)

> As an outline:
> When there are no fundamental concerns of the mayor changes in this
> series I would try again to pick apart the PS buffering and filtered
> frame mess. Which probably will have to be in one patch. Last time I tried
> that without kthread it was not possible to rip it out independently.

I don't see _fundamental_ concerns right now, but I'd probably still
hold off on that for a while, perhaps even put that into a separate
kernel release just in case?

> From a performance point of view this series seems to be ok...

:)

> I did run some quick tests with my "normal" home network close to an
> stil unpatched AP: 60s long tests with iperf3 using tcp got between
> 484-524 MBit/s while the patched kernel was between 471-521MBit/s using
> SMP. Uniprocessor performance is better in both cases, 484-530MBit/s for
> the original and 563-676 MBIT/s for the patched variant.

Interesting.

Thanks for doing this! I need to review it much more carefully, but it
looks really nice as far as I've seen.

johannes
Jeff Johnson Jan. 29, 2025, 10:15 p.m. UTC | #3
On 1/27/2025 8:26 AM, Alexander Wetzel wrote:
> This converts mac80211 to only use TXQs internally, moves TX into
> a kthread and picks up some easy optimizations on the way.

It would be awesome if the cover letter actually described why these changes
are being proposed...
Johannes Berg Jan. 30, 2025, 8 a.m. UTC | #4
On Wed, 2025-01-29 at 14:15 -0800, Jeff Johnson wrote:
> On 1/27/2025 8:26 AM, Alexander Wetzel wrote:
> > This converts mac80211 to only use TXQs internally, moves TX into
> > a kthread and picks up some easy optimizations on the way.
> 
> It would be awesome if the cover letter actually described why these changes
> are being proposed...

:)

I guess we've been discussing this for years, on and off, and it's
mostly because of the code cleanups this enables. Some are here, but as
Alexanders says it'll enable also more cleanups on PS buffering etc.

johannes