mbox series

[v2,0/4] wifi: ath12k: fix dest ring-buffer corruption

Message ID 20250604144509.28374-1-johan+linaro@kernel.org
Headers show
Series wifi: ath12k: fix dest ring-buffer corruption | expand

Message

Johan Hovold June 4, 2025, 2:45 p.m. UTC
As a follow up to commit:

	b67d2cf14ea ("wifi: ath12k: fix ring-buffer corruption")

add the remaining missing memory barriers to make sure that destination
ring descriptors are read after the head pointers to avoid using stale
data on weakly ordered architectures like aarch64.

Also switch back to plain accesses for the descriptor fields which is
sufficient after the memory barrier.

New in v2 are two patches that add the missing barriers also for source
rings and when updating the tail pointer for destination rings.

To avoid leaking ring details from the "hal" (lmac or non-lmac), the
barriers are added to the ath12k_hal_srng_access_end() helper. For
symmetry I therefore moved also the dest ring barriers into
ath12k_hal_srng_access_begin() and made the barrier conditional.

[ Due to this change I did not add Miaoqing's reviewed-by tag. ]

Johan


Changes in v2:
 - add tested-on tags to plain access patch
 - move destination barriers into begin helper
 - fix source ring corruption (new patch)
 - fix dest ring corruption when ring is full (new patch)


Johan Hovold (4):
  wifi: ath12k: fix dest ring-buffer corruption
  wifi: ath12k: use plain access for descriptor length
  wifi: ath12k: fix source ring-buffer corruption
  wifi: ath12k: fix dest ring-buffer corruption when ring is full

 drivers/net/wireless/ath/ath12k/ce.c  |  3 --
 drivers/net/wireless/ath/ath12k/hal.c | 40 ++++++++++++++++++++++-----
 2 files changed, 33 insertions(+), 10 deletions(-)

Comments

Baochen Qiang June 5, 2025, 8:41 a.m. UTC | #1
On 6/4/2025 10:45 PM, Johan Hovold wrote:
> Add the missing memory barrier to make sure that destination ring
> descriptors are read after the head pointers to avoid using stale data
> on weakly ordered architectures like aarch64.
> 
> The barrier is added to the ath12k_hal_srng_access_begin() helper for
> symmetry with follow-on fixes for source ring buffer corruption which
> will add barriers to ath12k_hal_srng_access_end().
> 
> Note that this may fix the empty descriptor issue recently worked around
> by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for
> monitor ring").

why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC,
rather than anything caused by reordering.

> 
> Tested-on: WCN7850 hw2.0 WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3
> 
> Fixes: d889913205cf ("wifi: ath12k: driver for Qualcomm Wi-Fi 7 devices")
> Cc: stable@vger.kernel.org	# 6.3
> Signed-off-by: Johan Hovold <johan+linaro@kernel.org>
> ---
>  drivers/net/wireless/ath/ath12k/ce.c  |  3 ---
>  drivers/net/wireless/ath/ath12k/hal.c | 17 ++++++++++++++---
>  2 files changed, 14 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/wireless/ath/ath12k/ce.c b/drivers/net/wireless/ath/ath12k/ce.c
> index 740586fe49d1..b66d23d6b2bd 100644
> --- a/drivers/net/wireless/ath/ath12k/ce.c
> +++ b/drivers/net/wireless/ath/ath12k/ce.c
> @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe,
>  		goto err;
>  	}
>  
> -	/* Make sure descriptor is read after the head pointer. */
> -	dma_rmb();
> -
>  	*nbytes = ath12k_hal_ce_dst_status_get_length(desc);
>  
>  	*skb = pipe->dest_ring->skb[sw_index];
> diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c
> index 91d5126ca149..9eea13ed5565 100644
> --- a/drivers/net/wireless/ath/ath12k/hal.c
> +++ b/drivers/net/wireless/ath/ath12k/hal.c
> @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab,
>  
>  void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng)
>  {
> +	u32 hp;
> +
>  	lockdep_assert_held(&srng->lock);
>  
> -	if (srng->ring_dir == HAL_SRNG_DIR_SRC)
> +	if (srng->ring_dir == HAL_SRNG_DIR_SRC) {
>  		srng->u.src_ring.cached_tp =
>  			*(volatile u32 *)srng->u.src_ring.tp_addr;
> -	else
> -		srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
> +	} else {
> +		hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
> +
> +		if (hp != srng->u.dst_ring.cached_hp) {

This consumes additional CPU cycles in hot path, which is a concern to me.

Based on that, I prefer the v1 implementation.

> +			srng->u.dst_ring.cached_hp = hp;
> +			/* Make sure descriptor is read after the head
> +			 * pointer.
> +			 */
> +			dma_rmb();
> +		}
> +	}
>  }
>  
>  /* Update cached ring head/tail pointers to HW. ath12k_hal_srng_access_begin()
Baochen Qiang June 5, 2025, 8:51 a.m. UTC | #2
On 6/5/2025 4:44 PM, Johan Hovold wrote:
> On Thu, Jun 05, 2025 at 04:37:13PM +0800, Baochen Qiang wrote:
>> On 6/4/2025 10:45 PM, Johan Hovold wrote:
>>> As a follow up to commit:
>>>
>>> 	b67d2cf14ea ("wifi: ath12k: fix ring-buffer corruption")
>>>
>>> add the remaining missing memory barriers to make sure that destination
>>> ring descriptors are read after the head pointers to avoid using stale
>>> data on weakly ordered architectures like aarch64.
>>>
>>> Also switch back to plain accesses for the descriptor fields which is
>>> sufficient after the memory barrier.
>>>
>>> New in v2 are two patches that add the missing barriers also for source
>>> rings and when updating the tail pointer for destination rings.
>>>
>>> To avoid leaking ring details from the "hal" (lmac or non-lmac), the
>>> barriers are added to the ath12k_hal_srng_access_end() helper. For
>>
>> Could you elaborate? what do you mean by "leaking ring details from the 'hal'"?
> 
> The type of barrier needed depends on the type of the ring. If we add
> the barrier directly in the caller, the caller would need to know what
> kind of ring (lmac or non-lmac) it is operating on, something which is
> currently abstracted away in the hal helpers.
> 

Thanks, I get your point. I can see the difference in patch [3/4]

>>> symmetry I therefore moved also the dest ring barriers into
>>> ath12k_hal_srng_access_begin() and made the barrier conditional.
> 
> Johan
Johan Hovold June 5, 2025, 10 a.m. UTC | #3
On Thu, Jun 05, 2025 at 04:41:32PM +0800, Baochen Qiang wrote:
> On 6/4/2025 10:45 PM, Johan Hovold wrote:
> > Add the missing memory barrier to make sure that destination ring
> > descriptors are read after the head pointers to avoid using stale data
> > on weakly ordered architectures like aarch64.
> > 
> > The barrier is added to the ath12k_hal_srng_access_begin() helper for
> > symmetry with follow-on fixes for source ring buffer corruption which
> > will add barriers to ath12k_hal_srng_access_end().
> > 
> > Note that this may fix the empty descriptor issue recently worked around
> > by commit 51ad34a47e9f ("wifi: ath12k: Add drop descriptor handling for
> > monitor ring").
> 
> why? I would expect drunk cookies are valid in case of HAL_MON_DEST_INFO0_EMPTY_DESC,
> rather than anything caused by reordering.

Based on a quick look it seemed like this could possibly fall in the
same category as some of the other workarounds I've spotted while
looking into these ordering issues (e.g. f9fff67d2d7c ("wifi: ath11k:
Fix SKB corruption in REO destination ring")).

If you say this one is clearly unrelated, I'll drop the comment.

> > @@ -343,9 +343,6 @@ static int ath12k_ce_completed_recv_next(struct ath12k_ce_pipe *pipe,
> >  		goto err;
> >  	}
> >  
> > -	/* Make sure descriptor is read after the head pointer. */
> > -	dma_rmb();
> > -
> >  	*nbytes = ath12k_hal_ce_dst_status_get_length(desc);
> >  
> >  	*skb = pipe->dest_ring->skb[sw_index];
> > diff --git a/drivers/net/wireless/ath/ath12k/hal.c b/drivers/net/wireless/ath/ath12k/hal.c
> > index 91d5126ca149..9eea13ed5565 100644
> > --- a/drivers/net/wireless/ath/ath12k/hal.c
> > +++ b/drivers/net/wireless/ath/ath12k/hal.c
> > @@ -2126,13 +2126,24 @@ void *ath12k_hal_srng_src_get_next_reaped(struct ath12k_base *ab,
> >  
> >  void ath12k_hal_srng_access_begin(struct ath12k_base *ab, struct hal_srng *srng)
> >  {
> > +	u32 hp;
> > +
> >  	lockdep_assert_held(&srng->lock);
> >  
> > -	if (srng->ring_dir == HAL_SRNG_DIR_SRC)
> > +	if (srng->ring_dir == HAL_SRNG_DIR_SRC) {
> >  		srng->u.src_ring.cached_tp =
> >  			*(volatile u32 *)srng->u.src_ring.tp_addr;
> > -	else
> > -		srng->u.dst_ring.cached_hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
> > +	} else {
> > +		hp = READ_ONCE(*srng->u.dst_ring.hp_addr);
> > +
> > +		if (hp != srng->u.dst_ring.cached_hp) {
> 
> This consumes additional CPU cycles in hot path, which is a concern to me.
> 
> Based on that, I prefer the v1 implementation.

The conditional avoids a memory barrier in case the ring is empty, so
for all callers but ath12k_ce_completed_recv_next() it's an improvement
over v1 in that sense.

I could make the barrier unconditional, which will only add one barrier
to ath12k_ce_completed_recv_next() in case the ring is empty compared to
v1. Perhaps that's a good compromise if you worry about the extra
comparison?

I very much want to avoid having both explicit barriers in the caller
and barriers in the hal end() helper. I think it should be either or.
 
> > +			srng->u.dst_ring.cached_hp = hp;
> > +			/* Make sure descriptor is read after the head
> > +			 * pointer.
> > +			 */
> > +			dma_rmb();
> > +		}
> > +	}

Johan