Message ID | 20221127144116.1418083-8-jic23@kernel.org |
---|---|
State | New |
Headers | show |
Series | Input: Fix insufficent DMA alignment. | expand |
On Sun, 27 Nov 2022 14:41:14 +0000 Jonathan Cameron <jic23@kernel.org> wrote: > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > The use of ____cacheline_aligned to ensure a buffer is DMA safe only > enforces the start of the buffer alignment. In this case, sufficient > alignment is already ensured by the use of kzalloc(). > ____cacheline_aligned does not ensure that no other members of the > structure are placed in the same cacheline after the end of the > buffer marked. Thus to ensure a DMA safe buffer it must be at the end > of the structure. This move is unnecessary, because the cacheline is 16 bytes and the buffer is 64 bytes. > Whilst here switch from ____cacheline_aligned to > __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable > sized cacheline lines across their cache hierarchy, require this > greater alignment guarantee for DMA safety. Make this change throughout > the driver as it reduces need for a reader to know about the particular > architecture. This change looks ok. - Lauri
On Sun, 27 Nov 2022 18:48:44 +0200 Lauri Kasanen <cand@gmx.com> wrote: > On Sun, 27 Nov 2022 14:41:14 +0000 > Jonathan Cameron <jic23@kernel.org> wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > The use of ____cacheline_aligned to ensure a buffer is DMA safe only > > enforces the start of the buffer alignment. In this case, sufficient > > alignment is already ensured by the use of kzalloc(). > > ____cacheline_aligned does not ensure that no other members of the > > structure are placed in the same cacheline after the end of the > > buffer marked. Thus to ensure a DMA safe buffer it must be at the end > > of the structure. > > This move is unnecessary, because the cacheline is 16 bytes and the > buffer is 64 bytes. Ah. That particular option hadn't occurred to me (and I'd failed to notice how big the buffer is :( ). The marking isn't needed at all then as the allocation is already guaranteed to be sufficiently aligned. However, maybe that is a bit subtle and having some sort of marking is useful. Curious question though, why is the buffer so big? Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them. Is the hardware putting garbage into the remaining 2 cachelines or is there something subtle going on? Or given my earlier success, maybe I'm misreading the code entirely. Jonathan > > > Whilst here switch from ____cacheline_aligned to > > __aligned(ARCH_KMALLOC_MINALIGN) as on some architectures, with variable > > sized cacheline lines across their cache hierarchy, require this > > greater alignment guarantee for DMA safety. Make this change throughout > > the driver as it reduces need for a reader to know about the particular > > architecture. > > This change looks ok. > > - Lauri
On Sun, 27 Nov 2022 18:01:26 +0000 Jonathan Cameron <jic23@kernel.org> wrote: > > This move is unnecessary, because the cacheline is 16 bytes and the > > buffer is 64 bytes. > > Ah. That particular option hadn't occurred to me (and I'd failed to notice > how big the buffer is :( ). > The marking isn't needed at all then as the allocation is already > guaranteed to be sufficiently aligned. However, maybe that is a bit subtle > and having some sort of marking is useful. You can replace the __cacheline annotation with a comment, that's totally fine. > Curious question though, why is the buffer so big? > Each struct joydata is 8 bytes I think, but the driver only accesses 4 of them. > > Is the hardware putting garbage into the remaining 2 cachelines or is there > something subtle going on? That chip operates on 64-byte units. I don't remember whether the remaining area is garbage or the input bytes or something else. - Lauri
On Sun, Nov 27, 2022 at 06:48:44PM +0200, Lauri Kasanen wrote: > On Sun, 27 Nov 2022 14:41:14 +0000 > Jonathan Cameron <jic23@kernel.org> wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > The use of ____cacheline_aligned to ensure a buffer is DMA safe only > > enforces the start of the buffer alignment. In this case, sufficient > > alignment is already ensured by the use of kzalloc(). > > ____cacheline_aligned does not ensure that no other members of the > > structure are placed in the same cacheline after the end of the > > buffer marked. Thus to ensure a DMA safe buffer it must be at the end > > of the structure. > > This move is unnecessary, because the cacheline is 16 bytes and the > buffer is 64 bytes. I think it is still worth moving it or alternatively adding a comment why we believe the following member will not be sharing cacheline with the buffer. Thanks.
diff --git a/drivers/input/joystick/n64joy.c b/drivers/input/joystick/n64joy.c index 9dbca366613e..d8c50103c108 100644 --- a/drivers/input/joystick/n64joy.c +++ b/drivers/input/joystick/n64joy.c @@ -44,12 +44,12 @@ static const char *n64joy_phys[MAX_CONTROLLERS] = { }; struct n64joy_priv { - u64 si_buf[8] ____cacheline_aligned; struct timer_list timer; struct mutex n64joy_mutex; struct input_dev *n64joy_dev[MAX_CONTROLLERS]; u32 __iomem *reg_base; u8 n64joy_opened; + u64 si_buf[8] __aligned(ARCH_KMALLOC_MINALIGN); }; struct joydata { @@ -129,7 +129,7 @@ static void n64joy_exec_pif(struct n64joy_priv *priv, const u64 in[8]) local_irq_restore(flags); } -static const u64 polldata[] ____cacheline_aligned = { +static const u64 polldata[] __aligned(ARCH_KMALLOC_MINALIGN) = { 0xff010401ffffffff, 0xff010401ffffffff, 0xff010401ffffffff, @@ -222,7 +222,7 @@ static void n64joy_close(struct input_dev *dev) mutex_unlock(&priv->n64joy_mutex); } -static const u64 __initconst scandata[] ____cacheline_aligned = { +static const u64 __initconst scandata[] __aligned(ARCH_KMALLOC_MINALIGN) = { 0xff010300ffffffff, 0xff010300ffffffff, 0xff010300ffffffff,