Message ID | 20221127144116.1418083-1-jic23@kernel.org |
---|---|
Headers | show |
Series | Input: Fix insufficent DMA alignment. | expand |
On Sun, 27 Nov 2022 14:41:13 +0000 Jonathan Cameron <jic23@kernel.org> wrote: > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > On some architectures (e.g. arm64), ____cachline_aligned only aligns > to the cacheline size of the L1 cache size. L1_CACHE_BYTES in > arch64/include/asm/cache.h Unfortunately DMA safety on these > architectures requires the buffer no share a last level cache cacheline > given by ARCH_DMA_MINALIGN which has a greater granularity. > ARCH_DMA_MINALIGN is not defined for all architectures, but when it is > defined it is used to set the size of ARCH_KMALLOC_MINALIGN > to allow DMA safe buffer allocations. > > As such the correct alignment requirement is > __aligned(ARCH_KMALLOC_MINALIGN). > This has recently been fixed in other subsystems such as IIO. > > Presumably this part is little used on boards where this could actually > matter so this is mostly about removing code that might be coppied > elsewhere. > > Fixes: 4feacbc24eea ("Input: add new driver for the Surface 3") > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Sigh. Sunday afternoon incompetence on my part... Dmitry, if everything else is fine with this series I can resend this or if you are feeling generous feel free to fix it up whilst applying ;) Jonathan > --- > drivers/input/touchscreen/surface3_spi.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/input/touchscreen/surface3_spi.c b/drivers/input/touchscreen/surface3_spi.c > index 1da23e5585a0..6c884fc2b332 100644 > --- a/drivers/input/touchscreen/surface3_spi.c > +++ b/drivers/input/touchscreen/surface3_spi.c > @@ -32,7 +32,7 @@ struct surface3_ts_data { > struct input_dev *pen_input_dev; > int pen_tool; > > - u8 rd_buf[SURFACE3_PACKET_SIZE] ____cacheline_aligned; > + u8 rd_buf[SURFACE3_PACKET_SIZE] __aligned(ARCH_KMALLOC_MINALIGN; Missing bracket. > }; > > struct surface3_ts_data_finger {
> -----Original Message----- > From: Jonathan Cameron <jic23@kernel.org> > Sent: Sonntag, 27. November 2022 15:41 > To: linux-input@vger.kernel.org; Dmitry Torokhov > <dmitry.torokhov@gmail.com> > Cc: Jonathan Cameron <Jonathan.Cameron@huawei.com>; Hennerich, Michael > <Michael.Hennerich@analog.com> > Subject: [PATCH 2/9] Input: ad714x - Fix padding for DMA safe buffers. > > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > On some architectures (e.g. arm64), ____cachline_aligned only aligns to the > cacheline size of the L1 cache size. L1_CACHE_BYTES in > arch64/include/asm/cache.h Unfortunately DMA safety on these architectures > requires the buffer no share a last level cache cacheline given by > ARCH_DMA_MINALIGN which has a greater granularity. > ARCH_DMA_MINALIGN is not defined for all architectures, but when it is > defined it is used to set the size of ARCH_KMALLOC_MINALIGN to allow DMA > safe buffer allocations. > > As such the correct alignment requirement is > __aligned(ARCH_KMALLOC_MINALIGN). > This has recently been fixed in other subsystems such as IIO. > > Fixes tag is inprecise because there may not have been any architectures > where the two values were different at the time of the earlier fix. > > Fixes: c0409feb8689 ("Input: ad714x - use DMA-safe buffers for spi_write()") > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Cc: Michael Hennerich <michael.hennerich@analog.com> Acked-by: Michael Hennerich <michael.hennerich@analog.com> > --- > drivers/input/misc/ad714x.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/input/misc/ad714x.h b/drivers/input/misc/ad714x.h index > af847b5f0d0e..2b8b901183be 100644 > --- a/drivers/input/misc/ad714x.h > +++ b/drivers/input/misc/ad714x.h > @@ -41,7 +41,7 @@ struct ad714x_chip { > unsigned product; > unsigned version; > > - __be16 xfer_buf[16] ____cacheline_aligned; > + __be16 xfer_buf[16] __aligned(ARCH_KMALLOC_MINALIGN); > > }; > > -- > 2.38.1
Hi Jonathan, On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote: > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > This problem was discovered in IIO as a side effect of the discussions about > relaxing kmalloc alignment on arm64 and resulted in a series of large > patch sets. > > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/ > > Unsurprisingly there are cases of it in other subsystems. > > The short version of this is that there are a few known arm64 chips where > ___cacheline_aligned enforces 64 byte alignment which is what we typically > want for performance optimization as the size of the L1 cache lines. > However, further out in the cache hierarchy we have caches with 128 byte > lines. Those are the ones that matter for DMA safety. > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which > in this case is 128 bytes. I wonder if we could have something like ____dmasafe_aligned instead of sprinkling ARCH_KMALLOC_MINALIGN around? > > There is one other use of ____cacheline_aligned in input: > joystick/iforce/iforce-usb.c > > Whilst suspicious I'm not sure enough of the requirements of USB to > know if they are there for DMA safety or some other constraint. Yes, USB has requirements similar to SPI. Thanks.
On Mon, 28 Nov 2022 10:16:31 -0800 Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > Hi Jonathan, > > On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote: > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > This problem was discovered in IIO as a side effect of the discussions about > > relaxing kmalloc alignment on arm64 and resulted in a series of large > > patch sets. > > > > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/ > > > > Unsurprisingly there are cases of it in other subsystems. > > > > The short version of this is that there are a few known arm64 chips where > > ___cacheline_aligned enforces 64 byte alignment which is what we typically > > want for performance optimization as the size of the L1 cache lines. > > However, further out in the cache hierarchy we have caches with 128 byte > > lines. Those are the ones that matter for DMA safety. > > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which > > in this case is 128 bytes. > > I wonder if we could have something like ____dmasafe_aligned instead of > sprinkling ARCH_KMALLOC_MINALIGN around? I agree in principle and eventually that will be ARCH_DMA_MINALIGN. But it isn't useable yet for backports. Catalin has worked on several approaches to reducing the alignment of kmalloc. I may well be lagging on the current plan... https://lore.kernel.org/linux-arm-kernel/20221106220143.2129263-1-catalin.marinas@arm.com/#r The way crypto and IIO handled this was to add a local define IIO_DMA_MINALIGN, CRYPTO_MINALIGN Catalin, if you do forwards with making ARCH_DMA_MINALIGN global available, make sure to include IIO_DMA_MINALIGN. I'm fine with the approach you used for crypto of changing the local define. Note that in IIO this typically only bloats a single structure per device instance, so it's not worth the complexity of dynamic alignment. As we are marking structure elements with it, in all cases they are multiples of IIO_DMA_MINALIGN in size, so everything is easy. On the other side of things, we might be able to relax most of these alignment tricks in IIO (and input) if swiotlb is the approach eventually used. I'm personally not that keen on that transition but meh, this stuff usually involves a serial bus, so the extra bounce isnt' too painful. Note that we regularly do fun things like setting tx buffer to be ARCH_KMALLOC_MINALIGN and the rx buffer to be that +n bytes on the basis that a given serial bus master won't trash itself and hence this is safe. I think your approach will bounce all of those which is somewhat unfortunate. Other fun things include DMA mapping a series of small buffers in one allocation for a series of serial messages. Again, that is guaranteed to be safe. Jonathan > > > > > There is one other use of ____cacheline_aligned in input: > > joystick/iforce/iforce-usb.c > > > > Whilst suspicious I'm not sure enough of the requirements of USB to > > know if they are there for DMA safety or some other constraint. > > Yes, USB has requirements similar to SPI. > > Thanks. >
On Tue, Nov 29, 2022 at 09:18:28AM +0000, Jonathan Cameron wrote: > On Mon, 28 Nov 2022 10:16:31 -0800 > Dmitry Torokhov <dmitry.torokhov@gmail.com> wrote: > > > Hi Jonathan, > > > > On Sun, Nov 27, 2022 at 02:41:07PM +0000, Jonathan Cameron wrote: > > > From: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > > > > This problem was discovered in IIO as a side effect of the discussions about > > > relaxing kmalloc alignment on arm64 and resulted in a series of large > > > patch sets. > > > > > > https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/ > > > > > > Unsurprisingly there are cases of it in other subsystems. > > > > > > The short version of this is that there are a few known arm64 chips where > > > ___cacheline_aligned enforces 64 byte alignment which is what we typically > > > want for performance optimization as the size of the L1 cache lines. > > > However, further out in the cache hierarchy we have caches with 128 byte > > > lines. Those are the ones that matter for DMA safety. > > > So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which > > > in this case is 128 bytes. > > > > I wonder if we could have something like ____dmasafe_aligned instead of > > sprinkling ARCH_KMALLOC_MINALIGN around? > > I agree in principle and eventually that will be ARCH_DMA_MINALIGN. > But it isn't useable yet for backports. Sorry, I do not follow. Are talking about backports because the code is broken in the mainline right now, or it will become broken when Catalin's changes land? And even if it is broken right now why can't we add #define ____dmasafe_aligned __attribute__((__aligned__(ARCH_KMALLOC_MINALIGN))) somewhere in include/linux/cache.h? Then you can tweak it as needed independently of kmalloc alignment and without need to touch drivers again and it should be easy to backport still. Thanks.
From: Jonathan Cameron <Jonathan.Cameron@huawei.com> This problem was discovered in IIO as a side effect of the discussions about relaxing kmalloc alignment on arm64 and resulted in a series of large patch sets. https://lore.kernel.org/linux-iio/20220508175712.647246-1-jic23@kernel.org/ Unsurprisingly there are cases of it in other subsystems. The short version of this is that there are a few known arm64 chips where ___cacheline_aligned enforces 64 byte alignment which is what we typically want for performance optimization as the size of the L1 cache lines. However, further out in the cache hierarchy we have caches with 128 byte lines. Those are the ones that matter for DMA safety. So we need the larger alignment guarantees of ARCH_KMALLOC_MINALIGN which in this case is 128 bytes. There is one other use of ____cacheline_aligned in input: joystick/iforce/iforce-usb.c Whilst suspicious I'm not sure enough of the requirements of USB to know if they are there for DMA safety or some other constraint. Jonathan Cc: Daniel Mack <daniel@zonque.org> Cc: Michael Hennerich <michael.hennerich@analog.com> Cc: Tomohiro Yoshidomi <sylph23k@gmail.com> Cc: Javier Martinez Canillas <javier@dowhile0.org> Cc: Linus Walleij <linus.walleij@linaro.org> Cc: Benjamin Tissoires <benjamin.tissoires@redhat.com> Cc: Lauri Kasanen <cand@gmx.com> Cc: Daniel Hung-yu Wu <hywu@google.com> Jonathan Cameron (9): Input: psxpad - Fix padding for DMA safe buffers. Input: ad714x - Fix padding for DMA safe buffers. Input: ad7887 - Fix padding for DMA safe buffers. Input: ads7846 - Fix padding for DMA safe buffers. Input: cyttsp - Fix padding for DMA safe buffers. Input: surface3 - Fix padding for DMA safe buffers. Input: n64joy - Fix DMA buffer alignment. Input: atmel_captouch - Avoid suspect DMA buffer alignment. Input: elants - Fix suspect DMA buffer alignment drivers/input/joystick/n64joy.c | 6 +++--- drivers/input/joystick/psxpad-spi.c | 4 ++-- drivers/input/misc/ad714x.h | 2 +- drivers/input/misc/atmel_captouch.c | 2 +- drivers/input/touchscreen/ad7877.c | 4 ++-- drivers/input/touchscreen/ads7846.c | 4 ++-- drivers/input/touchscreen/cyttsp_core.h | 2 +- drivers/input/touchscreen/elants_i2c.c | 2 +- drivers/input/touchscreen/surface3_spi.c | 2 +- 9 files changed, 14 insertions(+), 14 deletions(-)