[v2,0/3] phase out CONFIG_VIRT_TO_BUS

Message ID	20220617125750.728590-1-arnd@kernel.org
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Arnd Bergmann <arnd@kernel.org> To: linux-scsi@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Arnd Bergmann <arnd@arndb.de>, Jakub Kicinski <kuba@kernel.org>, Christoph Hellwig <hch@infradead.org>, Marek Szyprowski <m.szyprowski@samsung.com>, Robin Murphy <robin.murphy@arm.com>, iommu@lists.linux-foundation.org, Khalid Aziz <khalid@gonehiking.org>, "Maciej W . Rozycki" <macro@orcam.me.uk>, Matt Wang <wwentao@vmware.com>, Miquel van Smoorenburg <mikevs@xs4all.net>, Mark Salyzyn <salyzyn@android.com>, linuxppc-dev@lists.ozlabs.org, linux-arch@vger.kernel.org, linux-alpha@vger.kernel.org, linux-m68k@lists.linux-m68k.org, linux-parisc@vger.kernel.org, Denis Efremov <efremov@linux.com> Subject: [PATCH v2 0/3] phase out CONFIG_VIRT_TO_BUS Date: Fri, 17 Jun 2022 14:57:47 +0200 Message-Id: <20220617125750.728590-1-arnd@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	phase out CONFIG_VIRT_TO_BUS \| expand [v2,0/3] phase out CONFIG_VIRT_TO_BUS [v2,1/3] scsi: dpt_i2o: drop stale VIRT_TO_BUS dependency [v2,2/3] scsi: BusLogic remove bus_to_virt [v2,3/3] arch/*/: remove CONFIG_VIRT_TO_BUS

Arnd Bergmann June 17, 2022, 12:57 p.m. UTC

From: Arnd Bergmann <arnd@arndb.de>

The virt_to_bus/bus_to_virt interface has been deprecated for
decades. After Jakub Kicinski put a lot of work into cleaning out the
network drivers using them, there are only a couple of other drivers
left, which can all be removed or otherwise cleaned up, to remove the
old interface for good.

Any out of tree drivers using virt_to_bus() should be converted to
using the dma-mapping interfaces, typically dma_alloc_coherent()
or dma_map_single()).

There are a few m68k and ppc32 specific drivers that keep using the
interfaces, but these are all guarded with architecture-specific
Kconfig dependencies, and are not actually broken.

There are still a number of drivers that are using virt_to_phys()
and phys_to_virt() in place of dma-mapping operations, and these
are often broken, but they are out of scope for this series.

I would like the first two patches to either get merged through
the SCSI tree, or get an Ack from the SCSI maintainers so I can
merge them through the asm-generic tree

      Arnd

---
Changes since v1:
 - dropped VME patches that are already in staging-next
 - dropped media patch that gets merged independently
 - added a networking patch and dropped it again after it got merged
 - replace BusLogic removal with a workaround

Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org> # dma-mapping
Cc: Marek Szyprowski <m.szyprowski@samsung.com> # dma-mapping
Cc: Robin Murphy <robin.murphy@arm.com> # dma-mapping
Cc: iommu@lists.linux-foundation.org
Cc: Khalid Aziz <khalid@gonehiking.org> # buslogic
Cc: Maciej W. Rozycki <macro@orcam.me.uk> # buslogic
Cc: Matt Wang <wwentao@vmware.com> # buslogic
Cc: Miquel van Smoorenburg <mikevs@xs4all.net> # dpt_i2o
Cc: Mark Salyzyn <salyzyn@android.com> # dpt_i2o
Cc: linux-scsi@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arch@vger.kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-parisc@vger.kernel.org
Cc: Denis Efremov <efremov@linux.com> # floppy

Arnd Bergmann (3):
  scsi: dpt_i2o: drop stale VIRT_TO_BUS dependency
  scsi: BusLogic remove bus_to_virt
  arch/*/: remove CONFIG_VIRT_TO_BUS

 .../core-api/bus-virt-phys-mapping.rst        | 220 ------------------
 Documentation/core-api/dma-api-howto.rst      |  14 --
 Documentation/core-api/index.rst              |   1 -
 .../translations/zh_CN/core-api/index.rst     |   1 -
 arch/alpha/Kconfig                            |   1 -
 arch/alpha/include/asm/floppy.h               |   2 +-
 arch/alpha/include/asm/io.h                   |   8 +-
 arch/ia64/Kconfig                             |   1 -
 arch/ia64/include/asm/io.h                    |   8 -
 arch/m68k/Kconfig                             |   1 -
 arch/m68k/include/asm/virtconvert.h           |   4 +-
 arch/microblaze/Kconfig                       |   1 -
 arch/microblaze/include/asm/io.h              |   2 -
 arch/mips/Kconfig                             |   1 -
 arch/mips/include/asm/io.h                    |   9 -
 arch/parisc/Kconfig                           |   1 -
 arch/parisc/include/asm/floppy.h              |   4 +-
 arch/parisc/include/asm/io.h                  |   2 -
 arch/powerpc/Kconfig                          |   1 -
 arch/powerpc/include/asm/io.h                 |   2 -
 arch/riscv/include/asm/page.h                 |   1 -
 arch/x86/Kconfig                              |   1 -
 arch/x86/include/asm/io.h                     |   9 -
 arch/xtensa/Kconfig                           |   1 -
 arch/xtensa/include/asm/io.h                  |   3 -
 drivers/scsi/BusLogic.c                       |  27 ++-
 drivers/scsi/Kconfig                          |   4 +-
 drivers/scsi/dpt_i2o.c                        |   4 +-
 include/asm-generic/io.h                      |  14 --
 mm/Kconfig                                    |   8 -
 30 files changed, 30 insertions(+), 326 deletions(-)
 delete mode 100644 Documentation/core-api/bus-virt-phys-mapping.rst

Michael Schmitz June 18, 2022, 1:06 a.m. UTC | #1

Arnd,

Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
> From: Arnd Bergmann <arnd@arndb.de>
>
> All architecture-independent users of virt_to_bus() and bus_to_virt()
> have been fixed to use the dma mapping interfaces or have been
> removed now.  This means the definitions on most architectures, and the
> CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
>
> The only exceptions to this are a few network and scsi drivers for m68k
> Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
> with the old interfaces and are probably not worth changing.

The Amiga SCSI drivers are all old WD33C93 ones, and replacing 
virt_to_bus by virt_to_phys in the dma_setup() function there would 
cause no functional change at all.

drivers/vme/bridges/vme_ca91cx42.c hasn't been used at all on m68k (it 
is a PCI-to-VME bridge chipset driver that would be needed on 
architectures that natively use a PCI bus). I haven't found anything 
that selects that driver, so not sure it is even still in use??

That would allow you to drop the remaining virt_to_bus define from 
arch/m68k/include/asm/virtconvert.h.

I could submit a patch to convert the Amiga SCSI drivers to use 
virt_to_phys if Geert and the SCSI maintainers think it's worth the churn.

32bit powerpc is a different matter though.

Cheers,

	Michael

Arnd Bergmann June 24, 2022, 9:10 a.m. UTC | #2

On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
> >
> > All architecture-independent users of virt_to_bus() and bus_to_virt()
> > have been fixed to use the dma mapping interfaces or have been
> > removed now.  This means the definitions on most architectures, and the
> > CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
> >
> > The only exceptions to this are a few network and scsi drivers for m68k
> > Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
> > with the old interfaces and are probably not worth changing.
>
> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
> virt_to_bus by virt_to_phys in the dma_setup() function there would
> cause no functional change at all.

Ok, thanks for taking a look here.

> drivers/vme/bridges/vme_ca91cx42.c hasn't been used at all on m68k (it
> is a PCI-to-VME bridge chipset driver that would be needed on
> architectures that natively use a PCI bus). I haven't found anything
> that selects that driver, so not sure it is even still in use??

It's gone now, Greg has already taken my patches for this through
the staging tree.

> That would allow you to drop the remaining virt_to_bus define from
> arch/m68k/include/asm/virtconvert.h.
>
> I could submit a patch to convert the Amiga SCSI drivers to use
> virt_to_phys if Geert and the SCSI maintainers think it's worth the churn.

I don't think using virt_to_phys() is an improvement here, as
virt_to_bus() was originally meant as a better abstraction to
replace the use of virt_to_phys() to make drivers portable, before
it got replaced by the dma-mapping interface in turn.

It looks like the Amiga SCSI drivers have an open-coded version of
what dma_map_single() does, to do bounce buffering and cache
management. The ideal solution would be to convert the drivers
actually use the appropriate dma-mapping interfaces and remove
this custom code.

The same could be done for the two vme drivers (scsi/mvme147.c
and ethernet/82596.c), which do the cache management but
apparently don't need swiotlb bounce buffering.

Rewriting the drivers to modern APIs is of course non-trivial,
and if you want a shortcut here, I would suggest introducing
platform specific helpers similar to isa_virt_to_bus() and call
them amiga_virt_to_bus() and vme_virt_to_bus, respectively.

Putting these into a platform specific header file at least helps
clarify that both the helper functions and the drivers using them
are non-portable.

> 32bit powerpc is a different matter though.

It's similar, but unrelated. The two apple ethernet drivers
(bmac and mace) can again either get changed to use the
dma-mapping interfaces, or get a custom pmac_virt_to_bus()/
pmac_bus_to_virt() helper.

There is also drivers/tty/serial/cpm_uart/cpm_uart_cpm2.c,
which I think just needs a trivial change, but I'm not sure
how to do it correctly.

      Arnd

Michael Schmitz June 26, 2022, 5:21 a.m. UTC | #3

Arnd,

Am 24.06.2022 um 21:10 schrieb Arnd Bergmann:
> On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
>>>
>>> All architecture-independent users of virt_to_bus() and bus_to_virt()
>>> have been fixed to use the dma mapping interfaces or have been
>>> removed now.  This means the definitions on most architectures, and the
>>> CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
>>>
>>> The only exceptions to this are a few network and scsi drivers for m68k
>>> Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
>>> with the old interfaces and are probably not worth changing.
>>
>> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
>> virt_to_bus by virt_to_phys in the dma_setup() function there would
>> cause no functional change at all.
>
> Ok, thanks for taking a look here.
>
>> drivers/vme/bridges/vme_ca91cx42.c hasn't been used at all on m68k (it
>> is a PCI-to-VME bridge chipset driver that would be needed on
>> architectures that natively use a PCI bus). I haven't found anything
>> that selects that driver, so not sure it is even still in use??
>
> It's gone now, Greg has already taken my patches for this through
> the staging tree.

One less to worry about, thanks.

>> That would allow you to drop the remaining virt_to_bus define from
>> arch/m68k/include/asm/virtconvert.h.
>>
>> I could submit a patch to convert the Amiga SCSI drivers to use
>> virt_to_phys if Geert and the SCSI maintainers think it's worth the churn.
>
> I don't think using virt_to_phys() is an improvement here, as
> virt_to_bus() was originally meant as a better abstraction to
> replace the use of virt_to_phys() to make drivers portable, before
> it got replaced by the dma-mapping interface in turn.
>
> It looks like the Amiga SCSI drivers have an open-coded version of
> what dma_map_single() does, to do bounce buffering and cache
> management. The ideal solution would be to convert the drivers
> actually use the appropriate dma-mapping interfaces and remove
> this custom code.

I've taken another look at these drivers' dma_setup() functions and they 
all look much more complex than the Amiga ESP drivers (which do use the 
dma-mapping interface for parts of the DMA setup). From my limited 
understanding, the difference between the ESP and WD33C93 drivers is 
that the former are used on 040/060 accelerator boards only (where the 
processor does do bus snooping and DMA can access all of RAM). The 
latter ones would need cache management, could only use non-coherent 
mappings and would require special case handling for DMA-inaccessible 
RAM inside a device-specific dma ops' map_page() function.

That's several bridges too far for me ... I have no Amiga hardware 
whatsoever, and know no one who could test changes to WD33C93 drivers 
for me.

What I have is a NCR5380 with the proverbial 'pathological DMA' 
integration example (and its driver was never changed to even use 
virt_to_bus()!). I might learn enough about using the dma-mapping API on 
that one eventually (though the requirement for at least 1 MB swiotlb 
bounce buffers looks hard to meet), and use that to convert the WD33C93 
drivers, but it would still remain untested.

 > The same could be done for the two vme drivers (scsi/mvme147.c
> and ethernet/82596.c), which do the cache management but
> apparently don't need swiotlb bounce buffering.
>
> Rewriting the drivers to modern APIs is of course non-trivial,
> and if you want a shortcut here, I would suggest introducing
> platform specific helpers similar to isa_virt_to_bus() and call
> them amiga_virt_to_bus() and vme_virt_to_bus, respectively.

I don't think Amiga and m68k VME differ at all in that respect, so might 
just call it m68k_virt_to_bus() for now.

> Putting these into a platform specific header file at least helps
> clarify that both the helper functions and the drivers using them
> are non-portable.

There are no platform specific header files other than asm/amigahw.h and 
asm/mvme147hw.h, currently only holding register address definitions. 
Would it be OK to add m68k_virt_to_bus() in there if it can't remain in 
asm/virtconvert.h, Geert?

>
>> 32bit powerpc is a different matter though.
>
> It's similar, but unrelated. The two apple ethernet drivers
> (bmac and mace) can again either get changed to use the
> dma-mapping interfaces, or get a custom pmac_virt_to_bus()/
> pmac_bus_to_virt() helper.

Hmmm - I see Finn had done the DMA API conversion on macmace.c which 
might give some hints on what to do about mace.c ... no idea about 
bmac.c though. And again, haven't got hardware to test, so custom 
helpers is it, then.

Cheers,

	Michael

> There is also drivers/tty/serial/cpm_uart/cpm_uart_cpm2.c,
> which I think just needs a trivial change, but I'm not sure
> how to do it correctly.
>
>       Arnd
>

Arnd Bergmann June 26, 2022, 8:36 a.m. UTC | #4

(On Sun, Jun 26, 2022 at 7:21 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>  > The same could be done for the two vme drivers (scsi/mvme147.c
> > and ethernet/82596.c), which do the cache management but
> > apparently don't need swiotlb bounce buffering.
> >
> > Rewriting the drivers to modern APIs is of course non-trivial,
> > and if you want a shortcut here, I would suggest introducing
> > platform specific helpers similar to isa_virt_to_bus() and call
> > them amiga_virt_to_bus() and vme_virt_to_bus, respectively.
>
> I don't think Amiga and m68k VME differ at all in that respect, so might
> just call it m68k_virt_to_bus() for now.
>
> > Putting these into a platform specific header file at least helps
> > clarify that both the helper functions and the drivers using them
> > are non-portable.
>
> There are no platform specific header files other than asm/amigahw.h and
> asm/mvme147hw.h, currently only holding register address definitions.
> Would it be OK to add m68k_virt_to_bus() in there if it can't remain in
> asm/virtconvert.h, Geert?

In that case, I would just leave it under the current name and not change
m68k at all. I don't like the m68k_virt_to_bus() name because there is
not anything CPU specific in what it does, and keeping it in a common
header does nothing to prevent it from being used on other platforms
either.

> >> 32bit powerpc is a different matter though.
> >
> > It's similar, but unrelated. The two apple ethernet drivers
> > (bmac and mace) can again either get changed to use the
> > dma-mapping interfaces, or get a custom pmac_virt_to_bus()/
> > pmac_bus_to_virt() helper.
>
> Hmmm - I see Finn had done the DMA API conversion on macmace.c which
> might give some hints on what to do about mace.c ... no idea about
> bmac.c though. And again, haven't got hardware to test, so custom
> helpers is it, then.

Ok.

          Arnd

Michael Schmitz June 27, 2022, 8:09 a.m. UTC | #5

Arnd,

Am 26.06.2022 um 20:36 schrieb Arnd Bergmann:
>> There are no platform specific header files other than asm/amigahw.h and
>> asm/mvme147hw.h, currently only holding register address definitions.
>> Would it be OK to add m68k_virt_to_bus() in there if it can't remain in
>> asm/virtconvert.h, Geert?
>
> In that case, I would just leave it under the current name and not change
> m68k at all. I don't like the m68k_virt_to_bus() name because there is
> not anything CPU specific in what it does, and keeping it in a common
> header does nothing to prevent it from being used on other platforms
> either.

Fair enough.

>>>> 32bit powerpc is a different matter though.
>>>
>>> It's similar, but unrelated. The two apple ethernet drivers
>>> (bmac and mace) can again either get changed to use the
>>> dma-mapping interfaces, or get a custom pmac_virt_to_bus()/
>>> pmac_bus_to_virt() helper.
>>
>> Hmmm - I see Finn had done the DMA API conversion on macmace.c which
>> might give some hints on what to do about mace.c ... no idea about
>> bmac.c though. And again, haven't got hardware to test, so custom
>> helpers is it, then.
>
> Ok.

Again, no platform specific headers to shift renamed helpers to, so may 
as well keep this as-is.

Cheers,

	Michael


>
>           Arnd
>

Geert Uytterhoeven June 27, 2022, 8:26 a.m. UTC | #6

Hi Michael,

On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
> > From: Arnd Bergmann <arnd@arndb.de>
> >
> > All architecture-independent users of virt_to_bus() and bus_to_virt()
> > have been fixed to use the dma mapping interfaces or have been
> > removed now.  This means the definitions on most architectures, and the
> > CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
> >
> > The only exceptions to this are a few network and scsi drivers for m68k
> > Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
> > with the old interfaces and are probably not worth changing.
>
> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
> virt_to_bus by virt_to_phys in the dma_setup() function there would
> cause no functional change at all.

FTR, the sgiwd93 driver use dma_map_single().

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Michael Schmitz June 27, 2022, 9:12 p.m. UTC | #7

Hi Geert,

On 27/06/22 20:26, Geert Uytterhoeven wrote:
> Hi Michael,
>
> On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
>>> From: Arnd Bergmann <arnd@arndb.de>
>>>
>>> All architecture-independent users of virt_to_bus() and bus_to_virt()
>>> have been fixed to use the dma mapping interfaces or have been
>>> removed now.  This means the definitions on most architectures, and the
>>> CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
>>>
>>> The only exceptions to this are a few network and scsi drivers for m68k
>>> Amiga and VME machines and ppc32 Macintosh. These drivers work correctly
>>> with the old interfaces and are probably not worth changing.
>> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
>> virt_to_bus by virt_to_phys in the dma_setup() function there would
>> cause no functional change at all.
> FTR, the sgiwd93 driver use dma_map_single().

Thanks! From what I see, it doesn't have to deal with bounce buffers 
though?

Cheers,

     Michael


>
> Gr{oetje,eeting}s,
>
>                          Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds

Michael Schmitz June 28, 2022, 3:25 a.m. UTC | #8

Hii Geert

Am 28.06.2022 um 09:12 schrieb Michael Schmitz:
> Hi Geert,
>
> On 27/06/22 20:26, Geert Uytterhoeven wrote:
>> Hi Michael,
>>
>> On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com>
>> wrote:
>>> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
>>>> From: Arnd Bergmann <arnd@arndb.de>
>>>>
>>>> All architecture-independent users of virt_to_bus() and bus_to_virt()
>>>> have been fixed to use the dma mapping interfaces or have been
>>>> removed now.  This means the definitions on most architectures, and the
>>>> CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
>>>>
>>>> The only exceptions to this are a few network and scsi drivers for m68k
>>>> Amiga and VME machines and ppc32 Macintosh. These drivers work
>>>> correctly
>>>> with the old interfaces and are probably not worth changing.
>>> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
>>> virt_to_bus by virt_to_phys in the dma_setup() function there would
>>> cause no functional change at all.
>> FTR, the sgiwd93 driver use dma_map_single().
>
> Thanks! From what I see, it doesn't have to deal with bounce buffers
> though?

Leaving the bounce buffer handling in place, and taking a few other 
liberties - this is what converting the easiest case (a3000 SCSI) might 
look like. Any obvious mistakes? The mvme147 driver would be very 
similar to handle (after conversion to a platform device).

The driver allocates bounce buffers using kmalloc if it hits an 
unaligned data buffer - can such buffers still even happen these days? 
If I understand dma_map_single() correctly, the resulting dma handle 
would be equally misaligned?

To allocate a bounce buffer, would it be OK to use dma_alloc_coherent() 
even though AFAIU memory used for DMA buffers generally isn't consistent 
on m68k?

Thinking ahead to the other two Amiga drivers - I wonder whether 
allocating a static bounce buffer or a DMA pool at driver init is likely 
to succeed if the kernel runs from the low 16 MB RAM chunk? It certainly 
won't succeed if the kernel runs from a higher memory address, so the 
present bounce buffer logic around amiga_chip_alloc() might still need 
to be used here.

Leaves the question whether converting the gvp11 and a2091 drivers is 
actually worth it, if bounce buffers still have to be handled explicitly.

Untested (except for compile testing), un-checkpatched, don't try this 
on any disk with valuable data ...

Cheers,

	Michael

Geert Uytterhoeven June 28, 2022, 7:03 a.m. UTC | #9

Hi Michael,

On Tue, Jun 28, 2022 at 5:26 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> Am 28.06.2022 um 09:12 schrieb Michael Schmitz:
> > On 27/06/22 20:26, Geert Uytterhoeven wrote:
> >> On Sat, Jun 18, 2022 at 3:06 AM Michael Schmitz <schmitzmic@gmail.com>
> >> wrote:
> >>> Am 18.06.2022 um 00:57 schrieb Arnd Bergmann:
> >>>> From: Arnd Bergmann <arnd@arndb.de>
> >>>>
> >>>> All architecture-independent users of virt_to_bus() and bus_to_virt()
> >>>> have been fixed to use the dma mapping interfaces or have been
> >>>> removed now.  This means the definitions on most architectures, and the
> >>>> CONFIG_VIRT_TO_BUS symbol are now obsolete and can be removed.
> >>>>
> >>>> The only exceptions to this are a few network and scsi drivers for m68k
> >>>> Amiga and VME machines and ppc32 Macintosh. These drivers work
> >>>> correctly
> >>>> with the old interfaces and are probably not worth changing.
> >>> The Amiga SCSI drivers are all old WD33C93 ones, and replacing
> >>> virt_to_bus by virt_to_phys in the dma_setup() function there would
> >>> cause no functional change at all.
> >> FTR, the sgiwd93 driver use dma_map_single().
> >
> > Thanks! From what I see, it doesn't have to deal with bounce buffers
> > though?
>
> Leaving the bounce buffer handling in place, and taking a few other
> liberties - this is what converting the easiest case (a3000 SCSI) might
> look like. Any obvious mistakes? The mvme147 driver would be very
> similar to handle (after conversion to a platform device).

Thanks, looks reasonable.

> The driver allocates bounce buffers using kmalloc if it hits an
> unaligned data buffer - can such buffers still even happen these days?

No idea.

> If I understand dma_map_single() correctly, the resulting dma handle
> would be equally misaligned?
>
> To allocate a bounce buffer, would it be OK to use dma_alloc_coherent()
> even though AFAIU memory used for DMA buffers generally isn't consistent
> on m68k?
>
> Thinking ahead to the other two Amiga drivers - I wonder whether
> allocating a static bounce buffer or a DMA pool at driver init is likely
> to succeed if the kernel runs from the low 16 MB RAM chunk? It certainly
> won't succeed if the kernel runs from a higher memory address, so the
> present bounce buffer logic around amiga_chip_alloc() might still need
> to be used here.
>
> Leaves the question whether converting the gvp11 and a2091 drivers is
> actually worth it, if bounce buffers still have to be handled explicitly.

A2091 should be straight-forward, as A3000 is basically A2091 on the
motherboard (comparing the two drivers, looks like someone's been
sprinkling mb()s over the A3000 driver).

I don't have any of these SCSI host adapters (not counting the A590
(~A2091) expansion of the old A500, which is not Linux-capable, and
 hasn't been powered on for 20 years).

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

Arnd Bergmann June 28, 2022, 7:08 a.m. UTC | #10

On Tue, Jun 28, 2022 at 5:25 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
> Am 28.06.2022 um 09:12 schrieb Michael Schmitz:
>
> Leaving the bounce buffer handling in place, and taking a few other
> liberties - this is what converting the easiest case (a3000 SCSI) might
> look like. Any obvious mistakes? The mvme147 driver would be very
> similar to handle (after conversion to a platform device).
>
> The driver allocates bounce buffers using kmalloc if it hits an
> unaligned data buffer - can such buffers still even happen these days?
> If I understand dma_map_single() correctly, the resulting dma handle
> would be equally misaligned?
>
> To allocate a bounce buffer, would it be OK to use dma_alloc_coherent()
> even though AFAIU memory used for DMA buffers generally isn't consistent
> on m68k?

I think it makes sense to skip the bounce buffering as you do here:
the only standardized way we have for integrating that part is to
use the swiotlb infrastructure, but as you mentioned earlier that
part is probably too resource-heavy here for Amiga.

I see two other problems with your patch though:

a) you still duplicate the cache handling: the cache_clear()/cache_push()
is supposed to already be done by dma_map_single() when the device
is not cache-coherent.

b) The bounce buffer is never mapped here, instead you have the
virt_to_phys() here, which is not the same. I think you need to map
the pointer that actually gets passed down to the device after deciding
to use a bouce buffer or not.

     Arnd

Michael Schmitz June 28, 2022, 9:03 p.m. UTC | #11

Hi Geert,

On 28/06/22 19:03, Geert Uytterhoeven wrote:
>
>> Leaving the bounce buffer handling in place, and taking a few other
>> liberties - this is what converting the easiest case (a3000 SCSI) might
>> look like. Any obvious mistakes? The mvme147 driver would be very
>> similar to handle (after conversion to a platform device).
> Thanks, looks reasonable.
Thanks, I'll take care of Arnd's comments and post a corrected version 
later.
>> The driver allocates bounce buffers using kmalloc if it hits an
>> unaligned data buffer - can such buffers still even happen these days?
> No idea.
Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this 
code path is still being used.
>
>> If I understand dma_map_single() correctly, the resulting dma handle
>> would be equally misaligned?
>>
>> To allocate a bounce buffer, would it be OK to use dma_alloc_coherent()
>> even though AFAIU memory used for DMA buffers generally isn't consistent
>> on m68k?
>>
>> Thinking ahead to the other two Amiga drivers - I wonder whether
>> allocating a static bounce buffer or a DMA pool at driver init is likely
>> to succeed if the kernel runs from the low 16 MB RAM chunk? It certainly
>> won't succeed if the kernel runs from a higher memory address, so the
>> present bounce buffer logic around amiga_chip_alloc() might still need
>> to be used here.
>>
>> Leaves the question whether converting the gvp11 and a2091 drivers is
>> actually worth it, if bounce buffers still have to be handled explicitly.
> A2091 should be straight-forward, as A3000 is basically A2091 on the
> motherboard (comparing the two drivers, looks like someone's been
> sprinkling mb()s over the A3000 driver).

Yep, and at least the ones in the dma_setup() function are there for no 
reason (the compiler won't reorder stores around the cache flush calls, 
I hope?).

Just leaves the 24 bit DMA mask there (and likely need for bounce buffers).

> I don't have any of these SCSI host adapters (not counting the A590
> (~A2091) expansion of the old A500, which is not Linux-capable, and
>   hasn't been powered on for 20 years).

I wonder whether kullervo has survived - that one was an A3000. Should 
have gone to Adrian a few years ago...

Cheers,

     Michael


>
> Gr{oetje,eeting}s,
>
>                          Geert
>
> --
> Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org
>
> In personal conversations with technical people, I call myself a hacker. But
> when I'm talking to journalists I just say "programmer" or something like that.
>                                  -- Linus Torvalds

Michael Schmitz June 28, 2022, 9:38 p.m. UTC | #12

Hi Arnd,

On 28/06/22 19:08, Arnd Bergmann wrote:
> On Tue, Jun 28, 2022 at 5:25 AM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> Am 28.06.2022 um 09:12 schrieb Michael Schmitz:
>>
>> Leaving the bounce buffer handling in place, and taking a few other
>> liberties - this is what converting the easiest case (a3000 SCSI) might
>> look like. Any obvious mistakes? The mvme147 driver would be very
>> similar to handle (after conversion to a platform device).
>>
>> The driver allocates bounce buffers using kmalloc if it hits an
>> unaligned data buffer - can such buffers still even happen these days?
>> If I understand dma_map_single() correctly, the resulting dma handle
>> would be equally misaligned?
>>
>> To allocate a bounce buffer, would it be OK to use dma_alloc_coherent()
>> even though AFAIU memory used for DMA buffers generally isn't consistent
>> on m68k?
> I think it makes sense to skip the bounce buffering as you do here:
> the only standardized way we have for integrating that part is to
> use the swiotlb infrastructure, but as you mentioned earlier that
> part is probably too resource-heavy here for Amiga.
OK, leaving the old custom logic in place allows to convert the 24 bit 
DMA drivers more easily.
>
> I see two other problems with your patch though:
>
> a) you still duplicate the cache handling: the cache_clear()/cache_push()
> is supposed to already be done by dma_map_single() when the device
> is not cache-coherent.

That's one of the 'liberties' I alluded to. The reason I left these in 
is that I'm none too certain what device feature the DMA API uses to 
decide a device isn't cache-coherent. If it's dev->coherent_dma_mask, 
the way I set up the device in the a3000 driver should leave the 
coherent mask unchanged. For the Zorro drivers, devices are set up to 
use the same storage to store normal and coherent masks - something we 
most likely want to change. I need to think about the ramifications of 
that.

Note that zorro_esp.c uses dma_sync_single_for_device() and uses a 32 
bit coherent DMA mask which does work OK. I might  ask Adrian to test a 
change to only set dev->dma_mask, and drop the 
dma_sync_single_for_device() calls if there's any doubt on this aspect.

> b) The bounce buffer is never mapped here, instead you have the
> virt_to_phys() here, which is not the same. I think you need to map
> the pointer that actually gets passed down to the device after deciding
> to use a bouce buffer or not.

I hadn't realized that I can map the bounce buffer just as it's done for 
the SCp data buffer. Should have been obvious, but I'm still learning 
about the DMA API.

I've updated the patch now, will re-send as part of a complete series 
once done.

Cheers,

     Michael

>
>       Arnd

Arnd Bergmann June 28, 2022, 9:50 p.m. UTC | #13

On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
> On 28/06/22 19:03, Geert Uytterhoeven wrote:
> >> The driver allocates bounce buffers using kmalloc if it hits an
> >> unaligned data buffer - can such buffers still even happen these days?
> > No idea.
> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
> code path is still being used.

kmalloc() guarantees alignment to the next power-of-two size or
KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
is cacheline aligned.

      Arnd

Arnd Bergmann June 28, 2022, 9:55 p.m. UTC | #14

On Tue, Jun 28, 2022 at 11:38 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
> On 28/06/22 19:08, Arnd Bergmann wrote:
> > I see two other problems with your patch though:
> >
> > a) you still duplicate the cache handling: the cache_clear()/cache_push()
> > is supposed to already be done by dma_map_single() when the device
> > is not cache-coherent.
>
> That's one of the 'liberties' I alluded to. The reason I left these in
> is that I'm none too certain what device feature the DMA API uses to
> decide a device isn't cache-coherent. If it's dev->coherent_dma_mask,
> the way I set up the device in the a3000 driver should leave the
> coherent mask unchanged. For the Zorro drivers, devices are set up to
> use the same storage to store normal and coherent masks - something we
> most likely want to change. I need to think about the ramifications of
> that.
>
> Note that zorro_esp.c uses dma_sync_single_for_device() and uses a 32
> bit coherent DMA mask which does work OK. I might  ask Adrian to test a
> change to only set dev->dma_mask, and drop the
> dma_sync_single_for_device() calls if there's any doubt on this aspect.

The "coherent_mask" is independent of the cache flushing. On some
architectures, a device can indicate whether it needs cache management
or not to guarantee coherency, but on m68k it appears that we always
assume it does, see arch/m68k/kernel/dma.c

> > b) The bounce buffer is never mapped here, instead you have the
> > virt_to_phys() here, which is not the same. I think you need to map
> > the pointer that actually gets passed down to the device after deciding
> > to use a bouce buffer or not.
>
> I hadn't realized that I can map the bounce buffer just as it's done for
> the SCp data buffer. Should have been obvious, but I'm still learning
> about the DMA API.
>
> I've updated the patch now, will re-send as part of a complete series
> once done.

I suppose you can just drop the bounce buffer if this just comes
from kmalloc().

       Arnd

Michael Schmitz June 28, 2022, 11:09 p.m. UTC | #15

Hi Arnd,

On 29/06/22 09:50, Arnd Bergmann wrote:
> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> On 28/06/22 19:03, Geert Uytterhoeven wrote:
>>>> The driver allocates bounce buffers using kmalloc if it hits an
>>>> unaligned data buffer - can such buffers still even happen these days?
>>> No idea.
>> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
>> code path is still being used.
> kmalloc() guarantees alignment to the next power-of-two size or
> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
> is cacheline aligned.

And all SCSI buffers are allocated using kmalloc? No way at all for user 
space to pass unaligned data?

(SCSI is a weird beast - I have used a SCSI DAT tape driver many many 
years ago, which broke all sorts of assumptions about transfer block 
sizes ... but that might actually have been in the v0.99 days, many 
rewrites of SCSI midlevel ago).

Just being cautious, as getting any of this tested will be a stretch.

Cheers,

     Michael

>
>        Arnd

Michael Schmitz June 28, 2022, 11:43 p.m. UTC | #16

Hi Arnd,

On 29/06/22 09:55, Arnd Bergmann wrote:
> On Tue, Jun 28, 2022 at 11:38 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
>> On 28/06/22 19:08, Arnd Bergmann wrote:
>>> I see two other problems with your patch though:
>>>
>>> a) you still duplicate the cache handling: the cache_clear()/cache_push()
>>> is supposed to already be done by dma_map_single() when the device
>>> is not cache-coherent.
>> That's one of the 'liberties' I alluded to. The reason I left these in
>> is that I'm none too certain what device feature the DMA API uses to
>> decide a device isn't cache-coherent. If it's dev->coherent_dma_mask,
>> the way I set up the device in the a3000 driver should leave the
>> coherent mask unchanged. For the Zorro drivers, devices are set up to
>> use the same storage to store normal and coherent masks - something we
>> most likely want to change. I need to think about the ramifications of
>> that.
>>
>> Note that zorro_esp.c uses dma_sync_single_for_device() and uses a 32
>> bit coherent DMA mask which does work OK. I might  ask Adrian to test a
>> change to only set dev->dma_mask, and drop the
>> dma_sync_single_for_device() calls if there's any doubt on this aspect.
> The "coherent_mask" is independent of the cache flushing. On some
> architectures, a device can indicate whether it needs cache management
> or not to guarantee coherency, but on m68k it appears that we always
> assume it does, see arch/m68k/kernel/dma.c

Thanks - what I see there indicates that on the relevant platforms, 
pages mapped for DMA have their page table cache bits modified to make 
them non-cacheable (and I suppose unmapping restores the default cache 
bits). That means I should use dma_set_mask_and_coherent() here to take 
advantage of this, and no need to mess around with 
dma_sync_single_for_device() in the drivers' dma_setup() functions.

>>> b) The bounce buffer is never mapped here, instead you have the
>>> virt_to_phys() here, which is not the same. I think you need to map
>>> the pointer that actually gets passed down to the device after deciding
>>> to use a bouce buffer or not.
>> I hadn't realized that I can map the bounce buffer just as it's done for
>> the SCp data buffer. Should have been obvious, but I'm still learning
>> about the DMA API.
>>
>> I've updated the patch now, will re-send as part of a complete series
>> once done.
> I suppose you can just drop the bounce buffer if this just comes
> from kmalloc().

That's only true for a3000 and mvme147 though.

Cheers,

     Michael

>
>         Arnd

Bart Van Assche June 28, 2022, 11:50 p.m. UTC | #17

On 6/28/22 16:09, Michael Schmitz wrote:
> On 29/06/22 09:50, Arnd Bergmann wrote:
>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz 
>> <schmitzmic@gmail.com> wrote:
>>> On 28/06/22 19:03, Geert Uytterhoeven wrote:
>>>>> The driver allocates bounce buffers using kmalloc if it hits an
>>>>> unaligned data buffer - can such buffers still even happen these days?
>>>> No idea.
>>> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
>>> code path is still being used.
>> kmalloc() guarantees alignment to the next power-of-two size or
>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
>> is cacheline aligned.
> 
> And all SCSI buffers are allocated using kmalloc? No way at all for user 
> space to pass unaligned data?
> 
> (SCSI is a weird beast - I have used a SCSI DAT tape driver many many 
> years ago, which broke all sorts of assumptions about transfer block 
> sizes ... but that might actually have been in the v0.99 days, many 
> rewrites of SCSI midlevel ago).
> 
> Just being cautious, as getting any of this tested will be a stretch.

An example of a user space application that passes an SG I/O data buffer 
to the kernel that is aligned to a four byte boundary but not to an 
eight byte boundary if the -s (scattered) command line option is used: 
https://github.com/osandov/blktests/blob/master/src/discontiguous-io.cpp

Bart.

Michael Schmitz June 29, 2022, 12:01 a.m. UTC | #18

Hi Bart,

On 29/06/22 11:50, Bart Van Assche wrote:
> On 6/28/22 16:09, Michael Schmitz wrote:
>> On 29/06/22 09:50, Arnd Bergmann wrote:
>>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz 
>>> <schmitzmic@gmail.com> wrote:
>>>> On 28/06/22 19:03, Geert Uytterhoeven wrote:
>>>>>> The driver allocates bounce buffers using kmalloc if it hits an
>>>>>> unaligned data buffer - can such buffers still even happen these 
>>>>>> days?
>>>>> No idea.
>>>> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether 
>>>> this
>>>> code path is still being used.
>>> kmalloc() guarantees alignment to the next power-of-two size or
>>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
>>> is cacheline aligned.
>>
>> And all SCSI buffers are allocated using kmalloc? No way at all for 
>> user space to pass unaligned data?
>>
>> (SCSI is a weird beast - I have used a SCSI DAT tape driver many many 
>> years ago, which broke all sorts of assumptions about transfer block 
>> sizes ... but that might actually have been in the v0.99 days, many 
>> rewrites of SCSI midlevel ago).
>>
>> Just being cautious, as getting any of this tested will be a stretch.
>
> An example of a user space application that passes an SG I/O data 
> buffer to the kernel that is aligned to a four byte boundary but not 
> to an eight byte boundary if the -s (scattered) command line option is 
> used: 
> https://github.com/osandov/blktests/blob/master/src/discontiguous-io.cpp

Thanks - four byte alignment actually wouldn't be an issue for me. It's 
two byte or smaller that would trip up the SCSI DMA.

While I'm sure such an even more pathological test case could be 
written, I was rather worried about st.c and sr.c input ...

Cheers,

     Michael

>
> Bart.

Michael Schmitz June 29, 2022, 12:14 a.m. UTC | #19

Hi Bart,

On 29/06/22 12:01, Michael Schmitz wrote:
>
>> An example of a user space application that passes an SG I/O data 
>> buffer to the kernel that is aligned to a four byte boundary but not 
>> to an eight byte boundary if the -s (scattered) command line option 
>> is used: 
>> https://github.com/osandov/blktests/blob/master/src/discontiguous-io.cpp
>
> Thanks - four byte alignment actually wouldn't be an issue for me. 
> It's two byte or smaller that would trip up the SCSI DMA.
>
> While I'm sure such an even more pathological test case could be 
> written, I was rather worried about st.c and sr.c input ...
Nevermind - I just see m68k defines ARCH_DMA_MINALIGN to be four bytes. 
Should be safe for all that matters, then.

Cheers,

     Michael

Christoph Hellwig June 29, 2022, 6:21 a.m. UTC | #20

On Wed, Jun 29, 2022 at 11:09:00AM +1200, Michael Schmitz wrote:
> And all SCSI buffers are allocated using kmalloc? No way at all for user
> space to pass unaligned data?

Most that you will see actually comes from the page allocator.  But
the block layer has a dma_alignment limit, and when userspace sends
I/O that is not properly aligned it will be bounce buffered before
it it sent to the driver.

Christoph Hellwig June 29, 2022, 6:25 a.m. UTC | #21

On Wed, Jun 29, 2022 at 09:38:00AM +1200, Michael Schmitz wrote:
> That's one of the 'liberties' I alluded to. The reason I left these in is
> that I'm none too certain what device feature the DMA API uses to decide a
> device isn't cache-coherent.

The DMA API does not look at device features at all.  It needs to be
told so by the platform code.  Once an architecture implements the
hooks to support non-coherent DMA all devices are treated as
non-coherent by default unless overriden by the architecture either
globally (using the global dma_default_coherent variable) or per-device
(using the dev->dma_coherent field, usually set by arch_setup_dma_ops).

> If it's dev->coherent_dma_mask, the way I set
> up the device in the a3000 driver should leave the coherent mask unchanged.
> For the Zorro drivers, devices are set up to use the same storage to store
> normal and coherent masks - something we most likely want to change. I need
> to think about the ramifications of that.

No, the coherent mask is slightly misnamed amd not actually related.

David Laight June 30, 2022, 8:04 a.m. UTC | #22

From: Michael Schmitz
> Sent: 29 June 2022 00:09
> 
> Hi Arnd,
> 
> On 29/06/22 09:50, Arnd Bergmann wrote:
> > On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
> >> On 28/06/22 19:03, Geert Uytterhoeven wrote:
> >>>> The driver allocates bounce buffers using kmalloc if it hits an
> >>>> unaligned data buffer - can such buffers still even happen these days?
> >>> No idea.
> >> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
> >> code path is still being used.
> > kmalloc() guarantees alignment to the next power-of-two size or
> > KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
> > is cacheline aligned.
> 
> And all SCSI buffers are allocated using kmalloc? No way at all for user
> space to pass unaligned data?

I didn't think kmalloc() gave any such guarantee about alignment.
There are cache-line alignment requirements on systems with non-coherent
dma, but otherwise the alignment can be much smaller.

One of the allocators adds a header to each item, IIRC that can
lead to 'unexpected' alignments - especially on m68k.

dma_alloc_coherent() does align to next 'power of 2'.
And sometimes you need (eg) 16k allocates that are 16k aligned.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Christophe Leroy June 30, 2022, 9:40 a.m. UTC | #23

Le 30/06/2022 à 10:04, David Laight a écrit :
> From: Michael Schmitz
>> Sent: 29 June 2022 00:09
>>
>> Hi Arnd,
>>
>> On 29/06/22 09:50, Arnd Bergmann wrote:
>>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
>>>> On 28/06/22 19:03, Geert Uytterhoeven wrote:
>>>>>> The driver allocates bounce buffers using kmalloc if it hits an
>>>>>> unaligned data buffer - can such buffers still even happen these days?
>>>>> No idea.
>>>> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
>>>> code path is still being used.
>>> kmalloc() guarantees alignment to the next power-of-two size or
>>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
>>> is cacheline aligned.
>>
>> And all SCSI buffers are allocated using kmalloc? No way at all for user
>> space to pass unaligned data?
> 
> I didn't think kmalloc() gave any such guarantee about alignment.

I does since commit 59bb47985c1d ("mm, sl[aou]b: guarantee natural 
alignment for kmalloc(power-of-two)")

Christophe

> There are cache-line alignment requirements on systems with non-coherent
> dma, but otherwise the alignment can be much smaller.
> 
> One of the allocators adds a header to each item, IIRC that can
> lead to 'unexpected' alignments - especially on m68k.
> 
> dma_alloc_coherent() does align to next 'power of 2'.
> And sometimes you need (eg) 16k allocates that are 16k aligned.
> 
> 	David
> 
> -
> Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> Registration No: 1397386 (Wales)

David Laight June 30, 2022, 10:32 a.m. UTC | #24

From: Christophe Leroy
> Sent: 30 June 2022 10:40
> 
> Le 30/06/2022 à 10:04, David Laight a écrit :
> > From: Michael Schmitz
> >> Sent: 29 June 2022 00:09
> >>
> >> Hi Arnd,
> >>
> >> On 29/06/22 09:50, Arnd Bergmann wrote:
> >>> On Tue, Jun 28, 2022 at 11:03 PM Michael Schmitz <schmitzmic@gmail.com> wrote:
> >>>> On 28/06/22 19:03, Geert Uytterhoeven wrote:
> >>>>>> The driver allocates bounce buffers using kmalloc if it hits an
> >>>>>> unaligned data buffer - can such buffers still even happen these days?
> >>>>> No idea.
> >>>> Hmmm - I think I'll stick a WARN_ONCE() in there so we know whether this
> >>>> code path is still being used.
> >>> kmalloc() guarantees alignment to the next power-of-two size or
> >>> KMALLOC_MIN_ALIGN, whichever is bigger. On m68k this means it
> >>> is cacheline aligned.
> >>
> >> And all SCSI buffers are allocated using kmalloc? No way at all for user
> >> space to pass unaligned data?
> >
> > I didn't think kmalloc() gave any such guarantee about alignment.
> 
> I does since commit 59bb47985c1d ("mm, sl[aou]b: guarantee natural
> alignment for kmalloc(power-of-two)")

Looks like it is done for 'power-of-two' less than PAGE_SIZE.
This may not help scsi tape writes which could easily be (say) 47 bytes.
I think that only guarantees 2 byte alignment on m68k.
(Although increasing the min-alignment on m68k to 4 (or even 8)
will probably make no measurable difference.)

What happens above PAGE_SIZE?
Any structure with a trailing [] field could easily request
'64k + a_bit' bytes.
You don't really want to extend this to 128k - but I suspect
that is what happens.

	David
 

> 
> Christophe
> 
> > There are cache-line alignment requirements on systems with non-coherent
> > dma, but otherwise the alignment can be much smaller.
> >
> > One of the allocators adds a header to each item, IIRC that can
> > lead to 'unexpected' alignments - especially on m68k.
> >
> > dma_alloc_coherent() does align to next 'power of 2'.
> > And sometimes you need (eg) 16k allocates that are 16k aligned.
> >
> > 	David
> >
> > -
> > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
> > Registration No: 1397386 (Wales)

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Michael Schmitz June 30, 2022, 7:21 p.m. UTC | #25

Hi Christoph,

On 29/06/22 18:21, Christoph Hellwig wrote:
> On Wed, Jun 29, 2022 at 11:09:00AM +1200, Michael Schmitz wrote:
>> And all SCSI buffers are allocated using kmalloc? No way at all for user
>> space to pass unaligned data?
> Most that you will see actually comes from the page allocator.  But
> the block layer has a dma_alignment limit, and when userspace sends
> I/O that is not properly aligned it will be bounce buffered before
> it it sent to the driver.

That limit is set to L1_CACHE_BYTES on m68k so we're good here.

Thanks,

     Michael

Michael Schmitz June 30, 2022, 7:26 p.m. UTC | #26

Hi Christoph,

On 29/06/22 18:25, Christoph Hellwig wrote:
> On Wed, Jun 29, 2022 at 09:38:00AM +1200, Michael Schmitz wrote:
>> That's one of the 'liberties' I alluded to. The reason I left these in is
>> that I'm none too certain what device feature the DMA API uses to decide a
>> device isn't cache-coherent.
> The DMA API does not look at device features at all.  It needs to be
> told so by the platform code.  Once an architecture implements the
> hooks to support non-coherent DMA all devices are treated as
> non-coherent by default unless overriden by the architecture either
> globally (using the global dma_default_coherent variable) or per-device
> (using the dev->dma_coherent field, usually set by arch_setup_dma_ops).
Haven't got any of that, so non-coherent DMA is all we can use (even 
though some of the RAM used for bounce buffers may actually be coherent 
due to the page table cache bits).
>
>> If it's dev->coherent_dma_mask, the way I set
>> up the device in the a3000 driver should leave the coherent mask unchanged.
>> For the Zorro drivers, devices are set up to use the same storage to store
>> normal and coherent masks - something we most likely want to change. I need
>> to think about the ramifications of that.
> No, the coherent mask is slightly misnamed amd not actually related.

Thanks, that had me confused.

Cheers,

     Michael

[v2,0/3] phase out CONFIG_VIRT_TO_BUS

Message

Comments