Message ID | 1334332076-28489-3-git-send-email-t.stanislaws@samsung.com |
---|---|
State | Superseded, archived |
Headers | show |
Hi Tomasz, Thanks for the patch. On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote: > This patch adds description and usage examples for importing > DMABUF file descriptor in V4L2. > > Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com> > Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> [snip] > diff --git a/Documentation/DocBook/media/v4l/io.xml > b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 > --- a/Documentation/DocBook/media/v4l/io.xml > +++ b/Documentation/DocBook/media/v4l/io.xml > @@ -472,6 +472,162 @@ rest should be evident.</para> > </footnote></para> > </section> > > + <section id="dmabuf"> > + <title>Streaming I/O (DMA buffer importing)</title> This section is very similar to the Streaming I/O (User Pointers) section. Do you think we should merge the two ? I could handle that if you want.
On 04/17/2012 01:25 AM, Laurent Pinchart wrote: > Hi Tomasz, > > Thanks for the patch. > > On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote: >> This patch adds description and usage examples for importing >> DMABUF file descriptor in V4L2. >> >> Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com> >> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> > > [snip] > >> diff --git a/Documentation/DocBook/media/v4l/io.xml >> b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 >> --- a/Documentation/DocBook/media/v4l/io.xml >> +++ b/Documentation/DocBook/media/v4l/io.xml >> @@ -472,6 +472,162 @@ rest should be evident.</para> >> </footnote></para> >> </section> >> >> + <section id="dmabuf"> >> + <title>Streaming I/O (DMA buffer importing)</title> > > This section is very similar to the Streaming I/O (User Pointers) section. Do > you think we should merge the two ? I could handle that if you want. > Hi Laurent, One may find similar sentences in MMAP, USERPTR and DMABUF. Maybe the common parts like description of STREAMON/OFF, QBUF/DQBUF shuffling should be moved to separate section like "Streaming" :). Maybe it is worth to introduce a separate patch for this change. Frankly, I would prefer to keep the Doc in the current form till importer support gets merged. Later the Doc could be fixed. BTW. What is the sense of merging userptr and dmabuf section if userptr is going to dropped in long-term? Regards, Tomasz Stanislawski
Em 19-04-2012 11:32, Tomasz Stanislawski escreveu: > On 04/17/2012 01:25 AM, Laurent Pinchart wrote: >> Hi Tomasz, >> >> Thanks for the patch. >> >> On Friday 13 April 2012 17:47:44 Tomasz Stanislawski wrote: >>> This patch adds description and usage examples for importing >>> DMABUF file descriptor in V4L2. >>> >>> Signed-off-by: Tomasz Stanislawski <t.stanislaws@samsung.com> >>> Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com> >> >> [snip] >> >>> diff --git a/Documentation/DocBook/media/v4l/io.xml >>> b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 >>> --- a/Documentation/DocBook/media/v4l/io.xml >>> +++ b/Documentation/DocBook/media/v4l/io.xml >>> @@ -472,6 +472,162 @@ rest should be evident.</para> >>> </footnote></para> >>> </section> >>> >>> + <section id="dmabuf"> >>> + <title>Streaming I/O (DMA buffer importing)</title> >> >> This section is very similar to the Streaming I/O (User Pointers) section. Do >> you think we should merge the two ? I could handle that if you want. >> > > Hi Laurent, > > One may find similar sentences in MMAP, USERPTR and DMABUF. > Maybe the common parts like description of STREAMON/OFF, > QBUF/DQBUF shuffling should be moved to separate section > like "Streaming" :). > > Maybe it is worth to introduce a separate patch for this change. > > Frankly, I would prefer to keep the Doc in the current form till > importer support gets merged. Later the Doc could be fixed. > > BTW. What is the sense of merging userptr and dmabuf section > if userptr is going to dropped in long-term? I didn't read the rest of the thread, so sorry, if I'm making wrong assumptions... Am I understanding wrong or are you saying that you want to drop userptr from V4L2 API in long-term? > > Regards, > Tomasz Stanislawski
Em 19-04-2012 11:32, Tomasz Stanislawski escreveu: > Hi Laurent, > > One may find similar sentences in MMAP, USERPTR and DMABUF. > Maybe the common parts like description of STREAMON/OFF, > QBUF/DQBUF shuffling should be moved to separate section > like "Streaming" :). > > Maybe it is worth to introduce a separate patch for this change. > > Frankly, I would prefer to keep the Doc in the current form till > importer support gets merged. Later the Doc could be fixed. > > BTW. What is the sense of merging userptr and dmabuf section > if userptr is going to dropped in long-term? I didn't read yet the rest of the thread, so sorry, if I'm making wrong assumptions... Am I understanding wrong or are you saying that you want to drop userptr from V4L2 API in long-term? If so, why? Regards, Mauro
Hi Mauro, On 04/19/2012 10:37 PM, Mauro Carvalho Chehab wrote: > Em 19-04-2012 11:32, Tomasz Stanislawski escreveu: > >> Hi Laurent, >> >> One may find similar sentences in MMAP, USERPTR and DMABUF. >> Maybe the common parts like description of STREAMON/OFF, >> QBUF/DQBUF shuffling should be moved to separate section >> like "Streaming" :). >> >> Maybe it is worth to introduce a separate patch for this change. >> >> Frankly, I would prefer to keep the Doc in the current form till >> importer support gets merged. Later the Doc could be fixed. >> >> BTW. What is the sense of merging userptr and dmabuf section >> if userptr is going to dropped in long-term? > > I didn't read yet the rest of the thread, so sorry, if I'm making wrong assumptions... > Am I understanding wrong or are you saying that you want to drop userptr > from V4L2 API in long-term? If so, why? Dropping userptr is just some brainstorming idea. It was found out that userptr is not a good mean for buffer exchange between to two devices. The USERPTR simplifies userspace code but introduce a lot of complexity problems for the kernel drivers and frameworks. The problem is that memory mmaped to the userspace may not be a part of the system memory. It often happens for devices that use remap_pfn or dma_mmap_* to mmap the memory to the userspace. It is was empirically conjured the it is not possible to access this kind of memory by the other device without a platform-specific hacks or workarounds. The DMABUF was introduced to help in such a case. The basic short-term idea is to drop userptr support for buffers that are MMAPed by other device. The userptr will be used for memory allocated using malloc (anonymous pages) or (maybe) mmaped files. There are of course cache synchronization problems but there are a lesser concern. However this approach will work only for devices that have its own IOMMU which can be configured to access system memory. Otherwise, the memory has to copied anyway to device's own buffers. Moreover copying a large amount of data should not happen in the kernel-space. All the reasons make userptr an unreliable and complex to implement feature. So my rough-idea was to remove USERPTR support from kernel drivers (if possible of course) and to provide an emulation layer in the userspace code like libv4l2. Please note that it is only a rough idea. Just brainstorming :) It is *too early* to start any discussion on this topic. Especially until DMABUF is mature enough to become a good alternative for userptr. Regards, Tomasz Stanislawski > > Regards, > Mauro >
On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski <t.stanislaws@samsung.com> wrote: >> Am I understanding wrong or are you saying that you want to drop userptr >> from V4L2 API in long-term? If so, why? > > Dropping userptr is just some brainstorming idea. > It was found out that userptr is not a good mean > for buffer exchange between to two devices. I can believe that. But I am also inclined to believe that DMABUF is targetted at device-to-device transfer, while USERPTR is targetted at device-to-user (or user-to-device) transfers. Are you saying applications should use DMABUF and memory map the buffers? Or would you care to explain how DMABUF addresses the problem space of USERPTR? > The USERPTR simplifies userspace code but introduce > a lot of complexity problems for the kernel drivers > and frameworks. It is not only a simplification. In some cases, USERPTR is the only I/O method that supports zero copy in pretty much any circumstance. When the user cannot reliably predict the maximum number of required buffers, predicts a value larger than the device will negotiate, or needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not. Now, I do realize that some devices cannot support USERPTR efficiently, then they should not support USERPTR. But for those devices that can, it seems quite a nice performance enhancement. -- Rémi Denis-Courmont Sent from my collocated server
Hi Remi, On 04/20/2012 12:56 PM, Rémi Denis-Courmont wrote: > On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski > <t.stanislaws@samsung.com> wrote: >>> Am I understanding wrong or are you saying that you want to drop > userptr >>> from V4L2 API in long-term? If so, why? >> >> Dropping userptr is just some brainstorming idea. >> It was found out that userptr is not a good mean >> for buffer exchange between to two devices. > > I can believe that. But I am also inclined to believe that DMABUF is > targetted at device-to-device transfer, while USERPTR is targetted at > device-to-user (or user-to-device) transfers. Are you saying applications > should use DMABUF and memory map the buffers? No. As I sad before: it is *too early* to drop userptr and expect application to use DMABUF and MMAPs only. This was just some hypothetical idea. DMABUF is dedicated for dev-dev transfers. However, looking at the current speed of appearances of DMABUF extensions it may be expected that one day it starts supporting making DMA buffer from a user pointer. Currently there are already extensions for MMAP and cache synchronization. Who know what will happen future versions. However these are only hypothetical issues. Or would you care to explain > how DMABUF addresses the problem space of USERPTR? Allowing to attach a DMABUF to some userptr using some new magic IOCTL. I think that sooner or later someone will find this feature useful. > >> The USERPTR simplifies userspace code but introduce >> a lot of complexity problems for the kernel drivers >> and frameworks. > > It is not only a simplification. In some cases, USERPTR is the only I/O > method that supports zero copy in pretty much any circumstance. Only for devices that have its own IOMMU that can access system memory. Moreover the userptr must come from malloc or be a mmaped file. The other case are drivers that touch memory using CPU in the kernel space like VIVI or USB drivers. > When the user cannot reliably predict the maximum number of required buffers, > predicts a value larger than the device will negotiate, or needs buffers to > outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not. What does outlive STREAMOFF means in this context? Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best idea because it introduces an allocation overhead for negotiations of the number of the buffers. An allocation at mmap was to late. There is a need for some intermediate state between REQBUFS and mmap. The ioctl BUF_PREPARE may help here. Can you give me an example of a sane application is forced to negotiate a larger number of buffers than it is actually going to use? > > Now, I do realize that some devices cannot support USERPTR efficiently, > then they should not support USERPTR. The problem is not there is *NO* device that can handle USERPTR reliably. The can handle USERPTR generated by malloc or page cache (not sure). Memory mmaped from other devices, frameworks etc may or may not work. Even if the device has its IOMMU the DMA layer provides no generic way to transform from one device to the mapping in some other device. It is done using platform-defendant hacks like extracting PFNs from mappings, hack-forming them into struct pages or scatterlists, mapping it and hoping that the memory is not going to release it in some other thread. The only sure way is to copy data from userptr to MMAP buffer. But for those devices that can, it > seems quite a nice performance enhancement. The userptr has its niches were it works pretty well like Web cams or VIVI. I am saying that if ever DMABUF becomes a good alternative for USERPTR than maybe we should consider encouraging dropping USERPTR in the new drivers as 'obsolete' feature and providing some emulation layer in libv4l2 for legacy applications. Regards, Tomasz Stanislawski
On Fri, 20 Apr 2012 14:25:01 +0200, Tomasz Stanislawski <t.stanislaws@samsung.com> wrote: >>> The USERPTR simplifies userspace code but introduce >>> a lot of complexity problems for the kernel drivers >>> and frameworks. >> >> It is not only a simplification. In some cases, USERPTR is the only I/O >> method that supports zero copy in pretty much any circumstance. > > Only for devices that have its own IOMMU that can access system memory. Newer versions of the UVC driver have USERTPR, and simingly gspca seems too. That is practically all USB capture devices... That might be irrelevant for a smartphone manufacturer. That is very relevant for desktop applications. > Moreover the userptr must come from malloc or be a mmaped file. > The other case are drivers that touch memory using CPU in the kernel > space like VIVI or USB drivers. I'd argue that USB is the most common case of V4L2 on the desktop... >> When the user cannot reliably predict the maximum number of required >> buffers, predicts a value larger than the device will negotiate, or >> needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. >> USERPTR does not. > > What does outlive STREAMOFF means in this context? Depending how your multimedia pipeline is built, it is plausible that the V4L2 source is shutdown (STREAMOFF then close()) before buffers coming from it are released/destroyed downstream. I might be wrong, but I would expect that V4L2 MMAP buffers become invalid after STREAMOFF+close()? > Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best > idea because it introduces an allocation overhead for negotiations of > the number of the buffers. An allocation at mmap was to late. There is a > need for some intermediate state between REQBUFS and mmap. The ioctl > BUF_PREPARE may help here. > > Can you give me an example of a sane application is forced to negotiate a > larger number of buffers than it is actually going to use? Outside the embedded world, the application typically does not know what the latency of the multimedia pipeline is. If the latency is not known, the number of buffers needed for zero copy cannot be precomputed for REQBUFS, say: count = 1 + latency / frame interval. Even for a trivial analog TV viewer application, lip synchronization requires picture frames to be bufferred to be long enough to account for the latency of the audio input, dejitter, filtering and audio output. Those values are usually not well determined at the time of requesting buffers from the video capture device. Also the application may want to play nice with PulseAudio. Then it will get very long audio buffers with very few audio periods... more latency. It gets harder or outright impossible for frameworks dealing with complicated or arbitrary pipelines such as LibVLC or gstreamer. There is far too much unpredictability and variability downstream of the V4L2 source to estimate latency, and infer the number of buffers needed. >> Now, I do realize that some devices cannot support USERPTR efficiently, >> then they should not support USERPTR. > > The problem is not there is *NO* device that can handle USERPTR reliably. > The can handle USERPTR generated by malloc or page cache (not sure). > Memory mmaped from other devices, frameworks etc may or may not work. > Even if the device has its IOMMU the DMA layer provides no generic way to > transform from one device to the mapping in some other device. I'm not saying that USERPTR should replace DMABUF. I'm saying USERPTR has advantages over MMAP that DMABUF does not seem to cover as yet (if only libv4l2 would not inhibit USERPTR...). I'm definitely not saying that applications should rely on USERPTR being supported. We agree that not all devices can support USERPTR. > The userptr has its niches were it works pretty well like Web cams or VIVI. > I am saying that if ever DMABUF becomes a good alternative for USERPTR > than maybe we should consider encouraging dropping USERPTR in the new > drivers as 'obsolete' feature and providing some emulation layer in libv4l2 > for legacy applications. Sure. -- Rémi Denis-Courmont Sent from my collocated server
Em 20-04-2012 07:56, Rémi Denis-Courmont escreveu: > On Fri, 20 Apr 2012 10:41:37 +0200, Tomasz Stanislawski > <t.stanislaws@samsung.com> wrote: >>> Am I understanding wrong or are you saying that you want to drop > userptr >>> from V4L2 API in long-term? If so, why? >> >> Dropping userptr is just some brainstorming idea. >> It was found out that userptr is not a good mean >> for buffer exchange between to two devices. > > I can believe that. But I am also inclined to believe that DMABUF is > targetted at device-to-device transfer, while USERPTR is targetted at > device-to-user (or user-to-device) transfers. Are you saying applications > should use DMABUF and memory map the buffers? Or would you care to explain > how DMABUF addresses the problem space of USERPTR? I agree with Rémi. Userptr were never meant to be used by dev2dev transfer. The overlay mode were designed for it. I remember I've pointed it a few times at the mailing list. The DMABUF is the proper replacement for the overlay mode, and, after having it fully implemented, we can deprecate and remove the overlay mode. > >> The USERPTR simplifies userspace code but introduce >> a lot of complexity problems for the kernel drivers >> and frameworks. > > It is not only a simplification. In some cases, USERPTR is the only I/O > method that supports zero copy in pretty much any circumstance. When the > user cannot reliably predict the maximum number of required buffers, > predicts a value larger than the device will negotiate, or needs buffers to > outlive STREAMOFF (?), MMAP requires memory copying. USERPTR does not. Yes, that's my understand too. USERPTR works helps to avoid buffer copying. > > Now, I do realize that some devices cannot support USERPTR efficiently, > then they should not support USERPTR. But for those devices that can, it > seems quite a nice performance enhancement. Agreed. A quick note about that: for USB devices, with the current implementations, there will always be a copy inside the Kernel, as the USB and other transport headers should be removed. For them, the cost of MMAP and USERPTR is the same (not all USB drivers export USERPTR, because of a limitation at videobuf-vmalloc). >> The problem is that memory mmaped to the userspace may >> not be a part of the system memory. It often happens for >> devices that use remap_pfn or dma_mmap_* to mmap the >> memory to the userspace. >> >> It is was empirically conjured the it is not possible >> to access this kind of memory by the other device >> without a platform-specific hacks or workarounds. As I warned in the past: USERPTR were never meant to be used for dev2dev transfers. >> >> The DMABUF was introduced to help in such a case. >> >> The basic short-term idea is to drop userptr support for >> buffers that are MMAPed by other device. You should, instead, just drop userptr support on devices where DMA scatter/gather is not supported, and migrate all dev2dev use cases to DMABUF. >> >> The userptr will be used for memory allocated using malloc >> (anonymous pages) or (maybe) mmaped files. There are of >> course cache synchronization problems but there are >> a lesser concern. >> >> However this approach will work only for devices that >> have its own IOMMU which can be configured to access system >> memory. Otherwise, the memory has to copied anyway >> to device's own buffers. >> >> Moreover copying a large amount of data should not happen >> in the kernel-space. >> >> All the reasons make userptr an unreliable and complex to >> implement feature. >> >> So my rough-idea was to remove USERPTR support from kernel >> drivers (if possible of course) and to provide an emulation >> layer in the userspace code like libv4l2. >> >> Please note that it is only a rough idea. Just brainstorming :) > It is *too early* to start any discussion on this topic. > Especially until DMABUF is mature enough to become a good > alternative for userptr. Looking at the hole picture, dropping USERPTR would only make sense if it is broken on dev2user (or user2dev) transfers. Dropping its usage on dev2dev transfers makes sense, after having DMABUF implemented. Yet, if some userspace application wants to abuse of USERPTR in order to use it for dev2dev transfer, there's not much that can be done at Kernel level. It makes sense to put a big warn at the V4L2 Docs telling that this is not officially supported and can cause all sorts of issues at the machine/system. Regards, Mauro
Em 20-04-2012 09:25, Tomasz Stanislawski escreveu: > Hi Remi, >> Now, I do realize that some devices cannot support USERPTR efficiently, >> then they should not support USERPTR. > > The problem is not there is *NO* device that can handle USERPTR reliably. > The can handle USERPTR generated by malloc or page cache (not sure). > Memory mmaped from other devices, frameworks etc may or may not work. > Even if the device has its IOMMU the DMA layer provides no generic way to > transform from one device to the mapping in some other device. > > It is done using platform-defendant hacks like extracting PFNs from mappings, > hack-forming them into struct pages or scatterlists, mapping it and hoping > that the memory is not going to release it in some other thread. > > The only sure way is to copy data from userptr to MMAP buffer. All you're talking about is related to userptr abuse that happened on Embedded devices, of using it for something that were never meant to be used (dev2dev). While the DMABUF patches aren't applied, there's just one mode defined at the V4L2 API for dev2dev: overlay mode[1]. Most embedded applications and drivers decided that, instead of using overlay mode, to abuse of userptr to do dev2dev. As you've pointed, it was noticed in practice that this sometimes fail. Yes, such abuse should be dropped, and DMABUF is the right way to address it. That doesn't mean that USERPTR should be dropped for the thing it were originally created: dev2user or user2dev. Regards, Mauro [1] Even so, not all PC motherboards are capable of supporting the overlay mode: it is known that several chipsets have problems on their DMA engines, with causes data losses when a DMA transfer happens without passing through the system main memory (PCI2PCI transfers). So, drivers check the PCI quirks table to detect if dev2dev is supported, before exposing overlay mode to userspace.
Hi Rémi, On Friday 20 April 2012 15:03:17 Rémi Denis-Courmont wrote: > On Fri, 20 Apr 2012 14:25:01 +0200, Tomasz Stanislawski wrote: > >>> The USERPTR simplifies userspace code but introduce > >>> a lot of complexity problems for the kernel drivers > >>> and frameworks. > >> > >> It is not only a simplification. In some cases, USERPTR is the only I/O > >> method that supports zero copy in pretty much any circumstance. > > > > Only for devices that have its own IOMMU that can access system memory. > > Newer versions of the UVC driver have USERTPR, and simingly gspca seems > too. That is practically all USB capture devices... That might be > irrelevant for a smartphone manufacturer. That is very relevant for desktop > applications. > > > Moreover the userptr must come from malloc or be a mmaped file. > > The other case are drivers that touch memory using CPU in the kernel > > space like VIVI or USB drivers. > > I'd argue that USB is the most common case of V4L2 on the desktop... > > >> When the user cannot reliably predict the maximum number of required > >> buffers, predicts a value larger than the device will negotiate, or > >> needs buffers to outlive STREAMOFF (?), MMAP requires memory copying. > >> USERPTR does not. > > > > What does outlive STREAMOFF means in this context? > > Depending how your multimedia pipeline is built, it is plausible that the > V4L2 source is shutdown (STREAMOFF then close()) before buffers coming from > it are released/destroyed downstream. I might be wrong, but I would expect > that V4L2 MMAP buffers become invalid after STREAMOFF+close()? If the buffer is mmap()ed to userspace, it will not be freed before being munmap()ed. > > Anyway, IMO allocation of the buffers at VIDIOC_REQBUFS was not the best > > idea because it introduces an allocation overhead for negotiations of > > the number of the buffers. An allocation at mmap was to late. There is a > > need for some intermediate state between REQBUFS and mmap. The ioctl > > BUF_PREPARE may help here. > > > > Can you give me an example of a sane application is forced to negotiate > > a larger number of buffers than it is actually going to use? > > Outside the embedded world, the application typically does not know what > the latency of the multimedia pipeline is. If the latency is not known, the > number of buffers needed for zero copy cannot be precomputed for REQBUFS, > say: > > count = 1 + latency / frame interval. > > Even for a trivial analog TV viewer application, lip synchronization > requires picture frames to be bufferred to be long enough to account for > the latency of the audio input, dejitter, filtering and audio output. Those > values are usually not well determined at the time of requesting buffers > from the video capture device. Also the application may want to play nice > with PulseAudio. Then it will get very long audio buffers with very few > audio periods... more latency. > > It gets harder or outright impossible for frameworks dealing with > complicated or arbitrary pipelines such as LibVLC or gstreamer. There is > far too much unpredictability and variability downstream of the V4L2 source > to estimate latency, and infer the number of buffers needed. If I'm not mistaken VIDIOC_CREATEBUF allows you to create additional buffers at runtime. You can thus cope with a latency increase (provided that the allocation overhead isn't prohibitive, in which case you're stuck whatever method you select). Deleting buffers at runtime is currently not possible though. > >> Now, I do realize that some devices cannot support USERPTR efficiently, > >> then they should not support USERPTR. > > > > The problem is not there is *NO* device that can handle USERPTR reliably. > > The can handle USERPTR generated by malloc or page cache (not sure). > > Memory mmaped from other devices, frameworks etc may or may not work. > > Even if the device has its IOMMU the DMA layer provides no generic way to > > transform from one device to the mapping in some other device. > > I'm not saying that USERPTR should replace DMABUF. I'm saying USERPTR has > advantages over MMAP that DMABUF does not seem to cover as yet (if only > libv4l2 would not inhibit USERPTR...). > > I'm definitely not saying that applications should rely on USERPTR being > supported. We agree that not all devices can support USERPTR. > > > The userptr has its niches were it works pretty well like Web cams or > > VIVI. > > > > I am saying that if ever DMABUF becomes a good alternative for USERPTR > > than maybe we should consider encouraging dropping USERPTR in the new > > drivers as 'obsolete' feature and providing some emulation layer in > > libv4l2 for legacy applications. > > Sure.
Hi Mauro, On Friday, April 20, 2012 3:37 PM Mauro Carvalho Chehab wrote: (snipped) > >> So my rough-idea was to remove USERPTR support from kernel > >> drivers (if possible of course) and to provide an emulation > >> layer in the userspace code like libv4l2. > >> > >> Please note that it is only a rough idea. Just brainstorming :) > > > It is *too early* to start any discussion on this topic. > > Especially until DMABUF is mature enough to become a good > > alternative for userptr. > > Looking at the hole picture, dropping USERPTR would only make > sense if it is broken on dev2user (or user2dev) transfers. > > Dropping its usage on dev2dev transfers makes sense, after having > DMABUF implemented. > > Yet, if some userspace application wants to abuse of USERPTR in order > to use it for dev2dev transfer, there's not much that can be done at > Kernel level. > > It makes sense to put a big warn at the V4L2 Docs telling that this > is not officially supported and can cause all sorts of issues at > the machine/system. Please note that all current drivers which use videobuf/videobuf2-dma-contig are able to use userptr memory access method only with physically contiguous memory. This means that in fact they work only buffers, which come from other devices and dev2dev transfers are the only possibility. malloc()ed memory buffers are rejected. Best regards
Em 23-04-2012 07:50, Marek Szyprowski escreveu: > Hi Mauro, > > On Friday, April 20, 2012 3:37 PM Mauro Carvalho Chehab wrote: > > (snipped) > >>>> So my rough-idea was to remove USERPTR support from kernel >>>> drivers (if possible of course) and to provide an emulation >>>> layer in the userspace code like libv4l2. >>>> >>>> Please note that it is only a rough idea. Just brainstorming :) >> >>> It is *too early* to start any discussion on this topic. >>> Especially until DMABUF is mature enough to become a good >>> alternative for userptr. >> >> Looking at the hole picture, dropping USERPTR would only make >> sense if it is broken on dev2user (or user2dev) transfers. >> >> Dropping its usage on dev2dev transfers makes sense, after having >> DMABUF implemented. >> >> Yet, if some userspace application wants to abuse of USERPTR in order >> to use it for dev2dev transfer, there's not much that can be done at >> Kernel level. >> >> It makes sense to put a big warn at the V4L2 Docs telling that this >> is not officially supported and can cause all sorts of issues at >> the machine/system. > > Please note that all current drivers which use videobuf/videobuf2-dma-contig > are able to use userptr memory access method only with physically contiguous > memory. Yes. > This means that in fact they work only buffers, which come from other > devices and dev2dev transfers are the only possibility. malloc()ed memory > buffers are rejected. Fragmented buffers can be detected, at Kernel level, and VB/VB2 can refuse a fragmented memory when the hardware doesn't support it. However, checking if the buffer is fragmented is not a safe way to detect that the buffer will be used by a dev2dev transfer. If the buffers are allocated very soon just after boot time which malloc(), or if they use some different way of allocating the buffers (like reducing the max ram area addressed by the kernel or using CMU or a simila approach), it could be possible to use videobuf(1/2)-dma-contig for userptr with user2dev/dev2user transfers. This is actually used on some cases where this is used (like where the capture device only supports contiguous buffers). If, for some reason, the hardware doesn't support dev2dev transfers on a reliable way, some other strategy should be used. Regards, Mauro
diff --git a/Documentation/DocBook/media/v4l/compat.xml b/Documentation/DocBook/media/v4l/compat.xml index bce97c5..2a2083d 100644 --- a/Documentation/DocBook/media/v4l/compat.xml +++ b/Documentation/DocBook/media/v4l/compat.xml @@ -2523,6 +2523,10 @@ ioctls.</para> <listitem> <para>Selection API. <xref linkend="selection-api" /></para> </listitem> + <listitem> + <para>Importing DMABUF file descriptors as a new IO method described + in <xref linkend="dmabuf" />.</para> + </listitem> </itemizedlist> </section> diff --git a/Documentation/DocBook/media/v4l/io.xml b/Documentation/DocBook/media/v4l/io.xml index b815929..dc5979d 100644 --- a/Documentation/DocBook/media/v4l/io.xml +++ b/Documentation/DocBook/media/v4l/io.xml @@ -472,6 +472,162 @@ rest should be evident.</para> </footnote></para> </section> + <section id="dmabuf"> + <title>Streaming I/O (DMA buffer importing)</title> + + <note> + <title>Experimental</title> + <para>This is an <link linkend="experimental"> experimental </link> + interface and may change in the future.</para> + </note> + +<para>The DMABUF framework provides a generic mean for sharing buffers between + multiple devices. Device drivers that support DMABUF can export a DMA buffer +to userspace as a file descriptor (known as the exporter role), import a DMA +buffer from userspace using a file descriptor previously exported for a +different or the same device (known as the importer role), or both. This +section describes the DMABUF importer role API in V4L2.</para> + +<para>Input and output devices support the streaming I/O method when the +<constant>V4L2_CAP_STREAMING</constant> flag in the +<structfield>capabilities</structfield> field of &v4l2-capability; returned by +the &VIDIOC-QUERYCAP; ioctl is set. Whether importing DMA buffers through +DMABUF file descriptors is supported is determined by calling the +&VIDIOC-REQBUFS; ioctl with the memory type set to +<constant>V4L2_MEMORY_DMABUF</constant>.</para> + + <para>This I/O method is dedicated for sharing DMA buffers between V4L and +other APIs. Buffers (planes) are allocated by a driver on behalf of the +application, and exported to the application as file descriptors using an API +specific to the allocator driver. Only those file descriptor are exchanged, +these files and meta-information are passed in &v4l2-buffer; (or in +&v4l2-plane; in the multi-planar API case). The driver must be switched into +DMABUF I/O mode by calling the &VIDIOC-REQBUFS; with the desired buffer type. +No buffers (planes) are allocated beforehand, consequently they are not indexed +and cannot be queried like mapped buffers with the +<constant>VIDIOC_QUERYBUF</constant> ioctl.</para> + + <example> + <title>Initiating streaming I/O with DMABUF file descriptors</title> + + <programlisting> +&v4l2-requestbuffers; reqbuf; + +memset (&reqbuf, 0, sizeof (reqbuf)); +reqbuf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; +reqbuf.memory = V4L2_MEMORY_DMABUF; + +if (ioctl (fd, &VIDIOC-REQBUFS;, &reqbuf) == -1) { + if (errno == EINVAL) + printf ("Video capturing or DMABUF streaming is not supported\n"); + else + perror ("VIDIOC_REQBUFS"); + + exit (EXIT_FAILURE); +} + </programlisting> + </example> + + <para>Buffer (plane) file is passed on the fly with the &VIDIOC-QBUF; +ioctl. In case of multiplanar buffers, every plane can be associated with a +different DMABUF descriptor.Although buffers are commonly cycled, applications +can pass different DMABUF descriptor at each <constant>VIDIOC_QBUF</constant> +call.</para> + + <example> + <title>Queueing DMABUF using single plane API</title> + + <programlisting> +int buffer_queue(int v4lfd, int index, int dmafd) +{ + &v4l2-buffer; buf; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.fd = dmafd; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <example> + <title>Queueing DMABUF using multi plane API</title> + + <programlisting> +int buffer_queue_mp(int v4lfd, int index, int dmafd[], int n_planes) +{ + &v4l2-buffer; buf; + &v4l2-plane; planes[VIDEO_MAX_PLANES]; + int i; + + memset(&buf, 0, sizeof buf); + buf.type = V4L2_BUF_TYPE_VIDEO_CAPTURE_MPLANE; + buf.memory = V4L2_MEMORY_DMABUF; + buf.index = index; + buf.m.planes = planes; + buf.length = n_planes; + + memset(&planes, 0, sizeof planes); + + for (i = 0; i < n_planes; ++i) + buf.m.planes[i].m.fd = dmafd[i]; + + if (ioctl (v4lfd, &VIDIOC-QBUF;, &buf) == -1) { + perror ("VIDIOC_QBUF"); + return -1; + } + + return 0; +} + </programlisting> + </example> + + <para>Filled or displayed buffers are dequeued with the +&VIDIOC-DQBUF; ioctl. The driver can unpin the buffer at any +time between the completion of the DMA and this ioctl. The memory is +also unpinned when &VIDIOC-STREAMOFF; is called, &VIDIOC-REQBUFS;, or +when the device is closed.</para> + + <para>For capturing applications it is customary to enqueue a +number of empty buffers, to start capturing and enter the read loop. +Here the application waits until a filled buffer can be dequeued, and +re-enqueues the buffer when the data is no longer needed. Output +applications fill and enqueue buffers, when enough buffers are stacked +up output is started. In the write loop, when the application +runs out of free buffers it must wait until an empty buffer can be +dequeued and reused. Two methods exist to suspend execution of the +application until one or more buffers can be dequeued. By default +<constant>VIDIOC_DQBUF</constant> blocks when no buffer is in the +outgoing queue. When the <constant>O_NONBLOCK</constant> flag was +given to the &func-open; function, <constant>VIDIOC_DQBUF</constant> +returns immediately with an &EAGAIN; when no buffer is available. The +&func-select; or &func-poll; function are always available.</para> + + <para>To start and stop capturing or output applications call the +&VIDIOC-STREAMON; and &VIDIOC-STREAMOFF; ioctls. Note that +<constant>VIDIOC_STREAMOFF</constant> removes all buffers from both queues and +unlocks/unpins all buffers as a side effect. Since there is no notion of doing +anything "now" on a multitasking system, if an application needs to synchronize +with another event it should examine the &v4l2-buffer; +<structfield>timestamp</structfield> of captured buffers, or set the field +before enqueuing buffers for output.</para> + + <para>Drivers implementing DMABUF importing I/O must support the +<constant>VIDIOC_REQBUFS</constant>, <constant>VIDIOC_QBUF</constant>, +<constant>VIDIOC_DQBUF</constant>, <constant>VIDIOC_STREAMON</constant> and +<constant>VIDIOC_STREAMOFF</constant> ioctl, the <function>select()</function> +and <function>poll()</function> function.</para> + + </section> + <section id="async"> <title>Asynchronous I/O</title> @@ -671,6 +827,14 @@ memory, set by the application. See <xref linkend="userp" /> for details. <structname>v4l2_buffer</structname> structure.</entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>For the single-plane API and when +<structfield>memory</structfield> is <constant>V4L2_MEMORY_DMABUF</constant> this +is the file descriptor associated with a DMABUF buffer.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>length</structfield></entry> <entry></entry> @@ -746,6 +910,15 @@ should set this to 0.</entry> </entry> </row> <row> + <entry></entry> + <entry>int</entry> + <entry><structfield>fd</structfield></entry> + <entry>When the memory type in the containing &v4l2-buffer; is + <constant>V4L2_MEMORY_DMABUF</constant>, this is a file + descriptor associated with a DMABUF buffer, similar to the + <structfield>fd</structfield> field in &v4l2-buffer;.</entry> + </row> + <row> <entry>__u32</entry> <entry><structfield>data_offset</structfield></entry> <entry></entry> @@ -980,6 +1153,12 @@ pointer</link> I/O.</entry> <entry>3</entry> <entry>[to do]</entry> </row> + <row> + <entry><constant>V4L2_MEMORY_DMABUF</constant></entry> + <entry>2</entry> + <entry>The buffer is used for <link linkend="dmabuf">DMA shared +buffer</link> I/O.</entry> + </row> </tbody> </tgroup> </table> diff --git a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml index 73ae8a6..adc92be 100644 --- a/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-create-bufs.xml @@ -98,6 +98,7 @@ information.</para> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to <constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row> diff --git a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml index 9caa49a..cb5f5ff 100644 --- a/Documentation/DocBook/media/v4l/vidioc-qbuf.xml +++ b/Documentation/DocBook/media/v4l/vidioc-qbuf.xml @@ -112,6 +112,21 @@ they cannot be swapped out to disk. Buffers remain locked until dequeued, until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the device is closed.</para> + <para>To enqueue a <link linkend="dmabuf">DMABUF</link> buffer applications +set the <structfield>memory</structfield> field to +<constant>V4L2_MEMORY_DMABUF</constant> and the <structfield>m.fd</structfield> +to a file descriptor associated with a DMABUF buffer. When the multi-planar API is +used and <structfield>m.fd</structfield> of the passed array of &v4l2-plane; +have to be used instead. When <constant>VIDIOC_QBUF</constant> is called with a +pointer to this structure the driver sets the +<constant>V4L2_BUF_FLAG_QUEUED</constant> flag and clears the +<constant>V4L2_BUF_FLAG_MAPPED</constant> and +<constant>V4L2_BUF_FLAG_DONE</constant> flags in the +<structfield>flags</structfield> field, or it returns an error code. This +ioctl locks the buffer. Buffers remain locked until dequeued, +until the &VIDIOC-STREAMOFF; or &VIDIOC-REQBUFS; ioctl is called, or until the +device is closed.</para> + <para>Applications call the <constant>VIDIOC_DQBUF</constant> ioctl to dequeue a filled (capturing) or displayed (output) buffer from the driver's outgoing queue. They just set the diff --git a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml index 7be4b1d..e3e709b 100644 --- a/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml +++ b/Documentation/DocBook/media/v4l/vidioc-reqbufs.xml @@ -48,28 +48,30 @@ <refsect1> <title>Description</title> - <para>This ioctl is used to initiate <link linkend="mmap">memory -mapped</link> or <link linkend="userp">user pointer</link> -I/O. Memory mapped buffers are located in device memory and must be -allocated with this ioctl before they can be mapped into the -application's address space. User buffers are allocated by -applications themselves, and this ioctl is merely used to switch the -driver into user pointer I/O mode and to setup some internal structures.</para> +<para>This ioctl is used to initiate <link linkend="mmap">memory mapped</link>, +<link linkend="userp">user pointer</link> or <link +linkend="dmabuf">DMABUF</link> based I/O. Memory mapped buffers are located in +device memory and must be allocated with this ioctl before they can be mapped +into the application's address space. User buffers are allocated by +applications themselves, and this ioctl is merely used to switch the driver +into user pointer I/O mode and to setup some internal structures. +Similarly, DMABUF buffers are allocated by applications through a device +driver, and this ioctl only configures the driver into DMABUF I/O mode without +performing any direct allocation.</para> - <para>To allocate device buffers applications initialize all -fields of the <structname>v4l2_requestbuffers</structname> structure. -They set the <structfield>type</structfield> field to the respective -stream or buffer type, the <structfield>count</structfield> field to -the desired number of buffers, <structfield>memory</structfield> -must be set to the requested I/O method and the <structfield>reserved</structfield> array -must be zeroed. When the ioctl -is called with a pointer to this structure the driver will attempt to allocate -the requested number of buffers and it stores the actual number -allocated in the <structfield>count</structfield> field. It can be -smaller than the number requested, even zero, when the driver runs out -of free memory. A larger number is also possible when the driver requires -more buffers to function correctly. For example video output requires at least two buffers, -one displayed and one filled by the application.</para> + <para>To allocate device buffers applications initialize all fields of the +<structname>v4l2_requestbuffers</structname> structure. They set the +<structfield>type</structfield> field to the respective stream or buffer type, +the <structfield>count</structfield> field to the desired number of buffers, +<structfield>memory</structfield> must be set to the requested I/O method and +the <structfield>reserved</structfield> array must be zeroed. When the ioctl is +called with a pointer to this structure the driver will attempt to allocate the +requested number of buffers and it stores the actual number allocated in the +<structfield>count</structfield> field. It can be smaller than the number +requested, even zero, when the driver runs out of free memory. A larger number +is also possible when the driver requires more buffers to function correctly. +For example video output requires at least two buffers, one displayed and one +filled by the application.</para> <para>When the I/O method is not supported the ioctl returns an &EINVAL;.</para> @@ -102,7 +104,8 @@ as the &v4l2-format; <structfield>type</structfield> field. See <xref <entry>&v4l2-memory;</entry> <entry><structfield>memory</structfield></entry> <entry>Applications set this field to -<constant>V4L2_MEMORY_MMAP</constant> or +<constant>V4L2_MEMORY_MMAP</constant>, +<constant>V4L2_MEMORY_DMABUF</constant> or <constant>V4L2_MEMORY_USERPTR</constant>.</entry> </row> <row>