Message ID | 1458007931-14432-1-git-send-email-thunder.leizhen@huawei.com |
---|---|
State | New |
Headers | show |
On Tue, Mar 15, 2016 at 10:12:11AM +0800, Zhen Lei wrote: > 1. In swiotlb_alloc_coherent, the branch of __get_free_pages. Directly > return vaddr on success, and pass vaddr to free_pages on failure. > 2. So, we can directly transparent pass vaddr from __dma_free to > swiotlb_free_coherent, keep consistent with swiotlb_alloc_coherent. > > This patch have no functional change, I don't think so. > but can obtain a bit performance improvement. Have you actually measured it? > diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > index a6e757c..b2f2834 100644 > --- a/arch/arm64/mm/dma-mapping.c > +++ b/arch/arm64/mm/dma-mapping.c > @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, > void *vaddr, dma_addr_t dma_handle, > struct dma_attrs *attrs) > { > - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); > - > size = PAGE_ALIGN(size); > > if (!is_device_dma_coherent(dev)) { > @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, > return; > vunmap(vaddr); > } > - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); > + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); > } What happens when !is_device_dma_coherent(dev)? (hint: read two lines above __dma_free_coherent). -- Catalin
On 2016/3/15 23:37, Catalin Marinas wrote: > On Tue, Mar 15, 2016 at 10:12:11AM +0800, Zhen Lei wrote: >> 1. In swiotlb_alloc_coherent, the branch of __get_free_pages. Directly >> return vaddr on success, and pass vaddr to free_pages on failure. >> 2. So, we can directly transparent pass vaddr from __dma_free to >> swiotlb_free_coherent, keep consistent with swiotlb_alloc_coherent. >> >> This patch have no functional change, > > I don't think so. > >> but can obtain a bit performance improvement. > > Have you actually measured it? I have not run any performance testing, but reduced a line of code. So I said "a bit". > >> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >> index a6e757c..b2f2834 100644 >> --- a/arch/arm64/mm/dma-mapping.c >> +++ b/arch/arm64/mm/dma-mapping.c >> @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, >> void *vaddr, dma_addr_t dma_handle, >> struct dma_attrs *attrs) >> { >> - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); >> - >> size = PAGE_ALIGN(size); >> >> if (!is_device_dma_coherent(dev)) { >> @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, >> return; >> vunmap(vaddr); >> } >> - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); >> + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); >> } > > What happens when !is_device_dma_coherent(dev)? (hint: read two lines > above __dma_free_coherent). > The whole function of __dma_free as below: (nobody use swiotlb_addr except __dma_free_coherent) static void __dma_free(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, struct dma_attrs *attrs) { void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); size = PAGE_ALIGN(size); if (!is_device_dma_coherent(dev)) { if (__free_from_pool(vaddr, size)) return; vunmap(vaddr); } __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); }
On 2016/3/16 9:56, Leizhen (ThunderTown) wrote: > > > On 2016/3/15 23:37, Catalin Marinas wrote: >> On Tue, Mar 15, 2016 at 10:12:11AM +0800, Zhen Lei wrote: >>> 1. In swiotlb_alloc_coherent, the branch of __get_free_pages. Directly >>> return vaddr on success, and pass vaddr to free_pages on failure. >>> 2. So, we can directly transparent pass vaddr from __dma_free to >>> swiotlb_free_coherent, keep consistent with swiotlb_alloc_coherent. >>> >>> This patch have no functional change, >> >> I don't think so. >> >>> but can obtain a bit performance improvement. >> >> Have you actually measured it? > I have not run any performance testing, but reduced a line of code. So I said "a bit". > >> >>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >>> index a6e757c..b2f2834 100644 >>> --- a/arch/arm64/mm/dma-mapping.c >>> +++ b/arch/arm64/mm/dma-mapping.c >>> @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, >>> void *vaddr, dma_addr_t dma_handle, >>> struct dma_attrs *attrs) >>> { >>> - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); >>> - >>> size = PAGE_ALIGN(size); >>> >>> if (!is_device_dma_coherent(dev)) { >>> @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, >>> return; >>> vunmap(vaddr); >>> } >>> - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); >>> + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); >>> } >> >> What happens when !is_device_dma_coherent(dev)? (hint: read two lines >> above __dma_free_coherent). Do you afraid "vaddr" maybe modified by these statement? First, it could not be __free_from_pool. Otherwise, the function vunmap(which after it) can not work well. Then, it count not be vunmap too, the parameter is defined as "const void *". In the call chain: __dma_free_coherent-->__dma_free_coherent-->swiotlb_free_coherent, only swiotlb_free_coherent finally use "vaddr". >> > The whole function of __dma_free as below: (nobody use swiotlb_addr except __dma_free_coherent) > static void __dma_free(struct device *dev, size_t size, > void *vaddr, dma_addr_t dma_handle, > struct dma_attrs *attrs) > { > void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); > > size = PAGE_ALIGN(size); > > if (!is_device_dma_coherent(dev)) { > if (__free_from_pool(vaddr, size)) > return; > vunmap(vaddr); > } > __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); > } >
On Thu, Mar 17, 2016 at 07:06:27PM +0800, Leizhen (ThunderTown) wrote: > On 2016/3/16 9:56, Leizhen (ThunderTown) wrote: > > On 2016/3/15 23:37, Catalin Marinas wrote: > >> On Tue, Mar 15, 2016 at 10:12:11AM +0800, Zhen Lei wrote: > >>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c > >>> index a6e757c..b2f2834 100644 > >>> --- a/arch/arm64/mm/dma-mapping.c > >>> +++ b/arch/arm64/mm/dma-mapping.c > >>> @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, > >>> void *vaddr, dma_addr_t dma_handle, > >>> struct dma_attrs *attrs) > >>> { > >>> - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); > >>> - > >>> size = PAGE_ALIGN(size); > >>> > >>> if (!is_device_dma_coherent(dev)) { > >>> @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, > >>> return; > >>> vunmap(vaddr); > >>> } > >>> - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); > >>> + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); > >>> } > >> > >> What happens when !is_device_dma_coherent(dev)? (hint: read two lines > >> above __dma_free_coherent). > > Do you afraid "vaddr" maybe modified by these statement? > First, it could not be __free_from_pool. Otherwise, the function > vunmap(which after it) can not work well. Then, it count not be vunmap > too, the parameter is defined as "const void *". > > In the call chain: > __dma_free_coherent-->__dma_free_coherent-->swiotlb_free_coherent, > only swiotlb_free_coherent finally use "vaddr". Exactly. So you give swiotlb_free_coherent a vaddr which has been unmapped. It doesn't even matter whether it's still mapped since this address is passed further to free_pages() which performs a virt_to_page(). The latter is *only* valid on linear map addresses (and you would actually hit the VM_BUG_ON in free_pages; you can try running this with CONFIG_DEBUG_VM enabled and non-coherent DMA). For non-coherent DMA, the vaddr is not part of the linear mapping as it has been remapped by __dma_alloc() via dma_common_contiguous_remap(), hence for swiotlb freeing we need the actual linear map address (the original "ptr" in __dma_alloc()). We can generate it by a phys_to_virt(dma_to_phys(dma_handle)). -- Catalin
On 2016/3/17 19:59, Catalin Marinas wrote: > On Thu, Mar 17, 2016 at 07:06:27PM +0800, Leizhen (ThunderTown) wrote: >> On 2016/3/16 9:56, Leizhen (ThunderTown) wrote: >>> On 2016/3/15 23:37, Catalin Marinas wrote: >>>> On Tue, Mar 15, 2016 at 10:12:11AM +0800, Zhen Lei wrote: >>>>> diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c >>>>> index a6e757c..b2f2834 100644 >>>>> --- a/arch/arm64/mm/dma-mapping.c >>>>> +++ b/arch/arm64/mm/dma-mapping.c >>>>> @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, >>>>> void *vaddr, dma_addr_t dma_handle, >>>>> struct dma_attrs *attrs) >>>>> { >>>>> - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); >>>>> - >>>>> size = PAGE_ALIGN(size); >>>>> >>>>> if (!is_device_dma_coherent(dev)) { >>>>> @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, >>>>> return; >>>>> vunmap(vaddr); >>>>> } >>>>> - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); >>>>> + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); >>>>> } >>>> >>>> What happens when !is_device_dma_coherent(dev)? (hint: read two lines >>>> above __dma_free_coherent). >> >> Do you afraid "vaddr" maybe modified by these statement? >> First, it could not be __free_from_pool. Otherwise, the function >> vunmap(which after it) can not work well. Then, it count not be vunmap >> too, the parameter is defined as "const void *". >> >> In the call chain: >> __dma_free_coherent-->__dma_free_coherent-->swiotlb_free_coherent, >> only swiotlb_free_coherent finally use "vaddr". > > Exactly. So you give swiotlb_free_coherent a vaddr which has been > unmapped. It doesn't even matter whether it's still mapped since this > address is passed further to free_pages() which performs a > virt_to_page(). The latter is *only* valid on linear map addresses (and > you would actually hit the VM_BUG_ON in free_pages; you can try running > this with CONFIG_DEBUG_VM enabled and non-coherent DMA). > > For non-coherent DMA, the vaddr is not part of the linear mapping as it > has been remapped by __dma_alloc() via dma_common_contiguous_remap(), > hence for swiotlb freeing we need the actual linear map address (the > original "ptr" in __dma_alloc()). We can generate it by a > phys_to_virt(dma_to_phys(dma_handle)). > OK, I got it. So actually I should move the statement into branch "if (!is_device_dma_coherent(dev))", I will prepare v2.
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c index a6e757c..b2f2834 100644 --- a/arch/arm64/mm/dma-mapping.c +++ b/arch/arm64/mm/dma-mapping.c @@ -187,8 +187,6 @@ static void __dma_free(struct device *dev, size_t size, void *vaddr, dma_addr_t dma_handle, struct dma_attrs *attrs) { - void *swiotlb_addr = phys_to_virt(dma_to_phys(dev, dma_handle)); - size = PAGE_ALIGN(size); if (!is_device_dma_coherent(dev)) { @@ -196,7 +194,7 @@ static void __dma_free(struct device *dev, size_t size, return; vunmap(vaddr); } - __dma_free_coherent(dev, size, swiotlb_addr, dma_handle, attrs); + __dma_free_coherent(dev, size, vaddr, dma_handle, attrs); } static dma_addr_t __swiotlb_map_page(struct device *dev, struct page *page,
1. In swiotlb_alloc_coherent, the branch of __get_free_pages. Directly return vaddr on success, and pass vaddr to free_pages on failure. 2. So, we can directly transparent pass vaddr from __dma_free to swiotlb_free_coherent, keep consistent with swiotlb_alloc_coherent. This patch have no functional change, but can obtain a bit performance improvement. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- arch/arm64/mm/dma-mapping.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) -- 2.5.0