diff mbox series

[net-next,1/3] selftests: net: tcp_mmap: use madvise(MADV_DONTNEED)

Message ID 20200820171118.1822853-2-edumazet@google.com
State New
Headers show
Series [net-next,1/3] selftests: net: tcp_mmap: use madvise(MADV_DONTNEED) | expand

Commit Message

Eric Dumazet Aug. 20, 2020, 5:11 p.m. UTC
When TCP_ZEROCOPY_RECEIVE operation has been added,
I made the mistake of automatically un-mapping prior
content before mapping new pages.

This has the unfortunate effect of adding potentially long
MMU operations (like TLB flushes) while socket lock is held.

Using madvise(MADV_DONTNEED) right after pages has been used
has two benefits :

1) This releases pages sooner, allowing pages to be recycled
if they were part of a page pool in a NIC driver.

2) No more long unmap operations while preventing immediate
processing of incoming packets.

The cost of the added system call is small enough.

Arjun will submit a kernel patch allowing to opt out from
the unmap attempt in tcp_zerocopy_receive()

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Arjun Roy <arjunroy@google.com>
Cc: Soheil Hassas Yeganeh <soheil@google.com>
---
 tools/testing/selftests/net/tcp_mmap.c | 4 ++++
 1 file changed, 4 insertions(+)
diff mbox series

Patch

diff --git a/tools/testing/selftests/net/tcp_mmap.c b/tools/testing/selftests/net/tcp_mmap.c
index a61b7b3da5496285876b0e16b18a3060850b0803..59ec0b59f7b76ff75685bd96901d8237e0665b2b 100644
--- a/tools/testing/selftests/net/tcp_mmap.c
+++ b/tools/testing/selftests/net/tcp_mmap.c
@@ -179,6 +179,10 @@  void *child_thread(void *arg)
 				total_mmap += zc.length;
 				if (xflg)
 					hash_zone(addr, zc.length);
+				/* It is more efficient to unmap the pages right now,
+				 * instead of doing this in next TCP_ZEROCOPY_RECEIVE.
+				 */
+				madvise(addr, zc.length, MADV_DONTNEED);
 				total += zc.length;
 			}
 			if (zc.recv_skip_hint) {