Message ID | 20200907144435.43165-1-dust.li@linux.alibaba.com |
---|---|
State | New |
Headers | show |
Series | net/sock: don't drop udp packets if udp_mem[2] not reached | expand |
On Mon, Sep 07, 2020 at 07:18:48PM +0200, Paolo Abeni wrote: >Hi, > >On Mon, 2020-09-07 at 22:44 +0800, Dust Li wrote: >> We encoutered udp packets drop under a pretty low pressure >> with net.ipv4.udp_mem[0] set to a small value (4096). >> >> After some tracing and debugging, we found that for udp >> protocol, __sk_mem_raise_allocated() will possiblly drop >> packets if: >> udp_mem[0] < udp_prot.memory_allocated < udp_mem[2] >> >> That's because __sk_mem_raise_allocated() didn't handle >> the above condition for protocols like udp who doesn't >> have sk_has_memory_pressure() >> >> We can reproduce this with the following condition >> 1. udp_mem[0] is relateive small, >> 2. net.core.rmem_default/max > udp_mem[0] * 4K > >This looks like something that could/should be addressed at >configuration level ?!? Thanks a lot for the review ! Sorry, maybe I haven't make it clear enough The real problem is the scability with the sockets number. Since the udp_mem is for all UDP sockets, with the number of udp sockets grows, soon or later, udp_prot.memory_allocated will exceed udp_mem[0], and __sk_mem_raise_allocated() will cause the packets drop here. But the total udp memory allocated may still far under udp_mem[1] or udp_mem[2] > >udp_mem[0] should accomodate confortably at least a socket. Yeah, I agree udp_mem[0] should be large enough for at least a socket. Here I use 4096 just for simple and reproduce what we met before. I changed my test program a bit: - with 16 server sockets - with 1 client sending 3000 messages(size: 4096bytes) to each of those 8 server sockets - set net.core.rmem_default/max to (2*4096*4096) - and keep udp_mem unset, which by default on my 4GB VM is 'net.ipv4.udp_mem = 91944 122592 183888' https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/multi_sockets_test.sh Actually, with more udp sockets, I can always make it large enough to exceed udp_mem[0], and drop packets before udp_mem[1] and udp_mem[2]. > >Cheers, > >Paolo
diff --git a/net/core/sock.c b/net/core/sock.c index 6c5c6b18eff4..fed8211d8dbe 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -2648,6 +2648,12 @@ int __sk_mem_raise_allocated(struct sock *sk, int size, int amt, int kind) atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) return 1; + } else { + /* for prots without memory_pressure callbacks, we should not + * drop until hard limit reached + */ + if (allocated <= sk_prot_mem_limits(sk, 2)) + return 1; } suppress_allocation:
We encoutered udp packets drop under a pretty low pressure with net.ipv4.udp_mem[0] set to a small value (4096). After some tracing and debugging, we found that for udp protocol, __sk_mem_raise_allocated() will possiblly drop packets if: udp_mem[0] < udp_prot.memory_allocated < udp_mem[2] That's because __sk_mem_raise_allocated() didn't handle the above condition for protocols like udp who doesn't have sk_has_memory_pressure() We can reproduce this with the following condition 1. udp_mem[0] is relateive small, 2. net.core.rmem_default/max > udp_mem[0] * 4K 3. The udp server receive slowly, causing the udp_prot->memory_allocated exceed udp_mem[0], but still under udp_mem[2] I wrote a test script to reproduce this: https://github.com/dust-li/kernel-test/blob/master/exceed_udp_mem_min_drop/exceed_udp_mem_min_drop.sh Obviously, we should not drop packets when udp_prot.memory_allocated just exceed udp_mem[0] but still under hard limit. For protocols with memory_pressure callbacks (like TCP), this is not a problem, because there is an extra check: ``` if (sk_has_memory_pressure(sk)) { u64 alloc; if (!sk_under_memory_pressure(sk)) return 1; alloc = sk_sockets_allocated_read_positive(sk); if (sk_prot_mem_limits(sk, 2) > alloc * sk_mem_pages(sk->sk_wmem_queued + atomic_read(&sk->sk_rmem_alloc) + sk->sk_forward_alloc)) return 1; } ``` But UDP didn't check this, so I add an extra check here to make sure UDP packets are not dropped until the hard limit is reached. Signed-off-by: Dust Li <dust.li@linux.alibaba.com> --- net/core/sock.c | 6 ++++++ 1 file changed, 6 insertions(+)