Message ID | 72d8fc8a6d35a74d267cca6c9eddb3ff7852868b.1616345643.git.pabeni@redhat.com |
---|---|
State | Superseded |
Headers | show |
Series | udp: GRO L4 improvements | expand |
On Sun, Mar 21, 2021 at 1:01 PM Paolo Abeni <pabeni@redhat.com> wrote: > > After the previous patch the stack can do L4 UDP aggregation > on top of an UDP tunnel. > > The current GRO complete code tries frag based aggregation first; > in the above scenario will generate corrupted frames. > > We need to try first UDP tunnel based aggregation, if the GRO > packet requires that. We can use time GRO 'encap_mark' field > to track the need GRO complete action. If encap_mark is set, > skip the frag_list aggregation. > > On tunnel encap GRO complete clear such field, so that an inner > frag_list GRO complete could take action. > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > --- > net/ipv4/udp_offload.c | 8 +++++++- > net/ipv6/udp_offload.c | 3 ++- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c > index 25134a3548e99..54e06b88af69a 100644 > --- a/net/ipv4/udp_offload.c > +++ b/net/ipv4/udp_offload.c > @@ -642,6 +642,11 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, > skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM > : SKB_GSO_UDP_TUNNEL; > > + /* clear the encap mark, so that inner frag_list gro_complete > + * can take place > + */ > + NAPI_GRO_CB(skb)->encap_mark = 0; > + > /* Set encapsulation before calling into inner gro_complete() > * functions to make them set up the inner offsets. > */ > @@ -665,7 +670,8 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) > const struct iphdr *iph = ip_hdr(skb); > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > - if (NAPI_GRO_CB(skb)->is_flist) { > + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ > + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { Sorry, I don't follow. I thought the point was to avoid fraglist if an outer udp tunnel header is present. But the above code clears the mark and allows entering the fraglist branch exactly when such a header is encountered? > uh->len = htons(skb->len - nhoff); > > skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); > diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c > index faa823c242923..b3d9ed96e5ea5 100644 > --- a/net/ipv6/udp_offload.c > +++ b/net/ipv6/udp_offload.c > @@ -163,7 +163,8 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) > const struct ipv6hdr *ipv6h = ipv6_hdr(skb); > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > - if (NAPI_GRO_CB(skb)->is_flist) { > + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ > + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { > uh->len = htons(skb->len - nhoff); > > skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); > -- > 2.26.2 >
On Mon, 2021-03-22 at 09:30 -0400, Willem de Bruijn wrote: > On Sun, Mar 21, 2021 at 1:01 PM Paolo Abeni <pabeni@redhat.com> wrote: > > After the previous patch the stack can do L4 UDP aggregation > > on top of an UDP tunnel. > > > > The current GRO complete code tries frag based aggregation first; > > in the above scenario will generate corrupted frames. > > > > We need to try first UDP tunnel based aggregation, if the GRO > > packet requires that. We can use time GRO 'encap_mark' field > > to track the need GRO complete action. If encap_mark is set, > > skip the frag_list aggregation. > > > > On tunnel encap GRO complete clear such field, so that an inner > > frag_list GRO complete could take action. > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > --- > > net/ipv4/udp_offload.c | 8 +++++++- > > net/ipv6/udp_offload.c | 3 ++- > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c > > index 25134a3548e99..54e06b88af69a 100644 > > --- a/net/ipv4/udp_offload.c > > +++ b/net/ipv4/udp_offload.c > > @@ -642,6 +642,11 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, > > skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM > > : SKB_GSO_UDP_TUNNEL; > > > > + /* clear the encap mark, so that inner frag_list gro_complete > > + * can take place > > + */ > > + NAPI_GRO_CB(skb)->encap_mark = 0; > > + > > /* Set encapsulation before calling into inner gro_complete() > > * functions to make them set up the inner offsets. > > */ > > @@ -665,7 +670,8 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) > > const struct iphdr *iph = ip_hdr(skb); > > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > > > - if (NAPI_GRO_CB(skb)->is_flist) { > > + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ > > + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { > > Sorry, I don't follow. I thought the point was to avoid fraglist if an > outer udp tunnel header is present. But the above code clears the mark > and allows entering the fraglist branch exactly when such a header is > encountered? The relevant UDP packet has gone through: [l2/l3 GRO] -> udp_gro_receive -> udp_sk(sk)->gro_receive -> [some more GRO layers] -> udp_gro_receive (again) The first udp_gro_receive set NAPI_GRO_CB(skb)->encap_mark, the latter udp_gro_receive set NAPI_GRO_CB(skb)->is_flist. Then, at GRO complete time: [l2/l3 GRO] -> udp{4,6}_gro_complete -> udp_sk(sk)->gro_complete -> [more GRO layers] -> udp{4,6}_gro_complete (again). In the first udp{4,6}_gro_complete invocation 'encap_mark' is 1, so with this patch we do the 'udp_sk(sk)->gro_complete' path. In the second udp{4,6}_gro_complete invocation 'encap_mark' has been cleared (by udp_gro_complete), so we do the SKB_GSO_FRAGLIST completion. In case SKB_GSO_FRAGLIST with no UDP tunnel, 'encap_mark' is 0 and we do the SKB_GSO_FRAGLIST completion. Another alternative, possibly more readable, would be avoid clearing 'encap_mark' in udp_gro_complete() and replacing the above check with: if (NAPI_GRO_CB(skb)->is_flist && (!NAPI_GRO_CB(skb)->encap_mark || (NAPI_GRO_CB(skb)->encap_mark && skb->encapsulation))) { I opted otherwise to simplify the conditional expression. Please let me know if the above is somewhat more clear and if you have preferecens between the two options. Cheers, Paolo
On Mon, Mar 22, 2021 at 1:00 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Mon, 2021-03-22 at 09:30 -0400, Willem de Bruijn wrote: > > On Sun, Mar 21, 2021 at 1:01 PM Paolo Abeni <pabeni@redhat.com> wrote: > > > After the previous patch the stack can do L4 UDP aggregation > > > on top of an UDP tunnel. > > > > > > The current GRO complete code tries frag based aggregation first; > > > in the above scenario will generate corrupted frames. > > > > > > We need to try first UDP tunnel based aggregation, if the GRO > > > packet requires that. We can use time GRO 'encap_mark' field > > > to track the need GRO complete action. If encap_mark is set, > > > skip the frag_list aggregation. > > > > > > On tunnel encap GRO complete clear such field, so that an inner > > > frag_list GRO complete could take action. > > > > > > Signed-off-by: Paolo Abeni <pabeni@redhat.com> > > > --- > > > net/ipv4/udp_offload.c | 8 +++++++- > > > net/ipv6/udp_offload.c | 3 ++- > > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > > > diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c > > > index 25134a3548e99..54e06b88af69a 100644 > > > --- a/net/ipv4/udp_offload.c > > > +++ b/net/ipv4/udp_offload.c > > > @@ -642,6 +642,11 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, > > > skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM > > > : SKB_GSO_UDP_TUNNEL; > > > > > > + /* clear the encap mark, so that inner frag_list gro_complete > > > + * can take place > > > + */ > > > + NAPI_GRO_CB(skb)->encap_mark = 0; > > > + > > > /* Set encapsulation before calling into inner gro_complete() > > > * functions to make them set up the inner offsets. > > > */ > > > @@ -665,7 +670,8 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) > > > const struct iphdr *iph = ip_hdr(skb); > > > struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); > > > > > > - if (NAPI_GRO_CB(skb)->is_flist) { > > > + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ > > > + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { > > > > Sorry, I don't follow. I thought the point was to avoid fraglist if an > > outer udp tunnel header is present. But the above code clears the mark > > and allows entering the fraglist branch exactly when such a header is > > encountered? > > The relevant UDP packet has gone through: > > [l2/l3 GRO] -> udp_gro_receive -> udp_sk(sk)->gro_receive -> [some > more GRO layers] -> udp_gro_receive (again) > > The first udp_gro_receive set NAPI_GRO_CB(skb)->encap_mark, the > latter udp_gro_receive set NAPI_GRO_CB(skb)->is_flist. > > Then, at GRO complete time: > > [l2/l3 GRO] -> udp{4,6}_gro_complete -> udp_sk(sk)->gro_complete -> > [more GRO layers] -> udp{4,6}_gro_complete (again). > > In the first udp{4,6}_gro_complete invocation 'encap_mark' is 1, so > with this patch we do the 'udp_sk(sk)->gro_complete' path. In the > second udp{4,6}_gro_complete invocation 'encap_mark' has been cleared > (by udp_gro_complete), so we do the SKB_GSO_FRAGLIST completion. > > In case SKB_GSO_FRAGLIST with no UDP tunnel, 'encap_mark' is 0 and we > do the SKB_GSO_FRAGLIST completion. > > Another alternative, possibly more readable, would be avoid clearing > 'encap_mark' in udp_gro_complete() and replacing the above check with: > > if (NAPI_GRO_CB(skb)->is_flist && > (!NAPI_GRO_CB(skb)->encap_mark || > (NAPI_GRO_CB(skb)->encap_mark && skb->encapsulation))) { > > I opted otherwise to simplify the conditional expression. > > Please let me know if the above is somewhat more clear and if you have > preferecens between the two options. Got it now, thanks. Yes, this patch makes sense. When the GRO layer has built a GSO packet with both inner UDP (GSO_UDP_L4) and UDP tunnel header (GSO_UDP_TUNNEL), then on completion udp{4,6}_gro_complete will be called twice. This function will enter its is_flist branch immediately, even though that is only correct on the second call, as GSO_FRAGLIST is only relevant for the inner packet. Skip this while encap_mark == 1, identifying processing of the outer tunnel header. Clear the field to enter the path on the next round, for the inner header. This is subject to only supporting a single layer of encap, but that is indeed the current state afaik.
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 25134a3548e99..54e06b88af69a 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -642,6 +642,11 @@ int udp_gro_complete(struct sk_buff *skb, int nhoff, skb_shinfo(skb)->gso_type = uh->check ? SKB_GSO_UDP_TUNNEL_CSUM : SKB_GSO_UDP_TUNNEL; + /* clear the encap mark, so that inner frag_list gro_complete + * can take place + */ + NAPI_GRO_CB(skb)->encap_mark = 0; + /* Set encapsulation before calling into inner gro_complete() * functions to make them set up the inner offsets. */ @@ -665,7 +670,8 @@ INDIRECT_CALLABLE_SCOPE int udp4_gro_complete(struct sk_buff *skb, int nhoff) const struct iphdr *iph = ip_hdr(skb); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); - if (NAPI_GRO_CB(skb)->is_flist) { + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { uh->len = htons(skb->len - nhoff); skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4); diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c index faa823c242923..b3d9ed96e5ea5 100644 --- a/net/ipv6/udp_offload.c +++ b/net/ipv6/udp_offload.c @@ -163,7 +163,8 @@ INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff) const struct ipv6hdr *ipv6h = ipv6_hdr(skb); struct udphdr *uh = (struct udphdr *)(skb->data + nhoff); - if (NAPI_GRO_CB(skb)->is_flist) { + /* do fraglist only if there is no outer UDP encap (or we already processed it) */ + if (NAPI_GRO_CB(skb)->is_flist && !NAPI_GRO_CB(skb)->encap_mark) { uh->len = htons(skb->len - nhoff); skb_shinfo(skb)->gso_type |= (SKB_GSO_FRAGLIST|SKB_GSO_UDP_L4);
After the previous patch the stack can do L4 UDP aggregation on top of an UDP tunnel. The current GRO complete code tries frag based aggregation first; in the above scenario will generate corrupted frames. We need to try first UDP tunnel based aggregation, if the GRO packet requires that. We can use time GRO 'encap_mark' field to track the need GRO complete action. If encap_mark is set, skip the frag_list aggregation. On tunnel encap GRO complete clear such field, so that an inner frag_list GRO complete could take action. Signed-off-by: Paolo Abeni <pabeni@redhat.com> --- net/ipv4/udp_offload.c | 8 +++++++- net/ipv6/udp_offload.c | 3 ++- 2 files changed, 9 insertions(+), 2 deletions(-)