Message ID | 3cb5a2e5-4e4c-728a-252d-4757b6c9612d@virtuozzo.com |
---|---|
State | New |
Headers | show |
Series | [IPV6,1/1] ipv6: allocate enough headroom in ip6_finish_output2() | expand |
On 7/7/21 8:04 AM, Vasily Averin wrote: > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > index ff4f9eb..e5af740 100644 > --- a/net/ipv6/ip6_output.c > +++ b/net/ipv6/ip6_output.c > @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff * > struct dst_entry *dst = skb_dst(skb); > struct net_device *dev = dst->dev; > const struct in6_addr *nexthop; > + unsigned int hh_len = LL_RESERVED_SPACE(dev); > struct neighbour *neigh; > int ret; > > + /* Be paranoid, rather than too clever. */ > + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { > + struct sk_buff *skb2; > + > + skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev)); why not use hh_len here? > + if (!skb2) { > + kfree_skb(skb); > + return -ENOMEM; > + } > + if (skb->sk) > + skb_set_owner_w(skb2, skb->sk); > + consume_skb(skb); > + skb = skb2; > + } > if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) { > struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb)); > >
On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote: > On 7/7/21 8:04 AM, Vasily Averin wrote: > > diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c > > index ff4f9eb..e5af740 100644 > > --- a/net/ipv6/ip6_output.c > > +++ b/net/ipv6/ip6_output.c > > @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff * > > struct dst_entry *dst = skb_dst(skb); > > struct net_device *dev = dst->dev; > > const struct in6_addr *nexthop; > > + unsigned int hh_len = LL_RESERVED_SPACE(dev); > > struct neighbour *neigh; > > int ret; > > > > + /* Be paranoid, rather than too clever. */ > > + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { > > + struct sk_buff *skb2; > > + > > + skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev)); > > why not use hh_len here? Is there a reason for the new skb? Why not pskb_expand_head()? > > + if (!skb2) { > > + kfree_skb(skb); > > + return -ENOMEM; > > + } > > + if (skb->sk) > > + skb_set_owner_w(skb2, skb->sk); > > + consume_skb(skb); > > + skb = skb2; > > + } > > if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) { > > struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
On 7/7/21 8:41 PM, Eric Dumazet wrote: > On 7/7/21 6:42 PM, Jakub Kicinski wrote: >> On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote: >>> On 7/7/21 8:04 AM, Vasily Averin wrote: >>>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c >>>> index ff4f9eb..e5af740 100644 >>>> --- a/net/ipv6/ip6_output.c >>>> +++ b/net/ipv6/ip6_output.c >>>> @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff * >>>> struct dst_entry *dst = skb_dst(skb); >>>> struct net_device *dev = dst->dev; >>>> const struct in6_addr *nexthop; >>>> + unsigned int hh_len = LL_RESERVED_SPACE(dev); >>>> struct neighbour *neigh; >>>> int ret; >>>> >>>> + /* Be paranoid, rather than too clever. */ >>>> + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { >>>> + struct sk_buff *skb2; >>>> + >>>> + skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev)); >>> >>> why not use hh_len here? >> >> Is there a reason for the new skb? Why not pskb_expand_head()? > > pskb_expand_head() might crash, if skb is shared. > > We possibly can add a helper, factorizing all this, > and eventually use pskb_expand_head() if safe. Thank you for feedback, I'll do it in 2nd version. Vasily Averin
On 7/7/21 8:30 PM, Jakub Kicinski wrote: > On Wed, 7 Jul 2021 19:41:44 +0200 Eric Dumazet wrote: >> On 7/7/21 6:42 PM, Jakub Kicinski wrote: >>> On Wed, 7 Jul 2021 08:45:13 -0600 David Ahern wrote: >>>> why not use hh_len here? >>> >>> Is there a reason for the new skb? Why not pskb_expand_head()? >> >> >> pskb_expand_head() might crash, if skb is shared. >> >> We possibly can add a helper, factorizing all this, >> and eventually use pskb_expand_head() if safe. > > Is there a strategically placed skb_share_check() somewhere further > down? Otherwise there seems to be a lot of questionable skb_cow*() > calls, also __skb_linearize() and skb_pad() are risky, no? > Or is it that shared skbs are uncommon and syzbot doesn't hit them? > Some of us try hard to remove skb_get() occurrences, but they tend to re-appear fast :/ Refs: commit a516993f0ac1694673412eb2d16a091eafa77d2a ("net: fix wrong skb_get() usage / crash in IGMP/MLD parsing code")
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index ff4f9eb..e5af740 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -61,9 +61,24 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff * struct dst_entry *dst = skb_dst(skb); struct net_device *dev = dst->dev; const struct in6_addr *nexthop; + unsigned int hh_len = LL_RESERVED_SPACE(dev); struct neighbour *neigh; int ret; + /* Be paranoid, rather than too clever. */ + if (unlikely(skb_headroom(skb) < hh_len && dev->header_ops)) { + struct sk_buff *skb2; + + skb2 = skb_realloc_headroom(skb, LL_RESERVED_SPACE(dev)); + if (!skb2) { + kfree_skb(skb); + return -ENOMEM; + } + if (skb->sk) + skb_set_owner_w(skb2, skb->sk); + consume_skb(skb); + skb = skb2; + } if (ipv6_addr_is_multicast(&ipv6_hdr(skb)->daddr)) { struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
When TEE target mirrors traffic to another interface, sk_buff may not have enough headroom to be processed correctly. ip_finish_output2() detect this situation for ipv4 and allocates new skb with enogh headroom. However ipv6 lacks this logic in ip_finish_output2 and it leads to skb_under_panic: skbuff: skb_under_panic: text:ffffffffc0866ad4 len:96 put:24 head:ffff97be85e31800 data:ffff97be85e317f8 tail:0x58 end:0xc0 dev:gre0 ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:110! invalid opcode: 0000 [#1] SMP PTI CPU: 2 PID: 393 Comm: kworker/2:2 Tainted: G OE 5.13.0 #13 Hardware name: Virtuozzo KVM, BIOS 1.11.0-2.vz7.4 04/01/2014 Workqueue: ipv6_addrconf addrconf_dad_work RIP: 0010:skb_panic+0x48/0x4a Call Trace: skb_push.cold.111+0x10/0x10 ipgre_header+0x24/0xf0 [ip_gre] neigh_connected_output+0xae/0xf0 ip6_finish_output2+0x1a8/0x5a0 ip6_output+0x5c/0x110 nf_dup_ipv6+0x158/0x1000 [nf_dup_ipv6] tee_tg6+0x2e/0x40 [xt_TEE] ip6t_do_table+0x294/0x470 [ip6_tables] nf_hook_slow+0x44/0xc0 nf_hook.constprop.34+0x72/0xe0 ndisc_send_skb+0x20d/0x2e0 ndisc_send_ns+0xd1/0x210 addrconf_dad_work+0x3c8/0x540 process_one_work+0x1d1/0x370 worker_thread+0x30/0x390 kthread+0x116/0x130 ret_from_fork+0x22/0x30 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> --- net/ipv6/ip6_output.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)