From patchwork Mon Apr 20 14:25:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Abeni X-Patchwork-Id: 220956 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=3.0 tests=DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY, SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8F7FC3815B for ; Mon, 20 Apr 2020 14:26:11 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB48A206DD for ; Mon, 20 Apr 2020 14:26:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="U4yI9VZ+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729831AbgDTO0L (ORCPT ); Mon, 20 Apr 2020 10:26:11 -0400 Received: from us-smtp-delivery-1.mimecast.com ([207.211.31.120]:36320 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1725958AbgDTO0K (ORCPT ); Mon, 20 Apr 2020 10:26:10 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1587392763; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NlNxOrS0FW1oa1+/Zmxtegc0iVIoOwb3wKK5iHwv0jA=; b=U4yI9VZ+ybY/1xLsTgzvEpg2e5B6JCPlGyLRqBKDYreMhpC6aO9zrHydB8xn/Fzr3LoYnD WHT2eS3qWHDakQmmXNZGY1sfckg+7gvWBuVuGPXu/pklsnrF02AAVf4lsn8JzM0NvTPitj 5H1JMEbePwKNB+XQqvR40bC5yBy0y0Q= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-137-7Ybxu2eFMd6vVD6iX4QXOg-1; Mon, 20 Apr 2020 10:26:01 -0400 X-MC-Unique: 7Ybxu2eFMd6vVD6iX4QXOg-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7F50BA3486D; Mon, 20 Apr 2020 14:25:39 +0000 (UTC) Received: from linux.fritz.box.com (ovpn-114-142.ams2.redhat.com [10.36.114.142]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9306227BDA; Mon, 20 Apr 2020 14:25:37 +0000 (UTC) From: Paolo Abeni To: netdev@vger.kernel.org Cc: Mat Martineau , Matthieu Baerts , Jakub Kicinski , Christoph Paasch , Florian Westphal Subject: [PATCH net 2/3] mptcp: avoid flipping mp_capable field in syn_recv_sock() Date: Mon, 20 Apr 2020 16:25:05 +0200 Message-Id: In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Sender: netdev-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org If multiple CPUs races on the same req_sock in syn_recv_sock(), flipping such field can cause inconsistent child socket status. When racing, the CPU losing the req ownership may still change the mptcp request socket mp_capable flag while the CPU owning the request is cloning the socket, leaving the child socket with 'is_mptcp' set but no 'mp_capable' flag. Such socket will stay with 'conn' field cleared, heading to oops in later mptcp callback. Address the issue tracking the fallback status in a local variable. Fixes: 58b09919626b ("mptcp: create msk early") Co-developed-by: Florian Westphal Signed-off-by: Florian Westphal Signed-off-by: Paolo Abeni --- net/mptcp/subflow.c | 46 +++++++++++++++++++++++++++++---------------- 1 file changed, 30 insertions(+), 16 deletions(-) diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 3a94f859347a..10090ca3d3e0 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -376,6 +376,17 @@ static void mptcp_force_close(struct sock *sk) sk_common_release(sk); } +static void subflow_ulp_fallback(struct sock *sk, + struct mptcp_subflow_context *old_ctx) +{ + struct inet_connection_sock *icsk = inet_csk(sk); + + mptcp_subflow_tcp_fallback(sk, old_ctx); + icsk->icsk_ulp_ops = NULL; + rcu_assign_pointer(icsk->icsk_ulp_data, NULL); + tcp_sk(sk)->is_mptcp = 0; +} + static struct sock *subflow_syn_recv_sock(const struct sock *sk, struct sk_buff *skb, struct request_sock *req, @@ -388,6 +399,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, struct tcp_options_received opt_rx; bool fallback_is_fatal = false; struct sock *new_msk = NULL; + bool fallback = false; struct sock *child; pr_debug("listener=%p, req=%p, conn=%p", listener, req, listener->conn); @@ -412,14 +424,14 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, subflow_req->remote_key = opt_rx.mptcp.sndr_key; subflow_req->remote_key_valid = 1; } else { - subflow_req->mp_capable = 0; + fallback = true; goto create_child; } create_msk: new_msk = mptcp_sk_clone(listener->conn, req); if (!new_msk) - subflow_req->mp_capable = 0; + fallback = true; } else if (subflow_req->mp_join) { fallback_is_fatal = true; opt_rx.mptcp.mp_join = 0; @@ -438,12 +450,18 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, if (child && *own_req) { struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(child); - /* we have null ctx on TCP fallback, which is fatal on - * MPJ handshake + /* we need to fallback on ctx allocation failure and on pre-reqs + * checking above. In the latter scenario we additionally need + * to reset the context to non MPTCP status. */ - if (!ctx) { + if (!ctx || fallback) { if (fallback_is_fatal) goto close_child; + + if (ctx) { + subflow_ulp_fallback(child, ctx); + kfree_rcu(ctx, rcu); + } goto out; } @@ -474,6 +492,13 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, /* dispose of the left over mptcp master, if any */ if (unlikely(new_msk)) mptcp_force_close(new_msk); + + /* check for expected invariant - should never trigger, just help + * catching eariler subtle bugs + */ + WARN_ON_ONCE(*own_req && child && tcp_sk(child)->is_mptcp && + (!mptcp_subflow_ctx(child) || + !mptcp_subflow_ctx(child)->conn)); return child; close_child: @@ -1094,17 +1119,6 @@ static void subflow_ulp_release(struct sock *sk) kfree_rcu(ctx, rcu); } -static void subflow_ulp_fallback(struct sock *sk, - struct mptcp_subflow_context *old_ctx) -{ - struct inet_connection_sock *icsk = inet_csk(sk); - - mptcp_subflow_tcp_fallback(sk, old_ctx); - icsk->icsk_ulp_ops = NULL; - rcu_assign_pointer(icsk->icsk_ulp_data, NULL); - tcp_sk(sk)->is_mptcp = 0; -} - static void subflow_ulp_clone(const struct request_sock *req, struct sock *newsk, const gfp_t priority)