From patchwork Fri Dec 13 13:50:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 850316 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D874E1E0B9C for ; Fri, 13 Dec 2024 13:50:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097846; cv=none; b=FevIB0qjGYGDjFi71y85e2MpL8Njscwf7Mo+y/3dd/vvINf1MEiz+bYex+yJ7yqA3m9ZrwMFPcDMzXkvle7tSgMhSLJG3V85xxBdGWAKQTkjfOe02SuVJlNHw8pfAdV/q8GQZNE1aOWZziSpDWA7xFDVn0n6dkCrGuFN1Vevn1E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097846; c=relaxed/simple; bh=Gn32ove2Yu+uyaaNX7UGUwI/rCfUUP4QCCBh0KLUvxk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=HMNpMQkqmzgoHp+FocIIACQ1XksUv0isli+oFSTNOliWABx6GdW+iADKBgCzM8ORb8vjwMdZ4IBc2tafmc/jk2Dg04iCYGiB6PZ4i7HqCJAd7tRnJj4U4mC/qDLnds2PCD/F9VF1GDnpPFWlSjgTQ+ghAo+RhhukVTbyHixG1Pc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Rgc7YuW9; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Rgc7YuW9" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734097844; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XyqR/WjDCwAE5X+NDEcLa5xYMEUn9QBuN0OBrTmThCY=; b=Rgc7YuW9JNB//oZMuTVw7KwzVTQrGrFlf9xWClLXtH5PrC6i37r51AuCNiUccM8+YojxA1 iZHWFr4uTngWfXMm3tVmDEqXaUT5J6pvAczXHANEGNkBKEXobgGWN+70yBuWdsJu93tP2Z ynrBp9dzQoLOkh5pDU7KAJzVH/YF3vg= Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-73--YEkMWSgNkChCQp0_JqhQQ-1; Fri, 13 Dec 2024 08:50:40 -0500 X-MC-Unique: -YEkMWSgNkChCQp0_JqhQQ-1 X-Mimecast-MFC-AGG-ID: -YEkMWSgNkChCQp0_JqhQQ Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 7662919560AF; Fri, 13 Dec 2024 13:50:37 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 1BF2F1956089; Fri, 13 Dec 2024 13:50:31 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Max Kellermann , Ilya Dryomov , Xiubo Li , Trond Myklebust , Jeff Layton , Matthew Wilcox , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shyam Prasad N , Steve French , Paulo Alcantara Subject: [PATCH 02/10] netfs: Fix non-contiguous donation between completed reads Date: Fri, 13 Dec 2024 13:50:02 +0000 Message-ID: <20241213135013.2964079-3-dhowells@redhat.com> In-Reply-To: <20241213135013.2964079-1-dhowells@redhat.com> References: <20241213135013.2964079-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 When a read subrequest finishes, if it doesn't have sufficient coverage to complete the folio(s) covering either side of it, it will donate the excess coverage to the adjacent subrequests on either side, offloading responsibility for unlocking the folio(s) covered to them. Now, preference is given to donating down to a lower file offset over donating up because that check is done first - but there's no check that the lower subreq is actually contiguous, and so we can end up donating incorrectly. The scenario seen[1] is that an 8MiB readahead request spanning four 2MiB folios is split into eight 1MiB subreqs (numbered 1 through 8). These terminate in the order 1,6,2,5,3,7,4,8. What happens is: - 1 donates to 2 - 6 donates to 5 - 2 completes, unlocking the first folio (with 1). - 5 completes, unlocking the third folio (with 6). - 3 donates to 4 - 7 donates to 4 incorrectly - 4 completes, unlocking the second folio (with 3), but can't use the excess from 7. - 8 donates to 4, also incorrectly. Fix this by preventing downward donation if the subreqs are not contiguous (in the example above, 7 donates to 4 across the gap left by 5 and 6). Reported-by: Shyam Prasad N Closes: https://lore.kernel.org/r/CANT5p=qBwjBm-D8soFVVtswGEfmMtQXVW83=TNfUtvyHeFQZBA@mail.gmail.com/ Signed-off-by: David Howells cc: Steve French cc: Paulo Alcantara cc: Jeff Layton cc: linux-cifs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/526707.1733224486@warthog.procyon.org.uk/ [1] --- fs/netfs/read_collect.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index 3cbb289535a8..b415e3972336 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -247,16 +247,17 @@ static bool netfs_consume_read_data(struct netfs_io_subrequest *subreq, bool was /* Deal with the trickiest case: that this subreq is in the middle of a * folio, not touching either edge, but finishes first. In such a - * case, we donate to the previous subreq, if there is one, so that the - * donation is only handled when that completes - and remove this - * subreq from the list. + * case, we donate to the previous subreq, if there is one and if it is + * contiguous, so that the donation is only handled when that completes + * - and remove this subreq from the list. * * If the previous subreq finished first, we will have acquired their * donation and should be able to unlock folios and/or donate nextwards. */ if (!subreq->consumed && !prev_donated && - !list_is_first(&subreq->rreq_link, &rreq->subrequests)) { + !list_is_first(&subreq->rreq_link, &rreq->subrequests) && + subreq->start == prev->start + prev->len) { prev = list_prev_entry(subreq, rreq_link); WRITE_ONCE(prev->next_donated, prev->next_donated + subreq->len); subreq->start += subreq->len; From patchwork Fri Dec 13 13:50:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 850315 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D27041E2009 for ; Fri, 13 Dec 2024 13:50:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097858; cv=none; b=qG6KlIz5OeMImMwWW6QxJpPW2nAVSiD0EKrHhBeDb7y6i1wQ2cpVRKUiBxWMCyJjtTG99HSJpHu9oxPhKHQGt91Qq8ZLlYY/WPVd3xomu1hNbc/O9li+JLzSBudBGotJqgUUqZbRuQ9U1kvUl+2o6TgsfoyZbCVMINHh8vCDPiU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097858; c=relaxed/simple; bh=ee8bCaSWFbPmnDIBL40JFQm2oyep1Gfed/yf6gabDzM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jEZiLkYMEDDHjq7tBjW13B/pA5JXdMaq/BgmMoGeT3fDUPis3fkjojtXLcfhBkqBAEwTpYQ+vyTo0PxZKSqsSmEvLoQ8FWcxEMLXjlc54wIQea0qEC7gM+9QZ7yKhUZ/Ton7SegiFnevmqS74jF+j1d1QfBurTcuXmhAj35cZ/U= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=MwVdUgwf; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MwVdUgwf" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734097855; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5M1ENZFbO/mNMn248U1qss6F8q2hwKZC+45ijoUGNaQ=; b=MwVdUgwfQSQIi4jmyuKEgAO2ZchpnKwypFPV9WsITKHn5sFkT/AA0Tl3NzjgFy9lKEMJ3l 1l2cWgmr3Jw5hO0t9OmBzeEwq2o7kuuuky/ianG5BZlHTSov57JhquBrIjyngLMZi+DT15 sjkfVw0YkZNQDEu/Ql3ozUoHhZIgU3Q= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-492-komwBIjYNoKdH0VnGk57oA-1; Fri, 13 Dec 2024 08:50:52 -0500 X-MC-Unique: komwBIjYNoKdH0VnGk57oA-1 X-Mimecast-MFC-AGG-ID: komwBIjYNoKdH0VnGk57oA Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id D61E71956056; Fri, 13 Dec 2024 13:50:49 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id EC2561956052; Fri, 13 Dec 2024 13:50:44 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Max Kellermann , Ilya Dryomov , Xiubo Li , Trond Myklebust , Jeff Layton , Matthew Wilcox , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Anna Schumaker , Dave Wysochanski Subject: [PATCH 04/10] nfs: Fix oops in nfs_netfs_init_request() when copying to cache Date: Fri, 13 Dec 2024 13:50:04 +0000 Message-ID: <20241213135013.2964079-5-dhowells@redhat.com> In-Reply-To: <20241213135013.2964079-1-dhowells@redhat.com> References: <20241213135013.2964079-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 When netfslib wants to copy some data that has just been read on behalf of nfs, it creates a new write request and calls nfs_netfs_init_request() to initialise it, but with a NULL file pointer. This causes nfs_file_open_context() to oops - however, we don't actually need the nfs context as we're only going to write to the cache. Fix this by just returning if we aren't given a file pointer and emit a warning if the request was for something other than copy-to-cache. Further, fix nfs_netfs_free_request() so that it doesn't try to free the context if the pointer is NULL. Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Max Kellermann Closes: https://lore.kernel.org/r/CAKPOu+9DyMbKLhyJb7aMLDTb=Fh0T8Teb9sjuf_pze+XWT1VaQ@mail.gmail.com/ Signed-off-by: David Howells cc: Trond Myklebust cc: Anna Schumaker cc: Dave Wysochanski cc: Jeff Layton cc: linux-nfs@vger.kernel.org cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org --- fs/nfs/fscache.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/fs/nfs/fscache.c b/fs/nfs/fscache.c index 810269ee0a50..d49e4ce27999 100644 --- a/fs/nfs/fscache.c +++ b/fs/nfs/fscache.c @@ -263,6 +263,12 @@ int nfs_netfs_readahead(struct readahead_control *ractl) static atomic_t nfs_netfs_debug_id; static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *file) { + if (!file) { + if (WARN_ON_ONCE(rreq->origin != NETFS_PGPRIV2_COPY_TO_CACHE)) + return -EIO; + return 0; + } + rreq->netfs_priv = get_nfs_open_context(nfs_file_open_context(file)); rreq->debug_id = atomic_inc_return(&nfs_netfs_debug_id); /* [DEPRECATED] Use PG_private_2 to mark folio being written to the cache. */ @@ -274,7 +280,8 @@ static int nfs_netfs_init_request(struct netfs_io_request *rreq, struct file *fi static void nfs_netfs_free_request(struct netfs_io_request *rreq) { - put_nfs_open_context(rreq->netfs_priv); + if (rreq->netfs_priv) + put_nfs_open_context(rreq->netfs_priv); } static struct nfs_netfs_io_data *nfs_netfs_alloc(struct netfs_io_subrequest *sreq) From patchwork Fri Dec 13 13:50:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 850314 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6C7DD1E379F for ; Fri, 13 Dec 2024 13:51:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097874; cv=none; b=RJ5JK9ZBLKN0gu0zhDVohAycfFL8E2LliIYIy4UX2HjZStQbviKpfQUiMSllAzQGcXDeUZQ5aEtIJGDm1/mSN+fQ41vnSQmvN4ycVv0EOaLvvf7NziemkDnTbNYsxlzGmy+wDWg+Yp57/1QhxsvttwL3CIs7D8s9zziWrQRJsxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097874; c=relaxed/simple; bh=uaeI9J7rEoRhi5ZksmwE/88jFR1W/T6sdc10RyCTsN8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=BlYQoKvbrTm8T7qOrSzCmeEau+kYHWNAw4opmHFGq3MxCb6sTt2enaxX6qynu4GeeXg4DPyI7eVldaUwLPxT52AeZbPe2ssExFwDwejIplvIPTt8cB5gGPfShnZ1h9eq/NOY++Hzyjj6xd8zJZdqsaxTImkdVVVpr5j3GiR3XGU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=R3sIVsAu; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="R3sIVsAu" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734097871; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dahqI5gLkGfD05TuY+PqQn3pxJDwXGZGfLY3MOFYgVE=; b=R3sIVsAugDExB77D/kRmr7Bn6ZLuwbpIOP+iHDMl0apxDCVIeS3cAcxplaZXouyIjh5CRz tWlr4/kJ7Zymn/9zxOTlWyLiuPR3v99EeexhSc8Go+LulBLGDYjfwHYomBjzaH1wTrt7vp r/9L1ElMIjRwebB6/FSh/E3UwfrOQKQ= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-692-NQm--YxLM0SK64NOpZx29g-1; Fri, 13 Dec 2024 08:51:06 -0500 X-MC-Unique: NQm--YxLM0SK64NOpZx29g-1 X-Mimecast-MFC-AGG-ID: NQm--YxLM0SK64NOpZx29g Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 2E4CD19560B5; Fri, 13 Dec 2024 13:51:02 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 058071955F40; Fri, 13 Dec 2024 13:50:56 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Max Kellermann , Ilya Dryomov , Xiubo Li , Trond Myklebust , Jeff Layton , Matthew Wilcox , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Zilin Guan , Akira Yokosawa Subject: [PATCH 06/10] netfs: Remove redundant use of smp_rmb() Date: Fri, 13 Dec 2024 13:50:06 +0000 Message-ID: <20241213135013.2964079-7-dhowells@redhat.com> In-Reply-To: <20241213135013.2964079-1-dhowells@redhat.com> References: <20241213135013.2964079-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40 From: Zilin Guan The function netfs_unbuffered_write_iter_locked() in fs/netfs/direct_write.c contains an unnecessary smp_rmb() call after wait_on_bit(). Since wait_on_bit() already incorporates a memory barrier that ensures the flag update is visible before the function returns, the smp_rmb() provides no additional benefit and incurs unnecessary overhead. This patch removes the redundant barrier to simplify and optimize the code. Signed-off-by: Zilin Guan Signed-off-by: David Howells cc: Akira Yokosawa cc: Jeff Layton cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241207021952.2978530-1-zilin@seu.edu.cn/ Reviewed-by: Akira Yokosawa --- fs/netfs/direct_write.c | 1 - 1 file changed, 1 deletion(-) diff --git a/fs/netfs/direct_write.c b/fs/netfs/direct_write.c index 88f2adfab75e..173e8b5e6a93 100644 --- a/fs/netfs/direct_write.c +++ b/fs/netfs/direct_write.c @@ -104,7 +104,6 @@ ssize_t netfs_unbuffered_write_iter_locked(struct kiocb *iocb, struct iov_iter * trace_netfs_rreq(wreq, netfs_rreq_trace_wait_ip); wait_on_bit(&wreq->flags, NETFS_RREQ_IN_PROGRESS, TASK_UNINTERRUPTIBLE); - smp_rmb(); /* Read error/transferred after RIP flag */ ret = wreq->error; if (ret == 0) { ret = wreq->transferred; From patchwork Fri Dec 13 13:50:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 850313 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 97A671E8837 for ; Fri, 13 Dec 2024 13:51:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097884; cv=none; b=tyaVE9Hy+qDr0eXdb7GsQNEZzvnLvQWZ9lXMLpm9DNcWy/w4HCPUc1WELCakEfuLIQvbip+LBSr0GS0cDcF5ykNmdkqys+92pga6gTKsvaCfuwABlua5sMCHzP14Yttwln8wTUyRbJO1SqoUJhJuZSNnyU1sAFNyjDjCLPYTb/w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097884; c=relaxed/simple; bh=JdOtE9prXCiIruov8qMCnPB6JaU+McaaScCdBsotj5s=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ILVEAS5tCYnXqpqyINk+gC6FtNkNV0V3zQu4znbUYcAWHnP6WrSCiW0CRktJjS4DfQ237lw/fBaVwHn1hCVtVi7Wt1BT/c/E+yPRMv1b22HgY7aSLCfnjLht/HI7v0rsgk6CKQTudvGSahEfI97P6T5S7UaC3IGlkSgy4KSxvhs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=D9b5Wy1w; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="D9b5Wy1w" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734097881; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KjPETwBK7bWpM5YaAhKCzBXjAnc9nY6CVWmIFTOZIoE=; b=D9b5Wy1wXnzm5iUPaUdwjON7lpF7JqjlkKhJ063nkr1uo/nVUpV6tlJQYiM0pp0hRGvjcC bE1+svwZBbJiK3brvz6WXrhEumbz4GjrV1ajw3bqbsJ1U5SR9zEvs18YOKEpL5O/A6JB7f DWD0X2pqTKklz5gEdBlnX15yw4rsqQg= Received: from mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-39-NT_RY7GvPRmiuXhKw1jMLw-1; Fri, 13 Dec 2024 08:51:18 -0500 X-MC-Unique: NT_RY7GvPRmiuXhKw1jMLw-1 X-Mimecast-MFC-AGG-ID: NT_RY7GvPRmiuXhKw1jMLw Received: from mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id F191719560BC; Fri, 13 Dec 2024 13:51:14 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id B296F30044C1; Fri, 13 Dec 2024 13:51:09 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Max Kellermann , Ilya Dryomov , Xiubo Li , Trond Myklebust , Jeff Layton , Matthew Wilcox , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lizhi Xu , Dominique Martinet Subject: [PATCH 08/10] netfs: Work around recursion by abandoning retry if nothing read Date: Fri, 13 Dec 2024 13:50:08 +0000 Message-ID: <20241213135013.2964079-9-dhowells@redhat.com> In-Reply-To: <20241213135013.2964079-1-dhowells@redhat.com> References: <20241213135013.2964079-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.4 syzkaller reported recursion with a loop of three calls (netfs_rreq_assess, netfs_retry_reads and netfs_rreq_terminated) hitting the limit of the stack during an unbuffered or direct I/O read. There are a number of issues: (1) There is no limit on the number of retries. (2) A subrequest is supposed to be abandoned if it does not transfer anything (NETFS_SREQ_NO_PROGRESS), but that isn't checked under all circumstances. (3) The actual root cause, which is this: if (atomic_dec_and_test(&rreq->nr_outstanding)) netfs_rreq_terminated(rreq, ...); When we do a retry, we bump the rreq->nr_outstanding counter to prevent the final cleanup phase running before we've finished dispatching the retries. The problem is if we hit 0, we have to do the cleanup phase - but we're in the cleanup phase and end up repeating the retry cycle, hence the recursion. Work around the problem by limiting the number of retries. This is based on Lizhi Xu's patch[1], and makes the following changes: (1) Replace NETFS_SREQ_NO_PROGRESS with NETFS_SREQ_MADE_PROGRESS and make the filesystem set it if it managed to read or write at least one byte of data. Clear this bit before issuing a subrequest. (2) Add a ->retry_count member to the subrequest and increment it any time we do a retry. (3) Remove the NETFS_SREQ_RETRYING flag as it is superfluous with ->retry_count. If the latter is non-zero, we're doing a retry. (4) Abandon a subrequest if retry_count is non-zero and we made no progress. (5) Use ->retry_count in both the write-side and the read-size. [?] Question: Should I set a hard limit on retry_count in both read and write? Say it hits 50, we always abandon it. The problem is that these changes only mitigate the issue. As long as it made at least one byte of progress, the recursion is still an issue. This patch mitigates the problem, but does not fix the underlying cause. I have patches that will do that, but it's an intrusive fix that's currently pending for the next merge window. The oops generated by KASAN looks something like: BUG: TASK stack guard page was hit at ffffc9000482ff48 (stack is ffffc90004830000..ffffc90004838000) Oops: stack guard page: 0000 [#1] PREEMPT SMP KASAN NOPTI ... RIP: 0010:mark_lock+0x25/0xc60 kernel/locking/lockdep.c:4686 ... mark_usage kernel/locking/lockdep.c:4646 [inline] __lock_acquire+0x906/0x3ce0 kernel/locking/lockdep.c:5156 lock_acquire.part.0+0x11b/0x380 kernel/locking/lockdep.c:5825 local_lock_acquire include/linux/local_lock_internal.h:29 [inline] ___slab_alloc+0x123/0x1880 mm/slub.c:3695 __slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3908 __slab_alloc_node mm/slub.c:3961 [inline] slab_alloc_node mm/slub.c:4122 [inline] kmem_cache_alloc_noprof+0x2a7/0x2f0 mm/slub.c:4141 radix_tree_node_alloc.constprop.0+0x1e8/0x350 lib/radix-tree.c:253 idr_get_free+0x528/0xa40 lib/radix-tree.c:1506 idr_alloc_u32+0x191/0x2f0 lib/idr.c:46 idr_alloc+0xc1/0x130 lib/idr.c:87 p9_tag_alloc+0x394/0x870 net/9p/client.c:321 p9_client_prepare_req+0x19f/0x4d0 net/9p/client.c:644 p9_client_zc_rpc.constprop.0+0x105/0x880 net/9p/client.c:793 p9_client_read_once+0x443/0x820 net/9p/client.c:1570 p9_client_read+0x13f/0x1b0 net/9p/client.c:1534 v9fs_issue_read+0x115/0x310 fs/9p/vfs_addr.c:74 netfs_retry_read_subrequests fs/netfs/read_retry.c:60 [inline] netfs_retry_reads+0x153a/0x1d00 fs/netfs/read_retry.c:232 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 ... netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_retry_reads+0x155e/0x1d00 fs/netfs/read_retry.c:235 netfs_rreq_assess+0x5d3/0x870 fs/netfs/read_collect.c:371 netfs_rreq_terminated+0xe5/0x110 fs/netfs/read_collect.c:407 netfs_dispatch_unbuffered_reads fs/netfs/direct_read.c:103 [inline] netfs_unbuffered_read fs/netfs/direct_read.c:127 [inline] netfs_unbuffered_read_iter_locked+0x12f6/0x19b0 fs/netfs/direct_read.c:221 netfs_unbuffered_read_iter+0xc5/0x100 fs/netfs/direct_read.c:256 v9fs_file_read_iter+0xbf/0x100 fs/9p/vfs_file.c:361 do_iter_readv_writev+0x614/0x7f0 fs/read_write.c:832 vfs_readv+0x4cf/0x890 fs/read_write.c:1025 do_preadv fs/read_write.c:1142 [inline] __do_sys_preadv fs/read_write.c:1192 [inline] __se_sys_preadv fs/read_write.c:1187 [inline] __x64_sys_preadv+0x22d/0x310 fs/read_write.c:1187 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xcd/0x250 arch/x86/entry/common.c:83 Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Closes: https://syzkaller.appspot.com/bug?extid=1fc6f64c40a9d143cfb6 Signed-off-by: David Howells Suggested-by: Lizhi Xu cc: Dominique Martinet cc: Jeff Layton cc: v9fs@lists.linux.dev cc: netfs@lists.linux.dev cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/r/20241108034020.3695718-1-lizhi.xu@windriver.com/ [1] --- fs/9p/vfs_addr.c | 6 +++++- fs/afs/write.c | 5 ++++- fs/netfs/read_collect.c | 15 +++++++++------ fs/netfs/read_retry.c | 6 ++++-- fs/netfs/write_collect.c | 5 ++--- fs/netfs/write_issue.c | 2 ++ fs/smb/client/cifssmb.c | 13 +++++++++---- fs/smb/client/smb2pdu.c | 9 ++++++--- include/linux/netfs.h | 6 +++--- 9 files changed, 44 insertions(+), 23 deletions(-) diff --git a/fs/9p/vfs_addr.c b/fs/9p/vfs_addr.c index 819c75233235..3bc9ce6c575e 100644 --- a/fs/9p/vfs_addr.c +++ b/fs/9p/vfs_addr.c @@ -57,6 +57,8 @@ static void v9fs_issue_write(struct netfs_io_subrequest *subreq) int err, len; len = p9_client_write(fid, subreq->start, &subreq->io_iter, &err); + if (len > 0) + __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); netfs_write_subrequest_terminated(subreq, len ?: err, false); } @@ -80,8 +82,10 @@ static void v9fs_issue_read(struct netfs_io_subrequest *subreq) if (pos + total >= i_size_read(rreq->inode)) __set_bit(NETFS_SREQ_HIT_EOF, &subreq->flags); - if (!err) + if (!err) { subreq->transferred += total; + __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); + } netfs_read_subreq_terminated(subreq, err, false); } diff --git a/fs/afs/write.c b/fs/afs/write.c index 34107b55f834..ccb6aa8027c5 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -122,7 +122,7 @@ static void afs_issue_write_worker(struct work_struct *work) if (subreq->debug_index == 3) return netfs_write_subrequest_terminated(subreq, -ENOANO, false); - if (!test_bit(NETFS_SREQ_RETRYING, &subreq->flags)) { + if (!subreq->retry_count) { set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); return netfs_write_subrequest_terminated(subreq, -EAGAIN, false); } @@ -149,6 +149,9 @@ static void afs_issue_write_worker(struct work_struct *work) afs_wait_for_operation(op); ret = afs_put_operation(op); switch (ret) { + case 0: + __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); + break; case -EACCES: case -EPERM: case -ENOKEY: diff --git a/fs/netfs/read_collect.c b/fs/netfs/read_collect.c index 46ce3b7adf07..47ed3a5044e2 100644 --- a/fs/netfs/read_collect.c +++ b/fs/netfs/read_collect.c @@ -438,7 +438,7 @@ void netfs_read_subreq_progress(struct netfs_io_subrequest *subreq, rreq->origin == NETFS_READPAGE || rreq->origin == NETFS_READ_FOR_WRITE)) { netfs_consume_read_data(subreq, was_async); - __clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags); + __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); } } EXPORT_SYMBOL(netfs_read_subreq_progress); @@ -497,7 +497,7 @@ void netfs_read_subreq_terminated(struct netfs_io_subrequest *subreq, rreq->origin == NETFS_READPAGE || rreq->origin == NETFS_READ_FOR_WRITE)) { netfs_consume_read_data(subreq, was_async); - __clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags); + __set_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); } rreq->transferred += subreq->transferred; } @@ -511,10 +511,13 @@ void netfs_read_subreq_terminated(struct netfs_io_subrequest *subreq, } else { trace_netfs_sreq(subreq, netfs_sreq_trace_short); if (subreq->transferred > subreq->consumed) { - __set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); - __clear_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags); - set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags); - } else if (!__test_and_set_bit(NETFS_SREQ_NO_PROGRESS, &subreq->flags)) { + /* If we didn't read new data, abandon retry. */ + if (subreq->retry_count && + test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags)) { + __set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); + set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags); + } + } else if (test_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags)) { __set_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); set_bit(NETFS_RREQ_NEED_RETRY, &rreq->flags); } else { diff --git a/fs/netfs/read_retry.c b/fs/netfs/read_retry.c index 0350592ea804..0e72e9226fc8 100644 --- a/fs/netfs/read_retry.c +++ b/fs/netfs/read_retry.c @@ -56,6 +56,8 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq) if (test_bit(NETFS_SREQ_FAILED, &subreq->flags)) break; if (__test_and_clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags)) { + __clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); + subreq->retry_count++; netfs_reset_iter(subreq); netfs_reissue_read(rreq, subreq); } @@ -137,7 +139,8 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq) stream0->sreq_max_len = subreq->len; __clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); - __set_bit(NETFS_SREQ_RETRYING, &subreq->flags); + __clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); + subreq->retry_count++; spin_lock_bh(&rreq->lock); list_add_tail(&subreq->rreq_link, &rreq->subrequests); @@ -213,7 +216,6 @@ static void netfs_retry_read_subrequests(struct netfs_io_request *rreq) subreq->error = -ENOMEM; __clear_bit(NETFS_SREQ_FAILED, &subreq->flags); __clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); - __clear_bit(NETFS_SREQ_RETRYING, &subreq->flags); } spin_lock_bh(&rreq->lock); list_splice_tail_init(&queue, &rreq->subrequests); diff --git a/fs/netfs/write_collect.c b/fs/netfs/write_collect.c index 82290c92ba7a..ca3a11ed9b54 100644 --- a/fs/netfs/write_collect.c +++ b/fs/netfs/write_collect.c @@ -179,7 +179,6 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, struct iov_iter source = subreq->io_iter; iov_iter_revert(&source, subreq->len - source.count); - __set_bit(NETFS_SREQ_RETRYING, &subreq->flags); netfs_get_subrequest(subreq, netfs_sreq_trace_get_resubmit); netfs_reissue_write(stream, subreq, &source); } @@ -234,7 +233,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, /* Renegotiate max_len (wsize) */ trace_netfs_sreq(subreq, netfs_sreq_trace_retry); __clear_bit(NETFS_SREQ_NEED_RETRY, &subreq->flags); - __set_bit(NETFS_SREQ_RETRYING, &subreq->flags); + subreq->retry_count++; stream->prepare_write(subreq); part = min(len, stream->sreq_max_len); @@ -279,7 +278,7 @@ static void netfs_retry_write_stream(struct netfs_io_request *wreq, subreq->start = start; subreq->debug_index = atomic_inc_return(&wreq->subreq_counter); subreq->stream_nr = to->stream_nr; - __set_bit(NETFS_SREQ_RETRYING, &subreq->flags); + subreq->retry_count = 1; trace_netfs_sreq_ref(wreq->debug_id, subreq->debug_index, refcount_read(&subreq->ref), diff --git a/fs/netfs/write_issue.c b/fs/netfs/write_issue.c index bf6d507578e5..ff0e82505a0b 100644 --- a/fs/netfs/write_issue.c +++ b/fs/netfs/write_issue.c @@ -244,6 +244,8 @@ void netfs_reissue_write(struct netfs_io_stream *stream, iov_iter_advance(source, size); iov_iter_truncate(&subreq->io_iter, size); + subreq->retry_count++; + __clear_bit(NETFS_SREQ_MADE_PROGRESS, &subreq->flags); __set_bit(NETFS_SREQ_IN_PROGRESS, &subreq->flags); netfs_do_issue_write(stream, subreq); } diff --git a/fs/smb/client/cifssmb.c b/fs/smb/client/cifssmb.c index bd42a419458e..6cb1e81993f8 100644 --- a/fs/smb/client/cifssmb.c +++ b/fs/smb/client/cifssmb.c @@ -1319,14 +1319,16 @@ cifs_readv_callback(struct mid_q_entry *mid) } if (rdata->result == -ENODATA) { - __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); rdata->result = 0; + __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); } else { size_t trans = rdata->subreq.transferred + rdata->got_bytes; if (trans < rdata->subreq.len && rdata->subreq.start + trans == ictx->remote_i_size) { - __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); rdata->result = 0; + __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); + } else if (rdata->got_bytes > 0) { + __set_bit(NETFS_SREQ_MADE_PROGRESS, &rdata->subreq.flags); } } @@ -1670,10 +1672,13 @@ cifs_writev_callback(struct mid_q_entry *mid) if (written > wdata->subreq.len) written &= 0xFFFF; - if (written < wdata->subreq.len) + if (written < wdata->subreq.len) { result = -ENOSPC; - else + } else { result = written; + if (written > 0) + __set_bit(NETFS_SREQ_MADE_PROGRESS, &wdata->subreq.flags); + } break; case MID_REQUEST_SUBMITTED: case MID_RETRY_NEEDED: diff --git a/fs/smb/client/smb2pdu.c b/fs/smb/client/smb2pdu.c index 010eae9d6c47..458b53d1f9cb 100644 --- a/fs/smb/client/smb2pdu.c +++ b/fs/smb/client/smb2pdu.c @@ -4615,6 +4615,7 @@ smb2_readv_callback(struct mid_q_entry *mid) __set_bit(NETFS_SREQ_HIT_EOF, &rdata->subreq.flags); rdata->result = 0; } + __set_bit(NETFS_SREQ_MADE_PROGRESS, &rdata->subreq.flags); } trace_smb3_rw_credits(rreq_debug_id, subreq_debug_index, rdata->credits.value, server->credits, server->in_flight, @@ -4840,10 +4841,12 @@ smb2_writev_callback(struct mid_q_entry *mid) if (written > wdata->subreq.len) written &= 0xFFFF; - if (written < wdata->subreq.len) + if (written < wdata->subreq.len) { wdata->result = -ENOSPC; - else + } else if (written > 0) { wdata->subreq.len = written; + __set_bit(NETFS_SREQ_MADE_PROGRESS, &wdata->subreq.flags); + } break; case MID_REQUEST_SUBMITTED: case MID_RETRY_NEEDED: @@ -5012,7 +5015,7 @@ smb2_async_writev(struct cifs_io_subrequest *wdata) } #endif - if (test_bit(NETFS_SREQ_RETRYING, &wdata->subreq.flags)) + if (wdata->subreq.retry_count > 0) smb2_set_replay(server, &rqst); cifs_dbg(FYI, "async write at %llu %u bytes iter=%zx\n", diff --git a/include/linux/netfs.h b/include/linux/netfs.h index 5eaceef41e6c..4083d77e3f39 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -185,6 +185,7 @@ struct netfs_io_subrequest { short error; /* 0 or error that occurred */ unsigned short debug_index; /* Index in list (for debugging output) */ unsigned int nr_segs; /* Number of segs in io_iter */ + u8 retry_count; /* The number of retries (0 on initial pass) */ enum netfs_io_source source; /* Where to read from/write to */ unsigned char stream_nr; /* I/O stream this belongs to */ unsigned char curr_folioq_slot; /* Folio currently being read */ @@ -194,14 +195,13 @@ struct netfs_io_subrequest { #define NETFS_SREQ_COPY_TO_CACHE 0 /* Set if should copy the data to the cache */ #define NETFS_SREQ_CLEAR_TAIL 1 /* Set if the rest of the read should be cleared */ #define NETFS_SREQ_SEEK_DATA_READ 3 /* Set if ->read() should SEEK_DATA first */ -#define NETFS_SREQ_NO_PROGRESS 4 /* Set if we didn't manage to read any data */ +#define NETFS_SREQ_MADE_PROGRESS 4 /* Set if we transferred at least some data */ #define NETFS_SREQ_ONDEMAND 5 /* Set if it's from on-demand read mode */ #define NETFS_SREQ_BOUNDARY 6 /* Set if ends on hard boundary (eg. ceph object) */ #define NETFS_SREQ_HIT_EOF 7 /* Set if short due to EOF */ #define NETFS_SREQ_IN_PROGRESS 8 /* Unlocked when the subrequest completes */ #define NETFS_SREQ_NEED_RETRY 9 /* Set if the filesystem requests a retry */ -#define NETFS_SREQ_RETRYING 10 /* Set if we're retrying */ -#define NETFS_SREQ_FAILED 11 /* Set if the subreq failed unretryably */ +#define NETFS_SREQ_FAILED 10 /* Set if the subreq failed unretryably */ }; enum netfs_io_origin { From patchwork Fri Dec 13 13:50:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 850312 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 78A8D1EE7AF for ; Fri, 13 Dec 2024 13:51:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097897; cv=none; b=fZT7lge0g5KHF3TJ4W1I+m0gv955AYsvcjshS5eykXlAM9+MDl4+qk51nHIbXdKUELcYHjhZv7fKObHH09mqju1agbKY+zq1Nltmmr7YcKxLqS+WpdB/0Sz399NqwuVv4aUk4qKv3SCYqpGJW21jQ1pcCvcQJFw+yYb8FP9XAZA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734097897; c=relaxed/simple; bh=SrqCMAuMjlWi8sIixKN4Dar0oa3TrmkaDfBeGAAPycg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=aQg76iPjKV8YCyIsMY3DRQ+gpt7H5/FZdbr43TYa7IDRoJSnyQpYO+n/X3460oAfadqQU+mCUmJTIS/Y/0UWzJSFXqlS6XaFKu5aCpXAkBZ19bTvJf9oqS6QDGymngull3SGvfL4q9qeoMQeh/auIFc3F82au6Bht4DET96qE84= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=gFcncDkQ; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gFcncDkQ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734097893; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=L1v5Uahl/uBNImxZaUKZS5taSNKrtxd0nK3bDl+BkV8=; b=gFcncDkQScyof+oZWSXMb8Uj0dNkyguo4wg2n5Zd76UMxbvtlZjSHs23Uio68rTlPl+KXk by5rmukTvE6soTdYUbACScuDCdeQSK/JEnxqwFtNIPECCRWDwycnCFLInVLgS0h6I+NPTn JAW1c2Cmkk3mLtmbZtKPqunVyrmD43c= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-473-iuxIXFenMlSlR7jWS7DojA-1; Fri, 13 Dec 2024 08:51:30 -0500 X-MC-Unique: iuxIXFenMlSlR7jWS7DojA-1 X-Mimecast-MFC-AGG-ID: iuxIXFenMlSlR7jWS7DojA Received: from mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.15]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 059B41955F08; Fri, 13 Dec 2024 13:51:28 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.42.28.48]) by mx-prod-int-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 056F31956086; Fri, 13 Dec 2024 13:51:22 +0000 (UTC) From: David Howells To: Christian Brauner Cc: David Howells , Max Kellermann , Ilya Dryomov , Xiubo Li , Trond Myklebust , Jeff Layton , Matthew Wilcox , netfs@lists.linux.dev, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-erofs@lists.ozlabs.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 10/10] netfs: Fix the (non-)cancellation of copy when cache is temporarily disabled Date: Fri, 13 Dec 2024 13:50:10 +0000 Message-ID: <20241213135013.2964079-11-dhowells@redhat.com> In-Reply-To: <20241213135013.2964079-1-dhowells@redhat.com> References: <20241213135013.2964079-1-dhowells@redhat.com> Precedence: bulk X-Mailing-List: ceph-devel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.15 When the caching for a cookie is temporarily disabled (e.g. due to a DIO write on that file), future copying to the cache for that file is disabled until all fds open on that file are closed. However, if netfslib is using the deprecated PG_private_2 method (such as is currently used by ceph), and decides it wants to copy to the cache, netfs_advance_write() will just bail at the first check seeing that the cache stream is unavailable, and indicate that it dealt with all the content. This means that we have no subrequests to provide notifications to drive the state machine or even to pin the request and the request just gets discarded, leaving the folios with PG_private_2 set. Fix this by jumping directly to cancel the request if the cache is not available. That way, we don't remove mark3 from the folio_queue list and netfs_pgpriv2_cancel() will clean up the folios. This was found by running the generic/013 xfstest against ceph with an active cache and the "-o fsc" option passed to ceph. That would usually hang Fixes: ee4cdf7ba857 ("netfs: Speed up buffered reading") Reported-by: Max Kellermann Closes: https://lore.kernel.org/r/CAKPOu+_4m80thNy5_fvROoxBm689YtA0dZ-=gcmkzwYSY4syqw@mail.gmail.com/ Signed-off-by: David Howells cc: Jeff Layton cc: Ilya Dryomov cc: Xiubo Li cc: netfs@lists.linux.dev cc: ceph-devel@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/read_pgpriv2.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/netfs/read_pgpriv2.c b/fs/netfs/read_pgpriv2.c index ba5af89d37fa..54d5004fec18 100644 --- a/fs/netfs/read_pgpriv2.c +++ b/fs/netfs/read_pgpriv2.c @@ -170,6 +170,10 @@ void netfs_pgpriv2_write_to_the_cache(struct netfs_io_request *rreq) trace_netfs_write(wreq, netfs_write_trace_copy_to_cache); netfs_stat(&netfs_n_wh_copy_to_cache); + if (!wreq->io_streams[1].avail) { + netfs_put_request(wreq, false, netfs_rreq_trace_put_return); + goto couldnt_start; + } for (;;) { error = netfs_pgpriv2_copy_folio(wreq, folio);