From patchwork Mon Jun 16 22:33:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bart Van Assche X-Patchwork-Id: 897273 Received: from 003.mia.mailroute.net (003.mia.mailroute.net [199.89.3.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4DDAC1DB92A; Mon, 16 Jun 2025 22:34:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=199.89.3.6 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750113263; cv=none; b=aqdCd4yUYoFnbETXmykWaWucKfztD2vakY0DAUqg9aulv6drsYUnSG61MqqXsWtaOjD3IE5rfNHst1Gi1T3CUYu1aPOH0nXnDX30tyrdsmh6D1NscUAd2rL1t7JZWqDLcY7oU2uDNP92SJG67jAyGq2jgNrnfkPdxB7ZEdQ0pFU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750113263; c=relaxed/simple; bh=ntA8KiqVTDS23AUUqj6SV5pCYkwNF/Gg+KoeULYMKSU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XELx/WT7eP2FkZjHyIHnzg8ipldEyEnn7eQDWkGQXT9Z3fAf5/5aT0BSBrfLRIgTjwiS/vs6TwWcytBYvXou/zAwZLPbyiCB7h124DoOM/aYysE0jm6zlnQyuWq/a0Yu3vID+L3mbee4FWZszV7hpnJdRsjAeLbf32uSiZvCYqY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=acm.org; spf=pass smtp.mailfrom=acm.org; dkim=pass (2048-bit key) header.d=acm.org header.i=@acm.org header.b=rS4jMzJd; arc=none smtp.client-ip=199.89.3.6 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=acm.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=acm.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=acm.org header.i=@acm.org header.b="rS4jMzJd" Received: from localhost (localhost [127.0.0.1]) by 003.mia.mailroute.net (Postfix) with ESMTP id 4bLlDD32Zxzlgqxq; Mon, 16 Jun 2025 22:34:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=acm.org; h= content-transfer-encoding:mime-version:references:in-reply-to :x-mailer:message-id:date:date:subject:subject:from:from :received:received; s=mr01; t=1750113258; x=1752705259; bh=JFqAP Ri/vIgzyDFpGmlkFYzHBYBdAEH9luUneuwjDwE=; b=rS4jMzJdbfsx5dxtGsB3v CrstKi/cB5FozB45s7H+HOtMHzdiOYSoX0rd6DsZdaEkGHvKvy5vHerdHOi21a07 3j9Rj2CslS0liAHzwMpRQXCDH0OIO9zXcWvXykDF8mfeEdCnk3RyhTk1zESqRS/l tdPz0slEX7+TqU7ljMkNxJ1+NI4Q5ut4pNlHeAncpvPbNLW6WWf5FepmoCP5ACAz W6T7a0/lq6RN7m0qZynohsdn3Mqug7qI3s/iIxEo4vhirBd0cb141HWezIkCy4ro nB4qe1l9+ZuYwQOgNe6ypUM1UjZPn5jXBEPmJn70x6uPk7LdFMFM1c7gEqIsjach Q== X-Virus-Scanned: by MailRoute Received: from 003.mia.mailroute.net ([127.0.0.1]) by localhost (003.mia [127.0.0.1]) (mroute_mailscanner, port 10029) with LMTP id ewJMwwrOFp2g; Mon, 16 Jun 2025 22:34:18 +0000 (UTC) Received: from bvanassche.mtv.corp.google.com (unknown [104.135.204.82]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: bvanassche@acm.org) by 003.mia.mailroute.net (Postfix) with ESMTPSA id 4bLlD6175XzlgqVx; Mon, 16 Jun 2025 22:34:13 +0000 (UTC) From: Bart Van Assche To: Jens Axboe Cc: linux-block@vger.kernel.org, linux-scsi@vger.kernel.org, Christoph Hellwig , Damien Le Moal , Bart Van Assche , Yu Kuai Subject: [PATCH v18 04/12] blk-mq: Restore the zoned write order when requeuing Date: Mon, 16 Jun 2025 15:33:04 -0700 Message-ID: <20250616223312.1607638-5-bvanassche@acm.org> X-Mailer: git-send-email 2.50.0.rc2.692.g299adb8693-goog In-Reply-To: <20250616223312.1607638-1-bvanassche@acm.org> References: <20250616223312.1607638-1-bvanassche@acm.org> Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Zoned writes may be requeued. This happens if a block driver returns BLK_STS_RESOURCE, to handle SCSI unit attentions or by the SCSI error handler after error handling has finished. Requests may be requeued in another order than submitted. Restore the request order if requests are requeued. Add RQF_DONTPREP to RQF_NOMERGE_FLAGS because this patch may cause RQF_DONTPREP requests to be sent to the code that checks whether a request can be merged and RQF_DONTPREP requests must not be merged. Cc: Christoph Hellwig Cc: Damien Le Moal Cc: Yu Kuai Signed-off-by: Bart Van Assche --- block/bfq-iosched.c | 2 ++ block/blk-mq.c | 20 +++++++++++++++++++- block/blk-mq.h | 2 ++ block/kyber-iosched.c | 2 ++ block/mq-deadline.c | 7 ++++++- include/linux/blk-mq.h | 13 ++++++++++++- 6 files changed, 43 insertions(+), 3 deletions(-) diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 0cb1e9873aab..1bd3afe5d779 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -6276,6 +6276,8 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, if (flags & BLK_MQ_INSERT_AT_HEAD) { list_add(&rq->queuelist, &bfqd->dispatch); + } else if (flags & BLK_MQ_INSERT_ORDERED) { + blk_mq_insert_ordered(rq, &bfqd->dispatch); } else if (!bfqq) { list_add_tail(&rq->queuelist, &bfqd->dispatch); } else { diff --git a/block/blk-mq.c b/block/blk-mq.c index 9dab1caf750a..1735b1ac0574 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1562,7 +1562,9 @@ static void blk_mq_requeue_work(struct work_struct *work) * already. Insert it into the hctx dispatch list to avoid * block layer merges for the request. */ - if (rq->rq_flags & RQF_DONTPREP) + if (blk_rq_is_seq_zoned_write(rq)) + blk_mq_insert_request(rq, BLK_MQ_INSERT_ORDERED); + else if (rq->rq_flags & RQF_DONTPREP) blk_mq_request_bypass_insert(rq, 0); else blk_mq_insert_request(rq, BLK_MQ_INSERT_AT_HEAD); @@ -2595,6 +2597,20 @@ static void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, blk_mq_run_hw_queue(hctx, run_queue_async); } +void blk_mq_insert_ordered(struct request *rq, struct list_head *list) +{ + struct request_queue *q = rq->q; + struct request *rq2; + + list_for_each_entry(rq2, list, queuelist) + if (rq2->q == q && blk_rq_pos(rq2) > blk_rq_pos(rq)) + break; + + /* Insert rq before rq2. If rq2 is the list head, append at the end. */ + list_add_tail(&rq->queuelist, &rq2->queuelist); +} +EXPORT_SYMBOL_GPL(blk_mq_insert_ordered); + static void blk_mq_insert_request(struct request *rq, blk_insert_t flags) { struct request_queue *q = rq->q; @@ -2649,6 +2665,8 @@ static void blk_mq_insert_request(struct request *rq, blk_insert_t flags) spin_lock(&ctx->lock); if (flags & BLK_MQ_INSERT_AT_HEAD) list_add(&rq->queuelist, &ctx->rq_lists[hctx->type]); + else if (flags & BLK_MQ_INSERT_ORDERED) + blk_mq_insert_ordered(rq, &ctx->rq_lists[hctx->type]); else list_add_tail(&rq->queuelist, &ctx->rq_lists[hctx->type]); diff --git a/block/blk-mq.h b/block/blk-mq.h index 52e907547b55..75b657554f1f 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -40,8 +40,10 @@ enum { typedef unsigned int __bitwise blk_insert_t; #define BLK_MQ_INSERT_AT_HEAD ((__force blk_insert_t)0x01) +#define BLK_MQ_INSERT_ORDERED ((__force blk_insert_t)0x02) void blk_mq_submit_bio(struct bio *bio); +void blk_mq_insert_ordered(struct request *rq, struct list_head *list); int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_comp_batch *iob, unsigned int flags); void blk_mq_exit_queue(struct request_queue *q); diff --git a/block/kyber-iosched.c b/block/kyber-iosched.c index 4dba8405bd01..051c05ceafd7 100644 --- a/block/kyber-iosched.c +++ b/block/kyber-iosched.c @@ -603,6 +603,8 @@ static void kyber_insert_requests(struct blk_mq_hw_ctx *hctx, trace_block_rq_insert(rq); if (flags & BLK_MQ_INSERT_AT_HEAD) list_move(&rq->queuelist, head); + else if (flags & BLK_MQ_INSERT_ORDERED) + blk_mq_insert_ordered(rq, head); else list_move_tail(&rq->queuelist, head); sbitmap_set_bit(&khd->kcq_map[sched_domain], diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 2edf1cac06d5..110fef65b829 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -710,7 +710,12 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq, * set expire time and add to fifo list */ rq->fifo_time = jiffies + dd->fifo_expire[data_dir]; - list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]); + if (flags & BLK_MQ_INSERT_ORDERED) + blk_mq_insert_ordered(rq, + &per_prio->fifo_list[data_dir]); + else + list_add_tail(&rq->queuelist, + &per_prio->fifo_list[data_dir]); } } diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index de8c85a03bb7..6833d5a980ef 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -86,7 +86,7 @@ enum rqf_flags { /* flags that prevent us from merging requests: */ #define RQF_NOMERGE_FLAGS \ - (RQF_STARTED | RQF_FLUSH_SEQ | RQF_SPECIAL_PAYLOAD) + (RQF_STARTED | RQF_FLUSH_SEQ | RQF_DONTPREP | RQF_SPECIAL_PAYLOAD) enum mq_rq_state { MQ_RQ_IDLE = 0, @@ -1189,4 +1189,15 @@ static inline int blk_rq_map_sg(struct request *rq, struct scatterlist *sglist) } void blk_dump_rq_flags(struct request *, char *); +static inline bool blk_rq_is_seq_zoned_write(struct request *rq) +{ + switch (req_op(rq)) { + case REQ_OP_WRITE: + case REQ_OP_WRITE_ZEROES: + return bdev_zone_is_seq(rq->q->disk->part0, blk_rq_pos(rq)); + default: + return false; + } +} + #endif /* BLK_MQ_H */