Message ID | 20231114211804.1449162-1-bvanassche@acm.org |
---|---|
Headers | show |
Series | Improve write performance for zoned UFS devices | expand |
As this keeps getting reposted: I still think it is a very bad idea to add this amount of complexity to the SCSI code, for a model that can't work for the general case and diverges from the established NVMe model. So I do not thing we should support this.
Hi Bart, On 11/15/2023 5:16 AM, Bart Van Assche wrote: > From the UFSHCI 4.0 specification, about the legacy (single queue) mode: > "The host controller always process transfer requests in-order according > to the order submitted to the list. In case of multiple commands with > single doorbell register ringing (batch mode), The dispatch order for > these transfer requests by host controller will base on their index in > the List. A transfer request with lower index value will be executed > before a transfer request with higher index value." > > From the UFSHCI 4.0 specification, about the MCQ mode: > "Command Submission > 1. Host SW writes an Entry to SQ > 2. Host SW updates SQ doorbell tail pointer > > Command Processing > 3. After fetching the Entry, Host Controller updates SQ doorbell head > pointer > 4. Host controller sends COMMAND UPIU to UFS device" > > In other words, for both legacy and MCQ mode, UFS controllers are > required to forward commands to the UFS device in the order these > commands have been received from the host. > > Notes: > - For legacy mode this is only correct if the host submits one > command at a time. The UFS driver does this. > - Also in legacy mode, the command order is not preserved if > auto-hibernation is enabled in the UFS controller. Hence, enable > zone write locking if auto-hibernation is enabled. > > This patch improves performance as follows on my test setup: > - With the mq-deadline scheduler: 2.5x more IOPS for small writes. > - When not using an I/O scheduler compared to using mq-deadline with > zone locking: 4x more IOPS for small writes. > > Reviewed-by: Bao D. Nguyen <quic_nguyenb@quicinc.com> > Reviewed-by: Can Guo <quic_cang@quicinc.com> > Cc: Martin K. Petersen <martin.petersen@oracle.com> > Cc: Avri Altman <avri.altman@wdc.com> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > --- > drivers/ufs/core/ufshcd.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index 732509289165..e78954cda3ae 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -4421,6 +4421,20 @@ static int ufshcd_update_preserves_write_order(struct ufs_hba *hba, > return -EPERM; > } > } > + shost_for_each_device(sdev, hba->host) > + blk_freeze_queue_start(sdev->request_queue); > + shost_for_each_device(sdev, hba->host) { > + struct request_queue *q = sdev->request_queue; > + > + blk_mq_freeze_queue_wait(q); > + q->limits.driver_preserves_write_order = preserves_write_order; > + blk_queue_required_elevator_features(q, > + !preserves_write_order && blk_queue_is_zoned(q) ? > + ELEVATOR_F_ZBD_SEQ_WRITE : 0); > + if (q->disk) > + disk_set_zoned(q->disk, q->limits.zoned); > + blk_mq_unfreeze_queue(q); > + } > > return 0; > } > @@ -4463,7 +4477,8 @@ int ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit) > > if (!is_mcq_enabled(hba) && !prev_state && new_state) { > /* > - * Auto-hibernation will be enabled for legacy UFSHCI mode. > + * Auto-hibernation will be enabled for legacy UFSHCI mode. Tell > + * the block layer that write requests may be reordered. > */ > ret = ufshcd_update_preserves_write_order(hba, false); > if (ret) > @@ -4479,7 +4494,8 @@ int ufshcd_auto_hibern8_update(struct ufs_hba *hba, u32 ahit) > } > if (!is_mcq_enabled(hba) && prev_state && !new_state) { > /* > - * Auto-hibernation has been disabled. > + * Auto-hibernation has been disabled. Tell the block layer that > + * the order of write requests is preserved. > */ > ret = ufshcd_update_preserves_write_order(hba, true); > WARN_ON_ONCE(ret); > @@ -5247,6 +5263,10 @@ static int ufshcd_slave_configure(struct scsi_device *sdev) > struct ufs_hba *hba = shost_priv(sdev->host); > struct request_queue *q = sdev->request_queue; > > + q->limits.driver_preserves_write_order = > + !ufshcd_is_auto_hibern8_supported(hba) || > + FIELD_GET(UFSHCI_AHIBERN8_TIMER_MASK, hba->ahit) == 0; > + I got some time testing these changes on SM8650 with MCQ enabled. I found that with these changes in place (with AH8 disabled). Even we can make sure UFS driver does not re-order requests in MCQ mode, the reorder is still happening while running FIO and can be seen from ftrace logs. I think it is related with below logic in blk-mq-sched.c, please correct me if I am wrong. static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_ctx *hctx) { ... } else if (multi_hctxs) { /* * Requests from different hctx may be dequeued from some * schedulers, such as bfq and deadline. * * Sort the requests in the list according to their hctx, * dispatch batching requests from same hctx at a time. */ list_sort(NULL, &rq_list, sched_rq_cmp); ... } Thanks, Can Guo. > blk_queue_update_dma_pad(q, PRDT_DATA_BYTE_COUNT_PAD - 1); > > /* > @@ -9026,6 +9046,7 @@ static const struct scsi_host_template ufshcd_driver_template = { > .max_host_blocked = 1, > .track_queue_depth = 1, > .skip_settle_delay = 1, > + .needs_prepare_resubmit = 1, > .sdev_groups = ufshcd_driver_groups, > .rpm_autosuspend_delay = RPM_AUTOSUSPEND_DELAY_MS, > };
On Mon, Nov 27, 2023 at 11:35:48AM -0800, Bart Van Assche wrote:
> Here is some additional background information:
I know the background. I also know that JEDEC did all this aginst
better judgement and knowing the situation. We should not give them
their carrot after they haven't even been interested in engaging.
On 11/28/23 04:53, Christoph Hellwig wrote: > I know the background. I also know that JEDEC did all this aginst > better judgement and knowing the situation. We should not give them > their carrot after they haven't even been interested in engaging. That statement is overly negative. The JEDEC Zoned Storage for UFS standard has been published last week [1]. It can be downloaded by anyone for free after having created a JEDEC account, which is also free. As one can see in this standard, nothing excludes using a zone append command. Once T10 standardizes a zone append command, it can be implemented by UFS vendors. However, I do not know whether T10 plans to standardize a zone append command. Bart. [1] https://www.jedec.org/system/files/docs/JESD220-5.pdf
On 11/27/23 17:45, Can Guo wrote: > I got some time testing these changes on SM8650 with MCQ enabled. I > found that with these changes in place (with AH8 disabled). Even we > can make sure UFS driver does not re-order requests in MCQ mode, the > reorder is still happening while running FIO and can be seen from > ftrace logs. Hi Can, Thank you for having taken the time to run this test and also for having shared your findings. I have not yet had the chance to test this patch series myself on an MCQ test setup. I will try to locate such a test setup and test this patch series on an MCQ setup. Thanks, Bart.