From patchwork Tue Dec 17 18:29:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 851630 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8EE41F8AF0; Tue, 17 Dec 2024 18:29:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460187; cv=none; b=EkLOJIVnXE3+katakqr1YTfplN/TFc6SG/CzHWpG6Mtub86ny7T2IfkeSIGoTd9TTgOyDJjDvsbCs1M53a4Zl5gQEdb/l5UzTne+DDYvHzMpPHKPETCaMnDUXnO7o1HFTBzt08mLLU7jpsKw1jEK7wKRr7jtOIM463BPb55IDDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460187; c=relaxed/simple; bh=GK5KtME/hF0aSpOZVjtVt+PZSF6LX1XTvt/7AaCW82Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=HX+vlCH6jBmXc4u0QQq1Qfwb06TGC/0IzxjzQfULZpQOORPI7hRElTNYgNsQwnfLgYJyEqo9cXfpqpxqBjyc+MHvk6WPr3iUB+J+RMeaBljBo5vEBISSmqJRWiXN+X/UexQ7ZkicdGcNpFEdIbtKxzziEtK9vgGWmIuuHaysxLc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=BeLQJ9mm; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="BeLQJ9mm" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ECDC7C4CEDD; Tue, 17 Dec 2024 18:29:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460186; bh=GK5KtME/hF0aSpOZVjtVt+PZSF6LX1XTvt/7AaCW82Y=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=BeLQJ9mm2uFWrsTpWNRjjW7uEntUaU0PFv9nGaIG9BrhXK5aMfQAit/7HgXd8TU3L 3G53FG+ePxNtXXNPUf+NoCHde/ChjtQHWAwkYgnUcuRRv4Mo+AQ7XU6vhZOlcLzG39 XC+cartsLKGgQY/94FHd7/sgnlqqymFdzLTpc4Lq3qxVHlJlYu5oqsZ7XUYwq+7tsH 5RHY2tp5zoTH8k6TW7YVtRsThmSVwl6dDyiON6IPJha6H9LS8XaCQ6uopqIC5W/qKT 7ycHrotYixRpiCyIN3R/fGp8m5KNl0tegw6iXFrZRR7GM7NTrxYtlhNLrHAs3hiCME nTrjJSBEbgHEA== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:35 +0100 Subject: [PATCH v4 1/9] lib/group_cpus: let group_cpu_evenly return number of groups Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-1-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 group_cpu_evenly might allocated less groups then the requested: group_cpu_evenly __group_cpus_evenly alloc_nodes_groups # allocated total groups may be less than numgrps when # active total CPU number is less then numgrps In this case, the caller will do an out of bound access because the caller assumes the masks returned has numgrps. Return the number of groups created so the caller can limit the access range accordingly. Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 7 ++++--- drivers/virtio/virtio_vdpa.c | 2 +- fs/fuse/virtio_fs.c | 7 ++++--- include/linux/group_cpus.h | 2 +- kernel/irq/affinity.c | 2 +- lib/group_cpus.c | 23 +++++++++++++---------- 6 files changed, 24 insertions(+), 19 deletions(-) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index ad8d6a363f24ae11968b42f7bcfd6a719a0499b7..85c0a7073bd8bff5d34aad1729d45d89da4c4bd1 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -19,9 +19,10 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; - unsigned int queue, cpu; + unsigned int queue, cpu, nr_masks; - masks = group_cpus_evenly(qmap->nr_queues); + nr_masks = qmap->nr_queues; + masks = group_cpus_evenly(&nr_masks); if (!masks) { for_each_possible_cpu(cpu) qmap->mq_map[cpu] = qmap->queue_offset; @@ -29,7 +30,7 @@ void blk_mq_map_queues(struct blk_mq_queue_map *qmap) } for (queue = 0; queue < qmap->nr_queues; queue++) { - for_each_cpu(cpu, &masks[queue]) + for_each_cpu(cpu, &masks[queue % nr_masks]) qmap->mq_map[cpu] = qmap->queue_offset + queue; } kfree(masks); diff --git a/drivers/virtio/virtio_vdpa.c b/drivers/virtio/virtio_vdpa.c index 1f60c9d5cb1810a6f208c24bb2ac640d537391a0..c478cccf5fd68b9c9c01332046c24316573d97cd 100644 --- a/drivers/virtio/virtio_vdpa.c +++ b/drivers/virtio/virtio_vdpa.c @@ -330,7 +330,7 @@ create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(&this_vecs); if (!result) { kfree(masks); diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 82afe78ec542358e2db6f4d955d521652ae363ec..5acd875f1e9c9840dd9d2f3245665c91230f57a8 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -862,7 +862,7 @@ static void virtio_fs_requests_done_work(struct work_struct *work) static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *fs) { const struct cpumask *mask, *masks; - unsigned int q, cpu; + unsigned int q, cpu, nr_masks; /* First attempt to map using existing transport layer affinities * e.g. PCIe MSI-X @@ -882,7 +882,8 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f return; fallback: /* Attempt to map evenly in groups over the CPUs */ - masks = group_cpus_evenly(fs->num_request_queues); + nr_masks = fs->num_request_queues; + masks = group_cpus_evenly(&nr_masks); /* If even this fails we default to all CPUs use first request queue */ if (!masks) { for_each_possible_cpu(cpu) @@ -891,7 +892,7 @@ static void virtio_fs_map_queues(struct virtio_device *vdev, struct virtio_fs *f } for (q = 0; q < fs->num_request_queues; q++) { - for_each_cpu(cpu, &masks[q]) + for_each_cpu(cpu, &masks[q % nr_masks]) fs->mq_map[cpu] = q + VQ_REQUEST; } kfree(masks); diff --git a/include/linux/group_cpus.h b/include/linux/group_cpus.h index e42807ec61f6e8cf3787af7daa0d8686edfef0a3..8659534a3423e92746738ac57e713b7416e05271 100644 --- a/include/linux/group_cpus.h +++ b/include/linux/group_cpus.h @@ -9,6 +9,6 @@ #include #include -struct cpumask *group_cpus_evenly(unsigned int numgrps); +struct cpumask *group_cpus_evenly(unsigned int *numgrps); #endif diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c index 44a4eba80315cc098ecfa366ca1d88483641b12a..0188e133f1a508a623e33f08a0fca2e1f2cbf4e4 100644 --- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -71,7 +71,7 @@ irq_create_affinity_masks(unsigned int nvecs, struct irq_affinity *affd) for (i = 0, usedvecs = 0; i < affd->nr_sets; i++) { unsigned int this_vecs = affd->set_size[i]; int j; - struct cpumask *result = group_cpus_evenly(this_vecs); + struct cpumask *result = group_cpus_evenly(&this_vecs); if (!result) { kfree(masks); diff --git a/lib/group_cpus.c b/lib/group_cpus.c index ee272c4cefcc13907ce9f211f479615d2e3c9154..73da83ca2c45347a3a443d42d4f16801a47effd5 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -334,7 +334,8 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * @numgrps: number of groups * * Return: cpumask array if successful, NULL otherwise. And each element - * includes CPUs assigned to this group + * includes CPUs assigned to this group. numgrps will be updated to the + * actuall allocated number of masks. * * Try to put close CPUs from viewpoint of CPU and NUMA locality into * same group, and run two-stage grouping: @@ -344,9 +345,9 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int *numgrps) { - unsigned int curgrp = 0, nr_present = 0, nr_others = 0; + unsigned int curgrp = 0, nr_present = 0, nr_others = 0, nr_grps; cpumask_var_t *node_to_cpumask; cpumask_var_t nmsk, npresmsk; int ret = -ENOMEM; @@ -362,7 +363,8 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) if (!node_to_cpumask) goto fail_npresmsk; - masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + nr_grps = *numgrps; + masks = kcalloc(nr_grps, sizeof(*masks), GFP_KERNEL); if (!masks) goto fail_node_to_cpumask; @@ -383,7 +385,7 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) cpumask_copy(npresmsk, data_race(cpu_present_mask)); /* grouping present CPUs first */ - ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask, + ret = __group_cpus_evenly(curgrp, nr_grps, node_to_cpumask, npresmsk, nmsk, masks); if (ret < 0) goto fail_build_affinity; @@ -395,19 +397,19 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) * group space, assign the non present CPUs to the already * allocated out groups. */ - if (nr_present >= numgrps) + if (nr_present >= nr_grps) curgrp = 0; else curgrp = nr_present; cpumask_andnot(npresmsk, cpu_possible_mask, npresmsk); - ret = __group_cpus_evenly(curgrp, numgrps, node_to_cpumask, + ret = __group_cpus_evenly(curgrp, nr_grps, node_to_cpumask, npresmsk, nmsk, masks); if (ret >= 0) nr_others = ret; fail_build_affinity: if (ret >= 0) - WARN_ON(nr_present + nr_others < numgrps); + WARN_ON(nr_present + nr_others < nr_grps); fail_node_to_cpumask: free_node_to_cpumask(node_to_cpumask); @@ -421,12 +423,13 @@ struct cpumask *group_cpus_evenly(unsigned int numgrps) kfree(masks); return NULL; } + *numgrps = nr_present + nr_others; return masks; } #else /* CONFIG_SMP */ -struct cpumask *group_cpus_evenly(unsigned int numgrps) +struct cpumask *group_cpus_evenly(unsigned int *numgrps) { - struct cpumask *masks = kcalloc(numgrps, sizeof(*masks), GFP_KERNEL); + struct cpumask *masks = kcalloc(*numgrps, sizeof(*masks), GFP_KERNEL); if (!masks) return NULL; From patchwork Tue Dec 17 18:29:37 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 851629 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EEF831F9AB4; Tue, 17 Dec 2024 18:29:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460192; cv=none; b=Wi9uxDUzui52gSMT+y3/exMY2LOtstFR0FJ5fr/LM+MwhlqEBoBbZ4yvgCwKy6pmxME0OAHaWzAqUJm0A7NmkA1uoRyOBmaGe0gbBp0pqD2CJmfUwC+Eb12fzAF2RREexvSyZaHU4DiQjbOg7sBQdwA973Cbjix0sUtXjIzKZq4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460192; c=relaxed/simple; bh=J8AQ9XfymIw2GabflSCTi0yfwCmZTc/3XvdEqwuuAXs=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=RpxsFosSvUojEREMG8DednbKIcRlt5SAOjIWFtPpkniMMX4+FfYAJ100+CgTh0uZrYwIImBLh/hAlumNNWPZF3046Ug75jCeD5GftxpAiHm/aXtpdPvkGUDO1RmuBxkP1NCzZVLeOsFKsDh1bxiIQ12Lo7roXfBY2fs6f/LreAI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lYcE5ER4; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lYcE5ER4" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 13BE3C4CEDF; Tue, 17 Dec 2024 18:29:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460191; bh=J8AQ9XfymIw2GabflSCTi0yfwCmZTc/3XvdEqwuuAXs=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=lYcE5ER4WqHafND2JV2YcmsMXvebT7xabLiuYqWbNUv5yN4UICqQu62xU4mKGC8og 16QjYsmOFoy3qOB73Qskervz60t9UQE8Hij7bIulPxKtsUyWT1B+5kflUjghgMRnhx qr9reApEvmjcZWP0qGaCry7raoaM+oZT/wDYaW/cccY9slC/ptFO5DZ+rIbOTvvdYX 9x5zTaSm6kdHbxP31bj/hLMZlMQwQNDObpjrDK0PTQNCWtHbwCwMVH4ms4Gvv96bGW +ujZuNGrq4jOpo2XBn8/5Ez5t8PtiET+jXDzAIR8b8Ljw+B1iHREBQ0T2TTSEcuh3N SL6/lwveSYqwg== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:37 +0100 Subject: [PATCH v4 3/9] blk-mq: add number of queue calc helper Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-3-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Add two variants of helpers which calculates the correct number of queues which should be used. The need for two variants is necessary because some drivers calculate their max number of queues based on the possible CPU mask, others based on the online CPU mask. Reviewed-by: Christoph Hellwig Signed-off-by: Daniel Wagner --- block/blk-mq-cpumap.c | 45 +++++++++++++++++++++++++++++++++++++++++++++ include/linux/blk-mq.h | 2 ++ 2 files changed, 47 insertions(+) diff --git a/block/blk-mq-cpumap.c b/block/blk-mq-cpumap.c index 85c0a7073bd8bff5d34aad1729d45d89da4c4bd1..b3a863c2db3231624685ab54a1810b22af4111f4 100644 --- a/block/blk-mq-cpumap.c +++ b/block/blk-mq-cpumap.c @@ -12,10 +12,55 @@ #include #include #include +#include #include "blk.h" #include "blk-mq.h" +static unsigned int blk_mq_num_queues(const struct cpumask *mask, + unsigned int max_queues) +{ + unsigned int num; + + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + num = cpumask_weight(mask); + return min_not_zero(num, max_queues); +} + +/** + * blk_mq_num_possible_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of possible cpu. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_possible_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_possible_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_possible_queues); + +/** + * blk_mq_num_online_queues - Calc nr of queues for multiqueue devices + * @max_queues: The maximal number of queues the hardware/driver + * supports. If max_queues is 0, the argument is + * ignored. + * + * Calculate the number of queues which should be used for a multiqueue + * device based on the number of online cpus. The helper is considering + * isolcpus settings. + */ +unsigned int blk_mq_num_online_queues(unsigned int max_queues) +{ + return blk_mq_num_queues(cpu_online_mask, max_queues); +} +EXPORT_SYMBOL_GPL(blk_mq_num_online_queues); + void blk_mq_map_queues(struct blk_mq_queue_map *qmap) { const struct cpumask *masks; diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 769eab6247d4921e574e0828ab41a580a5a9f2fe..4f0f2ea64de2057750e88c2a3ff7d49e13a7bfc5 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -920,6 +920,8 @@ int blk_mq_freeze_queue_wait_timeout(struct request_queue *q, void blk_mq_unfreeze_queue_non_owner(struct request_queue *q); void blk_freeze_queue_start_non_owner(struct request_queue *q); +unsigned int blk_mq_num_possible_queues(unsigned int max_queues); +unsigned int blk_mq_num_online_queues(unsigned int max_queues); void blk_mq_map_queues(struct blk_mq_queue_map *qmap); void blk_mq_map_hw_queues(struct blk_mq_queue_map *qmap, struct device *dev, unsigned int offset); From patchwork Tue Dec 17 18:29:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 851628 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 966F11FA141; Tue, 17 Dec 2024 18:29:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460196; cv=none; b=XRwkmj79k+0gNZAwo3WbvwxNyHrkHOrzb68li39+VsGCbgVxPl8q7Ywz7lRl17AaIPzBi90AX139xrPmtOVHGKREQDqTWW+Xm+N5Dlo+3zEwsm7U3jwVX3vMjDR70/qsM29qmHrSwtsy64HCsVadZya195CrA+JCjZSHGmKc5+0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460196; c=relaxed/simple; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=cZweStJ61vIaY6ql2aaePFyp9CzuyFAZKd5VopFGzFDRP+l+ClT75/jpNW5yJ2hiz0Dx31+E/NEazP2EPT7Y64KZPOvpkQqn/jpmW3tmsn7uEloLPDY5ej621F4KvLua6CVHbuPdkSZBahY86FjrNDD2aBr3wS+fMMmxclY0ReE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=f6LqJjc+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="f6LqJjc+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id ED67FC4CEDD; Tue, 17 Dec 2024 18:29:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460196; bh=0ycW44IR1uqmQrDJLN8IiSf67ET1Zi70MGLU1ocQlSU=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=f6LqJjc+pL+hWAmpAZ1nqP1TaaZNM+c6KdbH0B6nrd0IxTtwOELC/n/9Xxy4XYQez 7WfXF1wmrvAMZV39XLXohDdZJtoB0H3nBi8tzM6KNu/gsLKCIMYwI/KXFNtOeaq/XV SECpmC0ZwS0WwblJ2NNKwrRjvmKYncDgdmYAUjmLW66JibnGEXjMC+wCzXaycd5P20 zfSaPkO6v0G3pYVHt0z92kJo5upot7QrFFcEDQh24KvefXx1S5J4IqatYIEo/AgLEQ RJg4CNTu6VwUvYv/5al5nNOhMcBmuHF1Q+VnQajvLdsJUcrvgq14K3LqA+LR2Y7MGV dkBUQ/S3uBwng== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:39 +0100 Subject: [PATCH v4 5/9] scsi: use block layer helpers to calculate num of queues Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-5-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 Multiqueue devices should only allocate queues for the housekeeping CPUs when isolcpus=managed_irq is set. This avoids that the isolated CPUs get disturbed with OS workload. Use helpers which calculates the correct number of queues which should be used when isolcpus is used. Signed-off-by: Daniel Wagner --- drivers/scsi/megaraid/megaraid_sas_base.c | 15 +++++++++------ drivers/scsi/qla2xxx/qla_isr.c | 10 +++++----- drivers/scsi/smartpqi/smartpqi_init.c | 5 ++--- 3 files changed, 16 insertions(+), 14 deletions(-) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 49abd7dd75a7b7c1ddcfac41acecbbcf7de8f5a4..59d385e5a917979ae2f61f5db2c3355b9cab7e08 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -5962,7 +5962,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) else instance->iopoll_q_count = 0; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -5978,7 +5979,8 @@ megasas_alloc_irq_vectors(struct megasas_instance *instance) /* Disable Balanced IOPS mode and try realloc vectors */ instance->perf_mode = MR_LATENCY_PERF_MODE; instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6234,7 +6236,7 @@ static int megasas_init_fw(struct megasas_instance *instance) intr_coalescing = (scratch_pad_1 & MR_INTR_COALESCING_SUPPORT_OFFSET) ? true : false; if (intr_coalescing && - (num_online_cpus() >= MR_HIGH_IOPS_QUEUE_COUNT) && + (blk_mq_num_online_queues(0) >= MR_HIGH_IOPS_QUEUE_COUNT) && (instance->msix_vectors == MEGASAS_MAX_MSIX_QUEUES)) instance->perf_mode = MR_BALANCED_PERF_MODE; else @@ -6278,7 +6280,8 @@ static int megasas_init_fw(struct megasas_instance *instance) else instance->low_latency_index_start = 1; - num_msix_req = num_online_cpus() + instance->low_latency_index_start; + num_msix_req = blk_mq_num_online_queues(0) + + instance->low_latency_index_start; instance->msix_vectors = min(num_msix_req, instance->msix_vectors); @@ -6310,8 +6313,8 @@ static int megasas_init_fw(struct megasas_instance *instance) megasas_setup_reply_map(instance); dev_info(&instance->pdev->dev, - "current msix/online cpus\t: (%d/%d)\n", - instance->msix_vectors, (unsigned int)num_online_cpus()); + "current msix/max num queues\t: (%d/%u)\n", + instance->msix_vectors, blk_mq_num_online_queues(0)); dev_info(&instance->pdev->dev, "RDPQ mode\t: (%s)\n", instance->is_rdpq ? "enabled" : "disabled"); diff --git a/drivers/scsi/qla2xxx/qla_isr.c b/drivers/scsi/qla2xxx/qla_isr.c index fe98c76e9be32ff03a1960f366f0d700d1168383..c4c6b5c6658c0734f7ff68bcc31b33dde87296dd 100644 --- a/drivers/scsi/qla2xxx/qla_isr.c +++ b/drivers/scsi/qla2xxx/qla_isr.c @@ -4533,13 +4533,13 @@ qla24xx_enable_msix(struct qla_hw_data *ha, struct rsp_que *rsp) if (USER_CTRL_IRQ(ha) || !ha->mqiobase) { /* user wants to control IRQ setting for target mode */ ret = pci_alloc_irq_vectors(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX); } else ret = pci_alloc_irq_vectors_affinity(ha->pdev, min_vecs, - min((u16)ha->msix_count, (u16)(num_online_cpus() + min_vecs)), - PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, - &desc); + blk_mq_num_online_queues(ha->msix_count) + min_vecs, + PCI_IRQ_MSIX | PCI_IRQ_AFFINITY, + &desc); if (ret < 0) { ql_log(ql_log_fatal, vha, 0x00c7, diff --git a/drivers/scsi/smartpqi/smartpqi_init.c b/drivers/scsi/smartpqi/smartpqi_init.c index 04fb24d77e9b5c0137f26bc41f17191cc4c49728..7636c8d1c9f14a0d887c1d517c3664f0d0df7e6e 100644 --- a/drivers/scsi/smartpqi/smartpqi_init.c +++ b/drivers/scsi/smartpqi/smartpqi_init.c @@ -5278,15 +5278,14 @@ static void pqi_calculate_queue_resources(struct pqi_ctrl_info *ctrl_info) if (reset_devices) { num_queue_groups = 1; } else { - int num_cpus; int max_queue_groups; max_queue_groups = min(ctrl_info->max_inbound_queues / 2, ctrl_info->max_outbound_queues - 1); max_queue_groups = min(max_queue_groups, PQI_MAX_QUEUE_GROUPS); - num_cpus = num_online_cpus(); - num_queue_groups = min(num_cpus, ctrl_info->max_msix_vectors); + num_queue_groups = + blk_mq_num_online_queues(ctrl_info->max_msix_vectors); num_queue_groups = min(num_queue_groups, max_queue_groups); } From patchwork Tue Dec 17 18:29:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 851627 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5DCD1FA832; Tue, 17 Dec 2024 18:30:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460202; cv=none; b=YHqiksFtp70Gq082Y2TcBwmSj9D78g4l3A+/hHyVs38IPgmt6MaH5tbj82ixRPSkfXjqG0ZxcsB7fks5WVzeDkIbdt17Bp0u943SvzHFB7bF3I+rOvJquMy8d4cNFW962DNEGq6Qz2LmSKI4LvkuxDHVL63lUTq9+CywIR16FF0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460202; c=relaxed/simple; bh=dzcQYVyMq7ub0Uj13RUnX1qE3Yh3C14qRsIHuipGrl4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=QcMbf4txpbj9Dzd8vuO4H3aupL+pg6w5KJd9XdmYtrLN2WmzHvzpItFubGeYVPHE+LCpx4rf7LfvmV1UHBz1JJtKl61C7Yl/KXILwiM73Qu+VQ3usw6rRS3nL1eDI0pmZJsX09sSkT+WRq9TE22N2NwGyCV6TD7zGjxXHhHw62w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=nG8pG+od; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="nG8pG+od" Received: by smtp.kernel.org (Postfix) with ESMTPSA id C648CC4CED3; Tue, 17 Dec 2024 18:30:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460201; bh=dzcQYVyMq7ub0Uj13RUnX1qE3Yh3C14qRsIHuipGrl4=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=nG8pG+odeN1F8AWBvZ1AHzhJUVjMLxbf0U4UGyEmyvPLYagYuGY9t7D6ZG8IiLogH S2Oo3if0xy+Q5mPj117g+Uf0Ip4jJ+G/rah17rpxaFZIvlI9CJ3BuTUfs0YKbntkrZ eiVD1NSO9mSUGcJofDrvcHxvEDD8EE0ndWVTjudgJGwtvkvbNRrR7wXsq8weYPYIEw VfV+68CSK2f0pSwNYoh1cPH0E5vWHCpefh8qNL5Okq3zMPBB+AoeQBh2QcftyjFvAE VMd/nuashhEo2taaGbXUnHk6Jjt5OFpdIOIklZcK19PziWJxsEPYNg4ozj6meap0uX 0M2cZFIskAI4g== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:41 +0100 Subject: [PATCH v4 7/9] lib/group_cpus: honor housekeeping config when grouping CPUs Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-7-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 group_cpus_evenly distributes all present CPUs into groups. This ignores the isolcpus configuration and assigns isolated CPUs into the groups. Make group_cpus_evenly aware of isolcpus configuration and use the housekeeping CPU mask as base for distributing the available CPUs into groups. Reviewed-by: Christoph Hellwig Reviewed-by: Hannes Reinecke Reviewed-by: Sagi Grimberg Signed-off-by: Daniel Wagner --- lib/group_cpus.c | 77 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 75 insertions(+), 2 deletions(-) diff --git a/lib/group_cpus.c b/lib/group_cpus.c index 73da83ca2c45347a3a443d42d4f16801a47effd5..927e4ed634d0d9ca14235c977fc53d6f5f649396 100644 --- a/lib/group_cpus.c +++ b/lib/group_cpus.c @@ -8,6 +8,7 @@ #include #include #include +#include #ifdef CONFIG_SMP @@ -330,7 +331,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, } /** - * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * group_possible_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality * @numgrps: number of groups * * Return: cpumask array if successful, NULL otherwise. And each element @@ -345,7 +346,7 @@ static int __group_cpus_evenly(unsigned int startgrp, unsigned int numgrps, * We guarantee in the resulted grouping that all CPUs are covered, and * no same CPU is assigned to multiple groups */ -struct cpumask *group_cpus_evenly(unsigned int *numgrps) +static struct cpumask *group_possible_cpus_evenly(unsigned int *numgrps) { unsigned int curgrp = 0, nr_present = 0, nr_others = 0, nr_grps; cpumask_var_t *node_to_cpumask; @@ -426,6 +427,78 @@ struct cpumask *group_cpus_evenly(unsigned int *numgrps) *numgrps = nr_present + nr_others; return masks; } + +/** + * group_mask_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * @cpu_mask: CPU to consider for the grouping + * + * Return: cpumask array if successful, NULL otherwise. And each element + * includes CPUs assigned to this group. + * + * Try to put close CPUs from viewpoint of CPU and NUMA locality into + * same group. Allocate present CPUs on these groups evenly. + */ +static struct cpumask *group_mask_cpus_evenly(unsigned int *numgrps, + const struct cpumask *cpu_mask) +{ + cpumask_var_t *node_to_cpumask; + cpumask_var_t nmsk; + unsigned int nr_grps; + int ret = -ENOMEM; + struct cpumask *masks = NULL; + + if (!zalloc_cpumask_var(&nmsk, GFP_KERNEL)) + return NULL; + + node_to_cpumask = alloc_node_to_cpumask(); + if (!node_to_cpumask) + goto fail_nmsk; + + nr_grps = *numgrps; + masks = kcalloc(nr_grps, sizeof(*masks), GFP_KERNEL); + if (!masks) + goto fail_node_to_cpumask; + + build_node_to_cpumask(node_to_cpumask); + + ret = __group_cpus_evenly(0, nr_grps, node_to_cpumask, cpu_mask, nmsk, + masks); + +fail_node_to_cpumask: + free_node_to_cpumask(node_to_cpumask); + +fail_nmsk: + free_cpumask_var(nmsk); + if (ret < 0) { + kfree(masks); + return NULL; + } + *numgrps = ret; + return masks; +} + +/** + * group_cpus_evenly - Group all CPUs evenly per NUMA/CPU locality + * @numgrps: number of groups + * + * Return: cpumask array if successful, NULL otherwise. + * + * group_possible_cpus_evently() is used for distributing the cpus on all + * possible cpus in absence of isolcpus command line argument. + * group_mask_cpu_evenly() is used when the isolcpus command line + * argument is used with managed_irq option. In this case only the + * housekeeping CPUs are considered. + */ +struct cpumask *group_cpus_evenly(unsigned int *numgrps) +{ + if (housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) { + return group_mask_cpus_evenly(numgrps, + housekeeping_cpumask(HK_TYPE_MANAGED_IRQ)); + } + + return group_possible_cpus_evenly(numgrps); +} #else /* CONFIG_SMP */ struct cpumask *group_cpus_evenly(unsigned int *numgrps) { From patchwork Tue Dec 17 18:29:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Wagner X-Patchwork-Id: 851626 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ACAE21FAC53; Tue, 17 Dec 2024 18:30:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460206; cv=none; b=KYk/yTpEXcpA8q37MdzzpRbXYIji01CuoABEy0omzoDgpemrSlK9n1sU/HqZG7pYwiks09Ailcu0zLek1gbYpam3ZkrnPGUj+9Oj+pvo9nLnsUqll01NYZIkEefbROlyvdlHXnSNA1tzyF1oMefmTjB6BKdFdOI5M+V4+tV4yyk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734460206; c=relaxed/simple; bh=6NgFV7+SLJHkzCci3gNbm7cBYHWTyZIkNamwDIlbLeY=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Bj6HiztUGL39yxOUV/zzs8LHw7kcroQFEP5yfrXmmVy+nwmnyZ/+4M9TpiJ3a0W0cY1QmPrtBQWlqpb43catXnn5MPco1XktayKkFMdmuL8F7UAqY1gQ9wU5l3IZaRqyDbKR39rKyGIlQ0qj3Fb5jSel4PlrHIpuO7/9xQSkVZk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=s6Kv27+R; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="s6Kv27+R" Received: by smtp.kernel.org (Postfix) with ESMTPSA id AEEDBC4CED7; Tue, 17 Dec 2024 18:30:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1734460206; bh=6NgFV7+SLJHkzCci3gNbm7cBYHWTyZIkNamwDIlbLeY=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=s6Kv27+RnXSPrl2jQCg0fkmNWjHVSiB/hAypqxFY66hSV7caMFNcFSK9tefnE2j9D 4VbA1/GbM9ZVKy8tWqx699sdcqv9NCX8Pbgz/cDonCQ1O4uhOltf/ng8vOftMg8qLI lmxKzV9+CUUzhx85La7dhEaQgYdecAI5Ogz0N/F6chL7REarIP7YADQY4s+x45YeFH vV5ptlSjGSqA+RYyXOk99DPYju/HT35KkViwL5WI2VvpjJK6uNHYY5ZLa1KLPSUkk7 xMfjlLKxelpJ/H5zZtseSfB/nl/OowQ7AE/vSLMPBeemE+TXvoPmt90w9y20mhLuQh bglrtOGwWgJRA== From: Daniel Wagner Date: Tue, 17 Dec 2024 19:29:43 +0100 Subject: [PATCH v4 9/9] blk-mq: issue warning when offlining hctx with online isolcpus Precedence: bulk X-Mailing-List: linux-scsi@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Message-Id: <20241217-isolcpus-io-queues-v4-9-5d355fbb1e14@kernel.org> References: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> In-Reply-To: <20241217-isolcpus-io-queues-v4-0-5d355fbb1e14@kernel.org> To: Jens Axboe , Keith Busch , Christoph Hellwig , Sagi Grimberg , Kashyap Desai , Sumit Saxena , Shivasharan S , Chandrakanth patil , "Martin K. Petersen" , Nilesh Javali , GR-QLogic-Storage-Upstream@marvell.com, Don Brace , "Michael S. Tsirkin" , Jason Wang , Paolo Bonzini , Stefan Hajnoczi , =?utf-8?q?Eugenio_P=C3=A9rez?= , Xuan Zhuo , Andrew Morton , Thomas Gleixner Cc: Costa Shulyupin , Juri Lelli , Valentin Schneider , Waiman Long , Ming Lei , =?utf-8?q?Michal_Koutn=C3=BD?= , Frederic Weisbecker , Mel Gorman , Hannes Reinecke , Sridhar Balaraman , "brookxu.cn" , linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, linux-nvme@lists.infradead.org, megaraidlinux.pdl@broadcom.com, linux-scsi@vger.kernel.org, storagedev@microchip.com, virtualization@lists.linux.dev, Daniel Wagner X-Mailer: b4 0.14.2 When we offlining a hardware context which also serves isolcpus mapped to it, any IO issued by the isolcpus will stall as there is nothing which handles the interrupts etc. This configuration/setup is not supported at this point thus just issue a warning. Signed-off-by: Daniel Wagner --- block/blk-mq.c | 43 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index de15c0c76f874a2a863b05a23e0f3dba20cb6488..f9af0f5dd6aac8da855777acf2ffc61128f15a74 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -3619,6 +3619,45 @@ static bool blk_mq_hctx_has_requests(struct blk_mq_hw_ctx *hctx) return data.has_rq; } +static void blk_mq_hctx_check_isolcpus_online(struct blk_mq_hw_ctx *hctx, unsigned int cpu) +{ + const struct cpumask *hk_mask; + int i; + + if (!housekeeping_enabled(HK_TYPE_MANAGED_IRQ)) + return; + + hk_mask = housekeeping_cpumask(HK_TYPE_MANAGED_IRQ); + + for (i = 0; i < hctx->nr_ctx; i++) { + struct blk_mq_ctx *ctx = hctx->ctxs[i]; + + if (ctx->cpu == cpu) + continue; + + /* + * Check if this context has at least one online + * housekeeping CPU in this case the hardware context is + * usable. + */ + if (cpumask_test_cpu(ctx->cpu, hk_mask) && + cpu_online(ctx->cpu)) + break; + + /* + * The context doesn't have any online housekeeping CPUs + * but there might be an online isolated CPU mapped to + * it. + */ + if (cpu_is_offline(ctx->cpu)) + continue; + + pr_warn("%s: offlining hctx%d but there is still an online isolcpu CPU %d mapped to it, IO stalls expected\n", + hctx->queue->disk->disk_name, + hctx->queue_num, ctx->cpu); + } +} + static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, unsigned int this_cpu) { @@ -3638,8 +3677,10 @@ static bool blk_mq_hctx_has_online_cpu(struct blk_mq_hw_ctx *hctx, continue; /* this hctx has at least one online CPU */ - if (this_cpu != cpu) + if (this_cpu != cpu) { + blk_mq_hctx_check_isolcpus_online(hctx, this_cpu); return true; + } } return false;