[08/12] scsi: iscsi: remove unneeded task state check

Message ID	20220308002747.122682-9-michael.christie@oracle.com
State	Superseded
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Mike Christie <michael.christie@oracle.com> To: lduncan@suse.com, cleech@redhat.com, martin.petersen@oracle.com, linux-scsi@vger.kernel.org, jejb@linux.ibm.com Cc: Mike Christie <michael.christie@oracle.com> Subject: [PATCH 08/12] scsi: iscsi: remove unneeded task state check Date: Mon, 7 Mar 2022 18:27:43 -0600 Message-Id: <20220308002747.122682-9-michael.christie@oracle.com> In-Reply-To: <20220308002747.122682-1-michael.christie@oracle.com> References: <20220308002747.122682-1-michael.christie@oracle.com> Content-Transfer-Encoding: 8bit Content-Type: text/plain MIME-Version: 1.0 Precedence: bulk
Series	misc iscsi patches \| expand [[PATCH,00/12] misc iscsi patches [01/12] scsi: iscsi: Merge suspend fields [02/12] scsi: iscsi: Rename iscsi_conn_queue_work [03/12] scsi: iscsi: Add recv workqueue helpers [04/12] scsi: iscsi: Allow a recv and xmit work to run [05/12] scsi: iscsi: Run recv path from workqueue [06/12] scsi: iscsi_tcp: Use sendpage for the PDU header [07/12] scsi: iscsi_tcp: Drop target_alloc use [08/12] scsi: iscsi: remove unneeded task state check [09/12] scsi: iscsi: Remove iscsi_get_task back_lock requirement [10/12] scsi: iscsi: Try to avoid taking back_lock in xmit path [11/12] scsi: libiscsi: improve conn_send_pdu API [12/12] scsi: iscsi: Fix race between recovery and task xmit.

Message ID

20220308002747.122682-9-michael.christie@oracle.com

State

Superseded

Headers

From: Mike Christie <michael.christie@oracle.com>
To: lduncan@suse.com, cleech@redhat.com, martin.petersen@oracle.com,
        linux-scsi@vger.kernel.org, jejb@linux.ibm.com
Cc: Mike Christie <michael.christie@oracle.com>
Subject: [PATCH 08/12] scsi: iscsi: remove unneeded task state check
Date: Mon,  7 Mar 2022 18:27:43 -0600
Message-Id: <20220308002747.122682-9-michael.christie@oracle.com>
In-Reply-To: <20220308002747.122682-1-michael.christie@oracle.com>
References: <20220308002747.122682-1-michael.christie@oracle.com>
Content-Transfer-Encoding: 8bit
Content-Type: text/plain
MIME-Version: 1.0
X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1
X-MS-Exchange-AntiSpam-MessageData-0: 
 eBWOWWInIQED0d59YiclzzGcOtqvc8iD374lx+PzCFTmFht76N7LtHZY8jVAOrAGF7/ymI96kF88CRhMyGPV6Lg0CyM+oHbz5u7RK1X3rk5wgOdv4gYKuNhr0v5JI7KMb2b6fBjYaAll6adMjaxtQOW//JdaT70xCOqyUB6X1sdJYuXNcK2fISiJMxcWnc7RgOL8EcKcIrWq/OlyUavhmOE8o4ex0n5WO/UUzJQgBk5zPD00iB2iCDnmm0hZJA2ObTM3u6gOtI2U1mhCGHeVHVBtLmVO8GzYl3m/FjVj6TPhfPemA8jZcadQi/OL66dF3qzn+A4N73fpCa2IhPrn8pH5U/a+x0aoOFxPrnF+ddvmW5FEo9pU48aMZ1YngUH0dm9mgJsXYXsUVxCs+TlxxNJ6Nrgo/HKdPreVC8gKfCvLhC6YLYUBz7DMEy1IZEgdq7SMWtNBgchYezffxvt0M037W1egDSTepxczwGqT9qkWM3Jy6fsznMxo6AGTK7MbZtO3VrnhUNEsxQi3Wl2P8EboZQD6h1sbaEccOsQhWP5x1NMYLcX0cIKy7HpwnnSMu3EIsadYgEE1BWFcXl+hhel43j5VorXCYEjMDQc7cpH9A+J7g9mcyQ8t/UZNOs03NHZy/OlcmtL5pKPd/roOhWOPwRq/25fNCjmqTDt0aVzNUlT5w9kKKRm6Zh5PQCCRsSiYBs5ye/yuTBGhWGtJYk4rdGWbXDncXuDWKgrDPQWgjoWfzXCw/H6KqPBRy2IViy9ZnMKtuBCkvWE87Z5pcH6b8oFYfGd8KkgIy4+Sg9tgJYUwkHkxc07nfayl6oZJXz+SGfG3E/PGEeohakA3BqruwA4d1vr+wd3vAovFG22NTzkNsvon3pAInC4efM9DRJhL9JveCS2py82tZMC22Z9NIzImvlfim+N/jUZ0i2riGCS3eAhiDkhPUW+lud/jbjAofS+jv6kWZQEj0L+2aZgHNbVYz45FXStNyZVgkAp9q58NPMcMm9h4KT2yy/7PHEmAGbHgnrmteUzs3KEFky5EHn1DahkR3ESKCSjlHB6Er5GeVaRBc3G6r0fB8EmZ7B4FrVFU7vvTEGANsvFdt+s4HjkbDEE8hL4mYT5proljht+y583L5gd+/3pPLtOJuo5OCkeaWN1JI1POohhudPZC7nSA7f55Qdf5X29S15YIOeELoOxxGyMRdPYQ7l2l6vDdKYdDlZMFReBicgJlETvLC8F/Pfpqi1SRr+xhZYlU2FrgCk7H9Zt35yF64UBEhf5HJfoEH2FxWqjXK+J2Y4k6cBXg6cyYyALOfYkVkotfU8tX/XcuAXG9O07ssTc7VEtKHDRkTzomLUNyFJqohUZU2VwPPP4uVwjTCCOO4dj7zEgHSxsbbbLzZvvFIA/RIAujL4JfW/1NkdYnnHroJtSelZzjem9wTXh1USM+aqbLcRGcbPPO5ZFV/dwWysdPi8E5LvkrjtMDUuZsJNh5YLfQwh46w0NBW3qJbXPGkSlmW/7p97Y01xfRUuP5W6MaQSg4KH1pSvaCddbh3c4OzN4J4Z15ggNIWm+aAphif4l3UtQaQBL2w4tA91YVYQ9NVMExvZ3KUstS43222LQwmsSqtqIpUhOlSbNqs9Zof/7ZsRcNUu55nqqS3DTormcLGSO9ooR200SRXFntZwHQ2w==
X-OriginatorOrg: oracle.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 
 53952de6-b5f9-495d-b2bd-08da009a7fab
X-MS-Exchange-CrossTenant-AuthSource: DM5PR10MB1466.namprd10.prod.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 08 Mar 2022 00:27:59.4206
 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-CrossTenant-Id: 4e2c6054-71cb-48f1-bd6c-3a9705aca71b
X-MS-Exchange-CrossTenant-MailboxType: HOSTED
X-MS-Exchange-CrossTenant-UserPrincipalName: 
 oyzu53WnMxJ4Zubhw4b2VpeNY2CoyBLO5cPzJdzF7YH346gnd6MJaNNZEERmjWroJlvZ1foXrSkKLnaLBWLFSeXWI7eHi/fDrOCHvfjPPOw=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR10MB2809
X-Proofpoint-Virus-Version: vendor=nai engine=6300 definitions=10279
 signatures=690470
X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 malwarescore=0
 mlxscore=0
 suspectscore=0 bulkscore=0 mlxlogscore=999 adultscore=0 spamscore=0
 phishscore=0 classifier=spam adjust=0 reason=mlx scancount=1
 engine=8.12.0-2202240000 definitions=main-2203070121
X-Proofpoint-ORIG-GUID: XTQbcyllOQQqR9ebzTB9cWWKvGZCq9Vx
X-Proofpoint-GUID: XTQbcyllOQQqR9ebzTB9cWWKvGZCq9Vx
Precedence: bulk
List-ID: <linux-scsi.vger.kernel.org>
X-Mailing-List: linux-scsi@vger.kernel.org

Series

misc iscsi patches | expand

Commit Message

Mike Christie March 8, 2022, 12:27 a.m. UTC

The patch:

commit 5923d64b7ab6 ("scsi: libiscsi: Drop taskqueuelock")

added an extra task->state because for

commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
list corruption regression")

we didn't know why we ended up with cmds on the list and thought it
might have been a bad target sending a response while we were still
sending the cmd. We were never able to get a target to send us a response
early, because it turns out the bug was just a race in libiscsi/
libiscsi_tcp where

1. iscsi_tcp_r2t_rsp queues a r2t to tcp_task->r2tqueue.
2. iscsi_tcp_task_xmit runs iscsi_tcp_get_curr_r2t and sees we have a r2t.
It dequeues it and iscsi_tcp_task_xmit starts to process it.
3. iscsi_tcp_r2t_rsp runs iscsi_requeue_task and puts the task on the
requeue list.
4. iscsi_tcp_task_xmit sends the data for r2t. This is the final chunk of
data, so the cmd is done.
5. target sends the response.
6. On a different CPU from #3, iscsi_complete_task processes the response.
Since there was no common lock for the list, the lists/tasks pointers are
not fully in sync, so could end up with list corruption.

Since it was just a race on our side, this patch removes the extra check
and fixes up the comments.

Reviewed-by: Lee Duncan <lduncan@suse.com>
Signed-off-by: Mike Christie <michael.christie@oracle.com>
---
 drivers/scsi/libiscsi.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

Comments

Wu Bo March 9, 2022, 1:33 a.m. UTC | #1

On 2022/3/8 8:27, Mike Christie wrote:
> The patch:
> 
> commit 5923d64b7ab6 ("scsi: libiscsi: Drop taskqueuelock")
> 
> added an extra task->state because for
> 
> commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
> list corruption regression")
> 
> we didn't know why we ended up with cmds on the list and thought it
> might have been a bad target sending a response while we were still
> sending the cmd. We were never able to get a target to send us a response
> early, because it turns out the bug was just a race in libiscsi/
> libiscsi_tcp where
> 
> 1. iscsi_tcp_r2t_rsp queues a r2t to tcp_task->r2tqueue.
> 2. iscsi_tcp_task_xmit runs iscsi_tcp_get_curr_r2t and sees we have a r2t.
> It dequeues it and iscsi_tcp_task_xmit starts to process it.
> 3. iscsi_tcp_r2t_rsp runs iscsi_requeue_task and puts the task on the
> requeue list.
> 4. iscsi_tcp_task_xmit sends the data for r2t. This is the final chunk of
> data, so the cmd is done.
> 5. target sends the response.
> 6. On a different CPU from #3, iscsi_complete_task processes the response.
> Since there was no common lock for the list, the lists/tasks pointers are
> not fully in sync, so could end up with list corruption.
> 
> Since it was just a race on our side, this patch removes the extra check
> and fixes up the comments.
> 
> Reviewed-by: Lee Duncan <lduncan@suse.com>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/libiscsi.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 0a0076144874..5c74ab92725f 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -567,16 +567,19 @@ static bool cleanup_queued_task(struct iscsi_task *task)
>   	struct iscsi_conn *conn = task->conn;
>   	bool early_complete = false;
>   
> -	/* Bad target might have completed task while it was still running */
> +	/*
> +	 * We might have raced where we handled a R2T early and got a response
> +	 * but have not yet taken the task off the requeue list, then a TMF or
> +	 * recovery happened and so we can still see it here.
> +	 */
>   	if (task->state == ISCSI_TASK_COMPLETED)
>   		early_complete = true;
>   
>   	if (!list_empty(&task->running)) {
>   		list_del_init(&task->running);
>   		/*
> -		 * If it's on a list but still running, this could be from
> -		 * a bad target sending a rsp early, cleanup from a TMF, or
> -		 * session recovery.
> +		 * If it's on a list but still running this could be cleanup
> +		 * from a TMF or session recovery.
>   		 */
>   		if (task->state == ISCSI_TASK_RUNNING ||
>   		    task->state == ISCSI_TASK_COMPLETED)
> @@ -1484,7 +1487,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task,
>   	}
>   	/* regular RX path uses back_lock */
>   	spin_lock(&conn->session->back_lock);
> -	if (rc && task->state == ISCSI_TASK_RUNNING) {
> +	if (rc) {
>   		/*
>   		 * get an extra ref that is released next time we access it
>   		 * as conn->task above.
> 
Reviewed-by: Wu Bo <wubo40@huawei.com>

Lee Duncan March 14, 2022, 5:13 p.m. UTC | #2

On 3/7/22 16:27, Mike Christie wrote:
> The patch:
> 
> commit 5923d64b7ab6 ("scsi: libiscsi: Drop taskqueuelock")
> 
> added an extra task->state because for
> 
> commit 6f8830f5bbab ("scsi: libiscsi: add lock around task lists to fix
> list corruption regression")
> 
> we didn't know why we ended up with cmds on the list and thought it
> might have been a bad target sending a response while we were still
> sending the cmd. We were never able to get a target to send us a response
> early, because it turns out the bug was just a race in libiscsi/
> libiscsi_tcp where
> 
> 1. iscsi_tcp_r2t_rsp queues a r2t to tcp_task->r2tqueue.
> 2. iscsi_tcp_task_xmit runs iscsi_tcp_get_curr_r2t and sees we have a r2t.
> It dequeues it and iscsi_tcp_task_xmit starts to process it.
> 3. iscsi_tcp_r2t_rsp runs iscsi_requeue_task and puts the task on the
> requeue list.
> 4. iscsi_tcp_task_xmit sends the data for r2t. This is the final chunk of
> data, so the cmd is done.
> 5. target sends the response.
> 6. On a different CPU from #3, iscsi_complete_task processes the response.
> Since there was no common lock for the list, the lists/tasks pointers are
> not fully in sync, so could end up with list corruption.
> 
> Since it was just a race on our side, this patch removes the extra check
> and fixes up the comments.
> 
> Reviewed-by: Lee Duncan <lduncan@suse.com>
> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> ---
>   drivers/scsi/libiscsi.c | 13 ++++++++-----
>   1 file changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
> index 0a0076144874..5c74ab92725f 100644
> --- a/drivers/scsi/libiscsi.c
> +++ b/drivers/scsi/libiscsi.c
> @@ -567,16 +567,19 @@ static bool cleanup_queued_task(struct iscsi_task *task)
>   	struct iscsi_conn *conn = task->conn;
>   	bool early_complete = false;
>   
> -	/* Bad target might have completed task while it was still running */
> +	/*
> +	 * We might have raced where we handled a R2T early and got a response
> +	 * but have not yet taken the task off the requeue list, then a TMF or
> +	 * recovery happened and so we can still see it here.
> +	 */
>   	if (task->state == ISCSI_TASK_COMPLETED)
>   		early_complete = true;
>   
>   	if (!list_empty(&task->running)) {
>   		list_del_init(&task->running);
>   		/*
> -		 * If it's on a list but still running, this could be from
> -		 * a bad target sending a rsp early, cleanup from a TMF, or
> -		 * session recovery.
> +		 * If it's on a list but still running this could be cleanup
> +		 * from a TMF or session recovery.
>   		 */
>   		if (task->state == ISCSI_TASK_RUNNING ||
>   		    task->state == ISCSI_TASK_COMPLETED)
> @@ -1484,7 +1487,7 @@ static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task,
>   	}
>   	/* regular RX path uses back_lock */
>   	spin_lock(&conn->session->back_lock);
> -	if (rc && task->state == ISCSI_TASK_RUNNING) {
> +	if (rc) {
>   		/*
>   		 * get an extra ref that is released next time we access it
>   		 * as conn->task above.

Reviewed-by: Lee Duncan <lduncan@suse.com>

diff --git a/drivers/scsi/libiscsi.c b/drivers/scsi/libiscsi.c
index 0a0076144874..5c74ab92725f 100644
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -567,16 +567,19 @@  static bool cleanup_queued_task(struct iscsi_task *task)
 	struct iscsi_conn *conn = task->conn;
 	bool early_complete = false;
 
-	/* Bad target might have completed task while it was still running */
+	/*
+	 * We might have raced where we handled a R2T early and got a response
+	 * but have not yet taken the task off the requeue list, then a TMF or
+	 * recovery happened and so we can still see it here.
+	 */
 	if (task->state == ISCSI_TASK_COMPLETED)
 		early_complete = true;
 
 	if (!list_empty(&task->running)) {
 		list_del_init(&task->running);
 		/*
-		 * If it's on a list but still running, this could be from
-		 * a bad target sending a rsp early, cleanup from a TMF, or
-		 * session recovery.
+		 * If it's on a list but still running this could be cleanup
+		 * from a TMF or session recovery.
 		 */
 		if (task->state == ISCSI_TASK_RUNNING ||
 		    task->state == ISCSI_TASK_COMPLETED)
@@ -1484,7 +1487,7 @@  static int iscsi_xmit_task(struct iscsi_conn *conn, struct iscsi_task *task,
 	}
 	/* regular RX path uses back_lock */
 	spin_lock(&conn->session->back_lock);
-	if (rc && task->state == ISCSI_TASK_RUNNING) {
+	if (rc) {
 		/*
 		 * get an extra ref that is released next time we access it
 		 * as conn->task above.

[08/12] scsi: iscsi: remove unneeded task state check

Commit Message

Comments

Patch