From patchwork Tue Sep 22 16:03:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 273082 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, MIME_BASE64_TEXT,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DC0EC4363D for ; Tue, 22 Sep 2020 16:09:19 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id B0DC122206 for ; Tue, 22 Sep 2020 16:09:18 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SYfUkXh5" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B0DC122206 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:58706 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kKkqy-0008EG-EJ for qemu-devel@archiver.kernel.org; Tue, 22 Sep 2020 12:09:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:46846) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kKkmZ-0003Fs-Qr for qemu-devel@nongnu.org; Tue, 22 Sep 2020 12:04:43 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:37438) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kKkmW-0005ef-Cp for qemu-devel@nongnu.org; Tue, 22 Sep 2020 12:04:43 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1600790678; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A4Pdyu/nzfDZOJq6MEf/FsawlBQezQlxsaBdgCLFLV0=; b=SYfUkXh5MOl+KWtjY3ha/gKhnAlV0AFGU+/BPYpZfsH9GlS/TihCT7BBby3jRLo2ncF/Xp BfhLyFPn3WdjfOnVxp2uBJQnq9nSXfagQ9+ZXkCj1RjQSx5Vvbp0OlhmQkt0mUo1aVNcen 5Gbe7OB+NUGs/LTWrwaROeRO2NPQrzw= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-320-A4qGfIS5OQyxepfL130WGg-1; Tue, 22 Sep 2020 12:04:35 -0400 X-MC-Unique: A4qGfIS5OQyxepfL130WGg-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id D883380046A; Tue, 22 Sep 2020 16:04:34 +0000 (UTC) Received: from localhost (ovpn-115-55.ams2.redhat.com [10.36.115.55]) by smtp.corp.redhat.com (Postfix) with ESMTP id 522E95C1A3; Tue, 22 Sep 2020 16:04:34 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PATCH 09/11] util/vhost-user-server: rework vu_client_trip() coroutine lifecycle Date: Tue, 22 Sep 2020 17:03:59 +0100 Message-Id: <20200922160401.294055-10-stefanha@redhat.com> In-Reply-To: <20200922160401.294055-1-stefanha@redhat.com> References: <20200922160401.294055-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/22 02:07:04 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -17 X-Spam_score: -1.8 X-Spam_bar: - X-Spam_report: (-1.8 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.455, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MIME_BASE64_TEXT=1.741, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Kevin Wolf , qemu-block@nongnu.org, Markus Armbruster , Coiby Xu , mreitz@redhat.com, Stefan Hajnoczi Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" The vu_client_trip() coroutine is leaked during AioContext switching. It is also unsafe to destroy the vu_dev in panic_cb() since its callers still access it in some cases. Rework the lifecycle to solve these safety issues. Signed-off-by: Stefan Hajnoczi --- util/vhost-user-server.h | 29 ++-- block/export/vhost-user-blk-server.c | 9 +- util/vhost-user-server.c | 245 +++++++++++++++------------ 3 files changed, 155 insertions(+), 128 deletions(-) diff --git a/util/vhost-user-server.h b/util/vhost-user-server.h index 92177fc911..0da4c2cc4c 100644 --- a/util/vhost-user-server.h +++ b/util/vhost-user-server.h @@ -19,34 +19,36 @@ #include "qapi/error.h" #include "standard-headers/linux/virtio_blk.h" +/* A kick fd that we monitor on behalf of libvhost-user */ typedef struct VuFdWatch { VuDev *vu_dev; int fd; /*kick fd*/ void *pvt; vu_watch_cb cb; - bool processing; QTAILQ_ENTRY(VuFdWatch) next; } VuFdWatch; -typedef struct VuServer VuServer; - -struct VuServer { +/** + * VuServer: + * A vhost-user server instance with user-defined VuDevIface callbacks. + * Vhost-user device backends can be implemented using VuServer. VuDevIface + * callbacks and virtqueue kicks run in the given AioContext. + */ +typedef struct { QIONetListener *listener; + QEMUBH *restart_listener_bh; AioContext *ctx; int max_queues; const VuDevIface *vu_iface; + + /* Protected by ctx lock */ VuDev vu_dev; QIOChannel *ioc; /* The I/O channel with the client */ QIOChannelSocket *sioc; /* The underlying data channel with the client */ - /* IOChannel for fd provided via VHOST_USER_SET_SLAVE_REQ_FD */ - QIOChannel *ioc_slave; - QIOChannelSocket *sioc_slave; - Coroutine *co_trip; /* coroutine for processing VhostUserMsg */ QTAILQ_HEAD(, VuFdWatch) vu_fd_watches; - /* restart coroutine co_trip if AIOContext is changed */ - bool aio_context_changed; - bool processing_msg; -}; + + Coroutine *co_trip; /* coroutine for processing VhostUserMsg */ +} VuServer; bool vhost_user_server_start(VuServer *server, SocketAddress *unix_socket, @@ -57,6 +59,7 @@ bool vhost_user_server_start(VuServer *server, void vhost_user_server_stop(VuServer *server); -void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx); +void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx); +void vhost_user_server_detach_aio_context(VuServer *server); #endif /* VHOST_USER_SERVER_H */ diff --git a/block/export/vhost-user-blk-server.c b/block/export/vhost-user-blk-server.c index f2bfddbf26..b609a3e4d6 100644 --- a/block/export/vhost-user-blk-server.c +++ b/block/export/vhost-user-blk-server.c @@ -313,18 +313,13 @@ static const VuDevIface vu_block_iface = { static void blk_aio_attached(AioContext *ctx, void *opaque) { VuBlockDev *vub_dev = opaque; - aio_context_acquire(ctx); - vhost_user_server_set_aio_context(&vub_dev->vu_server, ctx); - aio_context_release(ctx); + vhost_user_server_attach_aio_context(&vub_dev->vu_server, ctx); } static void blk_aio_detach(void *opaque) { VuBlockDev *vub_dev = opaque; - AioContext *ctx = vub_dev->vu_server.ctx; - aio_context_acquire(ctx); - vhost_user_server_set_aio_context(&vub_dev->vu_server, NULL); - aio_context_release(ctx); + vhost_user_server_detach_aio_context(&vub_dev->vu_server); } static void diff --git a/util/vhost-user-server.c b/util/vhost-user-server.c index ec555abcb2..d8b8c08b5f 100644 --- a/util/vhost-user-server.c +++ b/util/vhost-user-server.c @@ -9,8 +9,50 @@ */ #include "qemu/osdep.h" #include "qemu/main-loop.h" +#include "block/aio-wait.h" #include "vhost-user-server.h" +/* + * Theory of operation: + * + * VuServer is started and stopped by vhost_user_server_start() and + * vhost_user_server_stop() from the main loop thread. Starting the server + * opens a vhost-user UNIX domain socket and listens for incoming connections. + * Only one connection is allowed at a time. + * + * The connection is handled by the vu_client_trip() coroutine in the + * VuServer->ctx AioContext. The coroutine consists of a vu_dispatch() loop + * where libvhost-user calls vu_message_read() to receive the next vhost-user + * protocol messages over the UNIX domain socket. + * + * When virtqueues are set up libvhost-user calls set_watch() to monitor kick + * fds. These fds are also handled in the VuServer->ctx AioContext. + * + * Both vu_client_trip() and kick fd monitoring can be stopped by shutting down + * the socket connection. Shutting down the socket connection causes + * vu_message_read() to fail since no more data can be received from the socket. + * After vu_dispatch() fails, vu_client_trip() calls vu_deinit() to stop + * libvhost-user before terminating the coroutine. vu_deinit() calls + * remove_watch() to stop monitoring kick fds and this stops virtqueue + * processing. + * + * When vu_client_trip() has finished cleaning up it schedules a BH in the main + * loop thread to accept the next client connection. + * + * When libvhost-user detects an error it calls panic_cb() and sets the + * dev->broken flag. Both vu_client_trip() and kick fd processing stop when + * the dev->broken flag is set. + * + * It is possible to switch AioContexts using + * vhost_user_server_detach_aio_context() and + * vhost_user_server_attach_aio_context(). They stop monitoring fds in the old + * AioContext and resume monitoring in the new AioContext. The vu_client_trip() + * coroutine remains in a yielded state during the switch. This is made + * possible by QIOChannel's support for spurious coroutine re-entry in + * qio_channel_yield(). The coroutine will restart I/O when re-entered from the + * new AioContext. + */ + static void vmsg_close_fds(VhostUserMsg *vmsg) { int i; @@ -27,68 +69,9 @@ static void vmsg_unblock_fds(VhostUserMsg *vmsg) } } -static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc, - gpointer opaque); - -static void close_client(VuServer *server) -{ - /* - * Before closing the client - * - * 1. Let vu_client_trip stop processing new vhost-user msg - * - * 2. remove kick_handler - * - * 3. wait for the kick handler to be finished - * - * 4. wait for the current vhost-user msg to be finished processing - */ - - QIOChannelSocket *sioc = server->sioc; - /* When this is set vu_client_trip will stop new processing vhost-user message */ - server->sioc = NULL; - - while (server->processing_msg) { - if (server->ioc->read_coroutine) { - server->ioc->read_coroutine = NULL; - qio_channel_set_aio_fd_handler(server->ioc, server->ioc->ctx, NULL, - NULL, server->ioc); - server->processing_msg = false; - } - } - - vu_deinit(&server->vu_dev); - - /* vu_deinit() should have called remove_watch() */ - assert(QTAILQ_EMPTY(&server->vu_fd_watches)); - - object_unref(OBJECT(sioc)); - object_unref(OBJECT(server->ioc)); -} - static void panic_cb(VuDev *vu_dev, const char *buf) { - VuServer *server = container_of(vu_dev, VuServer, vu_dev); - - /* avoid while loop in close_client */ - server->processing_msg = false; - - if (buf) { - error_report("vu_panic: %s", buf); - } - - if (server->sioc) { - close_client(server); - } - - /* - * Set the callback function for network listener so another - * vhost-user client can connect to this server - */ - qio_net_listener_set_client_func(server->listener, - vu_accept, - server, - NULL); + error_report("vu_panic: %s", buf); } static bool coroutine_fn @@ -185,28 +168,31 @@ fail: return false; } - -static void vu_client_start(VuServer *server); static coroutine_fn void vu_client_trip(void *opaque) { VuServer *server = opaque; + VuDev *vu_dev = &server->vu_dev; - while (!server->aio_context_changed && server->sioc) { - server->processing_msg = true; - vu_dispatch(&server->vu_dev); - server->processing_msg = false; + while (!vu_dev->broken && vu_dispatch(vu_dev)) { + /* Keep running */ } - if (server->aio_context_changed && server->sioc) { - server->aio_context_changed = false; - vu_client_start(server); - } -} + vu_deinit(vu_dev); + + /* vu_deinit() should have called remove_watch() */ + assert(QTAILQ_EMPTY(&server->vu_fd_watches)); + + object_unref(OBJECT(server->sioc)); + server->sioc = NULL; -static void vu_client_start(VuServer *server) -{ - server->co_trip = qemu_coroutine_create(vu_client_trip, server); - aio_co_enter(server->ctx, server->co_trip); + object_unref(OBJECT(server->ioc)); + server->ioc = NULL; + + server->co_trip = NULL; + if (server->restart_listener_bh) { + qemu_bh_schedule(server->restart_listener_bh); + } + aio_wait_kick(); } /* @@ -219,12 +205,18 @@ static void vu_client_start(VuServer *server) static void kick_handler(void *opaque) { VuFdWatch *vu_fd_watch = opaque; - vu_fd_watch->processing = true; - vu_fd_watch->cb(vu_fd_watch->vu_dev, 0, vu_fd_watch->pvt); - vu_fd_watch->processing = false; + VuDev *vu_dev = vu_fd_watch->vu_dev; + + vu_fd_watch->cb(vu_dev, 0, vu_fd_watch->pvt); + + /* Stop vu_client_trip() if an error occurred in vu_fd_watch->cb() */ + if (vu_dev->broken) { + VuServer *server = container_of(vu_dev, VuServer, vu_dev); + + qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + } } - static VuFdWatch *find_vu_fd_watch(VuServer *server, int fd) { @@ -319,62 +311,95 @@ static void vu_accept(QIONetListener *listener, QIOChannelSocket *sioc, qio_channel_set_name(QIO_CHANNEL(sioc), "vhost-user client"); server->ioc = QIO_CHANNEL(sioc); object_ref(OBJECT(server->ioc)); - qio_channel_attach_aio_context(server->ioc, server->ctx); + + /* TODO vu_message_write() spins if non-blocking! */ qio_channel_set_blocking(server->ioc, false, NULL); - vu_client_start(server); + + server->co_trip = qemu_coroutine_create(vu_client_trip, server); + + aio_context_acquire(server->ctx); + vhost_user_server_attach_aio_context(server, server->ctx); + aio_context_release(server->ctx); } - void vhost_user_server_stop(VuServer *server) { + aio_context_acquire(server->ctx); + + qemu_bh_delete(server->restart_listener_bh); + server->restart_listener_bh = NULL; + if (server->sioc) { - close_client(server); + VuFdWatch *vu_fd_watch; + + QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) { + aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true, + NULL, NULL, NULL, vu_fd_watch); + } + + qio_channel_shutdown(server->ioc, QIO_CHANNEL_SHUTDOWN_BOTH, NULL); + + AIO_WAIT_WHILE(server->ctx, server->co_trip); } + aio_context_release(server->ctx); + if (server->listener) { qio_net_listener_disconnect(server->listener); object_unref(OBJECT(server->listener)); } +} + +/* + * Allow the next client to connect to the server. Called from a BH in the main + * loop. + */ +static void restart_listener_bh(void *opaque) +{ + VuServer *server = opaque; + qio_net_listener_set_client_func(server->listener, vu_accept, server, + NULL); } -void vhost_user_server_set_aio_context(VuServer *server, AioContext *ctx) +/* Called with ctx acquired */ +void vhost_user_server_attach_aio_context(VuServer *server, AioContext *ctx) { - VuFdWatch *vu_fd_watch, *next; - void *opaque = NULL; - IOHandler *io_read = NULL; - bool attach; + VuFdWatch *vu_fd_watch; - server->ctx = ctx ? ctx : qemu_get_aio_context(); + server->ctx = ctx; if (!server->sioc) { - /* not yet serving any client*/ return; } - if (ctx) { - qio_channel_attach_aio_context(server->ioc, ctx); - server->aio_context_changed = true; - io_read = kick_handler; - attach = true; - } else { + qio_channel_attach_aio_context(server->ioc, ctx); + + QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) { + aio_set_fd_handler(ctx, vu_fd_watch->fd, true, kick_handler, NULL, + NULL, vu_fd_watch); + } + + aio_co_schedule(ctx, server->co_trip); +} + +/* Called with server->ctx acquired */ +void vhost_user_server_detach_aio_context(VuServer *server) +{ + if (server->sioc) { + VuFdWatch *vu_fd_watch; + + QTAILQ_FOREACH(vu_fd_watch, &server->vu_fd_watches, next) { + aio_set_fd_handler(server->ctx, vu_fd_watch->fd, true, + NULL, NULL, NULL, vu_fd_watch); + } + qio_channel_detach_aio_context(server->ioc); - /* server->ioc->ctx keeps the old AioConext */ - ctx = server->ioc->ctx; - attach = false; } - QTAILQ_FOREACH_SAFE(vu_fd_watch, &server->vu_fd_watches, next, next) { - if (vu_fd_watch->cb) { - opaque = attach ? vu_fd_watch : NULL; - aio_set_fd_handler(ctx, vu_fd_watch->fd, true, - io_read, NULL, NULL, - opaque); - } - } + server->ctx = NULL; } - bool vhost_user_server_start(VuServer *server, SocketAddress *socket_addr, AioContext *ctx, @@ -382,6 +407,7 @@ bool vhost_user_server_start(VuServer *server, const VuDevIface *vu_iface, Error **errp) { + QEMUBH *bh; QIONetListener *listener = qio_net_listener_new(); if (qio_net_listener_open_sync(listener, socket_addr, 1, errp) < 0) { @@ -389,9 +415,12 @@ bool vhost_user_server_start(VuServer *server, return false; } + bh = qemu_bh_new(restart_listener_bh, server); + /* zero out unspecified fields */ *server = (VuServer) { .listener = listener, + .restart_listener_bh = bh, .vu_iface = vu_iface, .max_queues = max_queues, .ctx = ctx,