From patchwork Thu Sep 3 15:26:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 274666 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D3CEC433E9 for ; Thu, 3 Sep 2020 15:29:54 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D38D12072A for ; Thu, 3 Sep 2020 15:29:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="RJSOZMOi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D38D12072A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:41540 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kDrBR-0004OD-1Y for qemu-devel@archiver.kernel.org; Thu, 03 Sep 2020 11:29:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50844) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kDr8e-0000IX-87 for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:27:00 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:58885) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kDr8c-0000E6-FB for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:26:59 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599146817; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yNN9HMxErQBgUM+loGmpJTxuhPYQQVhkbsp50pLpF+c=; b=RJSOZMOirC+FFmRFAsu7Z7Alfq/Z2+XKK9+IS7XodMvyWgxZdQMu5VS9FVkEUQUVb5jUA8 Knz9fla0PwX9J+5fmPsFAn2fmxJd106Mz93X0Yyrpf/HGUnCJncMp+LY+K46PooRlRgdnu aSjKv9ZCC7mnjetc42nrcS8ksHGg/YY= Received: from mail-qt1-f199.google.com (mail-qt1-f199.google.com [209.85.160.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-525-SA9GxHsdNJ2BjR_p84eRYg-1; Thu, 03 Sep 2020 11:26:55 -0400 X-MC-Unique: SA9GxHsdNJ2BjR_p84eRYg-1 Received: by mail-qt1-f199.google.com with SMTP id j35so2350750qtk.14 for ; Thu, 03 Sep 2020 08:26:55 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=yNN9HMxErQBgUM+loGmpJTxuhPYQQVhkbsp50pLpF+c=; b=F1UhqyLgotfvZTUvcE9+l6Ee9X4msTS/ts/1VmeK8DUqXVLqQ1+C1ylht5yzjYML72 ukyY5m3DZ/RNBjad/9w2bRydOygtLbaARVUF+v1HJaNvmiHGmuo9nwiao1Hz2uxxNUNC WZ+6+aKssDS6Jz+QyOGnX1QjS19G1aySPR6VoLbncyblv3ytwDapwK7yLZ1PChWeb+yL RQzDDxBWPeRRoGwU6ple3xyvChtQrebs5uE/qp0tIHnlEYVmkLpQ4PSxagCaeI3byP8W ZIjI+Pyk7U9NsfWAd/utprND16IpU3oP7nbJzvlewYycK7YpSETLkuRIwEwXHOpuQqo2 9OTA== X-Gm-Message-State: AOAM5300O3nQFDdR9ks6fcNMaSD4zpT0NcH/uzjg13gq1yfb3rJDP1R2 WDN3Py+g55b6SPGg1ds9vbELGfLeb+/ujPUcmUTAS7CBhLUJnKLLKUWu61S9Nz0wPdQZnV11JwR Xi//Yfh8hgJpMI14= X-Received: by 2002:a37:68c7:: with SMTP id d190mr3553533qkc.127.1599146813432; Thu, 03 Sep 2020 08:26:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzhbtQYN6UmM0Ksh7FFhjMatqeLXu86Q7riw5gzhopdLkqndQyT0hc81xQA1N+MNil12P3LoA== X-Received: by 2002:a37:68c7:: with SMTP id d190mr3553514qkc.127.1599146813196; Thu, 03 Sep 2020 08:26:53 -0700 (PDT) Received: from xz-x1.redhat.com (bras-vprn-toroon474qw-lp130-11-70-53-122-15.dsl.bell.ca. [70.53.122.15]) by smtp.gmail.com with ESMTPSA id l38sm2319889qtl.58.2020.09.03.08.26.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 08:26:52 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH 3/5] migration: Pass incoming state into qemu_ufd_copy_ioctl() Date: Thu, 3 Sep 2020 11:26:44 -0400 Message-Id: <20200903152646.93336-4-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200903152646.93336-1-peterx@redhat.com> References: <20200903152646.93336-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0.003 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/03 01:47:17 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Xiaohui Li , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" It'll be used in follow up patches to access more fields out of it. Meanwhile fetch the userfaultfd inside the function. Signed-off-by: Peter Xu Reviewed-by: Dr. David Alan Gilbert --- migration/postcopy-ram.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index 11a70441a6..d333c3fd0e 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -1128,10 +1128,12 @@ int postcopy_ram_incoming_setup(MigrationIncomingState *mis) return 0; } -static int qemu_ufd_copy_ioctl(int userfault_fd, void *host_addr, +static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, void *from_addr, uint64_t pagesize, RAMBlock *rb) { + int userfault_fd = mis->userfault_fd; int ret; + if (from_addr) { struct uffdio_copy copy_struct; copy_struct.dst = (uint64_t)(uintptr_t)host_addr; @@ -1185,7 +1187,7 @@ int postcopy_place_page(MigrationIncomingState *mis, void *host, void *from, * which would be slightly cheaper, but we'd have to be careful * of the order of updating our page state. */ - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, from, pagesize, rb)) { + if (qemu_ufd_copy_ioctl(mis, host, from, pagesize, rb)) { int e = errno; error_report("%s: %s copy host: %p from: %p (size: %zd)", __func__, strerror(e), host, from, pagesize); @@ -1212,7 +1214,7 @@ int postcopy_place_page_zero(MigrationIncomingState *mis, void *host, * but it's not available for everything (e.g. hugetlbpages) */ if (qemu_ram_is_uf_zeroable(rb)) { - if (qemu_ufd_copy_ioctl(mis->userfault_fd, host, NULL, pagesize, rb)) { + if (qemu_ufd_copy_ioctl(mis, host, NULL, pagesize, rb)) { int e = errno; error_report("%s: %s zero host: %p", __func__, strerror(e), host); From patchwork Thu Sep 3 15:26:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 274664 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93B34C433E2 for ; Thu, 3 Sep 2020 15:33:35 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 40F7C206D4 for ; Thu, 3 Sep 2020 15:33:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SN2MujVR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 40F7C206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:53104 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kDrF0-00011o-Ej for qemu-devel@archiver.kernel.org; Thu, 03 Sep 2020 11:33:34 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50884) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kDr8g-0000NP-6E for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:27:02 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:35182 helo=us-smtp-1.mimecast.com) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kDr8d-0000EC-1C for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:27:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599146817; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7qNwKw3bP7jnzU4s7QDMSOC0XIYxDM97KbjcVzP6q/Q=; b=SN2MujVRtJN3cNRkGDkwHWkisHVriywgRPhcQFS4S1A9q6B5hb09bmqRAfphJycqwVNEm3 I07/yE0HcU5+oITTN+xX+nnd7Cz8NXbaz6k6b+B9y9mNHmi3k1nTobDPLFIlz1cFu+m88C JRdBTU9/7kpIPWkQeebW1lJ0e3NVsa0= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-554-Uiy9quutPPGFeVfTglBkqA-1; Thu, 03 Sep 2020 11:26:56 -0400 X-MC-Unique: Uiy9quutPPGFeVfTglBkqA-1 Received: by mail-qk1-f197.google.com with SMTP id s141so1770471qka.13 for ; Thu, 03 Sep 2020 08:26:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=7qNwKw3bP7jnzU4s7QDMSOC0XIYxDM97KbjcVzP6q/Q=; b=hTJ2XpT10WUATPQ0oKfF09jQCILA8imacj1e+xHwP6QsZHOgz+xM+lvbbZVRlU9yiw ARlv9lIPuz+Y+R081GAGBm9cZvLzZBO2pQ6TezdX9sq5gZ0uxpxrDXixszcG3vfBtsPL bj60R8ai4WBJ85CXR1DiVs226Iz3ZlR1y9iNTaellRjkl3VLMu5bZawYGW78T62PXdmd aFEu1SdKmokiSRlvnyAngD165MbNMfoQFHP2LbIfP8nJpYtdnQMA/Ob79bAR2a/EbRh/ cDc+hFu5ECWnTuukmu69NCF47WLmuVXgvwg99XRigmJvFNyl0hG32l6VcH0Qpu3/fhFO DHVg== X-Gm-Message-State: AOAM531uOzXgbFLz0HnbqvD7YBiti2r556lnq+8e7+Q22/hbQPH00zn+ oU5f+gyA41c4S0WZj9HuvpeElHUoS4KcYoejXS3R0QGPtxjxLvhGDEMDlQDIomwDR/8yGtS3Y7i sG5JaoANvyokqTaQ= X-Received: by 2002:a0c:e8c9:: with SMTP id m9mr3544002qvo.51.1599146815462; Thu, 03 Sep 2020 08:26:55 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwWbXH6HXo7H+gxm2bIni2HTO1WNh0nnxCO+55B6eeyoEQx3e4CsvMaXYz4hM329OdRchSDjQ== X-Received: by 2002:a0c:e8c9:: with SMTP id m9mr3543969qvo.51.1599146815114; Thu, 03 Sep 2020 08:26:55 -0700 (PDT) Received: from xz-x1.redhat.com (bras-vprn-toroon474qw-lp130-11-70-53-122-15.dsl.bell.ca. [70.53.122.15]) by smtp.gmail.com with ESMTPSA id l38sm2319889qtl.58.2020.09.03.08.26.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 08:26:53 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH 4/5] migration: Maintain postcopy faulted addresses Date: Thu, 3 Sep 2020 11:26:45 -0400 Message-Id: <20200903152646.93336-5-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200903152646.93336-1-peterx@redhat.com> References: <20200903152646.93336-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0.001 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=205.139.110.120; envelope-from=peterx@redhat.com; helo=us-smtp-1.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/03 01:58:20 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Xiaohui Li , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Maintain a list of faulted addresses on the destination host for which we're waiting on. This is implemented using a GTree rather than a real list to make sure even there're plenty of vCPUs/threads that are faulting, the lookup will still be fast with O(log(N)) (because we'll do that after placing each page). It should bring a slight overhead, but ideally that shouldn't be a big problem simply because in most cases the requested page list will be short. Actually we did similar things for postcopy blocktime measurements. This patch didn't use that simply because: (1) blocktime measurement is towards vcpu threads only, but here we need to record all faulted addresses, including main thread and external thread (like, DPDK via vhost-user). (2) blocktime measurement will require UFFD_FEATURE_THREAD_ID, but here we don't want to add that extra dependency on the kernel version since not necessary. E.g., we don't need to know which thread faulted on which page, we also don't care about multiple threads faulting on the same page. But we only care about what addresses are faulted so waiting for a page copying from src. (3) blocktime measurement is not enabled by default. However we need this by default especially for postcopy recover. Another thing to mention is that this patch introduced a new mutex to serialize the receivedmap and the page_requested tree, however that serialization does not cover other procedures like UFFDIO_COPY. Signed-off-by: Peter Xu --- migration/migration.c | 41 +++++++++++++++++++++++++++++++++++++++- migration/migration.h | 19 ++++++++++++++++++- migration/postcopy-ram.c | 18 +++++++++++++++--- migration/trace-events | 2 ++ 4 files changed, 75 insertions(+), 5 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 6b43ffddbd..e943d96c1b 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -143,6 +143,13 @@ static int migration_maybe_pause(MigrationState *s, int new_state); static void migrate_fd_cancel(MigrationState *s); +static gint page_request_addr_cmp(gconstpointer ap, gconstpointer bp) +{ + uint64_t a = (uint64_t) ap, b = (uint64_t) bp; + + return (a > b) - (a < b); +} + void migration_object_init(void) { MachineState *ms = MACHINE(qdev_get_machine()); @@ -165,6 +172,8 @@ void migration_object_init(void) qemu_event_init(¤t_incoming->main_thread_load_event, false); qemu_sem_init(¤t_incoming->postcopy_pause_sem_dst, 0); qemu_sem_init(¤t_incoming->postcopy_pause_sem_fault, 0); + qemu_mutex_init(¤t_incoming->page_request_mutex); + current_incoming->page_requested = g_tree_new(page_request_addr_cmp); if (!migration_object_check(current_migration, &err)) { error_report_err(err); @@ -238,6 +247,11 @@ void migration_incoming_state_destroy(void) mis->postcopy_remote_fds = NULL; } + if (mis->page_requested) { + g_tree_destroy(mis->page_requested); + mis->page_requested = NULL; + } + qemu_event_reset(&mis->main_thread_load_event); if (mis->socket_address_list) { @@ -354,8 +368,33 @@ int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, } int migrate_send_rp_req_pages(MigrationIncomingState *mis, - RAMBlock *rb, ram_addr_t start) + RAMBlock *rb, ram_addr_t start, uint64_t haddr) { + uint64_t aligned = haddr & (-qemu_target_page_size()); + bool received; + + qemu_mutex_lock(&mis->page_request_mutex); + received = ramblock_recv_bitmap_test_byte_offset(rb, start); + if (!received && !g_tree_lookup(mis->page_requested, (gpointer) aligned)) { + /* + * The page has not been received, and it's not yet in the page request + * list. Queue it. Set the value of element to 1, so that things like + * g_tree_lookup() will return TRUE (1) when found. + */ + g_tree_insert(mis->page_requested, (gpointer) aligned, (gpointer) 1); + mis->page_requested_count++; + trace_postcopy_page_req_add(aligned, mis->page_requested_count); + } + qemu_mutex_unlock(&mis->page_request_mutex); + + /* + * If the page is there, skip sending the message. We don't even need the + * lock because as long as the page arrived, it'll be there forever. + */ + if (received) { + return 0; + } + return migrate_send_rp_message_req_pages(mis, rb, start); } diff --git a/migration/migration.h b/migration/migration.h index f552725305..81311dc154 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -103,6 +103,23 @@ struct MigrationIncomingState { /* List of listening socket addresses */ SocketAddressList *socket_address_list; + + /* A tree of pages that we requested to the source VM */ + GTree *page_requested; + /* For debugging purpose only, but would be nice to keep */ + int page_requested_count; + /* + * The mutex helps to maintain the requested pages that we sent to the + * source, IOW, to guarantee coherent between the page_requests tree and + * the per-ramblock receivedmap. Note! This does not guarantee consistency + * of the real page copy procedures (using UFFDIO_[ZERO]COPY). E.g., even + * if one bit in receivedmap is cleared, UFFDIO_COPY could have happened + * for that page already. This is intended so that the mutex won't + * serialize and blocked by slow operations like UFFDIO_* ioctls. However + * this should be enough to make sure the page_requested tree always + * contains valid information. + */ + QemuMutex page_request_mutex; }; MigrationIncomingState *migration_incoming_get_current(void); @@ -329,7 +346,7 @@ void migrate_send_rp_shut(MigrationIncomingState *mis, void migrate_send_rp_pong(MigrationIncomingState *mis, uint32_t value); int migrate_send_rp_req_pages(MigrationIncomingState *mis, RAMBlock *rb, - ram_addr_t start); + ram_addr_t start, uint64_t haddr); int migrate_send_rp_message_req_pages(MigrationIncomingState *mis, RAMBlock *rb, ram_addr_t start); void migrate_send_rp_recv_bitmap(MigrationIncomingState *mis, diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c index d333c3fd0e..a30627e838 100644 --- a/migration/postcopy-ram.c +++ b/migration/postcopy-ram.c @@ -684,7 +684,7 @@ int postcopy_request_shared_page(struct PostCopyFD *pcfd, RAMBlock *rb, qemu_ram_get_idstr(rb), rb_offset); return postcopy_wake_shared(pcfd, client_addr, rb); } - migrate_send_rp_req_pages(mis, rb, aligned_rbo); + migrate_send_rp_req_pages(mis, rb, aligned_rbo, client_addr); return 0; } @@ -979,7 +979,8 @@ retry: * Send the request to the source - we want to request one * of our host page sizes (which is >= TPS) */ - ret = migrate_send_rp_req_pages(mis, rb, rb_offset); + ret = migrate_send_rp_req_pages(mis, rb, rb_offset, + msg.arg.pagefault.address); if (ret) { /* May be network failure, try to wait for recovery */ if (ret == -EIO && postcopy_pause_fault_thread(mis)) { @@ -1149,10 +1150,21 @@ static int qemu_ufd_copy_ioctl(MigrationIncomingState *mis, void *host_addr, ret = ioctl(userfault_fd, UFFDIO_ZEROPAGE, &zero_struct); } if (!ret) { + qemu_mutex_lock(&mis->page_request_mutex); ramblock_recv_bitmap_set_range(rb, host_addr, pagesize / qemu_target_page_size()); + /* + * If this page resolves a page fault for a previous recorded faulted + * address, take a special note to maintain the requested page list. + */ + if (g_tree_lookup(mis->page_requested, (gconstpointer)host_addr)) { + g_tree_remove(mis->page_requested, (gconstpointer)host_addr); + mis->page_requested_count--; + trace_postcopy_page_req_del((uint64_t)host_addr, + mis->page_requested_count); + } + qemu_mutex_unlock(&mis->page_request_mutex); mark_postcopy_blocktime_end((uintptr_t)host_addr); - } return ret; } diff --git a/migration/trace-events b/migration/trace-events index 4ab0a503d2..b89ce02cb0 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -157,6 +157,7 @@ postcopy_pause_return_path(void) "" postcopy_pause_return_path_continued(void) "" postcopy_pause_continued(void) "" postcopy_start_set_run(void) "" +postcopy_page_req_add(uint64_t addr, int count) "new page req 0x%lx total %d" source_return_path_thread_bad_end(void) "" source_return_path_thread_end(void) "" source_return_path_thread_entry(void) "" @@ -267,6 +268,7 @@ postcopy_ram_incoming_cleanup_blocktime(uint64_t total) "total blocktime %" PRIu postcopy_request_shared_page(const char *sharer, const char *rb, uint64_t rb_offset) "for %s in %s offset 0x%"PRIx64 postcopy_request_shared_page_present(const char *sharer, const char *rb, uint64_t rb_offset) "%s already %s offset 0x%"PRIx64 postcopy_wake_shared(uint64_t client_addr, const char *rb) "at 0x%"PRIx64" in %s" +postcopy_page_req_del(uint64_t addr, int count) "resolved page req 0x%lx total %d" get_mem_fault_cpu_index(int cpu, uint32_t pid) "cpu: %d, pid: %u" From patchwork Thu Sep 3 15:26:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Peter Xu X-Patchwork-Id: 274663 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH, MAILING_LIST_MULTI, SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43804C433E2 for ; Thu, 3 Sep 2020 15:35:09 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 0E690206D4 for ; Thu, 3 Sep 2020 15:35:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="DzsB7784" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 0E690206D4 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:56846 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1kDrGW-0002dJ-8n for qemu-devel@archiver.kernel.org; Thu, 03 Sep 2020 11:35:08 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:50892) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1kDr8h-0000QU-5k for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:27:03 -0400 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:46181) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1kDr8f-0000Fl-2J for qemu-devel@nongnu.org; Thu, 03 Sep 2020 11:27:02 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1599146820; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SN+zpDe63b9OXABJW135pgj4q08uRgy/TMocpHMnT88=; b=DzsB7784ZeG1akKNy10wM27wTNVSB2vHQuiAzU7IfGyPZycl95dK+h3gjjdXfMCagut1Gl 0QgvqF8bT/8ztcMPledpDVwykO0+87n53S2n1y/JxHz7xUyygDuhbGwAj4HRpGfv5uCc3z T6NQPVFnfRCQFYsL/olg44AR7vU+YmM= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-168-nndpTprDPEWybfrL-1_LBQ-1; Thu, 03 Sep 2020 11:26:58 -0400 X-MC-Unique: nndpTprDPEWybfrL-1_LBQ-1 Received: by mail-qk1-f197.google.com with SMTP id g6so1742683qko.21 for ; Thu, 03 Sep 2020 08:26:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=SN+zpDe63b9OXABJW135pgj4q08uRgy/TMocpHMnT88=; b=UzM8aa1dQX1ZYrQNqyi25p0qZUVbGmKYPoES+cLWpBg9NvSfg2CKQPJU29LLj/3MMr 6zw2facEVAs6PwW4nuoYLq068wCIJsbH5QJ7D4+NdZQUXY6aUho6QrvonEgNOKzhHH2O 1p7L7NgSXJzqAooK+1p920Xt7AFM0BvqUAxTzr64XIKXQOFsaixWGa35mUUlpfXySe93 9w3U/PTBHM6COyJ43NB/ldbktj/Zq9KguzAWrVBu3ODQbwhpnayasZ5U0RmzmnvRc8D7 HOZFg3wEJ/ZJLMI4lC66CzwPSTXvxDnTNEH3gjbrFjpYwTPh5hS4FWJEhRGmPrlOGCjp 4qsw== X-Gm-Message-State: AOAM533WrJ8SUrloNUZmTL0cxKsBxvAMY83yG5QlyFVoGp4xH3nWQjh9 q8JoXOUhIIyBDOvCm9rzXvjR8jmtVqH/GRmtIGPCi93S/PIW8T3Tgt1LDAqHEWMoTUBXexcrpSA 1BYIHd+wlDYQeMI8= X-Received: by 2002:a0c:de0f:: with SMTP id t15mr3449302qvk.90.1599146817095; Thu, 03 Sep 2020 08:26:57 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx5VWYzIE3WiP4K3dcvtkqv9LeAY2SZSUaV46AuCecyFjoS83lup39Dwj7+QPW8Xw1mFlzGaA== X-Received: by 2002:a0c:de0f:: with SMTP id t15mr3449277qvk.90.1599146816767; Thu, 03 Sep 2020 08:26:56 -0700 (PDT) Received: from xz-x1.redhat.com (bras-vprn-toroon474qw-lp130-11-70-53-122-15.dsl.bell.ca. [70.53.122.15]) by smtp.gmail.com with ESMTPSA id l38sm2319889qtl.58.2020.09.03.08.26.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Sep 2020 08:26:55 -0700 (PDT) From: Peter Xu To: qemu-devel@nongnu.org Subject: [PATCH 5/5] migration: Sync requested pages after postcopy recovery Date: Thu, 3 Sep 2020 11:26:46 -0400 Message-Id: <20200903152646.93336-6-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200903152646.93336-1-peterx@redhat.com> References: <20200903152646.93336-1-peterx@redhat.com> MIME-Version: 1.0 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=peterx@redhat.com X-Mimecast-Spam-Score: 0.003 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=peterx@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-detected-operating-system: by eggs.gnu.org: First seen = 2020/09/03 04:23:54 X-ACL-Warn: Detected OS = Linux 2.2.x-3.x [generic] [fuzzy] X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Xiaohui Li , "Dr . David Alan Gilbert" , peterx@redhat.com, Juan Quintela Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" We synchronize the requested pages right after a postcopy recovery happens. This helps to synchronize the prioritized pages on source so that the faulted threads can be served faster. Reported-by: Xiaohui Li Signed-off-by: Peter Xu --- migration/savevm.c | 56 ++++++++++++++++++++++++++++++++++++++++++ migration/trace-events | 1 + 2 files changed, 57 insertions(+) diff --git a/migration/savevm.c b/migration/savevm.c index 304d98ff78..f998dd230d 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2011,6 +2011,48 @@ static int loadvm_postcopy_handle_run(MigrationIncomingState *mis) return LOADVM_QUIT; } +/* We must be with page_request_mutex held */ +static gboolean postcopy_sync_page_req(gpointer key, gpointer value, + gpointer data) +{ + MigrationIncomingState *mis = data; + void *host_addr = (void *) key; + ram_addr_t rb_offset; + RAMBlock *rb; + int ret; + + rb = qemu_ram_block_from_host(host_addr, true, &rb_offset); + if (!rb) { + /* + * This should _never_ happen. However be nice for a migrating VM to + * not crash/assert. Post an error (note: intended to not use *_once + * because we do want to see all the illegal addresses; and this can + * never be triggered by the guest so we're safe) and move on next. + */ + error_report("%s: illegal host addr %p", __func__, host_addr); + /* Try the next entry */ + return FALSE; + } + + ret = migrate_send_rp_message_req_pages(mis, rb, rb_offset); + if (ret) { + /* Refer to above comment - just try our best to continue */ + error_report("%s: send rp message failed for addr %p", + __func__, host_addr); + } + + trace_postcopy_page_req_sync((uint64_t)host_addr); + + return FALSE; +} + +static void migrate_send_rp_req_pages_pending(MigrationIncomingState *mis) +{ + qemu_mutex_lock(&mis->page_request_mutex); + g_tree_foreach(mis->page_requested, postcopy_sync_page_req, mis); + qemu_mutex_unlock(&mis->page_request_mutex); +} + static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) { if (mis->state != MIGRATION_STATUS_POSTCOPY_RECOVER) { @@ -2033,6 +2075,20 @@ static int loadvm_postcopy_handle_resume(MigrationIncomingState *mis) /* Tell source that "we are ready" */ migrate_send_rp_resume_ack(mis, MIGRATION_RESUME_ACK_VALUE); + /* + * After a postcopy recovery, the source should have lost the postcopy + * queue, or potentially the requested pages could have been lost during + * the network down phase. Let's re-sync with the source VM by re-sending + * all the pending pages that we eagerly need, so these threads won't get + * blocked too long due to the recovery. + * + * Without this procedure, the faulted destination VM threads (waiting for + * page requests right before the postcopy is interrupted) can keep hanging + * until the pages are sent by the source during the background copying of + * pages, or another thread faulted on the same address accidentally. + */ + migrate_send_rp_req_pages_pending(mis); + return 0; } diff --git a/migration/trace-events b/migration/trace-events index b89ce02cb0..54a6dd2761 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -49,6 +49,7 @@ vmstate_save(const char *idstr, const char *vmsd_name) "%s, %s" vmstate_load(const char *idstr, const char *vmsd_name) "%s, %s" postcopy_pause_incoming(void) "" postcopy_pause_incoming_continued(void) "" +postcopy_page_req_sync(uint64_t host_addr) "sync page req 0x%"PRIx64 # vmstate.c vmstate_load_field_error(const char *field, int ret) "field \"%s\" load failed, ret = %d"