From patchwork Thu Oct 23 19:14:34 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mark Rutland X-Patchwork-Id: 39395 Return-Path: X-Original-To: linaro@patches.linaro.org Delivered-To: linaro@patches.linaro.org Received: from mail-la0-f69.google.com (mail-la0-f69.google.com [209.85.215.69]) by ip-10-151-82-157.ec2.internal (Postfix) with ESMTPS id A649520341 for ; Thu, 23 Oct 2014 19:20:30 +0000 (UTC) Received: by mail-la0-f69.google.com with SMTP id q1sf947232lam.4 for ; Thu, 23 Oct 2014 12:20:29 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:delivered-to:date:from:to:subject:message-id :references:mime-version:in-reply-to:thread-topic:accept-language :user-agent:cc:precedence:list-id:list-unsubscribe:list-archive :list-post:list-help:list-subscribe:sender:errors-to :x-original-sender:x-original-authentication-results:mailing-list :content-disposition:content-language:content-type :content-transfer-encoding; bh=zwTCkGyHqO5rWAVMachWGWuHEOfz3hPp4XxUPKXGth4=; b=af+wpIS+QAgP9O409JWTDeHMGXs+H6kqW1klYabrSJ2BMCjJL84NSbxJP20nfpNDFo JZeutPmtjzat6osleRpAJn8H9wHSCuSiaguynk01ICnossQNkYIj6Z0SvNmADg8ZT7hy MUG3hb3IVwnEYUzDExI7PQ7QtpHmIh6LqjsDyGlOEI7QOzitoPty/6gnQcgpva0USfg4 CeYB1TVVBmfg50nF++P2I0yKDG3tDQUnxKhPgQMNiFlWQNzy6FHawwCvk0qpQ8zoFS52 FvNoUqlkhuKmSsBoEJKD0uGP7WyzqQSEs3sdSvSgNzATBwwngZk84NTD+TLCTZxK7RxO cg5g== X-Gm-Message-State: ALoCoQntPO4n8kKZBlYQMemF/O3sX5md7REEDfraxIOvlMwJfTl8JMsEav1llNKRk6ncWPralb/7 X-Received: by 10.152.3.168 with SMTP id d8mr1425415lad.0.1414092029376; Thu, 23 Oct 2014 12:20:29 -0700 (PDT) X-BeenThere: patchwork-forward@linaro.org Received: by 10.152.206.106 with SMTP id ln10ls335156lac.103.gmail; Thu, 23 Oct 2014 12:20:29 -0700 (PDT) X-Received: by 10.152.3.168 with SMTP id d8mr318461lad.34.1414092029002; Thu, 23 Oct 2014 12:20:29 -0700 (PDT) Received: from mail-la0-f47.google.com (mail-la0-f47.google.com. [209.85.215.47]) by mx.google.com with ESMTPS id rj3si4020352lbb.85.2014.10.23.12.20.28 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 23 Oct 2014 12:20:28 -0700 (PDT) Received-SPF: pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.47 as permitted sender) client-ip=209.85.215.47; Received: by mail-la0-f47.google.com with SMTP id pv20so1420305lab.20 for ; Thu, 23 Oct 2014 12:20:28 -0700 (PDT) X-Received: by 10.152.87.98 with SMTP id w2mr183459laz.27.1414092028822; Thu, 23 Oct 2014 12:20:28 -0700 (PDT) X-Forwarded-To: patchwork-forward@linaro.org X-Forwarded-For: patch@linaro.org patchwork-forward@linaro.org Delivered-To: patch@linaro.org Received: by 10.112.84.229 with SMTP id c5csp311411lbz; Thu, 23 Oct 2014 12:20:27 -0700 (PDT) X-Received: by 10.66.139.106 with SMTP id qx10mr5824754pab.138.1414092026544; Thu, 23 Oct 2014 12:20:26 -0700 (PDT) Received: from bombadil.infradead.org (bombadil.infradead.org. [2001:1868:205::9]) by mx.google.com with ESMTPS id hn8si2263503pac.212.2014.10.23.12.20.25 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 23 Oct 2014 12:20:26 -0700 (PDT) Received-SPF: none (google.com: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org does not designate permitted sender hosts) client-ip=2001:1868:205::9; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1XhNqv-0000FJ-6Z; Thu, 23 Oct 2014 19:15:17 +0000 Received: from cam-admin0.cambridge.arm.com ([217.140.96.50]) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1XhNqr-0007Yo-Ox for linux-arm-kernel@lists.infradead.org; Thu, 23 Oct 2014 19:15:15 +0000 Received: from leverpostej (leverpostej.cambridge.arm.com [10.1.205.151]) by cam-admin0.cambridge.arm.com (8.12.6/8.12.6) with ESMTP id s9NJEgwo008947; Thu, 23 Oct 2014 20:14:42 +0100 (BST) Date: Thu, 23 Oct 2014 20:14:34 +0100 From: Mark Rutland To: "msalter@redhat.com" Subject: Re: [PATCH 06/10] arm64/efi: use UEFI memory map unconditionally if available Message-ID: <20141023191433.GB20121@leverpostej> References: <1413987713-30528-1-git-send-email-ard.biesheuvel@linaro.org> <1413987713-30528-7-git-send-email-ard.biesheuvel@linaro.org> <1413997616.2985.74.camel@deneb.redhat.com> <20141023155428.GA977@leverpostej> <1414081198.6829.12.camel@deneb.redhat.com> MIME-Version: 1.0 In-Reply-To: <1414081198.6829.12.camel@deneb.redhat.com> Thread-Topic: [PATCH 06/10] arm64/efi: use UEFI memory map unconditionally if available Accept-Language: en-GB, en-US User-Agent: Mutt/1.5.21 (2010-09-15) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20141023_121514_228180_039998D9 X-CRM114-Status: GOOD ( 34.72 ) X-Spam-Score: -6.4 (------) X-Spam-Report: SpamAssassin version 3.4.0 on bombadil.infradead.org summary: Content analysis details: (-6.4 points) pts rule name description ---- ---------------------- -------------------------------------------------- -5.0 RCVD_IN_DNSWL_HI RBL: Sender listed at http://www.dnswl.org/, high trust [217.140.96.50 listed in list.dnswl.org] -0.0 SPF_PASS SPF: sender matches SPF record -1.4 RP_MATCHES_RCVD Envelope sender domain matches handover relay domain Cc: "matt.fleming@intel.com" , "yi.li@linaro.org" , Ard Biesheuvel , Catalin Marinas , Will Deacon , "leif.lindholm@linaro.org" , "roy.franz@linaro.org" , "linux-efi@vger.kernel.org" , "dyoung@redhat.com" , "linux-arm-kernel@lists.infradead.org" X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: , List-Help: , List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patch=linaro.org@lists.infradead.org X-Removed-Original-Auth: Dkim didn't pass. X-Original-Sender: mark.rutland@arm.com X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of patch+caf_=patchwork-forward=linaro.org@linaro.org designates 209.85.215.47 as permitted sender) smtp.mail=patch+caf_=patchwork-forward=linaro.org@linaro.org Mailing-list: list patchwork-forward@linaro.org; contact patchwork-forward+owners@linaro.org X-Google-Group-Id: 836684582541 Content-Disposition: inline Content-Language: en-US [...] > > > It also looks like EFI_BOOT flag will be set even if uefi_init fails. > > > If uefi_init fails, we only need reserve_regions() for the purpose > > > of adding memblocks. Otherwise, we end up wasting a lot of memory. > > > > What memory are we wasting in that case? Surely the only items that we > > could choose to throw away if we failed uefi_init are > > EFI_ACPI_RECLAIM_MEMORY and EFI_MEMORY_RUNTIME? > > > > We might want to keep those around so we can kexec into a kernel where > > we can make use of them. Surely they shouldn't take up a significant > > proportion of the available memory? > > In addition, reserve_regions() also reserves BOOT_SERVICES (which gets > freed up later after set_virtual_address_map(). That won't happen if > uefi_init() fails. In any case, if uefi_init() fails, we won't be able > to find the ACPI tables kexec kernel won't be able to either. If you need the ACPI tables, the first kernel won't boot. However, if we fail to map the runtime services data and code, that doesn't mean the next kernel can't. We might have failed to map the runtime services due to an early_ioremap bug or similar, and that could be fixed in the kernel we're about to kexec to. If we're going to nuke runtime regions, we should nuke them in the memory map descriptors as a courtesy to the next kernel. Otherwise we could be more courteous and simply leave them be (which is my preference). > The BOOT_SERVICES stuff is the big chunk. There was some discussion of > not reserving that but no one has gotten around to writing the patch > to clean out the code that does it. I've just spent some time doing so; patch below. It boots happily on my Juno (with 4KiB pages at least), but I haven't given it much of a stress test. I wasn't sure if unusable memory could show up with EFI_MEMORY_WB attributes. If so, this patch still won't prevent that from being mapped as cacheable, but that should be easy to fix. Thanks, Mark. ---->8---- >From 581d5bac5b4a6f93e23737012b71c58d809af6bb Mon Sep 17 00:00:00 2001 From: Mark Rutland Date: Thu, 23 Oct 2014 19:41:55 +0100 Subject: [PATCH] arm64: efi: Simplify memory descriptor handling Currently discovering the memory map from UEFI is a multi-stage affair, where the boot services code and data are kept around for much longer than necessary. Additionally, potential overlap between memory and IO mappings at kernel page granularity is not handled. This patch attempts to solve both of these issues, with the addition and filtering of memory regions being handled in a single function. First, all normal memory (i.e. anything which can be mapped writeback cacheable) is added to memblock. Second, IO and reserved regions are filtered out of memblock at kernel page granularity, to prevent potential conflicting mappings. As some reserved regions must be present in the idmap these are reserved. Everything else that's not normal memory is removed, preventing the possibility of a cacheable mapping covering something it shouldn't. Signed-off-by: Mark Rutland Cc: Ard Biesheuvel Cc: Leif Lindholm --- arch/arm64/kernel/efi.c | 181 +++++++++++------------------------------------- 1 file changed, 42 insertions(+), 139 deletions(-) diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c index 03aaa99..8873c28 100644 --- a/arch/arm64/kernel/efi.c +++ b/arch/arm64/kernel/efi.c @@ -154,161 +154,66 @@ static __init int is_reserve_region(efi_memory_desc_t *md) return 0; } -static __init void reserve_regions(void) +static __init void efi_add_memory(void) { efi_memory_desc_t *md; - u64 paddr, npages, size; + u64 paddr, size; if (uefi_debug) - pr_info("Processing EFI memory map:\n"); + pr_info("EFI: adding normal memory:\n"); for_each_efi_memory_desc(&memmap, md) { + if (!is_normal_ram(md)) + continue; + paddr = md->phys_addr; - npages = md->num_pages; + size = md->num_pages << EFI_PAGE_SHIFT; if (uefi_debug) - pr_info(" 0x%012llx-0x%012llx [%s]", - paddr, paddr + (npages << EFI_PAGE_SHIFT) - 1, + pr_info(" 0x%012llx-0x%012llx [%s]\n", + paddr, size - 1, memory_type_name[md->type]); - memrange_efi_to_native(&paddr, &npages); - size = npages << PAGE_SHIFT; + early_init_dt_add_memory_arch(paddr, size); + } + + /* + * Regions describe in an EFI_MEMORY_DESCRIPTOR are only guaranteed to + * be 4KiB aligned but the kernel page size could be larger, and + * regions with different attributes could fall into the same kernel + * page. Thus we must remove any memory in the same kernel page as an + * IO region. + * + * Reserved regions of memory need to be in the idmap, so just reserve + * them. + */ + if (uefi_debug) + pr_info("EFI: correcting for overlapping regions:\n"); + + for_each_efi_memory_desc(&memmap, md) { + if (!is_reserve_region(md) && is_normal_ram(md)) + continue; + + paddr = md->phys_addr; + size = md->num_pages << EFI_PAGE_SHIFT; + + size = PAGE_ALIGN(size + (paddr & ~PAGE_MASK)); + paddr &= PAGE_MASK; - if (is_normal_ram(md)) - early_init_dt_add_memory_arch(paddr, size); + if (uefi_debug) + pr_info(" 0x%012llx-0x%012llx [%s]\n", + paddr, size - 1, + memory_type_name[md->type]); - if (is_reserve_region(md) || - md->type == EFI_BOOT_SERVICES_CODE || - md->type == EFI_BOOT_SERVICES_DATA) { + if (is_reserve_region(md)) memblock_reserve(paddr, size); - if (uefi_debug) - pr_cont("*"); - } - - if (uefi_debug) - pr_cont("\n"); + else + memblock_remove(paddr, size); } set_bit(EFI_MEMMAP, &efi.flags); } - -static u64 __init free_one_region(u64 start, u64 end) -{ - u64 size = end - start; - - if (uefi_debug) - pr_info(" EFI freeing: 0x%012llx-0x%012llx\n", start, end - 1); - - free_bootmem_late(start, size); - return size; -} - -static u64 __init free_region(u64 start, u64 end) -{ - u64 map_start, map_end, total = 0; - - if (end <= start) - return total; - - map_start = (u64)memmap.phys_map; - map_end = PAGE_ALIGN(map_start + (memmap.map_end - memmap.map)); - map_start &= PAGE_MASK; - - if (start < map_end && end > map_start) { - /* region overlaps UEFI memmap */ - if (start < map_start) - total += free_one_region(start, map_start); - - if (map_end < end) - total += free_one_region(map_end, end); - } else - total += free_one_region(start, end); - - return total; -} - -static void __init free_boot_services(void) -{ - u64 total_freed = 0; - u64 keep_end, free_start, free_end; - efi_memory_desc_t *md; - - /* - * If kernel uses larger pages than UEFI, we have to be careful - * not to inadvertantly free memory we want to keep if there is - * overlap at the kernel page size alignment. We do not want to - * free is_reserve_region() memory nor the UEFI memmap itself. - * - * The memory map is sorted, so we keep track of the end of - * any previous region we want to keep, remember any region - * we want to free and defer freeing it until we encounter - * the next region we want to keep. This way, before freeing - * it, we can clip it as needed to avoid freeing memory we - * want to keep for UEFI. - */ - - keep_end = 0; - free_start = 0; - - for_each_efi_memory_desc(&memmap, md) { - u64 paddr, npages, size; - - if (is_reserve_region(md)) { - /* - * We don't want to free any memory from this region. - */ - if (free_start) { - /* adjust free_end then free region */ - if (free_end > md->phys_addr) - free_end -= PAGE_SIZE; - total_freed += free_region(free_start, free_end); - free_start = 0; - } - keep_end = md->phys_addr + (md->num_pages << EFI_PAGE_SHIFT); - continue; - } - - if (md->type != EFI_BOOT_SERVICES_CODE && - md->type != EFI_BOOT_SERVICES_DATA) { - /* no need to free this region */ - continue; - } - - /* - * We want to free memory from this region. - */ - paddr = md->phys_addr; - npages = md->num_pages; - memrange_efi_to_native(&paddr, &npages); - size = npages << PAGE_SHIFT; - - if (free_start) { - if (paddr <= free_end) - free_end = paddr + size; - else { - total_freed += free_region(free_start, free_end); - free_start = paddr; - free_end = paddr + size; - } - } else { - free_start = paddr; - free_end = paddr + size; - } - if (free_start < keep_end) { - free_start += PAGE_SIZE; - if (free_start >= free_end) - free_start = 0; - } - } - if (free_start) - total_freed += free_region(free_start, free_end); - - if (total_freed) - pr_info("Freed 0x%llx bytes of EFI boot services memory", - total_freed); -} - void __init efi_init(void) { struct efi_fdt_params params; @@ -330,7 +235,7 @@ void __init efi_init(void) if (uefi_init() < 0) return; - reserve_regions(); + efi_add_memory(); } void __init efi_idmap_init(void) @@ -452,8 +357,6 @@ static int __init arm64_enter_virtual_mode(void) kfree(virtmap); - free_boot_services(); - if (status != EFI_SUCCESS) { pr_err("Failed to set EFI virtual address map! [%lx]\n", status);