Message ID | 20250226201839.2374631-1-pjones@redhat.com |
---|---|
State | Accepted |
Commit | 2b90e7ace79774a3540ce569e000388f8d22c9e0 |
Headers | show |
Series | efi: don't map the entire mokvar table to determine its size | expand |
On Wed, 26 Feb 2025 at 21:18, Peter Jones <pjones@redhat.com> wrote: > > Currently when validating the mokvar table, we (re)map the entire table > on each iteration of the loop, adding space as we discover new entries. > If the table grows over a certain size, this fails due to limitations of > early_memmap(), and we get a failure and traceback: > > ------------[ cut here ]------------ > WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:139 __early_ioremap+0xef/0x220 > Modules linked in: > CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.15-200.fc41.x86_64 #1 > Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20250221-6.copr8698600 02/21/2025 > RIP: 0010:__early_ioremap+0xef/0x220 > Code: e5 00 f0 ff ff 48 81 e5 00 f0 ff ff 4c 89 6c 24 08 41 81 e4 ff 0f 00 00 4c 29 ed 48 89 e8 48 c1 e8 0c 41 89 c7 83 f8 40 76 04 <0f> 0b eb 82 45 6b ee c0 41 81 c5 ff 05 00 00 45 85 ff 74 36 83 3d > RSP: 0000:ffffffff96803dd8 EFLAGS: 00010002 ORIG_RAX: 0000000000000000 > RAX: 0000000000000041 RBX: 0000000000000001 RCX: ffffffff97768250 > RDX: 8000000000000163 RSI: 0000000000000001 RDI: 000000007c4c3000 > RBP: 0000000000041000 R08: ffffffffff201630 R09: 0000000000000030 > R10: 000000007c4c3000 R11: ffffffff96803e20 R12: 0000000000000000 > R13: 000000007c4c3000 R14: 0000000000000001 R15: 0000000000000041 > FS: 0000000000000000(0000) GS:ffffffff97291000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: ffff9f1d8000040e CR3: 00000000653a4000 CR4: 00000000000000f0 > Call Trace: > <TASK> > ? __early_ioremap+0xef/0x220 > ? __warn.cold+0x93/0xfa > ? __early_ioremap+0xef/0x220 > ? report_bug+0xff/0x140 > ? early_fixup_exception+0x5d/0xb0 > ? early_idt_handler_common+0x2f/0x3a > ? __early_ioremap+0xef/0x220 > ? efi_mokvar_table_init+0xce/0x1d0 > ? setup_arch+0x864/0xc10 > ? start_kernel+0x6b/0xa10 > ? x86_64_start_reservations+0x24/0x30 > ? x86_64_start_kernel+0xed/0xf0 > ? common_startup_64+0x13e/0x141 > </TASK> > ---[ end trace 0000000000000000 ]--- > mokvar: Failed to map EFI MOKvar config table pa=0x7c4c3000, size=265187. > > Mapping the entire structure isn't actually necessary, as we don't ever > need more than one entry header mapped at once. > > This patch changes efi_mokvar_table_init() to only map each entry > header, not the entire table, when determining the table size. Since > we're not mapping any data past the variable name, it also changes the > code to enforce that each variable name is NUL terminated, rather than > attempting to verify it in place. > > Signed-off-by: Peter Jones <pjones@redhat.com> > --- > drivers/firmware/efi/mokvar-table.c | 41 +++++++++-------------------- > 1 file changed, 13 insertions(+), 28 deletions(-) > > diff --git a/drivers/firmware/efi/mokvar-table.c b/drivers/firmware/efi/mokvar-table.c > index 5ed0602c2f7..66eb83a0f12 100644 > --- a/drivers/firmware/efi/mokvar-table.c > +++ b/drivers/firmware/efi/mokvar-table.c > @@ -103,7 +103,6 @@ void __init efi_mokvar_table_init(void) > void *va = NULL; > unsigned long cur_offset = 0; > unsigned long offset_limit; > - unsigned long map_size = 0; > unsigned long map_size_needed = 0; > unsigned long size; > struct efi_mokvar_table_entry *mokvar_entry; > @@ -134,48 +133,34 @@ void __init efi_mokvar_table_init(void) > */ > err = -EINVAL; > while (cur_offset + sizeof(*mokvar_entry) <= offset_limit) { > - mokvar_entry = va + cur_offset; > - map_size_needed = cur_offset + sizeof(*mokvar_entry); > - if (map_size_needed > map_size) { > - if (va) > - early_memunmap(va, map_size); > - /* > - * Map a little more than the fixed size entry > - * header, anticipating some data. It's safe to > - * do so as long as we stay within current memory > - * descriptor. > - */ > - map_size = min(map_size_needed + 2*EFI_PAGE_SIZE, > - offset_limit); > - va = early_memremap(efi.mokvar_table, map_size); > - if (!va) { > - pr_err("Failed to map EFI MOKvar config table pa=0x%lx, size=%lu.\n", > - efi.mokvar_table, map_size); > - return; > - } > - mokvar_entry = va + cur_offset; > + if (va) > + early_memunmap(va, sizeof(*mokvar_entry)); > + va = early_memremap(efi.mokvar_table + cur_offset, sizeof(*mokvar_entry)); > + if (!va) { > + pr_err("Failed to map EFI MOKvar config table pa=0x%lx, size=%zu.\n", > + efi.mokvar_table + cur_offset, sizeof(*mokvar_entry)); > + return; > } > + mokvar_entry = va; > > /* Check for last sentinel entry */ > if (mokvar_entry->name[0] == '\0') { > if (mokvar_entry->data_size != 0) > break; > err = 0; > + map_size_needed = cur_offset + sizeof(*mokvar_entry); > break; > } > > - /* Sanity check that the name is null terminated */ > - size = strnlen(mokvar_entry->name, > - sizeof(mokvar_entry->name)); > - if (size >= sizeof(mokvar_entry->name)) > - break; > + /* Enforce that the name is null terminated */ > + mokvar_entry->name[sizeof(mokvar_entry->name)-1] = '\0'; > > /* Advance to the next entry */ > - cur_offset = map_size_needed + mokvar_entry->data_size; > + cur_offset += sizeof(*mokvar_entry) + mokvar_entry->data_size; > } > > if (va) > - early_memunmap(va, map_size); > + early_memunmap(va, sizeof(*mokvar_entry)); > if (err) { > pr_err("EFI MOKvar config table is not valid\n"); > return; Thanks for the fix. Should we add something like the below to avoid mapping the same page over and over again? Or is this premature optimization? --- a/drivers/firmware/efi/mokvar-table.c +++ b/drivers/firmware/efi/mokvar-table.c @@ -99,13 +99,13 @@ */ void __init efi_mokvar_table_init(void) { + struct efi_mokvar_table_entry __aligned(1) *mokvar_entry, *next_entry; efi_memory_desc_t md; void *va = NULL; unsigned long cur_offset = 0; unsigned long offset_limit; unsigned long map_size_needed = 0; unsigned long size; - struct efi_mokvar_table_entry *mokvar_entry; int err; if (!efi_enabled(EFI_MEMMAP)) @@ -142,7 +142,7 @@ return; } mokvar_entry = va; - +next: /* Check for last sentinel entry */ if (mokvar_entry->name[0] == '\0') { if (mokvar_entry->data_size != 0) @@ -156,7 +156,19 @@ void __init efi_mokvar_table_init(void) mokvar_entry->name[sizeof(mokvar_entry->name) - 1] = '\0'; /* Advance to the next entry */ - cur_offset += sizeof(*mokvar_entry) + mokvar_entry->data_size; + size = sizeof(*mokvar_entry) + mokvar_entry->data_size; + cur_offset += size; + + /* + * Don't bother remapping if the current entry header and the + * next one end on the same page. + */ + next_entry = (void *)((unsigned long)mokvar_entry + size); + if (((((unsigned long)(mokvar_entry + 1) - 1) ^ + ((unsigned long)(next_entry + 1) - 1)) & PAGE_MASK) == 0) { + mokvar_entry = next_entry; + goto next; + } } if (va)
On Thu, Feb 27, 2025 at 08:50:08AM +0100, Ard Biesheuvel wrote: > > Should we add something like the below to avoid mapping the same page > over and over again? Or is this premature optimization? > I can't honestly say I'm sure either way, but I'm leaning towards thinking it's probably worthwhile. On my development tree the number of these we wind up doing in the maximal case is 31, and but in a typical case it's more like 20, with a series that looks something like the below list of sizes and relative addresses from the first entry. (I generated this with ls and awk, so it's not quite exact but it's fairly representative.) I've marked which ones could be eliminated. map 264 at 0x0 unmap 264 at 0x0 <-- gone map 264 at 0x146 <-- gone unmap 264 at 0x146 <-- gone map 264 at 0x2a2 <-- gone unmap 264 at 0x2a2 <-- gone map 264 at 0x43e <-- gone unmap 264 at 0x43e <-- gone map 264 at 0x548 <-- gone unmap 264 at 0x548 <-- gone map 264 at 0x660 <-- gone unmap 264 at 0x660 <-- gone map 264 at 0x84d <-- gone unmap 264 at 0x84d map 264 at 0x191f unmap 264 at 0x191f <-- gone map 264 at 0x1a73 <-- gone unmap 264 at 0x1a73 <-- gone map 264 at 0x1b7c <-- gone unmap 264 at 0x1b7c <-- gone map 264 at 0x1cd0 <-- gone unmap 264 at 0x1cd0 map 264 at 0x21a8 unmap 264 at 0x21a8 <-- gone map 264 at 0x22c2 <-- gone unmap 264 at 0x22c2 <-- gone map 264 at 0x23cb <-- gone unmap 264 at 0x23cb <-- gone map 264 at 0x24d4 <-- gone unmap 264 at 0x24d4 <-- gone map 264 at 0x263c <-- gone unmap 264 at 0x263c <-- gone map 264 at 0x2746 <-- gone unmap 264 at 0x2746 map 264 at 0x4043 unmap 264 at 0x4043 map 264 at 0x86f7 unmap 264 at 0x86f7 So going from 19 map/unmap pairs to 5. Seems like it can't hurt, but it's a small number either way. Anyway, I tried your patch and it works for me: Tested-By: Peter Jones <pjones@redhat.com>
On Thu, 27 Feb 2025 at 18:13, Peter Jones <pjones@redhat.com> wrote: > > On Thu, Feb 27, 2025 at 08:50:08AM +0100, Ard Biesheuvel wrote: > > > > Should we add something like the below to avoid mapping the same page > > over and over again? Or is this premature optimization? > > > > I can't honestly say I'm sure either way, but I'm leaning towards > thinking it's probably worthwhile. On my development tree the number of > these we wind up doing in the maximal case is 31, and but in a typical > case it's more like 20, with a series that looks something like the > below list of sizes and relative addresses from the first entry. (I > generated this with ls and awk, so it's not quite exact but it's > fairly representative.) I've marked which ones could be eliminated. > > map 264 at 0x0 > unmap 264 at 0x0 <-- gone > map 264 at 0x146 <-- gone > unmap 264 at 0x146 <-- gone > map 264 at 0x2a2 <-- gone > unmap 264 at 0x2a2 <-- gone > map 264 at 0x43e <-- gone > unmap 264 at 0x43e <-- gone > map 264 at 0x548 <-- gone > unmap 264 at 0x548 <-- gone > map 264 at 0x660 <-- gone > unmap 264 at 0x660 <-- gone > map 264 at 0x84d <-- gone > unmap 264 at 0x84d > map 264 at 0x191f > unmap 264 at 0x191f <-- gone > map 264 at 0x1a73 <-- gone > unmap 264 at 0x1a73 <-- gone > map 264 at 0x1b7c <-- gone > unmap 264 at 0x1b7c <-- gone > map 264 at 0x1cd0 <-- gone > unmap 264 at 0x1cd0 > map 264 at 0x21a8 > unmap 264 at 0x21a8 <-- gone > map 264 at 0x22c2 <-- gone > unmap 264 at 0x22c2 <-- gone > map 264 at 0x23cb <-- gone > unmap 264 at 0x23cb <-- gone > map 264 at 0x24d4 <-- gone > unmap 264 at 0x24d4 <-- gone > map 264 at 0x263c <-- gone > unmap 264 at 0x263c <-- gone > map 264 at 0x2746 <-- gone > unmap 264 at 0x2746 > map 264 at 0x4043 > unmap 264 at 0x4043 > map 264 at 0x86f7 > unmap 264 at 0x86f7 > > So going from 19 map/unmap pairs to 5. Seems like it can't hurt, but > it's a small number either way. > So this would go from 19 to 3 on a 16k pages kernel. So I'm leaning to applying it as well. BTW these results confirm my suspicion that these headers may appear misaligned, hence the __aligned(1) > Anyway, I tried your patch and it works for me: > > Tested-By: Peter Jones <pjones@redhat.com> > Thanks. I'll apply it as a separate patch, and only tag your patch cc:stable
diff --git a/drivers/firmware/efi/mokvar-table.c b/drivers/firmware/efi/mokvar-table.c index 5ed0602c2f7..66eb83a0f12 100644 --- a/drivers/firmware/efi/mokvar-table.c +++ b/drivers/firmware/efi/mokvar-table.c @@ -103,7 +103,6 @@ void __init efi_mokvar_table_init(void) void *va = NULL; unsigned long cur_offset = 0; unsigned long offset_limit; - unsigned long map_size = 0; unsigned long map_size_needed = 0; unsigned long size; struct efi_mokvar_table_entry *mokvar_entry; @@ -134,48 +133,34 @@ void __init efi_mokvar_table_init(void) */ err = -EINVAL; while (cur_offset + sizeof(*mokvar_entry) <= offset_limit) { - mokvar_entry = va + cur_offset; - map_size_needed = cur_offset + sizeof(*mokvar_entry); - if (map_size_needed > map_size) { - if (va) - early_memunmap(va, map_size); - /* - * Map a little more than the fixed size entry - * header, anticipating some data. It's safe to - * do so as long as we stay within current memory - * descriptor. - */ - map_size = min(map_size_needed + 2*EFI_PAGE_SIZE, - offset_limit); - va = early_memremap(efi.mokvar_table, map_size); - if (!va) { - pr_err("Failed to map EFI MOKvar config table pa=0x%lx, size=%lu.\n", - efi.mokvar_table, map_size); - return; - } - mokvar_entry = va + cur_offset; + if (va) + early_memunmap(va, sizeof(*mokvar_entry)); + va = early_memremap(efi.mokvar_table + cur_offset, sizeof(*mokvar_entry)); + if (!va) { + pr_err("Failed to map EFI MOKvar config table pa=0x%lx, size=%zu.\n", + efi.mokvar_table + cur_offset, sizeof(*mokvar_entry)); + return; } + mokvar_entry = va; /* Check for last sentinel entry */ if (mokvar_entry->name[0] == '\0') { if (mokvar_entry->data_size != 0) break; err = 0; + map_size_needed = cur_offset + sizeof(*mokvar_entry); break; } - /* Sanity check that the name is null terminated */ - size = strnlen(mokvar_entry->name, - sizeof(mokvar_entry->name)); - if (size >= sizeof(mokvar_entry->name)) - break; + /* Enforce that the name is null terminated */ + mokvar_entry->name[sizeof(mokvar_entry->name)-1] = '\0'; /* Advance to the next entry */ - cur_offset = map_size_needed + mokvar_entry->data_size; + cur_offset += sizeof(*mokvar_entry) + mokvar_entry->data_size; } if (va) - early_memunmap(va, map_size); + early_memunmap(va, sizeof(*mokvar_entry)); if (err) { pr_err("EFI MOKvar config table is not valid\n"); return;
Currently when validating the mokvar table, we (re)map the entire table on each iteration of the loop, adding space as we discover new entries. If the table grows over a certain size, this fails due to limitations of early_memmap(), and we get a failure and traceback: ------------[ cut here ]------------ WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:139 __early_ioremap+0xef/0x220 Modules linked in: CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.12.15-200.fc41.x86_64 #1 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS edk2-20250221-6.copr8698600 02/21/2025 RIP: 0010:__early_ioremap+0xef/0x220 Code: e5 00 f0 ff ff 48 81 e5 00 f0 ff ff 4c 89 6c 24 08 41 81 e4 ff 0f 00 00 4c 29 ed 48 89 e8 48 c1 e8 0c 41 89 c7 83 f8 40 76 04 <0f> 0b eb 82 45 6b ee c0 41 81 c5 ff 05 00 00 45 85 ff 74 36 83 3d RSP: 0000:ffffffff96803dd8 EFLAGS: 00010002 ORIG_RAX: 0000000000000000 RAX: 0000000000000041 RBX: 0000000000000001 RCX: ffffffff97768250 RDX: 8000000000000163 RSI: 0000000000000001 RDI: 000000007c4c3000 RBP: 0000000000041000 R08: ffffffffff201630 R09: 0000000000000030 R10: 000000007c4c3000 R11: ffffffff96803e20 R12: 0000000000000000 R13: 000000007c4c3000 R14: 0000000000000001 R15: 0000000000000041 FS: 0000000000000000(0000) GS:ffffffff97291000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffff9f1d8000040e CR3: 00000000653a4000 CR4: 00000000000000f0 Call Trace: <TASK> ? __early_ioremap+0xef/0x220 ? __warn.cold+0x93/0xfa ? __early_ioremap+0xef/0x220 ? report_bug+0xff/0x140 ? early_fixup_exception+0x5d/0xb0 ? early_idt_handler_common+0x2f/0x3a ? __early_ioremap+0xef/0x220 ? efi_mokvar_table_init+0xce/0x1d0 ? setup_arch+0x864/0xc10 ? start_kernel+0x6b/0xa10 ? x86_64_start_reservations+0x24/0x30 ? x86_64_start_kernel+0xed/0xf0 ? common_startup_64+0x13e/0x141 </TASK> ---[ end trace 0000000000000000 ]--- mokvar: Failed to map EFI MOKvar config table pa=0x7c4c3000, size=265187. Mapping the entire structure isn't actually necessary, as we don't ever need more than one entry header mapped at once. This patch changes efi_mokvar_table_init() to only map each entry header, not the entire table, when determining the table size. Since we're not mapping any data past the variable name, it also changes the code to enforce that each variable name is NUL terminated, rather than attempting to verify it in place. Signed-off-by: Peter Jones <pjones@redhat.com> --- drivers/firmware/efi/mokvar-table.c | 41 +++++++++-------------------- 1 file changed, 13 insertions(+), 28 deletions(-)