mbox series

[0/2] UEFI panic notification mechanism

Message ID 20220520195028.1347426-1-gpiccoli@igalia.com
Headers show
Series UEFI panic notification mechanism | expand

Message

Guilherme G. Piccoli May 20, 2022, 7:50 p.m. UTC
Hi folks, this patch set is about a mechanism to notify the UEFI
firmware about a panic event in the kernel, so the firmware is
enabled to take some action. The specific use case for us is to
show an alternative splash screen if a panic happened.
The base for the idea is to explore the panic notifiers mechanism,
that allows a callback to execute in the panic path.

Patch 1 is kind of a "clean-up" in a way; just taking a helper out
of efibc and adding it on generic efi.h header, so we have common
code used by 3 modules (efibc, efi-pstore and efi-panic).

Patch 2 is the efi-panic module itself; it is *strongly* based on
efibc, and for that I'd like to hereby thank to the authors.
The efibc code is very clear!


Some design decisions to be discussed:

(1) The variable name and value - I called it PanicWarn, and the
data it holds is just a byte, set by default to 0xFF (though users
can change that via module parameter). I have no attachment to
these, we can improve naming and the size of the data for example,
suggestions are appreciated!


(2) The 3 modules (efibc, efi-pstore and efi-panic) share a lot of
ideas or code; both efi-{pstore,panic} deal with panic events, and
both efi{bc,-panic} rely on notifiers and share code.
Should we unify some of them or anything like that? It seemed to me
the proper approach would be a simple and small standalone module,
but I'm open for suggestions.


(3) I've noticed a behavior that also happens in efi-pstore, I'm not
sure if that's something to care about. In the efi-panic module, after
a fresh boot, we check if PanicWarn is there an if so, we print a
message and _delete_ that variable from the NVRAM. But...the variable
is still present in sysfs, in the list created by efivars. Same happens
with efi-pstore, if we delete the dumps generated from /sys/fs/pstore.

In my case, I've used the __efivar_entry_delete() variant, so it doesn't
delete from any list, only from the firmware area. Should this be handled?
Or we just don't care with the empty variable reference in the sysfs tree?


I appreciate feedbacks and suggestions, thanks in advance for the attention!
Cheers,


Guilherme


Guilherme G. Piccoli (2):
  efi: Add a generic helper to convert strings to unicode
  efi-panic: Introduce the UEFI panic notification module

 drivers/firmware/efi/Kconfig      | 10 ++++
 drivers/firmware/efi/Makefile     |  1 +
 drivers/firmware/efi/efi-panic.c  | 97 +++++++++++++++++++++++++++++++
 drivers/firmware/efi/efi-pstore.c |  5 +-
 drivers/firmware/efi/efibc.c      | 16 ++---
 include/linux/efi.h               | 16 +++++
 6 files changed, 130 insertions(+), 15 deletions(-)
 create mode 100644 drivers/firmware/efi/efi-panic.c

Comments

Guilherme G. Piccoli June 2, 2022, 5:39 p.m. UTC | #1
Hi Ard, apologies for annoying!

Just a friendly ping asking if you have any opinions about these patches.

Thanks in advance! Cheers,


Guilherme
Ard Biesheuvel June 28, 2022, 7:49 a.m. UTC | #2
On Thu, 2 Jun 2022 at 19:40, Guilherme G. Piccoli <gpiccoli@igalia.com> wrote:
>
> Hi Ard, apologies for annoying!
>

No worries, just very busy :-)

> Just a friendly ping asking if you have any opinions about these patches.
>

Honestly, I'm not sure I see the value of this. You want to 'notify
the UEFI firmware' but the firmware doesn't really care about these
variables, only your bespoke tooling does. EFI pstore captures far
more data, so if you just wipe /sys/fs/pstore after each clean boot,
you already have all the data you need, no?

Also, I'm in the process of removing the public efivar_entry_xxx() API
- please look at the efi/next tree for a peek. This is related to your
point 3), i.e., the efivar layer is a disaster in terms of consistency
between different observers of the EFI variable store. Switching to
efivar_set_variable() [the new API] should fix that.
Guilherme G. Piccoli June 28, 2022, 6:10 p.m. UTC | #3
On 28/06/2022 09:49, Ard Biesheuvel wrote:
> On Thu, 2 Jun 2022 at 19:40, Guilherme G. Piccoli <gpiccoli@igalia.com> wrote:
>>
>> Hi Ard, apologies for annoying!
>>
> 
> No worries, just very busy :-)
> 
>> Just a friendly ping asking if you have any opinions about these patches.
>>
> 
> Honestly, I'm not sure I see the value of this. You want to 'notify
> the UEFI firmware' but the firmware doesn't really care about these
> variables, only your bespoke tooling does. EFI pstore captures far
> more data, so if you just wipe /sys/fs/pstore after each clean boot,
> you already have all the data you need, no?
> 
> Also, I'm in the process of removing the public efivar_entry_xxx() API
> - please look at the efi/next tree for a peek. This is related to your
> point 3), i.e., the efivar layer is a disaster in terms of consistency
> between different observers of the EFI variable store. Switching to
> efivar_set_variable() [the new API] should fix that.

Hi Ard, thanks a lot for your review! Lemme split my points in some
bullets below:

a) What about patch 1, is it good as-is?


b) Cool about efivar_set_variable(), could fix that in V2.


c) Now, to the most relevant thing, the value of this. I might be
mistaken, but is there any known way to let UEFI know a panic happened?
For now, maybe only my UEFI fw might use that (and the usage is very
nice, showing a panic logo), but this opens a myriad of potential uses.
We can think in RAM preserving mechanism conditional to panic scenarios,
special resets for panic vs. cold boot, and even maybe a firmware vmcore
capturing.

If there is any other way to let UEFI know about a panic, let me know
and that will likely be more than enough for me. Otherwise, do you think
such small code would be a big burden to carry, considering the
cost/benefit for my use case?

Thanks in advance for your analysis.
Cheers,


Guilherme