Message ID | 20250514193406.3998101-1-superm1@kernel.org |
---|---|
Headers | show |
Series | Improvements to S5 power consumption | expand |
On 5/16/2025 9:58 AM, Rafael J. Wysocki wrote: > On Wed, May 14, 2025 at 9:34 PM Mario Limonciello <superm1@kernel.org> wrote: >> >> From: Mario Limonciello <mario.limonciello@amd.com> >> >> When the system is powered off the kernel will call device_shutdown() >> which will issue callbacks into PCI core to wake up a device and call >> it's shutdown() callback. This will leave devices in ACPI D0 which can >> cause some devices to misbehave with spurious wakeups and also leave some >> devices on which will consume power needlessly. >> >> The issue won't happen if the device is in D3 before system shutdown, so >> putting device to low power state before shutdown solves the issue. >> >> ACPI Spec 6.5, "7.4.2.5 System \_S4 State" says "Devices states are >> compatible with the current Power Resource states. In other words, all >> devices are in the D3 state when the system state is S4." >> >> The following "7.4.2.6 System \_S5 State (Soft Off)" states "The S5 >> state is similar to the S4 state except that OSPM does not save any >> context." so it's safe to assume devices should be at D3 for S5. >> >> To accomplish this, modify the PM core to call all the device hibernate >> callbacks when turning off the system when the kernel is compiled with >> hibernate support. If compiled without hibernate support or hibernate fails >> fall back into the previous shutdown flow. >> >> Cc: AceLan Kao <acelan.kao@canonical.com> >> Cc: Kai-Heng Feng <kaihengf@nvidia.com> >> Cc: Mark Pearson <mpearson-lenovo@squebb.ca> >> Cc: Merthan Karakaş <m3rthn.k@gmail.com> >> Tested-by: Denis Benato <benato.denis96@gmail.com> >> Link: https://lore.kernel.org/linux-pci/20231213182656.6165-1-mario.limonciello@amd.com/ >> Link: https://lore.kernel.org/linux-pci/20250506041934.1409302-1-superm1@kernel.org/ >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> v2: >> * Handle failures to hibernate (fall back to shutdown) >> * Don't use dedicated events >> * Only allow under CONFIG_HIBERNATE_CALLBACKS >> --- >> kernel/reboot.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/kernel/reboot.c b/kernel/reboot.c >> index ec087827c85cd..52f5e6e36a6f8 100644 >> --- a/kernel/reboot.c >> +++ b/kernel/reboot.c >> @@ -13,6 +13,7 @@ >> #include <linux/kexec.h> >> #include <linux/kmod.h> >> #include <linux/kmsg_dump.h> >> +#include <linux/pm.h> >> #include <linux/reboot.h> >> #include <linux/suspend.h> >> #include <linux/syscalls.h> >> @@ -305,6 +306,17 @@ static void kernel_shutdown_prepare(enum system_states state) >> (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); >> system_state = state; >> usermodehelper_disable(); >> +#ifdef CONFIG_HIBERNATE_CALLBACKS >> + if (dpm_suspend_start(PMSG_HIBERNATE)) >> + goto resume_devices; > > A failure of one device may trigger a cascade of failures when trying > to resume devices and it is not even necessary to resume the ones that > have been powered off successfully. Right it "shouldn't" be necessary, but I wanted to make sure that we had a clean (expected) slate going into device_shutdown(). Otherwise drivers might not have been prepared to go right from poweroff() to shutdown() callbacks. > > IMV this should just ignore errors during the processing of devices, > so maybe introduce PMSG_POWEROFF for it? Hmm - I guess it depends upon the failures that occurred. I'll start plumbing a new message and see how it looks. I don't "think" we can safely call dpm_suspend_end() if dpm_suspend_start() failed though. > > It should also ignore wakeup events that occur while devices are powered off. > >> + if (dpm_suspend_end(PMSG_HIBERNATE)) >> + goto resume_devices; >> + return; >> + >> +resume_devices: >> + pr_emerg("Failed to power off devices, using shutdown instead.\n"); >> + dpm_resume_end(PMSG_RESTORE); > > Unfortunately, PMSG_RESTORE is not the right resume action for > PMSG_HIBERNATE because it may not power-up things (some drivers assume > that the restore kernel will power-up devices and so they don't do it > in "restore" callbacks). > > I do realize that hibernation uses it to reverse PMSG_HIBERNATE, but > it should not do that either. That may be fixed later, though. > >> +#endif >> device_shutdown(); >> } >> /** >> -- > > I'd prefer to get back to this series after the 6.16 merge window > starts. It is sort of last minute for 6.16 and it is far from ready > IMV. Sure, I'll get a start on your feedback above and submit a fixed up version after the merge window.
On Fri, May 16, 2025 at 9:33 PM Mario Limonciello <superm1@kernel.org> wrote: > > On 5/16/2025 9:58 AM, Rafael J. Wysocki wrote: > > On Wed, May 14, 2025 at 9:34 PM Mario Limonciello <superm1@kernel.org> wrote: > >> > >> From: Mario Limonciello <mario.limonciello@amd.com> > >> > >> When the system is powered off the kernel will call device_shutdown() > >> which will issue callbacks into PCI core to wake up a device and call > >> it's shutdown() callback. This will leave devices in ACPI D0 which can > >> cause some devices to misbehave with spurious wakeups and also leave some > >> devices on which will consume power needlessly. > >> > >> The issue won't happen if the device is in D3 before system shutdown, so > >> putting device to low power state before shutdown solves the issue. > >> > >> ACPI Spec 6.5, "7.4.2.5 System \_S4 State" says "Devices states are > >> compatible with the current Power Resource states. In other words, all > >> devices are in the D3 state when the system state is S4." > >> > >> The following "7.4.2.6 System \_S5 State (Soft Off)" states "The S5 > >> state is similar to the S4 state except that OSPM does not save any > >> context." so it's safe to assume devices should be at D3 for S5. > >> > >> To accomplish this, modify the PM core to call all the device hibernate > >> callbacks when turning off the system when the kernel is compiled with > >> hibernate support. If compiled without hibernate support or hibernate fails > >> fall back into the previous shutdown flow. > >> > >> Cc: AceLan Kao <acelan.kao@canonical.com> > >> Cc: Kai-Heng Feng <kaihengf@nvidia.com> > >> Cc: Mark Pearson <mpearson-lenovo@squebb.ca> > >> Cc: Merthan Karakaş <m3rthn.k@gmail.com> > >> Tested-by: Denis Benato <benato.denis96@gmail.com> > >> Link: https://lore.kernel.org/linux-pci/20231213182656.6165-1-mario.limonciello@amd.com/ > >> Link: https://lore.kernel.org/linux-pci/20250506041934.1409302-1-superm1@kernel.org/ > >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > >> --- > >> v2: > >> * Handle failures to hibernate (fall back to shutdown) > >> * Don't use dedicated events > >> * Only allow under CONFIG_HIBERNATE_CALLBACKS > >> --- > >> kernel/reboot.c | 12 ++++++++++++ > >> 1 file changed, 12 insertions(+) > >> > >> diff --git a/kernel/reboot.c b/kernel/reboot.c > >> index ec087827c85cd..52f5e6e36a6f8 100644 > >> --- a/kernel/reboot.c > >> +++ b/kernel/reboot.c > >> @@ -13,6 +13,7 @@ > >> #include <linux/kexec.h> > >> #include <linux/kmod.h> > >> #include <linux/kmsg_dump.h> > >> +#include <linux/pm.h> > >> #include <linux/reboot.h> > >> #include <linux/suspend.h> > >> #include <linux/syscalls.h> > >> @@ -305,6 +306,17 @@ static void kernel_shutdown_prepare(enum system_states state) > >> (state == SYSTEM_HALT) ? SYS_HALT : SYS_POWER_OFF, NULL); > >> system_state = state; > >> usermodehelper_disable(); > >> +#ifdef CONFIG_HIBERNATE_CALLBACKS > >> + if (dpm_suspend_start(PMSG_HIBERNATE)) > >> + goto resume_devices; > > > > A failure of one device may trigger a cascade of failures when trying > > to resume devices and it is not even necessary to resume the ones that > > have been powered off successfully. > > Right it "shouldn't" be necessary, but I wanted to make sure that we had > a clean (expected) slate going into device_shutdown(). > > Otherwise drivers might not have been prepared to go right from > poweroff() to shutdown() callbacks. > > > > > IMV this should just ignore errors during the processing of devices, > > so maybe introduce PMSG_POWEROFF for it? > > Hmm - I guess it depends upon the failures that occurred. I'll start > plumbing a new message and see how it looks. > > I don't "think" we can safely call dpm_suspend_end() if > dpm_suspend_start() failed though. Nothing is safe at this point when dpm_suspend_start() fails, so why not just continue. Hopefully, it'll get to the point when power can be turned off and then it won't matter too much anyway.
From: Mario Limonciello <mario.limonciello@amd.com> A variety of issues both in function and in power consumption have been raised as a result of devices not being put into a low power state when the system is powered off. There have been some localized changes[1] to PCI core to help these issues, but they have had various downsides. This series instead tries to use the S4 flow when the system is being powered off. This lines up the behavior with what other operating systems do as well. If for some reason that fails or is not supported, unwind and do the previous S5 flow that will wake all devices and run their shutdown() callbacks. Previous submissions [1]: Link: https://lore.kernel.org/linux-pm/CAJZ5v0hrKEJa8Ad7iiAvQ3d_0ysVhzZcXSYc5kkL=6vtseF+bg@mail.gmail.com/T/#m91e4eae868a7405ae579e89b135085f4906225d2 Link: https://lore.kernel.org/linux-pci/20231213182656.6165-1-mario.limonciello@amd.com/ Link: https://lore.kernel.org/linux-pci/20250506041934.1409302-1-superm1@kernel.org/ Mario Limonciello (3): PM: Use hibernate flows for system power off PCI: Put PCIe ports with downstream devices into D3 at hibernate drm/amd: Avoid evicting resources at S5 drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +++ drivers/pci/pci-driver.c | 39 ++++++++++++++++++++-- kernel/reboot.c | 12 +++++++ 3 files changed, 53 insertions(+), 2 deletions(-)