diff mbox series

[v1] kexec_core: Add and update comments regarding the KEXEC_JUMP flow

Message ID 4968818.GXAFRqVoOG@rjwysocki.net
State New
Headers show
Series [v1] kexec_core: Add and update comments regarding the KEXEC_JUMP flow | expand

Commit Message

Rafael J. Wysocki Dec. 16, 2024, 1:39 p.m. UTC
From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The KEXEC_JUMP flow is analogous to hibernation flows occurring before
and after creating an image and before and after jumping from the
restore kernel to the image one, which is why it uses the same device
callbacks as those hibernation flows.

Add comments explaining that to the code in question and update an
existing comment in it which appears a bit out of context.

No functional changes.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

The kexec_jump code has a few problems AFAICS.

First off, it should use lock_system_sleep() or "interesting" things may
happen when it is run in parallel to a system-wide PM transition.

Second, it looks like it should use pm_sleep_disable/enable_decondary_cpus()
instead of the "raw" suspend_disable/enable_secondary_cpus() because running
it with unpaused cpuidle is kind of a slippery slope.

Moreover, it wouldn't hurt to somehow call acpi_pm_freeze() somewhere during
it to prevent background platform activity from interfering with the "resume"
part of it.

It also might be useful to unify it with the analogous hibernation flows more
directly, but that would require some rearrangements of the latter.

I'm going to send patches along these lines at one point in the future
unless I'm told that this is a bad idea.

Thanks!

---
 kernel/kexec_core.c |   23 +++++++++++++++++------
 1 file changed, 17 insertions(+), 6 deletions(-)
diff mbox series

Patch

--- a/kernel/kexec_core.c
+++ b/kernel/kexec_core.c
@@ -1001,6 +1001,12 @@ 
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
+		/*
+		 * This flow is analogous to hibernation flows that occur before
+		 * creating an image and before jumping from the restore kernel
+		 * to the image one, so it uses the same device device callbacks
+		 * as those two flows.
+		 */
 		pm_prepare_console();
 		error = freeze_processes();
 		if (error) {
@@ -1011,12 +1017,10 @@ 
 		error = dpm_suspend_start(PMSG_FREEZE);
 		if (error)
 			goto Resume_console;
-		/* At this point, dpm_suspend_start() has been called,
-		 * but *not* dpm_suspend_end(). We *must* call
-		 * dpm_suspend_end() now.  Otherwise, drivers for
-		 * some devices (e.g. interrupt controllers) become
-		 * desynchronized with the actual state of the
-		 * hardware at resume time, and evil weirdness ensues.
+		/*
+		 * dpm_suspend_end() must be called after dpm_suspend_start()
+		 * to complete the transition, like in the hibernation flows
+		 * mentioned above.
 		 */
 		error = dpm_suspend_end(PMSG_FREEZE);
 		if (error)
@@ -1052,6 +1056,13 @@ 
 
 #ifdef CONFIG_KEXEC_JUMP
 	if (kexec_image->preserve_context) {
+		/*
+		 * This flow is analogous to hibernation flows that occur after
+		 * creating an image and after the image hernel has got control
+		 * back, and in case the devices have been reset or otherwise
+		 * manipulated in the meantime, it uses the device callbacks
+		 * used by the latter.
+		 */
 		syscore_resume();
  Enable_irqs:
 		local_irq_enable();