Message ID | 20240405083410.4896-1-anna-maria@linutronix.de |
---|---|
State | Accepted |
Commit | 3c89a068bfd0698a5478f4cf39493595ef757d5e |
Headers | show |
Series | PM: s2idle: Make sure CPUs will wakeup directly on resume | expand |
On Mon, 8 Apr 2024 at 09:02, Anna-Maria Behnsen <anna-maria@linutronix.de> wrote: > > s2idle works like a regular suspend with freezing processes and freezing > devices. All CPUs except the control CPU go into idle. Once this is > completed the control CPU kicks all other CPUs out of idle, so that they > reenter the idle loop and then enter s2idle state. The control CPU then > issues an swait() on the suspend state and therefore enters the idle loop > as well. > > Due to being kicked out of idle, the other CPUs leave their NOHZ states, > which means the tick is active and the corresponding hrtimer is programmed > to the next jiffie. > > On entering s2idle the CPUs shut down their local clockevent device to > prevent wakeups. The last CPU which enters s2idle shuts down its local > clockevent and freezes timekeeping. > > On resume, one of the CPUs receives the wakeup interrupt, unfreezes > timekeeping and its local clockevent and starts the resume process. At that > point all other CPUs are still in s2idle with their clockevents switched > off. They only resume when they are kicked by another CPU or after resuming > devices and then receiving a device interrupt. > > That means there is no guarantee that all CPUs will wakeup directly on > resume. As a consequence there is no guarantee that timers which are queued > on those CPUs and should expire directly after resume, are handled. Also > timer list timers which are remotely queued to one of those CPUs after > resume will not result in a reprogramming IPI as the tick is > active. Queueing a hrtimer will also not result in a reprogramming IPI > because the first hrtimer event is already in the past. > > The recent introduction of the timer pull model (7ee988770326 ("timers: > Implement the hierarchical pull model")) amplifies this problem, if the > current migrator is one of the non woken up CPUs. When a non pinned timer > list timer is queued and the queuing CPU goes idle, it relies on the still > suspended migrator CPU to expire the timer which will happen by chance. > > The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause > cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which > in turn invoked a wakeup for all idle CPUs was moved to a later point in > the resume process. This might not be reached or reached very late because > it waits on a timer of a still suspended CPU. > > Address this by kicking all CPUs out of idle after the control CPU returns > from swait() so that they resume their timers and restore consistent system > state. > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641 > Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path") > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> > Tested-by: Mario Limonciello <mario.limonciello@amd.com> > Cc: stable@kernel.org Thanks for the detailed commit message! Please add: Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe > --- > v2: Fix typos in commit message > --- > kernel/power/suspend.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > --- a/kernel/power/suspend.c > +++ b/kernel/power/suspend.c > @@ -106,6 +106,12 @@ static void s2idle_enter(void) > swait_event_exclusive(s2idle_wait_head, > s2idle_state == S2IDLE_STATE_WAKE); > > + /* > + * Kick all CPUs to ensure that they resume their timers and restore > + * consistent system state. > + */ > + wake_up_all_idle_cpus(); > + > cpus_read_unlock(); > > raw_spin_lock_irq(&s2idle_lock);
On Mon, Apr 8, 2024 at 2:43 PM Ulf Hansson <ulf.hansson@linaro.org> wrote: > > On Mon, 8 Apr 2024 at 09:02, Anna-Maria Behnsen > <anna-maria@linutronix.de> wrote: > > > > s2idle works like a regular suspend with freezing processes and freezing > > devices. All CPUs except the control CPU go into idle. Once this is > > completed the control CPU kicks all other CPUs out of idle, so that they > > reenter the idle loop and then enter s2idle state. The control CPU then > > issues an swait() on the suspend state and therefore enters the idle loop > > as well. > > > > Due to being kicked out of idle, the other CPUs leave their NOHZ states, > > which means the tick is active and the corresponding hrtimer is programmed > > to the next jiffie. > > > > On entering s2idle the CPUs shut down their local clockevent device to > > prevent wakeups. The last CPU which enters s2idle shuts down its local > > clockevent and freezes timekeeping. > > > > On resume, one of the CPUs receives the wakeup interrupt, unfreezes > > timekeeping and its local clockevent and starts the resume process. At that > > point all other CPUs are still in s2idle with their clockevents switched > > off. They only resume when they are kicked by another CPU or after resuming > > devices and then receiving a device interrupt. > > > > That means there is no guarantee that all CPUs will wakeup directly on > > resume. As a consequence there is no guarantee that timers which are queued > > on those CPUs and should expire directly after resume, are handled. Also > > timer list timers which are remotely queued to one of those CPUs after > > resume will not result in a reprogramming IPI as the tick is > > active. Queueing a hrtimer will also not result in a reprogramming IPI > > because the first hrtimer event is already in the past. > > > > The recent introduction of the timer pull model (7ee988770326 ("timers: > > Implement the hierarchical pull model")) amplifies this problem, if the > > current migrator is one of the non woken up CPUs. When a non pinned timer > > list timer is queued and the queuing CPU goes idle, it relies on the still > > suspended migrator CPU to expire the timer which will happen by chance. > > > > The problem exists since commit 8d89835b0467 ("PM: suspend: Do not pause > > cpuidle in the suspend-to-idle path"). There the cpuidle_pause() call which > > in turn invoked a wakeup for all idle CPUs was moved to a later point in > > the resume process. This might not be reached or reached very late because > > it waits on a timer of a still suspended CPU. > > > > Address this by kicking all CPUs out of idle after the control CPU returns > > from swait() so that they resume their timers and restore consistent system > > state. > > > > Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218641 > > Fixes: 8d89835b0467 ("PM: suspend: Do not pause cpuidle in the suspend-to-idle path") > > Signed-off-by: Anna-Maria Behnsen <anna-maria@linutronix.de> > > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> > > Tested-by: Mario Limonciello <mario.limonciello@amd.com> > > Cc: stable@kernel.org > > Thanks for the detailed commit message! Please add: > > Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Applied as 6.9-rc material, many thanks to everyone involved!
diff --git a/kernel/power/suspend.c b/kernel/power/suspend.c index e3ae93bbcb9b..09f8397bae15 100644 --- a/kernel/power/suspend.c +++ b/kernel/power/suspend.c @@ -106,6 +106,12 @@ static void s2idle_enter(void) swait_event_exclusive(s2idle_wait_head, s2idle_state == S2IDLE_STATE_WAKE); + /* + * Kick all CPUs to ensure that they resume their timers and restore + * consistent system state. + */ + wake_up_all_idle_cpus(); + cpus_read_unlock(); raw_spin_lock_irq(&s2idle_lock);