Message ID | 1363868494-5503-1-git-send-email-daniel.lezcano@linaro.org |
---|---|
State | New |
Headers | show |
On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote: > When a cpu enters a deep idle state, the local timers are stopped and > the time framework falls back to the timer device used as a broadcast > timer. > > The different cpuidle drivers are calling clockevents_notify ENTER/EXIT > when the idle state stops the local timer. > > Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle > drivers. If the flag is set, the cpuidle core code takes care of the > notification on behalf of the driver to avoid pointless code duplication. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Len Brown <lenb@kernel.org> > Cc: Linus Walleij <linus.walleij@linaro.org> > Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> > Cc: Rajendra Nayak <rnayak@ti.com> > Cc: Sascha Hauer <kernel@pengutronix.de> > Cc: Thomas Gleixner <tglx@linutronix.de> > --- > drivers/cpuidle/cpuidle.c | 9 +++++++++ > include/linux/cpuidle.h | 1 + > 2 files changed, 10 insertions(+) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index eba6929..c500370 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -8,6 +8,7 @@ > * This code is licenced under the GPL. > */ > > +#include <linux/clockchips.h> > #include <linux/kernel.h> > #include <linux/mutex.h> > #include <linux/sched.h> > @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) > > trace_cpu_idle_rcuidle(next_state, dev->cpu); > > + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) > + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, > + &dev->cpu); > + Seems like good clean-up from drivers. Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> How about taking care of cpu_pm_notifiers() as well with some additional flag for CPU and cluster power state. That can help to reduce and consolidate the code. What you say ? Regards, Santosh Regards, Santosh
On 03/21/2013 01:59 PM, Santosh Shilimkar wrote: > On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote: >> When a cpu enters a deep idle state, the local timers are stopped and >> the time framework falls back to the timer device used as a broadcast >> timer. >> >> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT >> when the idle state stops the local timer. >> >> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle >> drivers. If the flag is set, the cpuidle core code takes care of the >> notification on behalf of the driver to avoid pointless code duplication. >> >> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> >> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> >> Cc: Len Brown <lenb@kernel.org> >> Cc: Linus Walleij <linus.walleij@linaro.org> >> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> >> Cc: Rajendra Nayak <rnayak@ti.com> >> Cc: Sascha Hauer <kernel@pengutronix.de> >> Cc: Thomas Gleixner <tglx@linutronix.de> >> --- >> drivers/cpuidle/cpuidle.c | 9 +++++++++ >> include/linux/cpuidle.h | 1 + >> 2 files changed, 10 insertions(+) >> >> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c >> index eba6929..c500370 100644 >> --- a/drivers/cpuidle/cpuidle.c >> +++ b/drivers/cpuidle/cpuidle.c >> @@ -8,6 +8,7 @@ >> * This code is licenced under the GPL. >> */ >> >> +#include <linux/clockchips.h> >> #include <linux/kernel.h> >> #include <linux/mutex.h> >> #include <linux/sched.h> >> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) >> >> trace_cpu_idle_rcuidle(next_state, dev->cpu); >> >> + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) >> + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, >> + &dev->cpu); >> + > Seems like good clean-up from drivers. > Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> > > How about taking care of cpu_pm_notifiers() as well with some > additional flag for CPU and cluster power state. That can help > to reduce and consolidate the code. What you say ? Do you mean add a flag for different level of idle (idle, suspend, power off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the drivers as a common framework ?
On 03/21/2013 01:59 PM, Santosh Shilimkar wrote: [ ... ] > Seems like good clean-up from drivers. > Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> ... btw, thanks for reviewing the patches :) -- Daniel
On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote: > On 03/21/2013 01:59 PM, Santosh Shilimkar wrote: >> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote: >>> When a cpu enters a deep idle state, the local timers are stopped and >>> the time framework falls back to the timer device used as a broadcast >>> timer. >>> >>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT >>> when the idle state stops the local timer. >>> >>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle >>> drivers. If the flag is set, the cpuidle core code takes care of the >>> notification on behalf of the driver to avoid pointless code duplication. >>> >>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> >>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> >>> Cc: Len Brown <lenb@kernel.org> >>> Cc: Linus Walleij <linus.walleij@linaro.org> >>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> >>> Cc: Rajendra Nayak <rnayak@ti.com> >>> Cc: Sascha Hauer <kernel@pengutronix.de> >>> Cc: Thomas Gleixner <tglx@linutronix.de> >>> --- >>> drivers/cpuidle/cpuidle.c | 9 +++++++++ >>> include/linux/cpuidle.h | 1 + >>> 2 files changed, 10 insertions(+) >>> >>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c >>> index eba6929..c500370 100644 >>> --- a/drivers/cpuidle/cpuidle.c >>> +++ b/drivers/cpuidle/cpuidle.c >>> @@ -8,6 +8,7 @@ >>> * This code is licenced under the GPL. >>> */ >>> >>> +#include <linux/clockchips.h> >>> #include <linux/kernel.h> >>> #include <linux/mutex.h> >>> #include <linux/sched.h> >>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) >>> >>> trace_cpu_idle_rcuidle(next_state, dev->cpu); >>> >>> + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) >>> + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, >>> + &dev->cpu); >>> + >> Seems like good clean-up from drivers. >> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> >> >> How about taking care of cpu_pm_notifiers() as well with some >> additional flag for CPU and cluster power state. That can help >> to reduce and consolidate the code. What you say ? > > Do you mean add a flag for different level of idle (idle, suspend, power > off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the > drivers as a common framework ? > I mean only for CPUidle considering C-state already has the information about CPU and cluster power state. For suspend, we by-default run the notifiers so nothing needs to be done there. You may not even need a framework. Just like we know in a C-state, timer stops, same lines, we can say CPU state is going to be say off and hence cpu_pm_enter() notifier needs to be called. And same way for cluster. I still haven't given complete thought but thought crossed my mind after looking at your patches. Regards, Santosh
On 03/21/2013 03:04 PM, Santosh Shilimkar wrote: > On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote: >> On 03/21/2013 01:59 PM, Santosh Shilimkar wrote: >>> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote: >>>> When a cpu enters a deep idle state, the local timers are stopped and >>>> the time framework falls back to the timer device used as a broadcast >>>> timer. >>>> >>>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT >>>> when the idle state stops the local timer. >>>> >>>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle >>>> drivers. If the flag is set, the cpuidle core code takes care of the >>>> notification on behalf of the driver to avoid pointless code duplication. >>>> >>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> >>>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> >>>> Cc: Len Brown <lenb@kernel.org> >>>> Cc: Linus Walleij <linus.walleij@linaro.org> >>>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> >>>> Cc: Rajendra Nayak <rnayak@ti.com> >>>> Cc: Sascha Hauer <kernel@pengutronix.de> >>>> Cc: Thomas Gleixner <tglx@linutronix.de> >>>> --- >>>> drivers/cpuidle/cpuidle.c | 9 +++++++++ >>>> include/linux/cpuidle.h | 1 + >>>> 2 files changed, 10 insertions(+) >>>> >>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c >>>> index eba6929..c500370 100644 >>>> --- a/drivers/cpuidle/cpuidle.c >>>> +++ b/drivers/cpuidle/cpuidle.c >>>> @@ -8,6 +8,7 @@ >>>> * This code is licenced under the GPL. >>>> */ >>>> >>>> +#include <linux/clockchips.h> >>>> #include <linux/kernel.h> >>>> #include <linux/mutex.h> >>>> #include <linux/sched.h> >>>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) >>>> >>>> trace_cpu_idle_rcuidle(next_state, dev->cpu); >>>> >>>> + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) >>>> + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, >>>> + &dev->cpu); >>>> + >>> Seems like good clean-up from drivers. >>> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> >>> >>> How about taking care of cpu_pm_notifiers() as well with some >>> additional flag for CPU and cluster power state. That can help >>> to reduce and consolidate the code. What you say ? >> >> Do you mean add a flag for different level of idle (idle, suspend, power >> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the >> drivers as a common framework ? >> > I mean only for CPUidle considering C-state already has the information > about CPU and cluster power state. For suspend, we by-default run the notifiers > so nothing needs to be done there. > > You may not even need a framework. Just like we know in a C-state, timer > stops, same lines, we can say CPU state is going to be say off and hence > cpu_pm_enter() notifier needs to be called. And same way for cluster. > > I still haven't given complete thought but thought crossed my mind > after looking at your patches. Right, that could be interesting. I see may be one issue with this approach: when we enter an idle state with power off, some checking are done before cpu_pm_enter and that could lead to abort the current idle routine. By moving this to the cpuidle framework, we will invoke always cpu_pm_enter/exit even if the idle enter routine failed. But, IMO, the idea is good. For example in cpuidle34xx: static int __omap3_enter_idle(struct cpuidle_device *dev, struct cpuidle_driver *drv, int index) { struct omap3_idle_statedata *cx = &omap3_idle_data[index]; local_fiq_disable(); if (omap_irq_pending() || need_resched()) goto return_sleep_time; ... if (cx->mpu_state == PWRDM_POWER_OFF) cpu_pm_enter(); ... } The same for omap4 and tegra2/3. With your knowledge of omap, do you think it is possible to move cpu_pm_enter before entering the idle routine ? Thanks -- Daniel
On Thursday 21 March 2013 08:11 PM, Daniel Lezcano wrote: > On 03/21/2013 03:04 PM, Santosh Shilimkar wrote: >> On Thursday 21 March 2013 07:22 PM, Daniel Lezcano wrote: >>> On 03/21/2013 01:59 PM, Santosh Shilimkar wrote: >>>> On Thursday 21 March 2013 05:51 PM, Daniel Lezcano wrote: >>>>> When a cpu enters a deep idle state, the local timers are stopped and >>>>> the time framework falls back to the timer device used as a broadcast >>>>> timer. >>>>> >>>>> The different cpuidle drivers are calling clockevents_notify ENTER/EXIT >>>>> when the idle state stops the local timer. >>>>> >>>>> Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle >>>>> drivers. If the flag is set, the cpuidle core code takes care of the >>>>> notification on behalf of the driver to avoid pointless code duplication. >>>>> >>>>> Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> >>>>> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> >>>>> Cc: Len Brown <lenb@kernel.org> >>>>> Cc: Linus Walleij <linus.walleij@linaro.org> >>>>> Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> >>>>> Cc: Rajendra Nayak <rnayak@ti.com> >>>>> Cc: Sascha Hauer <kernel@pengutronix.de> >>>>> Cc: Thomas Gleixner <tglx@linutronix.de> >>>>> --- >>>>> drivers/cpuidle/cpuidle.c | 9 +++++++++ >>>>> include/linux/cpuidle.h | 1 + >>>>> 2 files changed, 10 insertions(+) >>>>> >>>>> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c >>>>> index eba6929..c500370 100644 >>>>> --- a/drivers/cpuidle/cpuidle.c >>>>> +++ b/drivers/cpuidle/cpuidle.c >>>>> @@ -8,6 +8,7 @@ >>>>> * This code is licenced under the GPL. >>>>> */ >>>>> >>>>> +#include <linux/clockchips.h> >>>>> #include <linux/kernel.h> >>>>> #include <linux/mutex.h> >>>>> #include <linux/sched.h> >>>>> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) >>>>> >>>>> trace_cpu_idle_rcuidle(next_state, dev->cpu); >>>>> >>>>> + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) >>>>> + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, >>>>> + &dev->cpu); >>>>> + >>>> Seems like good clean-up from drivers. >>>> Acked-by: Santosh Shilimkar <santosh.shilimkar@ti.com> >>>> >>>> How about taking care of cpu_pm_notifiers() as well with some >>>> additional flag for CPU and cluster power state. That can help >>>> to reduce and consolidate the code. What you say ? >>> >>> Do you mean add a flag for different level of idle (idle, suspend, power >>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the >>> drivers as a common framework ? >>> >> I mean only for CPUidle considering C-state already has the information >> about CPU and cluster power state. For suspend, we by-default run the notifiers >> so nothing needs to be done there. >> >> You may not even need a framework. Just like we know in a C-state, timer >> stops, same lines, we can say CPU state is going to be say off and hence >> cpu_pm_enter() notifier needs to be called. And same way for cluster. >> >> I still haven't given complete thought but thought crossed my mind >> after looking at your patches. > > Right, that could be interesting. > > I see may be one issue with this approach: when we enter an idle state > with power off, some checking are done before cpu_pm_enter and that > could lead to abort the current idle routine. > I see your point. > By moving this to the cpuidle framework, we will invoke always > cpu_pm_enter/exit even if the idle enter routine failed. > If we at all decide to go on this path, we can always get around the issues. The key is these notifiers have to be run very close to the low power entry/exit since the saved context for CPU/CPU cluster at wrong points would lead to many issues. > But, IMO, the idea is good. > > For example in cpuidle34xx: > > static int __omap3_enter_idle(struct cpuidle_device *dev, > struct cpuidle_driver *drv, > int index) > { > struct omap3_idle_statedata *cx = &omap3_idle_data[index]; > > local_fiq_disable(); > > if (omap_irq_pending() || need_resched()) > goto return_sleep_time; > > ... > > if (cx->mpu_state == PWRDM_POWER_OFF) > cpu_pm_enter(); > > ... > } > > The same for omap4 and tegra2/3. > > With your knowledge of omap, do you think it is possible to move > cpu_pm_enter before entering the idle routine ? > I will get back on this topic after some experiments most likely by next week. Regards, Santosh
On Thu, Mar 21, 2013 at 02:52:21PM +0000, Santosh Shilimkar wrote: [...] > >>>> How about taking care of cpu_pm_notifiers() as well with some > >>>> additional flag for CPU and cluster power state. That can help > >>>> to reduce and consolidate the code. What you say ? > >>> > >>> Do you mean add a flag for different level of idle (idle, suspend, power > >>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the > >>> drivers as a common framework ? > >>> > >> I mean only for CPUidle considering C-state already has the information > >> about CPU and cluster power state. For suspend, we by-default run the notifiers > >> so nothing needs to be done there. > >> > >> You may not even need a framework. Just like we know in a C-state, timer > >> stops, same lines, we can say CPU state is going to be say off and hence > >> cpu_pm_enter() notifier needs to be called. And same way for cluster. > >> > >> I still haven't given complete thought but thought crossed my mind > >> after looking at your patches. > > > > Right, that could be interesting. > > > > I see may be one issue with this approach: when we enter an idle state > > with power off, some checking are done before cpu_pm_enter and that > > could lead to abort the current idle routine. > > > I see your point. > > > By moving this to the cpuidle framework, we will invoke always > > cpu_pm_enter/exit even if the idle enter routine failed. > > > If we at all decide to go on this path, we can always get around the > issues. The key is these notifiers have to be run very close to the > low power entry/exit since the saved context for CPU/CPU cluster > at wrong points would lead to many issues. > > > But, IMO, the idea is good. > > > > For example in cpuidle34xx: > > > > static int __omap3_enter_idle(struct cpuidle_device *dev, > > struct cpuidle_driver *drv, > > int index) > > { > > struct omap3_idle_statedata *cx = &omap3_idle_data[index]; > > > > local_fiq_disable(); > > > > if (omap_irq_pending() || need_resched()) > > goto return_sleep_time; > > > > ... > > > > if (cx->mpu_state == PWRDM_POWER_OFF) > > cpu_pm_enter(); > > > > ... > > } > > > > The same for omap4 and tegra2/3. > > > > With your knowledge of omap, do you think it is possible to move > > cpu_pm_enter before entering the idle routine ? > > > I will get back on this topic after some experiments most likely > by next week. Looks like the way to go, we could enhance the notifiers to be able to save/restore specific subsystems with the C-state flags defining which ones. Notifiers actions should not be disruptive so the only drawback in executing those before entering the idle routine could possibly be a waste of cycles in case the C-state entry fails but certainly something to verify on all platforms using them. Lorenzo
On Thursday, March 21, 2013 01:21:31 PM Daniel Lezcano wrote: > When a cpu enters a deep idle state, the local timers are stopped and > the time framework falls back to the timer device used as a broadcast > timer. > > The different cpuidle drivers are calling clockevents_notify ENTER/EXIT > when the idle state stops the local timer. > > Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle > drivers. If the flag is set, the cpuidle core code takes care of the > notification on behalf of the driver to avoid pointless code duplication. > > Signed-off-by: Daniel Lezcano <daniel.lezcano@linaro.org> > Reviewed-by: Thomas Gleixner <tglx@linutronix.de> > Cc: Len Brown <lenb@kernel.org> > Cc: Linus Walleij <linus.walleij@linaro.org> > Cc: Santosh Shilimkar <santosh.shilimkar@ti.com> > Cc: Rajendra Nayak <rnayak@ti.com> > Cc: Sascha Hauer <kernel@pengutronix.de> > Cc: Thomas Gleixner <tglx@linutronix.de> All patches in the series applied to linux-pm.git/bleeding-edge and will be moved to linux-next after build testing. Thanks, Rafael > --- > drivers/cpuidle/cpuidle.c | 9 +++++++++ > include/linux/cpuidle.h | 1 + > 2 files changed, 10 insertions(+) > > diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c > index eba6929..c500370 100644 > --- a/drivers/cpuidle/cpuidle.c > +++ b/drivers/cpuidle/cpuidle.c > @@ -8,6 +8,7 @@ > * This code is licenced under the GPL. > */ > > +#include <linux/clockchips.h> > #include <linux/kernel.h> > #include <linux/mutex.h> > #include <linux/sched.h> > @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) > > trace_cpu_idle_rcuidle(next_state, dev->cpu); > > + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) > + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, > + &dev->cpu); > + > if (cpuidle_state_is_coupled(dev, drv, next_state)) > entered_state = cpuidle_enter_state_coupled(dev, drv, > next_state); > else > entered_state = cpuidle_enter_state(dev, drv, next_state); > > + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) > + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, > + &dev->cpu); > + > trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu); > > /* give the governor an opportunity to reflect on the outcome */ > diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h > index 480c14d..a837b33 100644 > --- a/include/linux/cpuidle.h > +++ b/include/linux/cpuidle.h > @@ -57,6 +57,7 @@ struct cpuidle_state { > /* Idle State Flags */ > #define CPUIDLE_FLAG_TIME_VALID (0x01) /* is residency time measurable? */ > #define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */ > +#define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */ > > #define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000) > >
Daniel Lezcano <daniel.lezcano@linaro.org> writes: > When a cpu enters a deep idle state, the local timers are stopped and > the time framework falls back to the timer device used as a broadcast > timer. > > The different cpuidle drivers are calling clockevents_notify ENTER/EXIT > when the idle state stops the local timer. > > Add a new flag CPUIDLE_FLAG_TIMER_STOP which can be set by the cpuidle > drivers. If the flag is set, the cpuidle core code takes care of the > notification on behalf of the driver to avoid pointless code duplication. Nice cleanup. Reviewed-by: Kevin Hilman <khilman@linaro.org>
Lorenzo, Daniel, On Thursday 21 March 2013 08:55 PM, Lorenzo Pieralisi wrote: > On Thu, Mar 21, 2013 at 02:52:21PM +0000, Santosh Shilimkar wrote: > > [...] > >>>>>> How about taking care of cpu_pm_notifiers() as well with some >>>>>> additional flag for CPU and cluster power state. That can help >>>>>> to reduce and consolidate the code. What you say ? >>>>> >>>>> Do you mean add a flag for different level of idle (idle, suspend, power >>>>> off) and then use the cpu_pm_enter/cpu_cluster_pm_enter in all the >>>>> drivers as a common framework ? >>>>> >>>> I mean only for CPUidle considering C-state already has the information >>>> about CPU and cluster power state. For suspend, we by-default run the notifiers >>>> so nothing needs to be done there. >>>> >>>> You may not even need a framework. Just like we know in a C-state, timer >>>> stops, same lines, we can say CPU state is going to be say off and hence >>>> cpu_pm_enter() notifier needs to be called. And same way for cluster. >>>> >>>> I still haven't given complete thought but thought crossed my mind >>>> after looking at your patches. >>> >>> Right, that could be interesting. >>> >>> I see may be one issue with this approach: when we enter an idle state >>> with power off, some checking are done before cpu_pm_enter and that >>> could lead to abort the current idle routine. >>> >> I see your point. >> >>> By moving this to the cpuidle framework, we will invoke always >>> cpu_pm_enter/exit even if the idle enter routine failed. >>> >> If we at all decide to go on this path, we can always get around the >> issues. The key is these notifiers have to be run very close to the >> low power entry/exit since the saved context for CPU/CPU cluster >> at wrong points would lead to many issues. >> >>> But, IMO, the idea is good. >>> >>> For example in cpuidle34xx: >>> >>> static int __omap3_enter_idle(struct cpuidle_device *dev, >>> struct cpuidle_driver *drv, >>> int index) >>> { >>> struct omap3_idle_statedata *cx = &omap3_idle_data[index]; >>> >>> local_fiq_disable(); >>> >>> if (omap_irq_pending() || need_resched()) >>> goto return_sleep_time; >>> >>> ... >>> >>> if (cx->mpu_state == PWRDM_POWER_OFF) >>> cpu_pm_enter(); >>> >>> ... >>> } >>> >>> The same for omap4 and tegra2/3. >>> >>> With your knowledge of omap, do you think it is possible to move >>> cpu_pm_enter before entering the idle routine ? >>> >> I will get back on this topic after some experiments most likely >> by next week. > > Looks like the way to go, we could enhance the notifiers to be able to > save/restore specific subsystems with the C-state flags defining which > ones. > > Notifiers actions should not be disruptive so the only drawback in > executing those before entering the idle routine could possibly be > a waste of cycles in case the C-state entry fails but certainly something > to verify on all platforms using them. > After spending few mins on the idea, I made few observations. 1. Since the cpu_pm_enter() involves in saving the context CPU co-processors like VFP, debug subsystem, it should be called after irq, preemption is disabled and we are on its way into idle entry. It can be called from generic idle code but we just need to take care of above. 2. cpu_cluster_pm_enter() actually is bit tricky since its use is not consistent with couple idle entry and say smp_idle entry. Clusters notifiers are called when the idle drivers finds that all the CPUs in that cluster are at sleep and current CPU can take down the cluster with it. This notifier as well needs to be called late since the after the notifier call, there shouldn't be any irq activity which might lead to incorrect context save for say irq controller. 3. Same goes with exit notifiers which has to be called before irq, preemption is enabled back again. 4. Couple idle overall needs special considerations to manage these notifiers from generic code. Overall it seems to be doable with the direction you already started for timer broad-cast notifiers. You can at least add this one in your queue of works :-) I do not have much cycles left because of other pile of work to carry out the changes but you can surely count me on reviewing and testing the patches. Regards, Santosh
diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c index eba6929..c500370 100644 --- a/drivers/cpuidle/cpuidle.c +++ b/drivers/cpuidle/cpuidle.c @@ -8,6 +8,7 @@ * This code is licenced under the GPL. */ +#include <linux/clockchips.h> #include <linux/kernel.h> #include <linux/mutex.h> #include <linux/sched.h> @@ -146,12 +147,20 @@ int cpuidle_idle_call(void) trace_cpu_idle_rcuidle(next_state, dev->cpu); + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, + &dev->cpu); + if (cpuidle_state_is_coupled(dev, drv, next_state)) entered_state = cpuidle_enter_state_coupled(dev, drv, next_state); else entered_state = cpuidle_enter_state(dev, drv, next_state); + if (drv->states[next_state].flags & CPUIDLE_FLAG_TIMER_STOP) + clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_EXIT, + &dev->cpu); + trace_cpu_idle_rcuidle(PWR_EVENT_EXIT, dev->cpu); /* give the governor an opportunity to reflect on the outcome */ diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h index 480c14d..a837b33 100644 --- a/include/linux/cpuidle.h +++ b/include/linux/cpuidle.h @@ -57,6 +57,7 @@ struct cpuidle_state { /* Idle State Flags */ #define CPUIDLE_FLAG_TIME_VALID (0x01) /* is residency time measurable? */ #define CPUIDLE_FLAG_COUPLED (0x02) /* state applies to multiple cpus */ +#define CPUIDLE_FLAG_TIMER_STOP (0x04) /* timer is stopped on this state */ #define CPUIDLE_DRIVER_FLAGS_MASK (0xFFFF0000)