Message ID | 1488469507-32463-2-git-send-email-patrick.bellasi@arm.com |
---|---|
State | New |
Headers | show |
Series | cpufreq: schedutil: fixes for flags updates | expand |
On 06-Mar 09:29, Steven Rostedt wrote: > On Fri, 3 Mar 2017 09:11:25 +0530 > Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > On 02-03-17, 15:45, Patrick Bellasi wrote: > > > diff --git a/include/linux/sched.h b/include/linux/sched.h > > > index e2ed46d..739b29d 100644 > > > --- a/include/linux/sched.h > > > +++ b/include/linux/sched.h > > > @@ -3653,6 +3653,7 @@ static inline unsigned long rlimit_max(unsigned int limit) > > > #define SCHED_CPUFREQ_RT (1U << 0) > > > #define SCHED_CPUFREQ_DL (1U << 1) > > > #define SCHED_CPUFREQ_IOWAIT (1U << 2) > > > +#define SCHED_CPUFREQ_IDLE (1U << 3) > > > > > > #define SCHED_CPUFREQ_RT_DL (SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL) > > > > > > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c > > > index fd46593..084a98b 100644 > > > --- a/kernel/sched/cpufreq_schedutil.c > > > +++ b/kernel/sched/cpufreq_schedutil.c > > > @@ -281,6 +281,12 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, > > > > > > raw_spin_lock(&sg_policy->update_lock); > > > > > > + /* CPU is entering IDLE, reset flags without triggering an update */ > > > + if (flags & SCHED_CPUFREQ_IDLE) { > > > > Will "flags == SCHED_CPUFREQ_IDLE" generate better assembly ? > > > > Even if it does, a bit check and an equal check are pretty negligible > in difference wrt execution time. I would choose whatever is the most > readable to humans. > > flags == SCHED_CPUFREQ_IDLE > > will tell me (as a reviewer) that we expect no other flag to be set. > > flags & SCHED_CPUFREQ_IDLE > > will tell me that we only care about the IDLE flag. > > Which ever is the more meaningful is what should be used. Agree on the approach, whenever not silly code should be written to be easy to understand from other humans. Here the intent is "whatever flags you set, if the IDLE one is set" we assume we are entering idle. Thus, to me the current version is easier to understand without being "overkilling" in its semantics. Cheers Patrick -- #include <best/regards.h> Patrick Bellasi
diff --git a/include/linux/sched.h b/include/linux/sched.h index e2ed46d..739b29d 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -3653,6 +3653,7 @@ static inline unsigned long rlimit_max(unsigned int limit) #define SCHED_CPUFREQ_RT (1U << 0) #define SCHED_CPUFREQ_DL (1U << 1) #define SCHED_CPUFREQ_IOWAIT (1U << 2) +#define SCHED_CPUFREQ_IDLE (1U << 3) #define SCHED_CPUFREQ_RT_DL (SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL) diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index fd46593..084a98b 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -281,6 +281,12 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, raw_spin_lock(&sg_policy->update_lock); + /* CPU is entering IDLE, reset flags without triggering an update */ + if (flags & SCHED_CPUFREQ_IDLE) { + sg_cpu->flags = 0; + goto done; + } + sg_cpu->util = util; sg_cpu->max = max; sg_cpu->flags = flags; @@ -293,6 +299,7 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time, sugov_update_commit(sg_policy, time, next_f); } +done: raw_spin_unlock(&sg_policy->update_lock); } diff --git a/kernel/sched/idle_task.c b/kernel/sched/idle_task.c index 0c00172..a844c91 100644 --- a/kernel/sched/idle_task.c +++ b/kernel/sched/idle_task.c @@ -29,6 +29,10 @@ pick_next_task_idle(struct rq *rq, struct task_struct *prev, struct rq_flags *rf put_prev_task(rq, prev); update_idle_core(rq); schedstat_inc(rq->sched_goidle); + + /* kick cpufreq (see the comment in kernel/sched/sched.h). */ + cpufreq_update_this_cpu(rq, SCHED_CPUFREQ_IDLE); + return rq->idle; }
Currently, sg_cpu's flags are set to the value defined by the last call of the cpufreq_update_util()/cpufreq_update_this_cpu(); for RT/DL classes this corresponds to the SCHED_CPUFREQ_{RT/DL} flags always being set. When multiple CPU shares the same frequency domain it might happen that a CPU which executed an RT task, right before entering IDLE, has one of the SCHED_CPUFREQ_RT_DL flags set, permanently, until it exits IDLE. Thus, in sugov_next_freq_shared(), where utilisation and flags are aggregated across all the CPUs of a frequency domain, it turns out that all the CPUs of that domain will always run at the maximum OPP until another event happens in the idle CPU to eventually clear the SCHED_CPUFREQ_{RT/DL} flag. Such a behaviour can harm the energy efficiency of systems when RT workloads are not so frequent and other CPUs in the same frequency domain are running small utilisation workloads, which is a quite common scenario in mobile embedded systems. This patch proposes a solution which is aligned with the current principle to update the flags each time a scheduling event happens. The scheduling of the idle_task on a CPU is considered one of such meaningful events. That's why when the idle_task is selected for execution we poke the schedutil policy to reset the flags for that CPU. Moreover, no frequency transitions are activated at that point, which is fair in case the RT workload should come back in the future, but it allows other CPUs in the same frequency domain to scale down the frequency in case that should be needed. Signed-off-by: Patrick Bellasi <patrick.bellasi@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Rafael J. Wysocki <rafael.j.wysocki@intel.com> Cc: Viresh Kumar <viresh.kumar@linaro.org> Cc: linux-kernel@vger.kernel.org Cc: linux-pm@vger.kernel.org --- include/linux/sched.h | 1 + kernel/sched/cpufreq_schedutil.c | 7 +++++++ kernel/sched/idle_task.c | 4 ++++ 3 files changed, 12 insertions(+) -- 2.7.4