Message ID | 20250131162439.3843071-3-beata.michalska@arm.com |
---|---|
State | New |
Headers | show |
Series | Add support for AArch64 AMUv1-based average freq | expand |
On 31-01-25, 16:24, Beata Michalska wrote: > Currently the CPUFreq core exposes two sysfs attributes that can be used > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq > and scaling_cur_freq. Both provide slightly different view on the > subject and they do come with their own drawbacks. > > cpuinfo_cur_freq provides higher precision though at a cost of being > rather expensive. Moreover, the information retrieved via this attribute > is somewhat short lived as frequency can change at any point of time > making it difficult to reason from. > > scaling_cur_freq, on the other hand, tends to be less accurate but then > the actual level of precision (and source of information) varies between > architectures making it a bit ambiguous. > > The new attribute, cpuinfo_avg_freq, is intended to provide more stable, > distinct interface, exposing an average frequency of a given CPU(s), as > reported by the hardware, over a time frame spanning no more than a few > milliseconds. As it requires appropriate hardware support, this > interface is optional. > > Note that under the hood, the new attribute relies on the information > provided by arch_freq_get_on_cpu, which, up to this point, has been > feeding data for scaling_cur_freq attribute, being the source of > ambiguity when it comes to interpretation. This has been amended by > restoring the intended behavior for scaling_cur_freq, with a new > dedicated config option to maintain status quo for those, who may need > it. > > CC: Jonathan Corbet <corbet@lwn.net> > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Ingo Molnar <mingo@redhat.com> > CC: Borislav Petkov <bp@alien8.de> > CC: Dave Hansen <dave.hansen@linux.intel.com> > CC: H. Peter Anvin <hpa@zytor.com> > CC: Phil Auld <pauld@redhat.com> > CC: x86@kernel.org > CC: linux-doc@vger.kernel.org > > Signed-off-by: Beata Michalska <beata.michalska@arm.com> > Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> > Reviewed-by: Sumit Gupta <sumitg@nvidia.com> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
On Fri, Jan 31, 2025 at 5:25 PM Beata Michalska <beata.michalska@arm.com> wrote: > > Currently the CPUFreq core exposes two sysfs attributes that can be used > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq > and scaling_cur_freq. Both provide slightly different view on the > subject and they do come with their own drawbacks. > > cpuinfo_cur_freq provides higher precision though at a cost of being > rather expensive. Moreover, the information retrieved via this attribute > is somewhat short lived as frequency can change at any point of time > making it difficult to reason from. > > scaling_cur_freq, on the other hand, tends to be less accurate but then > the actual level of precision (and source of information) varies between > architectures making it a bit ambiguous. > > The new attribute, cpuinfo_avg_freq, is intended to provide more stable, > distinct interface, exposing an average frequency of a given CPU(s), as > reported by the hardware, over a time frame spanning no more than a few > milliseconds. As it requires appropriate hardware support, this > interface is optional. > > Note that under the hood, the new attribute relies on the information > provided by arch_freq_get_on_cpu, which, up to this point, has been > feeding data for scaling_cur_freq attribute, being the source of > ambiguity when it comes to interpretation. This has been amended by > restoring the intended behavior for scaling_cur_freq, with a new > dedicated config option to maintain status quo for those, who may need > it. In case anyone is waiting for my input here Acked-by: Rafael J. Wysocki <rafael@kernel.org> for this and the previous patch and please feel free to route them both through ARM64. Thanks! > CC: Jonathan Corbet <corbet@lwn.net> > CC: Thomas Gleixner <tglx@linutronix.de> > CC: Ingo Molnar <mingo@redhat.com> > CC: Borislav Petkov <bp@alien8.de> > CC: Dave Hansen <dave.hansen@linux.intel.com> > CC: H. Peter Anvin <hpa@zytor.com> > CC: Phil Auld <pauld@redhat.com> > CC: x86@kernel.org > CC: linux-doc@vger.kernel.org > > Signed-off-by: Beata Michalska <beata.michalska@arm.com> > Reviewed-by: Prasanna Kumar T S M <ptsm@linux.microsoft.com> > Reviewed-by: Sumit Gupta <sumitg@nvidia.com> > --- > Documentation/admin-guide/pm/cpufreq.rst | 17 +++++++++++++- > drivers/cpufreq/Kconfig.x86 | 12 ++++++++++ > drivers/cpufreq/cpufreq.c | 30 +++++++++++++++++++++++- > 3 files changed, 57 insertions(+), 2 deletions(-) > > diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst > index a21369eba034..3950583f2b15 100644 > --- a/Documentation/admin-guide/pm/cpufreq.rst > +++ b/Documentation/admin-guide/pm/cpufreq.rst > @@ -248,6 +248,20 @@ are the following: > If that frequency cannot be determined, this attribute should not > be present. > > +``cpuinfo_avg_freq`` > + An average frequency (in KHz) of all CPUs belonging to a given policy, > + derived from a hardware provided feedback and reported on a time frame > + spanning at most few milliseconds. > + > + This is expected to be based on the frequency the hardware actually runs > + at and, as such, might require specialised hardware support (such as AMU > + extension on ARM). If one cannot be determined, this attribute should > + not be present. > + > + Note, that failed attempt to retrieve current frequency for a given > + CPU(s) will result in an appropriate error, i.e: EAGAIN for CPU that > + remains idle (raised on ARM). > + > ``cpuinfo_max_freq`` > Maximum possible operating frequency the CPUs belonging to this policy > can run at (in kHz). > @@ -293,7 +307,8 @@ are the following: > Some architectures (e.g. ``x86``) may attempt to provide information > more precisely reflecting the current CPU frequency through this > attribute, but that still may not be the exact current CPU frequency as > - seen by the hardware at the moment. > + seen by the hardware at the moment. This behavior though, is only > + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option. > > ``scaling_driver`` > The scaling driver currently in use. > diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86 > index 97c2d4f15d76..2c5c228408bf 100644 > --- a/drivers/cpufreq/Kconfig.x86 > +++ b/drivers/cpufreq/Kconfig.x86 > @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK > option lets the probing code bypass some of those checks if the > parameter "relaxed_check=1" is passed to the module. > > +config CPUFREQ_ARCH_CUR_FREQ > + default y > + bool "Current frequency derived from HW provided feedback" > + help > + This determines whether the scaling_cur_freq sysfs attribute returns > + the last requested frequency or a more precise value based on hardware > + provided feedback (as architected counters). > + Given that a more precise frequency can now be provided via the > + cpuinfo_avg_freq attribute, by enabling this option, > + scaling_cur_freq maintains the provision of a counter based frequency, > + for compatibility reasons. > + > diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c > index 96b013ea177c..a2f31fbb1774 100644 > --- a/drivers/cpufreq/cpufreq.c > +++ b/drivers/cpufreq/cpufreq.c > @@ -734,12 +734,20 @@ __weak int arch_freq_get_on_cpu(int cpu) > return -EOPNOTSUPP; > } > > +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy) > +{ > + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP; > +} > + > static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf) > { > ssize_t ret; > int freq; > > - freq = arch_freq_get_on_cpu(policy->cpu); > + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ) > + ? arch_freq_get_on_cpu(policy->cpu) > + : 0; > + > if (freq > 0) > ret = sysfs_emit(buf, "%u\n", freq); > else if (cpufreq_driver->setpolicy && cpufreq_driver->get) > @@ -784,6 +792,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy, > return sysfs_emit(buf, "<unknown>\n"); > } > > +/* > + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware > + */ > +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy, > + char *buf) > +{ > + int avg_freq = arch_freq_get_on_cpu(policy->cpu); > + > + if (avg_freq > 0) > + return sysfs_emit(buf, "%u\n", avg_freq); > + return avg_freq != 0 ? avg_freq : -EINVAL; > +} > + > /* > * show_scaling_governor - show the current policy for the specified CPU > */ > @@ -946,6 +967,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf) > } > > cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400); > +cpufreq_freq_attr_ro(cpuinfo_avg_freq); > cpufreq_freq_attr_ro(cpuinfo_min_freq); > cpufreq_freq_attr_ro(cpuinfo_max_freq); > cpufreq_freq_attr_ro(cpuinfo_transition_latency); > @@ -1073,6 +1095,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy) > return ret; > } > > + if (cpufreq_avg_freq_supported(policy)) { > + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr); > + if (ret) > + return ret; > + } > + > ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr); > if (ret) > return ret; > -- > 2.25.1 > >
Hi Catalin, On Monday 17 Feb 2025 at 14:57:53 (+0000), Catalin Marinas wrote: > On Mon, Feb 17, 2025 at 12:52:44PM +0100, Rafael J. Wysocki wrote: > > On Fri, Jan 31, 2025 at 5:25 PM Beata Michalska <beata.michalska@arm.com> wrote: > > > > > > Currently the CPUFreq core exposes two sysfs attributes that can be used > > > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq > > > and scaling_cur_freq. Both provide slightly different view on the > > > subject and they do come with their own drawbacks. > > > > > > cpuinfo_cur_freq provides higher precision though at a cost of being > > > rather expensive. Moreover, the information retrieved via this attribute > > > is somewhat short lived as frequency can change at any point of time > > > making it difficult to reason from. > > > > > > scaling_cur_freq, on the other hand, tends to be less accurate but then > > > the actual level of precision (and source of information) varies between > > > architectures making it a bit ambiguous. > > > > > > The new attribute, cpuinfo_avg_freq, is intended to provide more stable, > > > distinct interface, exposing an average frequency of a given CPU(s), as > > > reported by the hardware, over a time frame spanning no more than a few > > > milliseconds. As it requires appropriate hardware support, this > > > interface is optional. > > > > > > Note that under the hood, the new attribute relies on the information > > > provided by arch_freq_get_on_cpu, which, up to this point, has been > > > feeding data for scaling_cur_freq attribute, being the source of > > > ambiguity when it comes to interpretation. This has been amended by > > > restoring the intended behavior for scaling_cur_freq, with a new > > > dedicated config option to maintain status quo for those, who may need > > > it. > > > > In case anyone is waiting for my input here > > > > Acked-by: Rafael J. Wysocki <rafael@kernel.org> > > > > for this and the previous patch and please feel free to route them > > both through ARM64. > > Thanks Rafael. I indeed plan to take them through the arm64 tree. Just a mention that this set depends on the patch that Beata linked at [6]. That patch applies cleanly on next-20250217 and it still builds/boots/works as expected. Thanks, Ionela. > > -- > Catalin
On Mon, Feb 17, 2025 at 03:07:24PM +0000, Ionela Voinescu wrote: > Hi Catalin, > > On Monday 17 Feb 2025 at 14:57:53 (+0000), Catalin Marinas wrote: > > On Mon, Feb 17, 2025 at 12:52:44PM +0100, Rafael J. Wysocki wrote: > > > On Fri, Jan 31, 2025 at 5:25 PM Beata Michalska <beata.michalska@arm.com> wrote: > > > > > > > > Currently the CPUFreq core exposes two sysfs attributes that can be used > > > > to query current frequency of a given CPU(s): namely cpuinfo_cur_freq > > > > and scaling_cur_freq. Both provide slightly different view on the > > > > subject and they do come with their own drawbacks. > > > > > > > > cpuinfo_cur_freq provides higher precision though at a cost of being > > > > rather expensive. Moreover, the information retrieved via this attribute > > > > is somewhat short lived as frequency can change at any point of time > > > > making it difficult to reason from. > > > > > > > > scaling_cur_freq, on the other hand, tends to be less accurate but then > > > > the actual level of precision (and source of information) varies between > > > > architectures making it a bit ambiguous. > > > > > > > > The new attribute, cpuinfo_avg_freq, is intended to provide more stable, > > > > distinct interface, exposing an average frequency of a given CPU(s), as > > > > reported by the hardware, over a time frame spanning no more than a few > > > > milliseconds. As it requires appropriate hardware support, this > > > > interface is optional. > > > > > > > > Note that under the hood, the new attribute relies on the information > > > > provided by arch_freq_get_on_cpu, which, up to this point, has been > > > > feeding data for scaling_cur_freq attribute, being the source of > > > > ambiguity when it comes to interpretation. This has been amended by > > > > restoring the intended behavior for scaling_cur_freq, with a new > > > > dedicated config option to maintain status quo for those, who may need > > > > it. > > > > > > In case anyone is waiting for my input here > > > > > > Acked-by: Rafael J. Wysocki <rafael@kernel.org> > > > > > > for this and the previous patch and please feel free to route them > > > both through ARM64. > > > > Thanks Rafael. I indeed plan to take them through the arm64 tree. > > Just a mention that this set depends on the patch that Beata linked at > [6]. That patch applies cleanly on next-20250217 and it still > builds/boots/works as expected. > Ah I see it is indeed dependent. Just responded on the other thread before reading this. So it is better if Catalin picks up [6] as well. Sorry for the confusion.
diff --git a/Documentation/admin-guide/pm/cpufreq.rst b/Documentation/admin-guide/pm/cpufreq.rst index a21369eba034..3950583f2b15 100644 --- a/Documentation/admin-guide/pm/cpufreq.rst +++ b/Documentation/admin-guide/pm/cpufreq.rst @@ -248,6 +248,20 @@ are the following: If that frequency cannot be determined, this attribute should not be present. +``cpuinfo_avg_freq`` + An average frequency (in KHz) of all CPUs belonging to a given policy, + derived from a hardware provided feedback and reported on a time frame + spanning at most few milliseconds. + + This is expected to be based on the frequency the hardware actually runs + at and, as such, might require specialised hardware support (such as AMU + extension on ARM). If one cannot be determined, this attribute should + not be present. + + Note, that failed attempt to retrieve current frequency for a given + CPU(s) will result in an appropriate error, i.e: EAGAIN for CPU that + remains idle (raised on ARM). + ``cpuinfo_max_freq`` Maximum possible operating frequency the CPUs belonging to this policy can run at (in kHz). @@ -293,7 +307,8 @@ are the following: Some architectures (e.g. ``x86``) may attempt to provide information more precisely reflecting the current CPU frequency through this attribute, but that still may not be the exact current CPU frequency as - seen by the hardware at the moment. + seen by the hardware at the moment. This behavior though, is only + available via c:macro:``CPUFREQ_ARCH_CUR_FREQ`` option. ``scaling_driver`` The scaling driver currently in use. diff --git a/drivers/cpufreq/Kconfig.x86 b/drivers/cpufreq/Kconfig.x86 index 97c2d4f15d76..2c5c228408bf 100644 --- a/drivers/cpufreq/Kconfig.x86 +++ b/drivers/cpufreq/Kconfig.x86 @@ -340,3 +340,15 @@ config X86_SPEEDSTEP_RELAXED_CAP_CHECK option lets the probing code bypass some of those checks if the parameter "relaxed_check=1" is passed to the module. +config CPUFREQ_ARCH_CUR_FREQ + default y + bool "Current frequency derived from HW provided feedback" + help + This determines whether the scaling_cur_freq sysfs attribute returns + the last requested frequency or a more precise value based on hardware + provided feedback (as architected counters). + Given that a more precise frequency can now be provided via the + cpuinfo_avg_freq attribute, by enabling this option, + scaling_cur_freq maintains the provision of a counter based frequency, + for compatibility reasons. + diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index 96b013ea177c..a2f31fbb1774 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -734,12 +734,20 @@ __weak int arch_freq_get_on_cpu(int cpu) return -EOPNOTSUPP; } +static inline bool cpufreq_avg_freq_supported(struct cpufreq_policy *policy) +{ + return arch_freq_get_on_cpu(policy->cpu) != -EOPNOTSUPP; +} + static ssize_t show_scaling_cur_freq(struct cpufreq_policy *policy, char *buf) { ssize_t ret; int freq; - freq = arch_freq_get_on_cpu(policy->cpu); + freq = IS_ENABLED(CONFIG_CPUFREQ_ARCH_CUR_FREQ) + ? arch_freq_get_on_cpu(policy->cpu) + : 0; + if (freq > 0) ret = sysfs_emit(buf, "%u\n", freq); else if (cpufreq_driver->setpolicy && cpufreq_driver->get) @@ -784,6 +792,19 @@ static ssize_t show_cpuinfo_cur_freq(struct cpufreq_policy *policy, return sysfs_emit(buf, "<unknown>\n"); } +/* + * show_cpuinfo_avg_freq - average CPU frequency as detected by hardware + */ +static ssize_t show_cpuinfo_avg_freq(struct cpufreq_policy *policy, + char *buf) +{ + int avg_freq = arch_freq_get_on_cpu(policy->cpu); + + if (avg_freq > 0) + return sysfs_emit(buf, "%u\n", avg_freq); + return avg_freq != 0 ? avg_freq : -EINVAL; +} + /* * show_scaling_governor - show the current policy for the specified CPU */ @@ -946,6 +967,7 @@ static ssize_t show_bios_limit(struct cpufreq_policy *policy, char *buf) } cpufreq_freq_attr_ro_perm(cpuinfo_cur_freq, 0400); +cpufreq_freq_attr_ro(cpuinfo_avg_freq); cpufreq_freq_attr_ro(cpuinfo_min_freq); cpufreq_freq_attr_ro(cpuinfo_max_freq); cpufreq_freq_attr_ro(cpuinfo_transition_latency); @@ -1073,6 +1095,12 @@ static int cpufreq_add_dev_interface(struct cpufreq_policy *policy) return ret; } + if (cpufreq_avg_freq_supported(policy)) { + ret = sysfs_create_file(&policy->kobj, &cpuinfo_avg_freq.attr); + if (ret) + return ret; + } + ret = sysfs_create_file(&policy->kobj, &scaling_cur_freq.attr); if (ret) return ret;