Message ID | 20220406220809.22555-1-lukasz.luba@arm.com |
---|---|
Headers | show |
Series | Introduce Cpufreq Active Stats | expand |
On 06-04-22, 23:08, Lukasz Luba wrote: > Hi all, > > This is the 3rd version of patch set which tries to address issues which are > due to missing proper information about CPU performance in time. > > The issue description: > 1. "Cpufreq statistics cover the time when CPUs are in idle states, so they > are not suitable for certain purposes, like thermal control." Rafael [2] > 2. Thermal governor Intelligent Power Allocation (IPA) has to estimate power, > for the last period, e.g. 100ms, for each CPU in the Cluster, to grant new > power and set max possible frequency. Currently in some cases it gets big > error, when the frequency of CPU changed in the middle. It is due to the > fact that IPA reads the current frequency for the CPU, not aware of all > other frequencies which were actively (not in idle) used in the last 100ms. > > This code focuses on tracking the events of idle entry/exit for each CPU > and combine them with the frequency tracked statistics inside internal > statistics arrays (per-CPU). In the old cpufreq stats we have one shared > statistics array for the policy (all CPUs) and not take into account > periods when each CPU was in idle. > > Sometimes the IPA error between old estimation signal and reality is quite > big (>50%). It would have been useful to show how the stats hierarchy looks in userspace now.
On 4/26/22 04:11, Viresh Kumar wrote: > On 06-04-22, 23:08, Lukasz Luba wrote: >> Hi all, >> >> This is the 3rd version of patch set which tries to address issues which are >> due to missing proper information about CPU performance in time. >> >> The issue description: >> 1. "Cpufreq statistics cover the time when CPUs are in idle states, so they >> are not suitable for certain purposes, like thermal control." Rafael [2] >> 2. Thermal governor Intelligent Power Allocation (IPA) has to estimate power, >> for the last period, e.g. 100ms, for each CPU in the Cluster, to grant new >> power and set max possible frequency. Currently in some cases it gets big >> error, when the frequency of CPU changed in the middle. It is due to the >> fact that IPA reads the current frequency for the CPU, not aware of all >> other frequencies which were actively (not in idle) used in the last 100ms. >> >> This code focuses on tracking the events of idle entry/exit for each CPU >> and combine them with the frequency tracked statistics inside internal >> statistics arrays (per-CPU). In the old cpufreq stats we have one shared >> statistics array for the policy (all CPUs) and not take into account >> periods when each CPU was in idle. >> >> Sometimes the IPA error between old estimation signal and reality is quite >> big (>50%). > > It would have been useful to show how the stats hierarchy looks in userspace > now. > I haven't modify your current cpufreq stats, they are still counting total time (idle + running) for the given frequency. I think this is still useful for some userspace tools. These new proposed stats don't have such sysfs interface to read them. I don't know if userspace would be interested in this information (the running only time). IIRC Android uses bpf mechanisms to get this information to the userspace.
On 26-04-22, 08:46, Lukasz Luba wrote: > I haven't modify your current cpufreq stats, they are still counting > total time (idle + running) for the given frequency. I think this is > still useful for some userspace tools. These new proposed stats don't > have such sysfs interface to read them. I don't know if userspace would > be interested in this information (the running only time). IIRC Android > uses bpf mechanisms to get this information to the userspace. I saw some debugfs bits there, aren't you exposing any data via it ? I am just asking about, not suggesting :)
On 4/26/22 08:54, Viresh Kumar wrote: > On 26-04-22, 08:46, Lukasz Luba wrote: >> I haven't modify your current cpufreq stats, they are still counting >> total time (idle + running) for the given frequency. I think this is >> still useful for some userspace tools. These new proposed stats don't >> have such sysfs interface to read them. I don't know if userspace would >> be interested in this information (the running only time). IIRC Android >> uses bpf mechanisms to get this information to the userspace. > > I saw some debugfs bits there, aren't you exposing any data via it ? I > am just asking about, not suggesting :) > :) but I didn't dare to make it sysfs. I don't know if anything in user-space would be interested (apart from my test scripts).
On 26-04-22, 08:59, Lukasz Luba wrote: > :) but I didn't dare to make it sysfs. I don't know if anything in > user-space would be interested (apart from my test scripts). Sure, I was talking about hierarchy in debugfs only. Will be useful if you can show how it looks and what all data is exposed.
On 4/26/22 09:02, Viresh Kumar wrote: > On 26-04-22, 08:59, Lukasz Luba wrote: >> :) but I didn't dare to make it sysfs. I don't know if anything in >> user-space would be interested (apart from my test scripts). > > Sure, I was talking about hierarchy in debugfs only. Will be useful if > you can show how it looks and what all data is exposed. > I've created a new way for sharing such thing. Please check the rendered notebook at [1]. You can find raw output of that debugfs at cell 9 or in cell 11 as a dictionary. The residency is in ns. You can also find a diff from two snapshots for all cpus at cell 16. We randomly use Little cpus: 0,3,4,5. At the bottom you can find plots for all cpus, their active residency at frequencies. Cpu1 and cpu2 are big, cpu2 has been hotplug out so there is an empty plot (which is good). BTW, if you are interested in comparison of different input power estimation mechanism, you can find them here [2]. There are 4 different power signals. One is real from Juno power/energy meters the rest is SW estimations of avg power for the 100ms period. As you can see there in cell 25 plot, the new proposal in this patch set is better that two previous one used in mainline. The last plot shows real power signal and the new avg signal. The plot is interactive and supports 'Box Zoom' on the right (scroll to right to see that toolbox). Regards, Lukasz [1] https://nbviewer.org/github/lukaszluba-arm/lisa/blob/public_tests/ipa_input_power-debugfs.ipynb [2] https://nbviewer.org/github/lukaszluba-arm/lisa/blob/public_tests/ipa_input_power.ipynb