Message ID | 2931539.RsFqoHxarq@kreacher |
---|---|
State | New |
Headers | show |
Series | [RFC/RFT] cpufreq: intel_pstate: Accept passive mode with HWP enabled | expand |
On 2020.06.30 11:41 Doug Smythies wrote: > > Hi Srinivas, > > O.K. let's try this again, starting a new thread, with address list similar to a few weeks ago. > I believe I have untangled my multiple issues, such that this e-mail should be only about > the single issue of HWP capable processors incorrectly deciding to lower the CPU frequency > under some conditions. Also, my previous assertion as to the issue was indeed incorrect. > > I now: > . never use x86_energy_perf_policy. > . For HWP disabled: never change from active to passive or via versa, but rather do it via boot. > . after boot always check and reset the various power limit log bits that are set. > . never compile the kernel (well, until after any tests), which will set those bits again. > . never run prime95 high heat torture test, which will set those bits again. > . Note that the tests done for this e-mail never ever set those bits again. > . Invented an entirely new way to manifest, demonstrate, and exploit the issue (also mentioned June > 6th). > . All tests were repeated on another HWP capable computer, so a i5-9600K and a i5-6200U. > > New method (old was periodic workflow): > > Long busy, short gap, busy but taking loop time samples so as to estimate CPU frequency. > I am calling it an inverse impulse response test. > > Assertion: > > If the short sleep is somehow simultaneous with some sort of 5.0 millisecond (200 Hertz) > periodic event (either in HWP itself, or via the driver, I am unable to determine which, > but think it is inside the black box that is HWP), I have been attempting to characterise the "black box" that is HWP. In terms of system response verses EPP, I only observe the HWP loop time as the response variable. 0 <= EPP <= 1 : My test can not measure loop time. 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds If there are other system response differences within those groups, I haven't been able to detect them, but would be grateful for any further insight. Otherwise, in future, I do not see a need to test anything other than 9 values of EPP, one from each group. > then there is a possibility that the > CPU frequency will drop significantly and will take an excessive amount of time to recover. > Frequency step ups are exactly on 5.0 millisecond boundaries +/- the short gap time. > > . The probability is somewhat inconsistent and a function of whatever else the computer is doing. > . The time to recover is a function of EPP, and if EPP is low enough my test never fails. > . These tests were all done with default settings. > . The "5.0" mSec is only for those default settings, it actually depends on EPP. > . Crude step boundaries, mSec: EPP=32, 2; EPP=64, 4; EPP=128, 5.00; EPP=196, 9 Now fully understood, as listed above. > . High level: i5-9600K: 2453 tests, 60 failures, 2.45% fail rate. (HWP - powersave) > . High level: i5-6200U: 4134 tests, 128 failures, 3.1% fail rate. (HWP - powersave) > . Low level (capture waveforms): i5-9600K: 1842 captured failure waveforms. See graph. > . Low level (capture waveforms): i5-6200U: 458 captured failure waveforms. See graph. > . Verify acpi-cpufreq/ondemand works fine: i5-9600K: 8975 tests. 0 failures. > . Verify acpi-cpufreq/ondemand works fine: i5-6200U: 8575 tests. 0 failures. The tests were all done using the teo idle governor. While the menu governor does not fail for this particular test, it fails in other scenarios. I have yet to find a failure scenario when idle state 2 is disabled. I have captured and analyzed about 400 megabytes of trace data, and have not been able to isolate an exact correlation. > > The short gap was 842 uSeconds for all these tests, and for no particular reason. > > While I have not re-done the bounds investigation, I have no reason to doubt > my previous work, re-stated below: > > > Gap definition: > > lower limit not known, but < 747 uSeconds. > > Upper limit is between 952 and 955 uSeconds (there will be some overhead uncertainties). The only new information I have is that the upper bound is bigger. > > Must be preceded by busy time spanning a couple of HWP sampling boundaries > > or jiffy boundaries or something (I don't actually know how HWP does stuff). > > Rather than point to graphs, which nobody seems to look at, they are attached, > and so might get striped for some of you. > > ... Doug > > Addendum: Some of the MSRs you have requested in the past: > > i5-9600K (HWP - powersave after test): > > root@s18:/home/doug# /home/doug/c/msr-decoder > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > reset > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO disable > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 108252E : high 46 : guaranteed 37 : efficient 8 : > lowest 1 > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : > raw: 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : > min: 8 : 8 : 8 : 8 : 8 : 8 : > max: 46 : 46 : 46 : 46 : 46 : 46 : > des: 0 : 0 : 0 : 0 : 0 : 0 : > epp: 128 : 128 : 128 : 128 : 128 : 128 : > act: 0 : 0 : 0 : 0 : 0 : 0 : > 7.) 0x777: IA32_HWP_STATUS: 0 : high 0 : guaranteed 0 : efficient 0 : lowest 0 > > i5-9600K (no HWP - acpi-cpufreq/ondemand after test): > > root@s18:/home/doug/c# /home/doug/c/msr-decoder > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > 9.) 0x199: IA32_PERF_CTL : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-5 : 6 : 6 : 6 : 6 : 6 : 6 : > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > reset > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO disable > > i5-6200U (HWP - powersave after test): > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 19 : 19 : 19 : 19 : > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > 1.) 0x19C: IA32_THERM_STATUS: 88430000 > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > reset > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88420000 > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO enable > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 105171C : high 28 : guaranteed 23 : efficient 5 : > lowest 1 > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-3 : > raw: 80001B04 : 80001B04 : 80001B04 : 80001B04 : > min: 4 : 4 : 4 : 4 : > max: 27 : 27 : 27 : 27 : > des: 0 : 0 : 0 : 0 : > epp: 128 : 128 : 128 : 128 : > act: 0 : 0 : 0 : 0 : > 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient 0 : lowest 0 > > i5-6200U (no HWP - acpi-cpufreq/ondemand after test): > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 23 : 23 : 23 : 23 : > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > 9.) 0x199: IA32_PERF_CTL : CPU 0-3 : 11 : 5 : 5 : 5 : > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-3 : 6 : 6 : 6 : 6 : > 1.) 0x19C: IA32_THERM_STATUS: 88440000 > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > reset > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88430000 > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO enable
On Wed, 2020-07-08 at 07:41 -0700, Doug Smythies wrote: > On 2020.06.30 11:41 Doug Smythies wrote: > > Hi Srinivas, > > > > O.K. let's try this again, starting a new thread, with address list > > similar to a few weeks ago. > > I believe I have untangled my multiple issues, such that this e- > > mail should be only about > > the single issue of HWP capable processors incorrectly deciding to > > lower the CPU frequency > > under some conditions. Also, my previous assertion as to the issue > > was indeed incorrect. > > > > I now: > > . never use x86_energy_perf_policy. > > . For HWP disabled: never change from active to passive or via > > versa, but rather do it via boot. > > . after boot always check and reset the various power limit log > > bits that are set. > > . never compile the kernel (well, until after any tests), which > > will set those bits again. > > . never run prime95 high heat torture test, which will set those > > bits again. > > . Note that the tests done for this e-mail never ever set those > > bits again. > > . Invented an entirely new way to manifest, demonstrate, and > > exploit the issue (also mentioned June > > 6th). > > . All tests were repeated on another HWP capable computer, so a i5- > > 9600K and a i5-6200U. > > > > New method (old was periodic workflow): > > > > Long busy, short gap, busy but taking loop time samples so as to > > estimate CPU frequency. > > I am calling it an inverse impulse response test. > > > > Assertion: > > > > If the short sleep is somehow simultaneous with some sort of 5.0 > > millisecond (200 Hertz) > > periodic event (either in HWP itself, or via the driver, I am > > unable to determine which, > > but think it is inside the black box that is HWP), > > I have been attempting to characterise the "black box" that is HWP. > In terms of system response verses EPP, I only observe the HWP loop > time as the > response variable. > > 0 <= EPP <= 1 : My test can not measure loop time. > 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds > 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds > 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds > 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds > 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds > 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds > 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds > 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds > > If there are other system response differences within > those groups, I haven't been able to detect them, > but would be grateful for any further insight. > > Otherwise, in future, I do not see a need to test anything > other than 9 values of EPP, one from each group. > Thanks Doug, I think they are enough. But there is no guarantee that every CPU model will have same results as the power curve will be different. Thanks, Srinivas > > then there is a possibility that the > > CPU frequency will drop significantly and will take an excessive > > amount of time to recover. > > Frequency step ups are exactly on 5.0 millisecond boundaries +/- > > the short gap time. > > > > . The probability is somewhat inconsistent and a function of > > whatever else the computer is doing. > > . The time to recover is a function of EPP, and if EPP is low > > enough my test never fails. > > . These tests were all done with default settings. > > . The "5.0" mSec is only for those default settings, it actually > > depends on EPP. > > . Crude step boundaries, mSec: EPP=32, 2; EPP=64, 4; EPP=128, > > 5.00; EPP=196, 9 > > Now fully understood, as listed above. > > > . High level: i5-9600K: 2453 tests, 60 failures, 2.45% fail rate. > > (HWP - powersave) > > . High level: i5-6200U: 4134 tests, 128 failures, 3.1% fail rate. > > (HWP - powersave) > > . Low level (capture waveforms): i5-9600K: 1842 captured failure > > waveforms. See graph. > > . Low level (capture waveforms): i5-6200U: 458 captured failure > > waveforms. See graph. > > . Verify acpi-cpufreq/ondemand works fine: i5-9600K: 8975 tests. 0 > > failures. > > . Verify acpi-cpufreq/ondemand works fine: i5-6200U: 8575 tests. 0 > > failures. > > The tests were all done using the teo idle governor. > While the menu governor does not fail for this particular test, it > fails > in other scenarios. > > I have yet to find a failure scenario when idle state 2 is disabled. > I have captured and analyzed about 400 megabytes of trace data, > and have not been able to isolate an exact correlation. > > > The short gap was 842 uSeconds for all these tests, and for no > > particular reason. > > > > While I have not re-done the bounds investigation, I have no reason > > to doubt > > my previous work, re-stated below: > > > > > Gap definition: > > > lower limit not known, but < 747 uSeconds. > > > Upper limit is between 952 and 955 uSeconds (there will be some > > > overhead uncertainties). > > The only new information I have is that the upper bound is bigger. > > > > Must be preceded by busy time spanning a couple of HWP sampling > > > boundaries > > > or jiffy boundaries or something (I don't actually know how HWP > > > does stuff). > > > > Rather than point to graphs, which nobody seems to look at, they > > are attached, > > and so might get striped for some of you. > > > > ... Doug > > > > Addendum: Some of the MSRs you have requested in the past: > > > > i5-9600K (HWP - powersave after test): > > > > root@s18:/home/doug# /home/doug/c/msr-decoder > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > : 8 : 8 : > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO > > disable > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 108252E : high 46 : > > guaranteed 37 : efficient 8 : > > lowest 1 > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : > > raw: 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : > > 80002E08 : > > min: 8 : 8 : 8 : 8 : 8 > > : 8 : > > max: 46 : 46 : 46 : 46 : 46 > > : 46 : > > des: 0 : 0 : 0 : 0 : 0 > > : 0 : > > epp: 128 : 128 : 128 : 128 : 128 > > : 128 : > > act: 0 : 0 : 0 : 0 : 0 > > : 0 : > > 7.) 0x777: IA32_HWP_STATUS: 0 : high 0 : guaranteed 0 : efficient 0 > > : lowest 0 > > > > i5-9600K (no HWP - acpi-cpufreq/ondemand after test): > > > > root@s18:/home/doug/c# /home/doug/c/msr-decoder > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > : 8 : 8 : > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > 9.) 0x199: IA32_PERF_CTL : CPU 0-5 : 8 : 8 : 8 : 8 > > : 8 : 8 : > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-5 : 6 : 6 : 6 : 6 > > : 6 : 6 : > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO > > disable > > > > i5-6200U (HWP - powersave after test): > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 19 : 19 : 19 : 19 : > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > 1.) 0x19C: IA32_THERM_STATUS: 88430000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88420000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > enable > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 105171C : high 28 : > > guaranteed 23 : efficient 5 : > > lowest 1 > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-3 : > > raw: 80001B04 : 80001B04 : 80001B04 : 80001B04 : > > min: 4 : 4 : 4 : 4 : > > max: 27 : 27 : 27 : 27 : > > des: 0 : 0 : 0 : 0 : > > epp: 128 : 128 : 128 : 128 : > > act: 0 : 0 : 0 : 0 : > > 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient 0 > > : lowest 0 > > > > i5-6200U (no HWP - acpi-cpufreq/ondemand after test): > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 23 : 23 : 23 : 23 : > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > 9.) 0x199: IA32_PERF_CTL : CPU 0-3 : 11 : 5 : 5 : 5 : > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-3 : 6 : 6 : 6 : 6 : > > 1.) 0x19C: IA32_THERM_STATUS: 88440000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88430000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > enable > >
On 2020.07.08 07:54 srinivas pandruvada wrote: > On Wed, 2020-07-08 at 07:41 -0700, Doug Smythies wrote: > > On 2020.06.30 11:41 Doug Smythies wrote: ... > > > If the short sleep is somehow simultaneous with some sort of 5.0 > > > millisecond (200 Hertz) > > > periodic event (either in HWP itself, or via the driver, I am > > > unable to determine which, > > > but think it is inside the black box that is HWP), > > > > I have been attempting to characterise the "black box" that is HWP. > > In terms of system response verses EPP, I only observe the HWP loop > > time as the > > response variable. > > > > 0 <= EPP <= 1 : My test can not measure loop time. > > 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds > > 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds > > 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds > > 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds > > 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds > > 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds > > 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds > > 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds > > > > If there are other system response differences within > > those groups, I haven't been able to detect them, > > but would be grateful for any further insight. > > > > Otherwise, in future, I do not see a need to test anything > > other than 9 values of EPP, one from each group. > > > Thanks Doug, > I think they are enough. But there is no guarantee that every CPU model > will have same results as the power curve will be different. Yes, of course the response curve is different between CPU models. However, the basic loops times seem to be the same. Although I admit to having limited data from other CPU models. ... Doug
Hi Srinivas, or anybody at Intel, Any chance of you looking into this issue. I first raised it over 2 months ago. On 2020.07.08 07:41 Doug Smythies wrote: > On 2020.06.30 11:41 Doug Smythies wrote: > > > > Hi Srinivas, > > > > O.K. let's try this again, starting a new thread, with address list similar to a few weeks ago. > > I believe I have untangled my multiple issues, such that this e-mail should be only about > > the single issue of HWP capable processors incorrectly deciding to lower the CPU frequency > > under some conditions. Also, my previous assertion as to the issue was indeed incorrect. > > > > I now: > > . never use x86_energy_perf_policy. > > . For HWP disabled: never change from active to passive or via versa, but rather do it via boot. > > . after boot always check and reset the various power limit log bits that are set. > > . never compile the kernel (well, until after any tests), which will set those bits again. > > . never run prime95 high heat torture test, which will set those bits again. > > . Note that the tests done for this e-mail never ever set those bits again. > > . Invented an entirely new way to manifest, demonstrate, and exploit the issue (also mentioned June > > 6th). > > . All tests were repeated on another HWP capable computer, so a i5-9600K and a i5-6200U. > > > > New method (old was periodic workflow): > > > > Long busy, short gap, busy but taking loop time samples so as to estimate CPU frequency. > > I am calling it an inverse impulse response test. > > > > Assertion: > > > > If the short sleep is somehow simultaneous with some sort of 5.0 millisecond (200 Hertz) > > periodic event (either in HWP itself, or via the driver, I am unable to determine which, > > but think it is inside the black box that is HWP), > > I have been attempting to characterise the "black box" that is HWP. > In terms of system response verses EPP, I only observe the HWP loop time as the > response variable. > > 0 <= EPP <= 1 : My test can not measure loop time. > 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds > 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds > 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds > 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds > 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds > 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds > 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds > 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds > > If there are other system response differences within > those groups, I haven't been able to detect them, > but would be grateful for any further insight. > > Otherwise, in future, I do not see a need to test anything > other than 9 values of EPP, one from each group. > > > then there is a possibility that the > > CPU frequency will drop significantly and will take an excessive amount of time to recover. > > Frequency step ups are exactly on 5.0 millisecond boundaries +/- the short gap time. > > > > . The probability is somewhat inconsistent and a function of whatever else the computer is doing. > > . The time to recover is a function of EPP, and if EPP is low enough my test never fails. > > . These tests were all done with default settings. > > . The "5.0" mSec is only for those default settings, it actually depends on EPP. > > . Crude step boundaries, mSec: EPP=32, 2; EPP=64, 4; EPP=128, 5.00; EPP=196, 9 > > Now fully understood, as listed above. > > > . High level: i5-9600K: 2453 tests, 60 failures, 2.45% fail rate. (HWP - powersave) > > . High level: i5-6200U: 4134 tests, 128 failures, 3.1% fail rate. (HWP - powersave) > > . Low level (capture waveforms): i5-9600K: 1842 captured failure waveforms. See graph. > > . Low level (capture waveforms): i5-6200U: 458 captured failure waveforms. See graph. > > . Verify acpi-cpufreq/ondemand works fine: i5-9600K: 8975 tests. 0 failures. > > . Verify acpi-cpufreq/ondemand works fine: i5-6200U: 8575 tests. 0 failures. > > The tests were all done using the teo idle governor. > While the menu governor does not fail for this particular test, it fails > in other scenarios. > > I have yet to find a failure scenario when idle state 2 is disabled. > I have captured and analyzed about 400 megabytes of trace data, > and have not been able to isolate an exact correlation. > > > > > The short gap was 842 uSeconds for all these tests, and for no particular reason. > > > > While I have not re-done the bounds investigation, I have no reason to doubt > > my previous work, re-stated below: > > > > > Gap definition: > > > lower limit not known, but < 747 uSeconds. > > > Upper limit is between 952 and 955 uSeconds (there will be some overhead uncertainties). > > The only new information I have is that the upper bound is bigger. > > > > Must be preceded by busy time spanning a couple of HWP sampling boundaries > > > or jiffy boundaries or something (I don't actually know how HWP does stuff). > > > > Rather than point to graphs, which nobody seems to look at, they are attached, > > and so might get striped for some of you. > > > > ... Doug > > > > Addendum: Some of the MSRs you have requested in the past: > > > > i5-9600K (HWP - powersave after test): > > > > root@s18:/home/doug# /home/doug/c/msr-decoder > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO disable > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 108252E : high 46 : guaranteed 37 : efficient 8 : > > lowest 1 > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : > > raw: 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : > > min: 8 : 8 : 8 : 8 : 8 : 8 : > > max: 46 : 46 : 46 : 46 : 46 : 46 : > > des: 0 : 0 : 0 : 0 : 0 : 0 : > > epp: 128 : 128 : 128 : 128 : 128 : 128 : > > act: 0 : 0 : 0 : 0 : 0 : 0 : > > 7.) 0x777: IA32_HWP_STATUS: 0 : high 0 : guaranteed 0 : efficient 0 : lowest 0 > > > > i5-9600K (no HWP - acpi-cpufreq/ondemand after test): > > > > root@s18:/home/doug/c# /home/doug/c/msr-decoder > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > 9.) 0x199: IA32_PERF_CTL : CPU 0-5 : 8 : 8 : 8 : 8 : 8 : 8 : > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-5 : 6 : 6 : 6 : 6 : 6 : 6 : > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : RHO disable > > > > i5-6200U (HWP - powersave after test): > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 19 : 19 : 19 : 19 : > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > 1.) 0x19C: IA32_THERM_STATUS: 88430000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88420000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO enable > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 105171C : high 28 : guaranteed 23 : efficient 5 : > > lowest 1 > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-3 : > > raw: 80001B04 : 80001B04 : 80001B04 : 80001B04 : > > min: 4 : 4 : 4 : 4 : > > max: 27 : 27 : 27 : 27 : > > des: 0 : 0 : 0 : 0 : > > epp: 128 : 128 : 128 : 128 : > > act: 0 : 0 : 0 : 0 : > > 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient 0 : lowest 0 > > > > i5-6200U (no HWP - acpi-cpufreq/ondemand after test): > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 23 : 23 : 23 : 23 : > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > 9.) 0x199: IA32_PERF_CTL : CPU 0-3 : 11 : 5 : 5 : 5 : > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-3 : 6 : 6 : 6 : 6 : > > 1.) 0x19C: IA32_THERM_STATUS: 88440000 > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination enabled OOB Bit 8 reset OOB Bit 18 > > reset > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88430000 > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO enable
On Sun, 2020-08-02 at 07:36 -0700, Doug Smythies wrote: > Hi Srinivas, or anybody at Intel, > > Any chance of you looking into this issue. > I first raised it over 2 months ago. Hi Doug, Unfortunately, didn't reach to this yet. Thanks, Srinivas > > On 2020.07.08 07:41 Doug Smythies wrote: > > On 2020.06.30 11:41 Doug Smythies wrote: > > > Hi Srinivas, > > > > > > O.K. let's try this again, starting a new thread, with address > > > list similar to a few weeks ago. > > > I believe I have untangled my multiple issues, such that this e- > > > mail should be only about > > > the single issue of HWP capable processors incorrectly deciding > > > to lower the CPU frequency > > > under some conditions. Also, my previous assertion as to the > > > issue was indeed incorrect. > > > > > > I now: > > > . never use x86_energy_perf_policy. > > > . For HWP disabled: never change from active to passive or via > > > versa, but rather do it via boot. > > > . after boot always check and reset the various power limit log > > > bits that are set. > > > . never compile the kernel (well, until after any tests), which > > > will set those bits again. > > > . never run prime95 high heat torture test, which will set those > > > bits again. > > > . Note that the tests done for this e-mail never ever set those > > > bits again. > > > . Invented an entirely new way to manifest, demonstrate, and > > > exploit the issue (also mentioned June > > > 6th). > > > . All tests were repeated on another HWP capable computer, so a > > > i5-9600K and a i5-6200U. > > > > > > New method (old was periodic workflow): > > > > > > Long busy, short gap, busy but taking loop time samples so as to > > > estimate CPU frequency. > > > I am calling it an inverse impulse response test. > > > > > > Assertion: > > > > > > If the short sleep is somehow simultaneous with some sort of 5.0 > > > millisecond (200 Hertz) > > > periodic event (either in HWP itself, or via the driver, I am > > > unable to determine which, > > > but think it is inside the black box that is HWP), > > > > I have been attempting to characterise the "black box" that is HWP. > > In terms of system response verses EPP, I only observe the HWP loop > > time as the > > response variable. > > > > 0 <= EPP <= 1 : My test can not measure loop time. > > 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds > > 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds > > 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds > > 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds > > 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds > > 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds > > 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds > > 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds > > > > If there are other system response differences within > > those groups, I haven't been able to detect them, > > but would be grateful for any further insight. > > > > Otherwise, in future, I do not see a need to test anything > > other than 9 values of EPP, one from each group. > > > > > then there is a possibility that the > > > CPU frequency will drop significantly and will take an excessive > > > amount of time to recover. > > > Frequency step ups are exactly on 5.0 millisecond boundaries +/- > > > the short gap time. > > > > > > . The probability is somewhat inconsistent and a function of > > > whatever else the computer is doing. > > > . The time to recover is a function of EPP, and if EPP is low > > > enough my test never fails. > > > . These tests were all done with default settings. > > > . The "5.0" mSec is only for those default settings, it actually > > > depends on EPP. > > > . Crude step boundaries, mSec: EPP=32, 2; EPP=64, 4; EPP=128, > > > 5.00; EPP=196, 9 > > > > Now fully understood, as listed above. > > > > > . High level: i5-9600K: 2453 tests, 60 failures, 2.45% fail rate. > > > (HWP - powersave) > > > . High level: i5-6200U: 4134 tests, 128 failures, 3.1% fail rate. > > > (HWP - powersave) > > > . Low level (capture waveforms): i5-9600K: 1842 captured failure > > > waveforms. See graph. > > > . Low level (capture waveforms): i5-6200U: 458 captured failure > > > waveforms. See graph. > > > . Verify acpi-cpufreq/ondemand works fine: i5-9600K: 8975 tests. > > > 0 failures. > > > . Verify acpi-cpufreq/ondemand works fine: i5-6200U: 8575 tests. > > > 0 failures. > > > > The tests were all done using the teo idle governor. > > While the menu governor does not fail for this particular test, it > > fails > > in other scenarios. > > > > I have yet to find a failure scenario when idle state 2 is > > disabled. > > I have captured and analyzed about 400 megabytes of trace data, > > and have not been able to isolate an exact correlation. > > > > > The short gap was 842 uSeconds for all these tests, and for no > > > particular reason. > > > > > > While I have not re-done the bounds investigation, I have no > > > reason to doubt > > > my previous work, re-stated below: > > > > > > > Gap definition: > > > > lower limit not known, but < 747 uSeconds. > > > > Upper limit is between 952 and 955 uSeconds (there will be some > > > > overhead uncertainties). > > > > The only new information I have is that the upper bound is bigger. > > > > > > Must be preceded by busy time spanning a couple of HWP sampling > > > > boundaries > > > > or jiffy boundaries or something (I don't actually know how HWP > > > > does stuff). > > > > > > Rather than point to graphs, which nobody seems to look at, they > > > are attached, > > > and so might get striped for some of you. > > > > > > ... Doug > > > > > > Addendum: Some of the MSRs you have requested in the past: > > > > > > i5-9600K (HWP - powersave after test): > > > > > > root@s18:/home/doug# /home/doug/c/msr-decoder > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > > : 8 : 8 : > > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > > enabled OOB Bit 8 reset OOB Bit 18 > > > reset > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : > > > RHO disable > > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 108252E : high 46 > > > : guaranteed 37 : efficient 8 : > > > lowest 1 > > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : > > > raw: 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : > > > 80002E08 : > > > min: 8 : 8 : 8 : 8 : 8 > > > : 8 : > > > max: 46 : 46 : 46 : 46 : 46 > > > : 46 : > > > des: 0 : 0 : 0 : 0 : 0 > > > : 0 : > > > epp: 128 : 128 : 128 : 128 : 128 > > > : 128 : > > > act: 0 : 0 : 0 : 0 : 0 > > > : 0 : > > > 7.) 0x777: IA32_HWP_STATUS: 0 : high 0 : guaranteed 0 : efficient > > > 0 : lowest 0 > > > > > > i5-9600K (no HWP - acpi-cpufreq/ondemand after test): > > > > > > root@s18:/home/doug/c# /home/doug/c/msr-decoder > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > > : 8 : 8 : > > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > > 9.) 0x199: IA32_PERF_CTL : CPU 0-5 : 8 : 8 : 8 : 8 > > > : 8 : 8 : > > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-5 : 6 : 6 : 6 : 6 > > > : 6 : 6 : > > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > > enabled OOB Bit 8 reset OOB Bit 18 > > > reset > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : > > > RHO disable > > > > > > i5-6200U (HWP - powersave after test): > > > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 19 : 19 : 19 : 19 : > > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > > 1.) 0x19C: IA32_THERM_STATUS: 88430000 > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > > enabled OOB Bit 8 reset OOB Bit 18 > > > reset > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88420000 > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > > enable > > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 105171C : high 28 > > > : guaranteed 23 : efficient 5 : > > > lowest 1 > > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-3 : > > > raw: 80001B04 : 80001B04 : 80001B04 : 80001B04 : > > > min: 4 : 4 : 4 : 4 : > > > max: 27 : 27 : 27 : 27 : > > > des: 0 : 0 : 0 : 0 : > > > epp: 128 : 128 : 128 : 128 : > > > act: 0 : 0 : 0 : 0 : > > > 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient > > > 0 : lowest 0 > > > > > > i5-6200U (no HWP - acpi-cpufreq/ondemand after test): > > > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 23 : 23 : 23 : 23 > > > : > > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > > 9.) 0x199: IA32_PERF_CTL : CPU 0-3 : 11 : 5 : 5 : 5 > > > : > > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-3 : 6 : 6 : 6 : 6 > > > : > > > 1.) 0x19C: IA32_THERM_STATUS: 88440000 > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > > enabled OOB Bit 8 reset OOB Bit 18 > > > reset > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88430000 > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > > enable > >
On 2020.08.02 Srinivas Pandruvada wrote: > On Sun, 2020-08-02 at 07:36 -0700, Doug Smythies wrote: > > Hi Srinivas, or anybody at Intel, > > > > Any chance of you looking into this issue. > > I first raised it over 2 months ago. > > Hi Doug, > > Unfortunately, didn't reach to this yet. O.K., I created a bug report: https://bugzilla.kernel.org/show_bug.cgi?id=210741 > > Thanks, > Srinivas > > > > > > On 2020.07.08 07:41 Doug Smythies wrote: > > > On 2020.06.30 11:41 Doug Smythies wrote: > > > > Hi Srinivas, > > > > > > > > O.K. let's try this again, starting a new thread, with address > > > > list similar to a few weeks ago. > > > > I believe I have untangled my multiple issues, such that this e- > > > > mail should be only about > > > > the single issue of HWP capable processors incorrectly deciding > > > > to lower the CPU frequency > > > > under some conditions. Also, my previous assertion as to the > > > > issue was indeed incorrect. > > > > > > > > I now: > > > > . never use x86_energy_perf_policy. > > > > . For HWP disabled: never change from active to passive or via > > > > versa, but rather do it via boot. > > > > . after boot always check and reset the various power limit log > > > > bits that are set. > > > > . never compile the kernel (well, until after any tests), which > > > > will set those bits again. > > > > . never run prime95 high heat torture test, which will set those > > > > bits again. > > > > . Note that the tests done for this e-mail never ever set those > > > > bits again. > > > > . Invented an entirely new way to manifest, demonstrate, and > > > > exploit the issue (also mentioned June > > > > 6th). > > > > . All tests were repeated on another HWP capable computer, so a > > > > i5-9600K and a i5-6200U. > > > > > > > > New method (old was periodic workflow): > > > > > > > > Long busy, short gap, busy but taking loop time samples so as to > > > > estimate CPU frequency. > > > > I am calling it an inverse impulse response test. > > > > > > > > Assertion: > > > > > > > > If the short sleep is somehow simultaneous with some sort of 5.0 > > > > millisecond (200 Hertz) > > > > periodic event (either in HWP itself, or via the driver, I am > > > > unable to determine which, > > > > but think it is inside the black box that is HWP), > > > > > > I have been attempting to characterise the "black box" that is HWP. > > > In terms of system response verses EPP, I only observe the HWP loop > > > time as the > > > response variable. > > > > > > 0 <= EPP <= 1 : My test can not measure loop time. > > > 2 <= EPP <= 39 : HWP servo loop time 2 milliseconds > > > 40 <= EPP <= 55 : HWP servo loop time 3 milliseconds > > > 56 <= EPP <= 79 : HWP servo loop time 4 milliseconds > > > 80 <= EPP <= 133 : HWP servo loop time 5 milliseconds > > > 134 <= EPP <= 143 : HWP servo loop time 6 milliseconds > > > 144 <= EPP <= 154 : HWP servo loop time 7 milliseconds > > > 155 <= EPP <= 175 : HWP servo loop time 8 milliseconds > > > 176 <= EPP <= 255 : HWP servo loop time 9 milliseconds > > > > > > If there are other system response differences within > > > those groups, I haven't been able to detect them, > > > but would be grateful for any further insight. > > > > > > Otherwise, in future, I do not see a need to test anything > > > other than 9 values of EPP, one from each group. > > > > > > > then there is a possibility that the > > > > CPU frequency will drop significantly and will take an excessive > > > > amount of time to recover. > > > > Frequency step ups are exactly on 5.0 millisecond boundaries +/- > > > > the short gap time. > > > > > > > > . The probability is somewhat inconsistent and a function of > > > > whatever else the computer is doing. > > > > . The time to recover is a function of EPP, and if EPP is low > > > > enough my test never fails. > > > > . These tests were all done with default settings. > > > > . The "5.0" mSec is only for those default settings, it actually > > > > depends on EPP. > > > > . Crude step boundaries, mSec: EPP=32, 2; EPP=64, 4; EPP=128, > > > > 5.00; EPP=196, 9 > > > > > > Now fully understood, as listed above. > > > > > > > . High level: i5-9600K: 2453 tests, 60 failures, 2.45% fail rate. > > > > (HWP - powersave) > > > > . High level: i5-6200U: 4134 tests, 128 failures, 3.1% fail rate. > > > > (HWP - powersave) > > > > . Low level (capture waveforms): i5-9600K: 1842 captured failure > > > > waveforms. See graph. > > > > . Low level (capture waveforms): i5-6200U: 458 captured failure > > > > waveforms. See graph. > > > > . Verify acpi-cpufreq/ondemand works fine: i5-9600K: 8975 tests. > > > > 0 failures. > > > > . Verify acpi-cpufreq/ondemand works fine: i5-6200U: 8575 tests. > > > > 0 failures. > > > > > > The tests were all done using the teo idle governor. > > > While the menu governor does not fail for this particular test, it > > > fails > > > in other scenarios. > > > > > > I have yet to find a failure scenario when idle state 2 is > > > disabled. > > > I have captured and analyzed about 400 megabytes of trace data, > > > and have not been able to isolate an exact correlation. > > > > > > > The short gap was 842 uSeconds for all these tests, and for no > > > > particular reason. > > > > > > > > While I have not re-done the bounds investigation, I have no > > > > reason to doubt > > > > my previous work, re-stated below: > > > > > > > > > Gap definition: > > > > > lower limit not known, but < 747 uSeconds. > > > > > Upper limit is between 952 and 955 uSeconds (there will be some > > > > > overhead uncertainties). > > > > > > The only new information I have is that the upper bound is bigger. > > > > > > > > Must be preceded by busy time spanning a couple of HWP sampling > > > > > boundaries > > > > > or jiffy boundaries or something (I don't actually know how HWP > > > > > does stuff). > > > > > > > > Rather than point to graphs, which nobody seems to look at, they > > > > are attached, > > > > and so might get striped for some of you. > > > > > > > > ... Doug > > > > > > > > Addendum: Some of the MSRs you have requested in the past: > > > > > > > > i5-9600K (HWP - powersave after test): > > > > > > > > root@s18:/home/doug# /home/doug/c/msr-decoder > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > > > : 8 : 8 : > > > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > > > enabled OOB Bit 8 reset OOB Bit 18 > > > > reset > > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : > > > > RHO disable > > > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 108252E : high 46 > > > > : guaranteed 37 : efficient 8 : > > > > lowest 1 > > > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-5 : > > > > raw: 80002E08 : 80002E08 : 80002E08 : 80002E08 : 80002E08 : > > > > 80002E08 : > > > > min: 8 : 8 : 8 : 8 : 8 > > > > : 8 : > > > > max: 46 : 46 : 46 : 46 : 46 > > > > : 46 : > > > > des: 0 : 0 : 0 : 0 : 0 > > > > : 0 : > > > > epp: 128 : 128 : 128 : 128 : 128 > > > > : 128 : > > > > act: 0 : 0 : 0 : 0 : 0 > > > > : 0 : > > > > 7.) 0x777: IA32_HWP_STATUS: 0 : high 0 : guaranteed 0 : efficient > > > > 0 : lowest 0 > > > > > > > > i5-9600K (no HWP - acpi-cpufreq/ondemand after test): > > > > > > > > root@s18:/home/doug/c# /home/doug/c/msr-decoder > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-5 : 8 : 8 : 8 : 8 > > > > : 8 : 8 : > > > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > > > 9.) 0x199: IA32_PERF_CTL : CPU 0-5 : 8 : 8 : 8 : 8 > > > > : 8 : 8 : > > > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-5 : 6 : 6 : 6 : 6 > > > > : 6 : 6 : > > > > 1.) 0x19C: IA32_THERM_STATUS: 88480000 > > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 401CC0 EIST enabled Coordination > > > > enabled OOB Bit 8 reset OOB Bit 18 > > > > reset > > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88460000 > > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > > A.) 0x1FC: MSR_POWER_CTL: 3C005D : C1E disable : EEO disable : > > > > RHO disable > > > > > > > > i5-6200U (HWP - powersave after test): > > > > > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 19 : 19 : 19 : 19 : > > > > B.) 0x770: IA32_PM_ENABLE: 1 : HWP enable > > > > 1.) 0x19C: IA32_THERM_STATUS: 88430000 > > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > > > enabled OOB Bit 8 reset OOB Bit 18 > > > > reset > > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88420000 > > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > > > enable > > > > 5.) 0x771: IA32_HWP_CAPABILITIES (performance): 105171C : high 28 > > > > : guaranteed 23 : efficient 5 : > > > > lowest 1 > > > > 6.) 0x774: IA32_HWP_REQUEST: CPU 0-3 : > > > > raw: 80001B04 : 80001B04 : 80001B04 : 80001B04 : > > > > min: 4 : 4 : 4 : 4 : > > > > max: 27 : 27 : 27 : 27 : > > > > des: 0 : 0 : 0 : 0 : > > > > epp: 128 : 128 : 128 : 128 : > > > > act: 0 : 0 : 0 : 0 : > > > > 7.) 0x777: IA32_HWP_STATUS: 4 : high 4 : guaranteed 0 : efficient > > > > 0 : lowest 0 > > > > > > > > i5-6200U (no HWP - acpi-cpufreq/ondemand after test): > > > > > > > > 8.) 0x198: IA32_PERF_STATUS : CPU 0-3 : 23 : 23 : 23 : 23 > > > > : > > > > B.) 0x770: IA32_PM_ENABLE: 0 : HWP disable > > > > 9.) 0x199: IA32_PERF_CTL : CPU 0-3 : 11 : 5 : 5 : 5 > > > > : > > > > C.) 0x1B0: IA32_ENERGY_PERF_BIAS: CPU 0-3 : 6 : 6 : 6 : 6 > > > > : > > > > 1.) 0x19C: IA32_THERM_STATUS: 88440000 > > > > 2.) 0x1AA: MSR_MISC_PWR_MGMT: 4018C0 EIST enabled Coordination > > > > enabled OOB Bit 8 reset OOB Bit 18 > > > > reset > > > > 3.) 0x1B1: IA32_PACKAGE_THERM_STATUS: 88430000 > > > > 4.) 0x64F: MSR_CORE_PERF_LIMIT_REASONS: 0 > > > > A.) 0x1FC: MSR_POWER_CTL: 24005D : C1E disable : EEO enable : RHO > > > > enable > > > >
Index: linux-pm/drivers/cpufreq/intel_pstate.c =================================================================== --- linux-pm.orig/drivers/cpufreq/intel_pstate.c +++ linux-pm/drivers/cpufreq/intel_pstate.c @@ -36,6 +36,7 @@ #define INTEL_PSTATE_SAMPLING_INTERVAL (10 * NSEC_PER_MSEC) #define INTEL_CPUFREQ_TRANSITION_LATENCY 20000 +#define INTEL_CPUFREQ_TRANSITION_DELAY_HWP 5000 #define INTEL_CPUFREQ_TRANSITION_DELAY 500 #ifdef CONFIG_ACPI @@ -2175,7 +2176,10 @@ static int intel_pstate_verify_policy(st static void intel_cpufreq_stop_cpu(struct cpufreq_policy *policy) { - intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]); + if (hwp_active) + intel_pstate_hwp_force_min_perf(policy->cpu); + else + intel_pstate_set_min_pstate(all_cpu_data[policy->cpu]); } static void intel_pstate_stop_cpu(struct cpufreq_policy *policy) @@ -2183,12 +2187,10 @@ static void intel_pstate_stop_cpu(struct pr_debug("CPU %d exiting\n", policy->cpu); intel_pstate_clear_update_util_hook(policy->cpu); - if (hwp_active) { + if (hwp_active) intel_pstate_hwp_save_state(policy); - intel_pstate_hwp_force_min_perf(policy->cpu); - } else { - intel_cpufreq_stop_cpu(policy); - } + + intel_cpufreq_stop_cpu(policy); } static int intel_pstate_cpu_exit(struct cpufreq_policy *policy) @@ -2318,13 +2320,58 @@ static void intel_cpufreq_trace(struct c fp_toint(cpu->iowait_boost * 100)); } +static void intel_cpufreq_update_hwp_request(struct cpudata *cpu, u32 min_perf) +{ + u64 value, prev; + + rdmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, &prev); + value = prev; + + value &= ~HWP_MIN_PERF(~0L); + value |= HWP_MIN_PERF(min_perf); + + /* + * The entire MSR needs to be updated in order to update the HWP min + * field in it, so opportunistically update the max too if needed. + */ + value &= ~HWP_MAX_PERF(~0L); + value |= HWP_MAX_PERF(cpu->max_perf_ratio); + + if (value != prev) + wrmsrl_on_cpu(cpu->cpu, MSR_HWP_REQUEST, value); +} + +/** + * intel_cpufreq_adjust_hwp - Adjust the HWP reuqest register. + * @cpu: Target CPU. + * @target_pstate: P-state corresponding to the target frequency. + * + * Set the HWP minimum performance limit to 75% of @target_pstate taking the + * global min and max policy limits into account. + * + * The purpose of this is to avoid situations in which the kernel and the HWP + * algorithm work against each other by giving a hint about the expectations of + * the former to the latter. + */ +static void intel_cpufreq_adjust_hwp(struct cpudata *cpu, u32 target_pstate) +{ + u32 min_perf; + + min_perf = max_t(u32, (3 * target_pstate) / 4, cpu->min_perf_ratio); + min_perf = min_t(u32, min_perf, cpu->max_perf_ratio); + if (min_perf != cpu->pstate.current_pstate) { + cpu->pstate.current_pstate = min_perf; + intel_cpufreq_update_hwp_request(cpu, min_perf); + } +} + static int intel_cpufreq_target(struct cpufreq_policy *policy, unsigned int target_freq, unsigned int relation) { struct cpudata *cpu = all_cpu_data[policy->cpu]; + int target_pstate, old_pstate = cpu->pstate.current_pstate; struct cpufreq_freqs freqs; - int target_pstate, old_pstate; update_turbo_state(); @@ -2332,26 +2379,33 @@ static int intel_cpufreq_target(struct c freqs.new = target_freq; cpufreq_freq_transition_begin(policy, &freqs); + switch (relation) { case CPUFREQ_RELATION_L: - target_pstate = DIV_ROUND_UP(freqs.new, cpu->pstate.scaling); + target_pstate = DIV_ROUND_UP(target_freq, cpu->pstate.scaling); break; case CPUFREQ_RELATION_H: - target_pstate = freqs.new / cpu->pstate.scaling; + target_pstate = target_freq / cpu->pstate.scaling; break; default: - target_pstate = DIV_ROUND_CLOSEST(freqs.new, cpu->pstate.scaling); + target_pstate = DIV_ROUND_CLOSEST(target_freq, cpu->pstate.scaling); break; } - target_pstate = intel_pstate_prepare_request(cpu, target_pstate); - old_pstate = cpu->pstate.current_pstate; - if (target_pstate != cpu->pstate.current_pstate) { - cpu->pstate.current_pstate = target_pstate; - wrmsrl_on_cpu(policy->cpu, MSR_IA32_PERF_CTL, - pstate_funcs.get_val(cpu, target_pstate)); + + if (hwp_active) { + intel_cpufreq_adjust_hwp(cpu, target_pstate); + } else { + target_pstate = intel_pstate_prepare_request(cpu, target_pstate); + if (target_pstate != old_pstate) { + cpu->pstate.current_pstate = target_pstate; + wrmsrl_on_cpu(cpu->cpu, MSR_IA32_PERF_CTL, + pstate_funcs.get_val(cpu, target_pstate)); + } } - freqs.new = target_pstate * cpu->pstate.scaling; intel_cpufreq_trace(cpu, INTEL_PSTATE_TRACE_TARGET, old_pstate); + + freqs.new = target_pstate * cpu->pstate.scaling; + cpufreq_freq_transition_end(policy, &freqs, false); return 0; @@ -2361,14 +2415,19 @@ static unsigned int intel_cpufreq_fast_s unsigned int target_freq) { struct cpudata *cpu = all_cpu_data[policy->cpu]; - int target_pstate, old_pstate; + int target_pstate, old_pstate = cpu->pstate.current_pstate; update_turbo_state(); target_pstate = DIV_ROUND_UP(target_freq, cpu->pstate.scaling); - target_pstate = intel_pstate_prepare_request(cpu, target_pstate); - old_pstate = cpu->pstate.current_pstate; - intel_pstate_update_pstate(cpu, target_pstate); + + if (hwp_active) { + intel_cpufreq_adjust_hwp(cpu, target_pstate); + } else { + target_pstate = intel_pstate_prepare_request(cpu, target_pstate); + intel_pstate_update_pstate(cpu, target_pstate); + } + intel_cpufreq_trace(cpu, INTEL_PSTATE_TRACE_FAST_SWITCH, old_pstate); return target_pstate * cpu->pstate.scaling; } @@ -2389,7 +2448,6 @@ static int intel_cpufreq_cpu_init(struct return ret; policy->cpuinfo.transition_latency = INTEL_CPUFREQ_TRANSITION_LATENCY; - policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY; /* This reflects the intel_pstate_get_cpu_pstates() setting. */ policy->cur = policy->cpuinfo.min_freq; @@ -2401,10 +2459,13 @@ static int intel_cpufreq_cpu_init(struct cpu = all_cpu_data[policy->cpu]; - if (hwp_active) + if (hwp_active) { intel_pstate_get_hwp_max(policy->cpu, &turbo_max, &max_state); - else + policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY_HWP; + } else { turbo_max = cpu->pstate.turbo_pstate; + policy->transition_delay_us = INTEL_CPUFREQ_TRANSITION_DELAY; + } min_freq = DIV_ROUND_UP(turbo_max * global.min_perf_pct, 100); min_freq *= cpu->pstate.scaling; @@ -2505,9 +2566,6 @@ static int intel_pstate_register_driver( static int intel_pstate_unregister_driver(void) { - if (hwp_active) - return -EBUSY; - cpufreq_unregister_driver(intel_pstate_driver); intel_pstate_driver_cleanup(); @@ -2815,12 +2873,11 @@ static int __init intel_pstate_setup(cha if (!str) return -EINVAL; - if (!strcmp(str, "disable")) { + if (!strcmp(str, "disable")) no_load = 1; - } else if (!strcmp(str, "passive")) { + else if (!strcmp(str, "passive")) default_driver = &intel_cpufreq; - no_hwp = 1; - } + if (!strcmp(str, "no_hwp")) { pr_info("HWP disabled\n"); no_hwp = 1;