Message ID | 20221221155203.11347-1-ptyadav@amazon.de |
---|---|
Headers | show |
Series | intel_pstate: fix turbo not being used after a processor is rebooted | expand |
On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > When a processor is brought offline and online again, it is unable to > use Turbo mode because the _PSS table does not contain the whole > turbo > frequency range, but only +1 MHz above the max non-turbo frequency. > This > causes problems when ACPI processor driver tries to set frequency > constraints. See patch 2 for more details. > Are you using some _PPC constraint to force to limit frequency? I did a offline/online with PPC=0 with no HWP, I can get to full turbo range. [ 121.237752] smpboot: CPU 1 is now offline [ 125.734886] x86: Booting SMP configuration: [ 125.734892] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ 125.741007] intel_pstate: CPU 1 going online [ 125.741692] intel_pstate: CPU1 - ACPI _PSS perf data [ 125.741698] intel_pstate: *P0: 2301 MHz, 28000 mW, 0x2a00 [ 125.741703] intel_pstate: P1: 2300 MHz, 28000 mW, 0x1700 [ 125.741705] intel_pstate: P2: 2200 MHz, 26297 mW, 0x1600 [ 125.741707] intel_pstate: P3: 2000 MHz, 23263 mW, 0x1400 [ 125.741710] intel_pstate: P4: 1900 MHz, 21924 mW, 0x1300 [ 125.741712] intel_pstate: P5: 1800 MHz, 20612 mW, 0x1200 [ 125.741714] intel_pstate: P6: 1600 MHz, 17812 mW, 0x1000 [ 125.741716] intel_pstate: P7: 1500 MHz, 16581 mW, 0xf00 [ 125.741718] intel_pstate: P8: 1300 MHz, 13946 mW, 0xd00 [ 125.741720] intel_pstate: P9: 1200 MHz, 12796 mW, 0xc00 [ 125.741722] intel_pstate: P10: 1100 MHz, 11426 mW, 0xb00 [ 125.741724] intel_pstate: P11: 900 MHz, 9250 mW, 0x900 [ 125.741726] intel_pstate: P12: 800 MHz, 7965 mW, 0x800 [ 125.741729] intel_pstate: P13: 700 MHz, 6940 mW, 0x700 [ 125.741731] intel_pstate: P14: 500 MHz, 4738 mW, 0x500 [ 125.741733] intel_pstate: P15: 400 MHz, 3787 mW, 0x400 [ 125.741735] intel_pstate: _PPC limits will be enforced [ 125.741740] intel_pstate: policy->max > max non turbo frequency [ 125.741742] intel_pstate: cpu:1 min_policy_perf:4 max_policy_perf:42 [ 125.741745] intel_pstate: cpu:1 global_min:4 global_max:42 [ 125.741747] intel_pstate: cpu:1 max_perf_ratio:42 min_perf_ratio:4 [ 125.742243] intel_pstate: policy->max > max non turbo frequency [ 125.742247] intel_pstate: cpu:1 min_policy_perf:4 max_policy_perf:42 [ 125.742251] intel_pstate: cpu:1 global_min:4 global_max:42 [ 125.742255] intel_pstate: cpu:1 max_perf_ratio:42 min_perf_ratio:4 It is not clear how to get to this non turbo situation. Thanks, Srinivas > Pratyush Yadav (2): > acpi: processor: allow fixing up the frequency for a performance > state > cpufreq: intel_pstate: use acpi perflib to update turbo frequency > > drivers/acpi/processor_perflib.c | 40 > ++++++++++++++++++++++++++++++++ > drivers/cpufreq/intel_pstate.c | 5 ++-- > include/acpi/processor.h | 2 ++ > 3 files changed, 45 insertions(+), 2 deletions(-) > > -- > 2.38.1 >
Hi Srinivas, On Wed, Dec 21 2022, srinivas pandruvada wrote: > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: >> When a processor is brought offline and online again, it is unable to >> use Turbo mode because the _PSS table does not contain the whole >> turbo >> frequency range, but only +1 MHz above the max non-turbo frequency. >> This >> causes problems when ACPI processor driver tries to set frequency >> constraints. See patch 2 for more details. >> > Are you using some _PPC constraint to force to limit frequency? > I did a offline/online with PPC=0 with no HWP, I can get to full turbo > range. > > [ 121.237752] smpboot: CPU 1 is now offline > [ 125.734886] x86: Booting SMP configuration: > [ 125.734892] smpboot: Booting Node 0 Processor 1 APIC 0x2 > [ 125.741007] intel_pstate: CPU 1 going online > [ 125.741692] intel_pstate: CPU1 - ACPI _PSS perf data > [ 125.741698] intel_pstate: *P0: 2301 MHz, 28000 mW, 0x2a00 > [ 125.741703] intel_pstate: P1: 2300 MHz, 28000 mW, 0x1700 > [ 125.741705] intel_pstate: P2: 2200 MHz, 26297 mW, 0x1600 > [ 125.741707] intel_pstate: P3: 2000 MHz, 23263 mW, 0x1400 > [ 125.741710] intel_pstate: P4: 1900 MHz, 21924 mW, 0x1300 > [ 125.741712] intel_pstate: P5: 1800 MHz, 20612 mW, 0x1200 > [ 125.741714] intel_pstate: P6: 1600 MHz, 17812 mW, 0x1000 > [ 125.741716] intel_pstate: P7: 1500 MHz, 16581 mW, 0xf00 > [ 125.741718] intel_pstate: P8: 1300 MHz, 13946 mW, 0xd00 > [ 125.741720] intel_pstate: P9: 1200 MHz, 12796 mW, 0xc00 > [ 125.741722] intel_pstate: P10: 1100 MHz, 11426 mW, 0xb00 > [ 125.741724] intel_pstate: P11: 900 MHz, 9250 mW, 0x900 > [ 125.741726] intel_pstate: P12: 800 MHz, 7965 mW, 0x800 > [ 125.741729] intel_pstate: P13: 700 MHz, 6940 mW, 0x700 > [ 125.741731] intel_pstate: P14: 500 MHz, 4738 mW, 0x500 > [ 125.741733] intel_pstate: P15: 400 MHz, 3787 mW, 0x400 > [ 125.741735] intel_pstate: _PPC limits will be enforced > [ 125.741740] intel_pstate: policy->max > max non turbo frequency > [ 125.741742] intel_pstate: cpu:1 min_policy_perf:4 max_policy_perf:42 > [ 125.741745] intel_pstate: cpu:1 global_min:4 global_max:42 > [ 125.741747] intel_pstate: cpu:1 max_perf_ratio:42 min_perf_ratio:4 > [ 125.742243] intel_pstate: policy->max > max non turbo frequency > [ 125.742247] intel_pstate: cpu:1 min_policy_perf:4 max_policy_perf:42 > [ 125.742251] intel_pstate: cpu:1 global_min:4 global_max:42 > [ 125.742255] intel_pstate: cpu:1 max_perf_ratio:42 min_perf_ratio:4 > > > It is not clear how to get to this non turbo situation. Look at the scaling_max_freq before and after rebooting the CPU. Before you do it, it should be the max turbo frequency (say 2500 MHz). After rebooting the CPU, it should now be 2301 MHz. So the kernel will now not ask for anything above 2301 MHz, so you will never get to 2500 MHz. Another interesting thing I observed is that if I reboot only 1 CPU, its scaling_max_freq goes down to 2301, but it still keeps working at 2500 MHz. This might be something to do with how turbo works, I don't understand that very well. But if you reboot say 20 CPUs, then you see the frequency drop. I use the below steps to reproduce this bug on my system, which has 40 CPUs with a base frequency of 2500 MHz and turbo frequency of 3300 MHz: $ grep 'cpu MHz' /proc/cpuinfo cpu MHz : 3300.000 cpu MHz : 1199.652 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 cpu MHz : 3300.000 [ repeat 30 times ] $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq | sort -n | uniq -c 40 3300000 $ for i in `seq 1 20`; do echo 0 | sudo tee /sys/devices/system/cpu/cpu$i/online; done [...] $ for i in `seq 1 20`; do echo 1 | sudo tee /sys/devices/system/cpu/cpu$i/online; done [...] $ grep 'cpu MHz' /proc/cpuinfo cpu MHz : 3300.000 cpu MHz : 2500.000 cpu MHz : 2500.000 cpu MHz : 2500.000 cpu MHz : 2500.000 cpu MHz : 2500.000 [ repeat 15 times ] cpu MHz : 3300.000 cpu MHz : 3300.000 [ repeat 17 times ] $ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_max_freq | sort -n | uniq -c 20 2501000 20 3300000 > > Thanks, > Srinivas > >> Pratyush Yadav (2): >> acpi: processor: allow fixing up the frequency for a performance >> state >> cpufreq: intel_pstate: use acpi perflib to update turbo frequency >> >> drivers/acpi/processor_perflib.c | 40 >> ++++++++++++++++++++++++++++++++ >> drivers/cpufreq/intel_pstate.c | 5 ++-- >> include/acpi/processor.h | 2 ++ >> 3 files changed, 45 insertions(+), 2 deletions(-) >> >> -- >> 2.38.1 >> >
On Thu, Dec 22 2022, Rafael J. Wysocki wrote: > On Wed, Dec 21, 2022 at 4:52 PM Pratyush Yadav <ptyadav@amazon.de> wrote: >> >> In some cases the ACPI table can have an incorrect frequency populated >> for a performance state. For example, in Intel platforms, the Turbo >> frequency is just listed as +1 MHz above the max non-turbo frequency. > > Which is a known convention based on compatibility with some older OSes. Interesting. I did not know that. > >> The frequency can actually go much higher based on various factors like >> temperature, voltage, etc. > > It can. > >> Allow drivers like intel_pstate to fix up performance state frequencies >> with the actual maximum value. > > Why do you want to do that? To be able to use my processors at the full frequency they are capable of. See [0] for more details. [0] https://lore.kernel.org/linux-pm/mafs0k02jd8oh.fsf_-_@dev-dsk-ptyadav-1c-37607b33.eu-west-1.amazon.com/ > [...]
Hi Pratyush, On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > Hi Srinivas, > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > When a processor is brought offline and online again, it is > > > unable to > > > use Turbo mode because the _PSS table does not contain the whole > > > turbo > > > frequency range, but only +1 MHz above the max non-turbo > > > frequency. > > > This > > > causes problems when ACPI processor driver tries to set frequency > > > constraints. See patch 2 for more details. > > > I can reproduce on a Broadwell server platform. But not on a client system with acpi_ppc usage. Need to check what change broke this. Thanks, Srinivas > > > > Thanks, > > Srinivas > > > > > Pratyush Yadav (2): > > > acpi: processor: allow fixing up the frequency for a > > > performance > > > state > > > cpufreq: intel_pstate: use acpi perflib to update turbo > > > frequency > > > > > > drivers/acpi/processor_perflib.c | 40 > > > ++++++++++++++++++++++++++++++++ > > > drivers/cpufreq/intel_pstate.c | 5 ++-- > > > include/acpi/processor.h | 2 ++ > > > 3 files changed, 45 insertions(+), 2 deletions(-) > > > > > > -- > > > 2.38.1 > > > > > >
On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > Hi Pratyush, > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > Hi Srinivas, > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > When a processor is brought offline and online again, it is > > > > unable to > > > > use Turbo mode because the _PSS table does not contain the whole > > > > turbo > > > > frequency range, but only +1 MHz above the max non-turbo > > > > frequency. > > > > This > > > > causes problems when ACPI processor driver tries to set frequency > > > > constraints. See patch 2 for more details. > > > > > I can reproduce on a Broadwell server platform. But not on a client > system with acpi_ppc usage. > > Need to check what change broke this. When PPC limits enforcement changed to PM QOS, this broke. Previously acpi_processor_get_platform_limit() was not enforcing any limits. It was just setting variable. So any update done after acpi_register_performance_state() call to pr->performance- >states[ppc].core_frequency, was effective. We don't really need to call ret = freq_qos_update_request(&pr->perflib_req, pr->performance->states[ppc].core_frequency * 1000); if the PPC is not changed. When PPC is changed, this gets called again, so then we can call the above function to update cpufreq limit. The below change fixed for me. diff --git a/drivers/acpi/processor_perflib.c b/drivers/acpi/processor_perflib.c index 757a98f6d7a2..c6ced89c00dd 100644 --- a/drivers/acpi/processor_perflib.c +++ b/drivers/acpi/processor_perflib.c @@ -75,6 +75,11 @@ static int acpi_processor_get_platform_limit(struct acpi_processor *pr) pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr->id, (int)ppc, ppc ? "" : "not"); + if (ppc == pr->performance_platform_limit) { + pr_debug("CPU %d: _PPC is %d - frequency not changed\n", pr->id, ppc); + return 0; + } + pr->performance_platform_limit = (int)ppc; if (ppc >= pr->performance->state_count || Thanks, Srinivas > > Thanks, > Srinivas > > > > > > > Thanks, > > > Srinivas > > > > > > > Pratyush Yadav (2): > > > > acpi: processor: allow fixing up the frequency for a > > > > performance > > > > state > > > > cpufreq: intel_pstate: use acpi perflib to update turbo > > > > frequency > > > > > > > > drivers/acpi/processor_perflib.c | 40 > > > > ++++++++++++++++++++++++++++++++ > > > > drivers/cpufreq/intel_pstate.c | 5 ++-- > > > > include/acpi/processor.h | 2 ++ > > > > 3 files changed, 45 insertions(+), 2 deletions(-) > > > > > > > > -- > > > > 2.38.1 > > > > > > > > > >
Hi Srinivas, On Sat, Dec 24 2022, srinivas pandruvada wrote: > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: >> Hi Pratyush, >> >> On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: >> > >> > Hi Srinivas, >> > >> > On Wed, Dec 21 2022, srinivas pandruvada wrote: >> > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: >> > > > When a processor is brought offline and online again, it is >> > > > unable to >> > > > use Turbo mode because the _PSS table does not contain the whole >> > > > turbo >> > > > frequency range, but only +1 MHz above the max non-turbo >> > > > frequency. >> > > > This >> > > > causes problems when ACPI processor driver tries to set frequency >> > > > constraints. See patch 2 for more details. >> > > > >> I can reproduce on a Broadwell server platform. But not on a client >> system with acpi_ppc usage. >> >> Need to check what change broke this. > > When PPC limits enforcement changed to PM QOS, this broke. Previously > acpi_processor_get_platform_limit() was not enforcing any limits. It > was just setting variable. So any update done after > acpi_register_performance_state() call to pr->performance- >>states[ppc].core_frequency, was effective. > > We don't really need to call > ret = freq_qos_update_request(&pr->perflib_req, > pr->performance->states[ppc].core_frequency * > 1000); > > if the PPC is not changed. When PPC is changed, this gets called again, > so then we can call the above function to update cpufreq limit. > > The below change fixed for me. Right. Should I re-roll my patches with your diff below then? Or do you think my patches should be good to merge as-is? > > diff --git a/drivers/acpi/processor_perflib.c > b/drivers/acpi/processor_perflib.c > index 757a98f6d7a2..c6ced89c00dd 100644 > --- a/drivers/acpi/processor_perflib.c > +++ b/drivers/acpi/processor_perflib.c > @@ -75,6 +75,11 @@ static int acpi_processor_get_platform_limit(struct > acpi_processor *pr) > pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr->id, > (int)ppc, ppc ? "" : "not"); > > + if (ppc == pr->performance_platform_limit) { > + pr_debug("CPU %d: _PPC is %d - frequency not > changed\n", pr->id, ppc); > + return 0; > + } > + > pr->performance_platform_limit = (int)ppc; > > if (ppc >= pr->performance->state_count || >
On Tue, Dec 27, 2022 at 4:38 PM Pratyush Yadav <ptyadav@amazon.de> wrote: > > Hi Srinivas, > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > >> Hi Pratyush, > >> > >> On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > >> > > >> > Hi Srinivas, > >> > > >> > On Wed, Dec 21 2022, srinivas pandruvada wrote: > >> > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > >> > > > When a processor is brought offline and online again, it is > >> > > > unable to > >> > > > use Turbo mode because the _PSS table does not contain the whole > >> > > > turbo > >> > > > frequency range, but only +1 MHz above the max non-turbo > >> > > > frequency. > >> > > > This > >> > > > causes problems when ACPI processor driver tries to set frequency > >> > > > constraints. See patch 2 for more details. > >> > > > > >> I can reproduce on a Broadwell server platform. But not on a client > >> system with acpi_ppc usage. > >> > >> Need to check what change broke this. > > > > When PPC limits enforcement changed to PM QOS, this broke. Previously > > acpi_processor_get_platform_limit() was not enforcing any limits. It > > was just setting variable. So any update done after > > acpi_register_performance_state() call to pr->performance- > >>states[ppc].core_frequency, was effective. > > > > We don't really need to call > > ret = freq_qos_update_request(&pr->perflib_req, > > pr->performance->states[ppc].core_frequency * > > 1000); > > > > if the PPC is not changed. When PPC is changed, this gets called again, > > so then we can call the above function to update cpufreq limit. > > > > The below change fixed for me. > > Right. Should I re-roll my patches with your diff below then? Or do you > think my patches should be good to merge as-is? No, they are not good to merge.
On Wed, Dec 21, 2022 at 4:52 PM Pratyush Yadav <ptyadav@amazon.de> wrote: > > When a processor is brought offline and online again, it is unable to > use Turbo mode because the _PSS table does not contain the whole turbo > frequency range, but only +1 MHz above the max non-turbo frequency. That's because of the way P-state limits in the turbo range are handled by the given processor. Some of them restrict the P-state even if the limit is located within the turbo range and some of them don't (that is, requesting any P-state in the turbo range gives the processor a license to use the whole of it). > This causes problems when ACPI processor driver tries to set frequency > constraints. The problem is that acpi_processor_get_platform_limit() sets the limit to the frequency for all of the _PSS states including the last special one and it should update the QoS to "no limit" in that case.
On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > Hi Srinivas, > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > > > Hi Pratyush, > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > Hi Srinivas, > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > > > When a processor is brought offline and online again, it is > > > > > > unable to > > > > > > use Turbo mode because the _PSS table does not contain the > > > > > > whole > > > > > > turbo > > > > > > frequency range, but only +1 MHz above the max non-turbo > > > > > > frequency. > > > > > > This > > > > > > causes problems when ACPI processor driver tries to set > > > > > > frequency > > > > > > constraints. See patch 2 for more details. > > > > > > > > > I can reproduce on a Broadwell server platform. But not on a > > > client > > > system with acpi_ppc usage. > > > > > > Need to check what change broke this. > > > > When PPC limits enforcement changed to PM QOS, this broke. > > Previously > > acpi_processor_get_platform_limit() was not enforcing any limits. > > It > > was just setting variable. So any update done after > > acpi_register_performance_state() call to pr->performance- > > > states[ppc].core_frequency, was effective. > > > > We don't really need to call > > ret = freq_qos_update_request(&pr->perflib_req, > > pr->performance->states[ppc].core_frequency > > * > > 1000); > > > > if the PPC is not changed. When PPC is changed, this gets called > > again, > > so then we can call the above function to update cpufreq limit. > > > > The below change fixed for me. > > Right. I think, this is the only change you require to fix this. In addition set pr->performance_platform_limit = 0 in acpi_processor_unregister_performance(). Thanks, Srinivas > Should I re-roll my patches with your diff below then? Or do you > think my patches should be good to merge as-is? > > > > > diff --git a/drivers/acpi/processor_perflib.c > > b/drivers/acpi/processor_perflib.c > > index 757a98f6d7a2..c6ced89c00dd 100644 > > --- a/drivers/acpi/processor_perflib.c > > +++ b/drivers/acpi/processor_perflib.c > > @@ -75,6 +75,11 @@ static int > > acpi_processor_get_platform_limit(struct > > acpi_processor *pr) > > pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr- > > >id, > > (int)ppc, ppc ? "" : "not"); > > > > + if (ppc == pr->performance_platform_limit) { > > + pr_debug("CPU %d: _PPC is %d - frequency not > > changed\n", pr->id, ppc); > > + return 0; > > + } > > + > > pr->performance_platform_limit = (int)ppc; > > > > if (ppc >= pr->performance->state_count || > > >
On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > Hi Srinivas, > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > > > > Hi Pratyush, > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > Hi Srinivas, > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > > > > When a processor is brought offline and online again, it is > > > > > > > unable to > > > > > > > use Turbo mode because the _PSS table does not contain the > > > > > > > whole > > > > > > > turbo > > > > > > > frequency range, but only +1 MHz above the max non-turbo > > > > > > > frequency. > > > > > > > This > > > > > > > causes problems when ACPI processor driver tries to set > > > > > > > frequency > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not on a > > > > client > > > > system with acpi_ppc usage. > > > > > > > > Need to check what change broke this. > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > Previously > > > acpi_processor_get_platform_limit() was not enforcing any limits. > > > It > > > was just setting variable. So any update done after > > > acpi_register_performance_state() call to pr->performance- > > > > states[ppc].core_frequency, was effective. > > > > > > We don't really need to call > > > ret = freq_qos_update_request(&pr->perflib_req, > > > pr->performance->states[ppc].core_frequency > > > * > > > 1000); > > > > > > if the PPC is not changed. When PPC is changed, this gets called > > > again, > > > so then we can call the above function to update cpufreq limit. > > > > > > The below change fixed for me. > > > > Right. > I think, this is the only change you require to fix this. In addition > set pr->performance_platform_limit = 0 in > acpi_processor_unregister_performance(). Not really, because if the limit is set to a lower frequency and then reset to _PSS[0], it needs to be set back to "no limit". I'll send a patch for that in a while.
On Tuesday, December 27, 2022 6:02:50 PM CET Rafael J. Wysocki wrote: > On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada > <srinivas.pandruvada@linux.intel.com> wrote: > > > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > > Hi Srinivas, > > > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > > > > > Hi Pratyush, > > > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > > > > > When a processor is brought offline and online again, it is > > > > > > > > unable to > > > > > > > > use Turbo mode because the _PSS table does not contain the > > > > > > > > whole > > > > > > > > turbo > > > > > > > > frequency range, but only +1 MHz above the max non-turbo > > > > > > > > frequency. > > > > > > > > This > > > > > > > > causes problems when ACPI processor driver tries to set > > > > > > > > frequency > > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not on a > > > > > client > > > > > system with acpi_ppc usage. > > > > > > > > > > Need to check what change broke this. > > > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > > Previously > > > > acpi_processor_get_platform_limit() was not enforcing any limits. > > > > It > > > > was just setting variable. So any update done after > > > > acpi_register_performance_state() call to pr->performance- > > > > > states[ppc].core_frequency, was effective. > > > > > > > > We don't really need to call > > > > ret = freq_qos_update_request(&pr->perflib_req, > > > > pr->performance->states[ppc].core_frequency > > > > * > > > > 1000); > > > > > > > > if the PPC is not changed. When PPC is changed, this gets called > > > > again, > > > > so then we can call the above function to update cpufreq limit. > > > > > > > > The below change fixed for me. > > > > > > Right. > > I think, this is the only change you require to fix this. In addition > > set pr->performance_platform_limit = 0 in > > acpi_processor_unregister_performance(). > > Not really, because if the limit is set to a lower frequency and then > reset to _PSS[0], it needs to be set back to "no limit". > > I'll send a patch for that in a while. This has not been tested beyond compilation, so please be careful. --- drivers/acpi/processor_perflib.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) Index: linux-pm/drivers/acpi/processor_perflib.c =================================================================== --- linux-pm.orig/drivers/acpi/processor_perflib.c +++ linux-pm/drivers/acpi/processor_perflib.c @@ -53,6 +53,8 @@ static int acpi_processor_get_platform_l { acpi_status status = 0; unsigned long long ppc = 0; + s32 qos_value; + int index; int ret; if (!pr) @@ -72,17 +74,30 @@ static int acpi_processor_get_platform_l } } + index = ppc; + + if (pr->performance_platform_limit == index || + ppc >= pr->performance->state_count) + return 0; + pr_debug("CPU %d: _PPC is %d - frequency %s limited\n", pr->id, - (int)ppc, ppc ? "" : "not"); + index, index ? "is" : "is not"); - pr->performance_platform_limit = (int)ppc; + pr->performance_platform_limit = index; - if (ppc >= pr->performance->state_count || - unlikely(!freq_qos_request_active(&pr->perflib_req))) + if (unlikely(!freq_qos_request_active(&pr->perflib_req))) return 0; - ret = freq_qos_update_request(&pr->perflib_req, - pr->performance->states[ppc].core_frequency * 1000); + /* + * If _PPC returns 0, it means that all of the available states can be + * used ("no limit"). + */ + if (index == 0) + qos_value = FREQ_QOS_MAX_DEFAULT_VALUE; + else + qos_value = pr->performance->states[index].core_frequency * 1000; + + ret = freq_qos_update_request(&pr->perflib_req, qos_value); if (ret < 0) { pr_warn("Failed to update perflib freq constraint: CPU%d (%d)\n", pr->id, ret);
On Tue, 2022-12-27 at 18:02 +0100, Rafael J. Wysocki wrote: > On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada > <srinivas.pandruvada@linux.intel.com> wrote: > > > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > > Hi Srinivas, > > > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > > > > > Hi Pratyush, > > > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > > > > > When a processor is brought offline and online again, > > > > > > > > it is > > > > > > > > unable to > > > > > > > > use Turbo mode because the _PSS table does not contain > > > > > > > > the > > > > > > > > whole > > > > > > > > turbo > > > > > > > > frequency range, but only +1 MHz above the max non- > > > > > > > > turbo > > > > > > > > frequency. > > > > > > > > This > > > > > > > > causes problems when ACPI processor driver tries to set > > > > > > > > frequency > > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not on a > > > > > client > > > > > system with acpi_ppc usage. > > > > > > > > > > Need to check what change broke this. > > > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > > Previously > > > > acpi_processor_get_platform_limit() was not enforcing any > > > > limits. > > > > It > > > > was just setting variable. So any update done after > > > > acpi_register_performance_state() call to pr->performance- > > > > > states[ppc].core_frequency, was effective. > > > > > > > > We don't really need to call > > > > ret = freq_qos_update_request(&pr->perflib_req, > > > > pr->performance- > > > > >states[ppc].core_frequency > > > > * > > > > 1000); > > > > > > > > if the PPC is not changed. When PPC is changed, this gets > > > > called > > > > again, > > > > so then we can call the above function to update cpufreq limit. > > > > > > > > The below change fixed for me. > > > > > > Right. > > I think, this is the only change you require to fix this. In > > addition > > set pr->performance_platform_limit = 0 in > > acpi_processor_unregister_performance(). > > Not really, because if the limit is set to a lower frequency and then > reset to _PSS[0], it needs to be set back to "no limit". > If PPC becomes 0 again after ppc > 0 during dynamic PPC change, pr- >performance_platform_limit will not match current PPC, so will set to PPC 0 performance ( which is already patched by driver after return from acpi_register_performance_state()). But fine, you can always set freq qos to FREQ_QOS_MAX_DEFAULT_VALUE for PPC 0 as you are doing in your patch. Thanks, Srinivas > I'll send a patch for that in a while.
On Tue, Dec 27, 2022 at 7:07 PM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > On Tue, 2022-12-27 at 18:02 +0100, Rafael J. Wysocki wrote: > > On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada > > <srinivas.pandruvada@linux.intel.com> wrote: > > > > > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > > > Hi Srinivas, > > > > > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada wrote: > > > > > > Hi Pratyush, > > > > > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav wrote: > > > > > > > > > When a processor is brought offline and online again, > > > > > > > > > it is > > > > > > > > > unable to > > > > > > > > > use Turbo mode because the _PSS table does not contain > > > > > > > > > the > > > > > > > > > whole > > > > > > > > > turbo > > > > > > > > > frequency range, but only +1 MHz above the max non- > > > > > > > > > turbo > > > > > > > > > frequency. > > > > > > > > > This > > > > > > > > > causes problems when ACPI processor driver tries to set > > > > > > > > > frequency > > > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not on a > > > > > > client > > > > > > system with acpi_ppc usage. > > > > > > > > > > > > Need to check what change broke this. > > > > > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > > > Previously > > > > > acpi_processor_get_platform_limit() was not enforcing any > > > > > limits. > > > > > It > > > > > was just setting variable. So any update done after > > > > > acpi_register_performance_state() call to pr->performance- > > > > > > states[ppc].core_frequency, was effective. > > > > > > > > > > We don't really need to call > > > > > ret = freq_qos_update_request(&pr->perflib_req, > > > > > pr->performance- > > > > > >states[ppc].core_frequency > > > > > * > > > > > 1000); > > > > > > > > > > if the PPC is not changed. When PPC is changed, this gets > > > > > called > > > > > again, > > > > > so then we can call the above function to update cpufreq limit. > > > > > > > > > > The below change fixed for me. > > > > > > > > Right. > > > I think, this is the only change you require to fix this. In > > > addition > > > set pr->performance_platform_limit = 0 in > > > acpi_processor_unregister_performance(). > > > > Not really, because if the limit is set to a lower frequency and then > > reset to _PSS[0], it needs to be set back to "no limit". > > > > If PPC becomes 0 again after ppc > 0 during dynamic PPC change, pr- > >performance_platform_limit will not match current PPC, so will set to > PPC 0 performance ( which is already patched by driver after return > from acpi_register_performance_state()). I see. > But fine, you can always set freq qos to FREQ_QOS_MAX_DEFAULT_VALUE for > PPC 0 as you are doing in your patch. I think that using the "no limit" value to represent the "no limit" condition makes sense. Also, I'm wondering if the patching of states[0].core_frequency will still be necessary after this change.
On Tue, 2022-12-27 at 19:47 +0100, Rafael J. Wysocki wrote: > On Tue, Dec 27, 2022 at 7:07 PM srinivas pandruvada > <srinivas.pandruvada@linux.intel.com> wrote: > > > > On Tue, 2022-12-27 at 18:02 +0100, Rafael J. Wysocki wrote: > > > On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada > > > <srinivas.pandruvada@linux.intel.com> wrote: > > > > > > > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > > > > Hi Srinivas, > > > > > > > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > > > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada > > > > > > wrote: > > > > > > > Hi Pratyush, > > > > > > > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav > > > > > > > > > wrote: > > > > > > > > > > When a processor is brought offline and online > > > > > > > > > > again, > > > > > > > > > > it is > > > > > > > > > > unable to > > > > > > > > > > use Turbo mode because the _PSS table does not > > > > > > > > > > contain > > > > > > > > > > the > > > > > > > > > > whole > > > > > > > > > > turbo > > > > > > > > > > frequency range, but only +1 MHz above the max non- > > > > > > > > > > turbo > > > > > > > > > > frequency. > > > > > > > > > > This > > > > > > > > > > causes problems when ACPI processor driver tries to > > > > > > > > > > set > > > > > > > > > > frequency > > > > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not > > > > > > > on a > > > > > > > client > > > > > > > system with acpi_ppc usage. > > > > > > > > > > > > > > Need to check what change broke this. > > > > > > > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > > > > Previously > > > > > > acpi_processor_get_platform_limit() was not enforcing any > > > > > > limits. > > > > > > It > > > > > > was just setting variable. So any update done after > > > > > > acpi_register_performance_state() call to pr->performance- > > > > > > > states[ppc].core_frequency, was effective. > > > > > > > > > > > > We don't really need to call > > > > > > ret = freq_qos_update_request(&pr->perflib_req, > > > > > > pr->performance- > > > > > > > states[ppc].core_frequency > > > > > > * > > > > > > 1000); > > > > > > > > > > > > if the PPC is not changed. When PPC is changed, this gets > > > > > > called > > > > > > again, > > > > > > so then we can call the above function to update cpufreq > > > > > > limit. > > > > > > > > > > > > The below change fixed for me. > > > > > > > > > > Right. > > > > I think, this is the only change you require to fix this. In > > > > addition > > > > set pr->performance_platform_limit = 0 in > > > > acpi_processor_unregister_performance(). > > > > > > Not really, because if the limit is set to a lower frequency and > > > then > > > reset to _PSS[0], it needs to be set back to "no limit". > > > > > > > If PPC becomes 0 again after ppc > 0 during dynamic PPC change, pr- > > > performance_platform_limit will not match current PPC, so will > > > set to > > PPC 0 performance ( which is already patched by driver after return > > from acpi_register_performance_state()). > > I see. > > > But fine, you can always set freq qos to FREQ_QOS_MAX_DEFAULT_VALUE > > for > > PPC 0 as you are doing in your patch. > > I think that using the "no limit" value to represent the "no limit" > condition makes sense. Agree. > > Also, I'm wondering if the patching of states[0].core_frequency will > still be necessary after this change. I don't think so. We can remove the patching.
On Tue, Dec 27, 2022 at 7:49 PM srinivas pandruvada <srinivas.pandruvada@linux.intel.com> wrote: > > On Tue, 2022-12-27 at 19:47 +0100, Rafael J. Wysocki wrote: > > On Tue, Dec 27, 2022 at 7:07 PM srinivas pandruvada > > <srinivas.pandruvada@linux.intel.com> wrote: > > > > > > On Tue, 2022-12-27 at 18:02 +0100, Rafael J. Wysocki wrote: > > > > On Tue, Dec 27, 2022 at 5:40 PM srinivas pandruvada > > > > <srinivas.pandruvada@linux.intel.com> wrote: > > > > > > > > > > On Tue, 2022-12-27 at 16:38 +0100, Pratyush Yadav wrote: > > > > > > Hi Srinivas, > > > > > > > > > > > > On Sat, Dec 24 2022, srinivas pandruvada wrote: > > > > > > > > > > > > > On Fri, 2022-12-23 at 10:10 -0800, srinivas pandruvada > > > > > > > wrote: > > > > > > > > Hi Pratyush, > > > > > > > > > > > > > > > > On Thu, 2022-12-22 at 11:39 +0100, Pratyush Yadav wrote: > > > > > > > > > > > > > > > > > > Hi Srinivas, > > > > > > > > > > > > > > > > > > On Wed, Dec 21 2022, srinivas pandruvada wrote: > > > > > > > > > > On Wed, 2022-12-21 at 16:52 +0100, Pratyush Yadav > > > > > > > > > > wrote: > > > > > > > > > > > When a processor is brought offline and online > > > > > > > > > > > again, > > > > > > > > > > > it is > > > > > > > > > > > unable to > > > > > > > > > > > use Turbo mode because the _PSS table does not > > > > > > > > > > > contain > > > > > > > > > > > the > > > > > > > > > > > whole > > > > > > > > > > > turbo > > > > > > > > > > > frequency range, but only +1 MHz above the max non- > > > > > > > > > > > turbo > > > > > > > > > > > frequency. > > > > > > > > > > > This > > > > > > > > > > > causes problems when ACPI processor driver tries to > > > > > > > > > > > set > > > > > > > > > > > frequency > > > > > > > > > > > constraints. See patch 2 for more details. > > > > > > > > > > > > > > > > > > > I can reproduce on a Broadwell server platform. But not > > > > > > > > on a > > > > > > > > client > > > > > > > > system with acpi_ppc usage. > > > > > > > > > > > > > > > > Need to check what change broke this. > > > > > > > > > > > > > > When PPC limits enforcement changed to PM QOS, this broke. > > > > > > > Previously > > > > > > > acpi_processor_get_platform_limit() was not enforcing any > > > > > > > limits. > > > > > > > It > > > > > > > was just setting variable. So any update done after > > > > > > > acpi_register_performance_state() call to pr->performance- > > > > > > > > states[ppc].core_frequency, was effective. > > > > > > > > > > > > > > We don't really need to call > > > > > > > ret = freq_qos_update_request(&pr->perflib_req, > > > > > > > pr->performance- > > > > > > > > states[ppc].core_frequency > > > > > > > * > > > > > > > 1000); > > > > > > > > > > > > > > if the PPC is not changed. When PPC is changed, this gets > > > > > > > called > > > > > > > again, > > > > > > > so then we can call the above function to update cpufreq > > > > > > > limit. > > > > > > > > > > > > > > The below change fixed for me. > > > > > > > > > > > > Right. > > > > > I think, this is the only change you require to fix this. In > > > > > addition > > > > > set pr->performance_platform_limit = 0 in > > > > > acpi_processor_unregister_performance(). > > > > > > > > Not really, because if the limit is set to a lower frequency and > > > > then > > > > reset to _PSS[0], it needs to be set back to "no limit". > > > > > > > > > > If PPC becomes 0 again after ppc > 0 during dynamic PPC change, pr- > > > > performance_platform_limit will not match current PPC, so will > > > > set to > > > PPC 0 performance ( which is already patched by driver after return > > > from acpi_register_performance_state()). > > > > I see. > > > > > But fine, you can always set freq qos to FREQ_QOS_MAX_DEFAULT_VALUE > > > for > > > PPC 0 as you are doing in your patch. > > > > I think that using the "no limit" value to represent the "no limit" > > condition makes sense. > Agree. > > > > > Also, I'm wondering if the patching of states[0].core_frequency will > > still be necessary after this change. > > I don't think so. We can remove the patching. OK, let me cut the patches.