mbox series

[v3,00/18] amd-pstate cleanups

Message ID 20250217220707.1468365-1-superm1@kernel.org
Headers show
Series amd-pstate cleanups | expand

Message

Mario Limonciello Feb. 17, 2025, 10:06 p.m. UTC
From: Mario Limonciello <mario.limonciello@amd.com>

This series overhauls locking and drops many unnecessarily cached
variables.

Debugging messages are also dropped in favor of more ftracing.

This series is based off superm1/linux.git bleeding-edge branch.

v2->v3:
 * Mostly pick up tags
 * Add new patch 1/18 that fixes a KBZ issue (see patch for detail)
 * Fixup for min_freq issue in dropping cached values patch
 * Fixup for unit tests to only run on online CPUs

Mario Limonciello (18):
  cpufreq/amd-pstate: Invalidate cppc_req_cached during suspend
  cpufreq/amd-pstate: Show a warning when a CPU fails to setup
  cpufreq/amd-pstate: Drop min and max cached frequencies
  cpufreq/amd-pstate: Move perf values into a union
  cpufreq/amd-pstate: Overhaul locking
  cpufreq/amd-pstate: Drop `cppc_cap1_cached`
  cpufreq/amd-pstate-ut: Use _free macro to free put policy
  cpufreq/amd-pstate-ut: Allow lowest nonlinear and lowest to be the
    same
  cpufreq/amd-pstate-ut: Drop SUCCESS and FAIL enums
  cpufreq/amd-pstate-ut: Run on all of the correct CPUs
  cpufreq/amd-pstate-ut: Adjust variable scope for
    amd_pstate_ut_check_freq()
  cpufreq/amd-pstate: Replace all AMD_CPPC_* macros with masks
  cpufreq/amd-pstate: Cache CPPC request in shared mem case too
  cpufreq/amd-pstate: Move all EPP tracing into *_update_perf and
    *_set_epp functions
  cpufreq/amd-pstate: Update cppc_req_cached for shared mem EPP writes
  cpufreq/amd-pstate: Drop debug statements for policy setting
  cpufreq/amd-pstate: Rework CPPC enabling
  cpufreq/amd-pstate: Stop caching EPP

 arch/x86/include/asm/msr-index.h   |  20 +-
 arch/x86/kernel/acpi/cppc.c        |   4 +-
 drivers/cpufreq/amd-pstate-trace.h |  13 +-
 drivers/cpufreq/amd-pstate-ut.c    | 209 +++++-----
 drivers/cpufreq/amd-pstate.c       | 592 ++++++++++++++---------------
 drivers/cpufreq/amd-pstate.h       |  61 +--
 6 files changed, 428 insertions(+), 471 deletions(-)

Comments

Gautham R. Shenoy Feb. 19, 2025, 5:25 a.m. UTC | #1
On Mon, Feb 17, 2025 at 04:06:52PM -0600, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> Use the perf_to_freq helpers to calculate this on the fly.
> 
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

LGTM.

Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
Dhananjay Ugwekar Feb. 19, 2025, 6:12 a.m. UTC | #2
On 2/18/2025 3:36 AM, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> During resume it's possible the firmware didn't restore the CPPC request MSR
> but the kernel thinks the values line up. This leads to incorrect performance
> after resume from suspend.
> 
> To fix the issue invalidate the cached value at suspend. During resume use
> the saved values programmed as cached limits.
> 
> Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/cpufreq/amd-pstate.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
> index f425fb7ec77d7..12fb63169a24c 100644
> --- a/drivers/cpufreq/amd-pstate.c
> +++ b/drivers/cpufreq/amd-pstate.c
> @@ -1611,7 +1611,7 @@ static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
>  					  max_perf, policy->boost_enabled);
>  	}
>  
> -	return amd_pstate_update_perf(cpudata, 0, 0, max_perf, cpudata->epp_cached, false);
> +	return amd_pstate_epp_update_limit(policy);

Can we also add the check "if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)"
in "amd_pstate_epp_update_limit()" before calling "amd_pstate_update_min_max_limit()". I think it would help in 
avoiding some unnecessary calls to the update_min_max_limit() function.

Patch looks good to me otherwise.

Thanks,
Dhananjay

>  }
>  
>  static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
> @@ -1660,6 +1660,9 @@ static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
>  	if (cppc_state != AMD_PSTATE_ACTIVE)
>  		return 0;
>  
> +	/* invalidate to ensure it's rewritten during resume */
> +	cpudata->cppc_req_cached = 0;
> +
>  	/* set this flag to avoid setting core offline*/
>  	cpudata->suspended = true;
>
Dhananjay Ugwekar Feb. 19, 2025, 6:37 a.m. UTC | #3
On 2/19/2025 11:42 AM, Dhananjay Ugwekar wrote:
> On 2/18/2025 3:36 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> During resume it's possible the firmware didn't restore the CPPC request MSR
>> but the kernel thinks the values line up. This leads to incorrect performance
>> after resume from suspend.
>>
>> To fix the issue invalidate the cached value at suspend. During resume use
>> the saved values programmed as cached limits.
>>
>> Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>  drivers/cpufreq/amd-pstate.c | 5 ++++-
>>  1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index f425fb7ec77d7..12fb63169a24c 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1611,7 +1611,7 @@ static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
>>  					  max_perf, policy->boost_enabled);
>>  	}
>>  
>> -	return amd_pstate_update_perf(cpudata, 0, 0, max_perf, cpudata->epp_cached, false);
>> +	return amd_pstate_epp_update_limit(policy);
> 
> Can we also add the check "if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)"
> in "amd_pstate_epp_update_limit()" before calling "amd_pstate_update_min_max_limit()". I think it would help in 
> avoiding some unnecessary calls to the update_min_max_limit() function.

You can ignore this, I see that you have handled it in the 3rd patch.

Reviewed-by: Dhananjay Ugwekar <dhananjay.ugwekar@amd.com>

> 
> Patch looks good to me otherwise.
> 
> Thanks,
> Dhananjay
> 
>>  }
>>  
>>  static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>> @@ -1660,6 +1660,9 @@ static int amd_pstate_epp_suspend(struct cpufreq_policy *policy)
>>  	if (cppc_state != AMD_PSTATE_ACTIVE)
>>  		return 0;
>>  
>> +	/* invalidate to ensure it's rewritten during resume */
>> +	cpudata->cppc_req_cached = 0;
>> +
>>  	/* set this flag to avoid setting core offline*/
>>  	cpudata->suspended = true;
>>  
>
Mario Limonciello Feb. 19, 2025, 5:21 p.m. UTC | #4
On 2/18/2025 23:24, Gautham R. Shenoy wrote:
> Hello Mario,
> 
> 
> On Mon, Feb 17, 2025 at 04:06:50PM -0600, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> During resume it's possible the firmware didn't restore the CPPC request MSR
>> but the kernel thinks the values line up. This leads to incorrect performance
>> after resume from suspend.
>>
>> To fix the issue invalidate the cached value at suspend. During resume use
>> the saved values programmed as cached limits.
>>
>> Reported-by: Miroslav Pavleski <miroslav@pavleski.net>
>> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217931
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   drivers/cpufreq/amd-pstate.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index f425fb7ec77d7..12fb63169a24c 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -1611,7 +1611,7 @@ static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
>>   					  max_perf, policy->boost_enabled);
>>   	}
> 
> You can also remove the tracing code from amd_pstate_epp_reenable(), i.e,
> 
> 	if (trace_amd_pstate_epp_perf_enabled()) {
> 		trace_amd_pstate_epp_perf(cpudata->cpu, cpudata->highest_perf,
> 					  cpudata->epp_cached,
> 					  FIELD_GET(AMD_CPPC_MIN_PERF_MASK, cpudata->cppc_req_cached),
> 					  max_perf, policy->boost_enabled);
> 	}
> 
> Since amd_pstate_epp_update_limit() also has the the tracing code.

Yeah; the tracing code gets updated later in the series.
My plan is this commit is a minimal fix, and will go to 6.14, the rest 
will be in 6.15.

> 
> The patch looks good to me otherwise.
> 
> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
> 

Thanks!
Mario Limonciello Feb. 19, 2025, 5:29 p.m. UTC | #5
On 2/19/2025 02:00, Dhananjay Ugwekar wrote:
> On 2/18/2025 3:36 AM, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> Use the perf_to_freq helpers to calculate this on the fly.
> 
> Can we call out the below change (in the -1550,7 +1525,8 code chunk) in
> the commit message, or split it out to different patch
> 
> * Adding the if check in amd_pstate_epp_update_limit()
> 
Yeah will add to commit meesage.

>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> --
> We need one more hyphen here I think i.e."---", otherwise the version info is
> showing up in the commit message
Ack
> 
>> v3:
>>   * Fix calc error for min_freq
>> v2:
>>   * Keep cached limits
>> ---
>>   drivers/cpufreq/amd-pstate-ut.c | 14 +++----
>>   drivers/cpufreq/amd-pstate.c    | 70 +++++++++++----------------------
>>   drivers/cpufreq/amd-pstate.h    |  9 +----
>>   3 files changed, 32 insertions(+), 61 deletions(-)
>>
>> diff --git a/drivers/cpufreq/amd-pstate-ut.c b/drivers/cpufreq/amd-pstate-ut.c
>> index 3a0a380c3590c..445278cf40b61 100644
>> --- a/drivers/cpufreq/amd-pstate-ut.c
>> +++ b/drivers/cpufreq/amd-pstate-ut.c
>> @@ -214,14 +214,14 @@ static void amd_pstate_ut_check_freq(u32 index)
>>   			break;
>>   		cpudata = policy->driver_data;
>>   
>> -		if (!((cpudata->max_freq >= cpudata->nominal_freq) &&
>> +		if (!((policy->cpuinfo.max_freq >= cpudata->nominal_freq) &&
>>   			(cpudata->nominal_freq > cpudata->lowest_nonlinear_freq) &&
>> -			(cpudata->lowest_nonlinear_freq > cpudata->min_freq) &&
>> -			(cpudata->min_freq > 0))) {
>> +			(cpudata->lowest_nonlinear_freq > policy->cpuinfo.min_freq) &&
>> +			(policy->cpuinfo.min_freq > 0))) {
>>   			amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
>>   			pr_err("%s cpu%d max=%d >= nominal=%d > lowest_nonlinear=%d > min=%d > 0, the formula is incorrect!\n",
>> -				__func__, cpu, cpudata->max_freq, cpudata->nominal_freq,
>> -				cpudata->lowest_nonlinear_freq, cpudata->min_freq);
>> +				__func__, cpu, policy->cpuinfo.max_freq, cpudata->nominal_freq,
>> +				cpudata->lowest_nonlinear_freq, policy->cpuinfo.min_freq);
>>   			goto skip_test;
>>   		}
>>   
>> @@ -233,13 +233,13 @@ static void amd_pstate_ut_check_freq(u32 index)
>>   		}
>>   
>>   		if (cpudata->boost_supported) {
>> -			if ((policy->max == cpudata->max_freq) ||
>> +			if ((policy->max == policy->cpuinfo.max_freq) ||
>>   					(policy->max == cpudata->nominal_freq))
>>   				amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_PASS;
>>   			else {
>>   				amd_pstate_ut_cases[index].result = AMD_PSTATE_UT_RESULT_FAIL;
>>   				pr_err("%s cpu%d policy_max=%d should be equal cpu_max=%d or cpu_nominal=%d !\n",
>> -					__func__, cpu, policy->max, cpudata->max_freq,
>> +					__func__, cpu, policy->max, policy->cpuinfo.max_freq,
>>   					cpudata->nominal_freq);
>>   				goto skip_test;
>>   			}
>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>> index 87c605348a3dc..a7c41f915b46e 100644
>> --- a/drivers/cpufreq/amd-pstate.c
>> +++ b/drivers/cpufreq/amd-pstate.c
>> @@ -717,7 +717,7 @@ static int amd_pstate_cpu_boost_update(struct cpufreq_policy *policy, bool on)
>>   	int ret = 0;
>>   
>>   	nominal_freq = READ_ONCE(cpudata->nominal_freq);
>> -	max_freq = READ_ONCE(cpudata->max_freq);
>> +	max_freq = perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf));
>>   
>>   	if (on)
>>   		policy->cpuinfo.max_freq = max_freq;
>> @@ -901,35 +901,26 @@ static u32 amd_pstate_get_transition_latency(unsigned int cpu)
>>   static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>>   {
>>   	int ret;
>> -	u32 min_freq, max_freq;
>> -	u32 nominal_freq, lowest_nonlinear_freq;
>> +	u32 min_freq, nominal_freq, lowest_nonlinear_freq;
>>   	struct cppc_perf_caps cppc_perf;
>>   
>>   	ret = cppc_get_perf_caps(cpudata->cpu, &cppc_perf);
>>   	if (ret)
>>   		return ret;
>>   
>> -	if (quirks && quirks->lowest_freq)
>> -		min_freq = quirks->lowest_freq;
>> -	else
>> -		min_freq = cppc_perf.lowest_freq;
>> -
>>   	if (quirks && quirks->nominal_freq)
>>   		nominal_freq = quirks->nominal_freq;
>>   	else
>>   		nominal_freq = cppc_perf.nominal_freq;
>>   
>> -	min_freq *= 1000;
>>   	nominal_freq *= 1000;
>> -
>>   	WRITE_ONCE(cpudata->nominal_freq, nominal_freq);
>> -	WRITE_ONCE(cpudata->min_freq, min_freq);
>> -
>> -	max_freq = perf_to_freq(cpudata, cpudata->highest_perf);
>> -	lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
>>   
>> -	WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>> -	WRITE_ONCE(cpudata->max_freq, max_freq);
>> +	if (quirks && quirks->lowest_freq) {
> 
> We can avoid the "{" for single line if statement, to keep checkpatch happy

This is one of the reason that I don't like to treat checkpatch as 
gospil.  I added it specficially to avoid back and forth for the new 
patch.  In this case I plan to keep it for that reason.

> 
>> +		min_freq = quirks->lowest_freq;
>> +	} else
>> +		min_freq = cppc_perf.lowest_freq;
>> +	min_freq *= 1000;
> 
> I see that this min_freq part of the code is unchanged, just moved few lines below.
> If the moving is unintended can we avoid it, so that the diff is optimal.

Sure I'll see if I can re-order it a bit to keep it the same.

> 
>>   
>>   	/**
>>   	 * Below values need to be initialized correctly, otherwise driver will fail to load
>> @@ -937,12 +928,15 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>>   	 * lowest_nonlinear_freq is a value between [min_freq, nominal_freq]
>>   	 * Check _CPC in ACPI table objects if any values are incorrect
>>   	 */
>> -	if (min_freq <= 0 || max_freq <= 0 || nominal_freq <= 0 || min_freq > max_freq) {
> 
> Shouldn't we retain these sanity checks for min_freq and max_freq?

Now that we're using the helpers isn't it impossible to end up with 
negative values?

> 
> Thanks,
> Dhananjay
> 
>> -		pr_err("min_freq(%d) or max_freq(%d) or nominal_freq(%d) value is incorrect\n",
>> -			min_freq, max_freq, nominal_freq);
>> +	if (nominal_freq <= 0) {
>> +		pr_err("nominal_freq(%d) value is incorrect\n",
>> +			nominal_freq);
>>   		return -EINVAL;
>>   	}
>>   
>> +	lowest_nonlinear_freq = perf_to_freq(cpudata, cpudata->lowest_nonlinear_perf);
>> +	WRITE_ONCE(cpudata->lowest_nonlinear_freq, lowest_nonlinear_freq);
>> +
>>   	if (lowest_nonlinear_freq <= min_freq || lowest_nonlinear_freq > nominal_freq) {
>>   		pr_err("lowest_nonlinear_freq(%d) value is out of range [min_freq(%d), nominal_freq(%d)]\n",
>>   			lowest_nonlinear_freq, min_freq, nominal_freq);
>> @@ -954,9 +948,9 @@ static int amd_pstate_init_freq(struct amd_cpudata *cpudata)
>>   
>>   static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>   {
>> -	int min_freq, max_freq, ret;
>> -	struct device *dev;
>>   	struct amd_cpudata *cpudata;
>> +	struct device *dev;
>> +	int ret;
>>   
>>   	/*
>>   	 * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
>> @@ -987,17 +981,11 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>   	if (ret)
>>   		goto free_cpudata1;
>>   
>> -	min_freq = READ_ONCE(cpudata->min_freq);
>> -	max_freq = READ_ONCE(cpudata->max_freq);
>> -
>>   	policy->cpuinfo.transition_latency = amd_pstate_get_transition_latency(policy->cpu);
>>   	policy->transition_delay_us = amd_pstate_get_transition_delay_us(policy->cpu);
>>   
>> -	policy->min = min_freq;
>> -	policy->max = max_freq;
>> -
>> -	policy->cpuinfo.min_freq = min_freq;
>> -	policy->cpuinfo.max_freq = max_freq;
>> +	policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> +	policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>>   
>>   	policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>>   
>> @@ -1021,9 +1009,6 @@ static int amd_pstate_cpu_init(struct cpufreq_policy *policy)
>>   		goto free_cpudata2;
>>   	}
>>   
>> -	cpudata->max_limit_freq = max_freq;
>> -	cpudata->min_limit_freq = min_freq;
>> -
>>   	policy->driver_data = cpudata;
>>   
>>   	if (!current_pstate_driver->adjust_perf)
>> @@ -1081,14 +1066,10 @@ static int amd_pstate_cpu_suspend(struct cpufreq_policy *policy)
>>   static ssize_t show_amd_pstate_max_freq(struct cpufreq_policy *policy,
>>   					char *buf)
>>   {
>> -	int max_freq;
>>   	struct amd_cpudata *cpudata = policy->driver_data;
>>   
>> -	max_freq = READ_ONCE(cpudata->max_freq);
>> -	if (max_freq < 0)
>> -		return max_freq;
>>   
>> -	return sysfs_emit(buf, "%u\n", max_freq);
>> +	return sysfs_emit(buf, "%u\n", perf_to_freq(cpudata, READ_ONCE(cpudata->highest_perf)));
>>   }
>>   
>>   static ssize_t show_amd_pstate_lowest_nonlinear_freq(struct cpufreq_policy *policy,
>> @@ -1446,10 +1427,10 @@ static bool amd_pstate_acpi_pm_profile_undefined(void)
>>   
>>   static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>   {
>> -	int min_freq, max_freq, ret;
>>   	struct amd_cpudata *cpudata;
>>   	struct device *dev;
>>   	u64 value;
>> +	int ret;
>>   
>>   	/*
>>   	 * Resetting PERF_CTL_MSR will put the CPU in P0 frequency,
>> @@ -1480,19 +1461,13 @@ static int amd_pstate_epp_cpu_init(struct cpufreq_policy *policy)
>>   	if (ret)
>>   		goto free_cpudata1;
>>   
>> -	min_freq = READ_ONCE(cpudata->min_freq);
>> -	max_freq = READ_ONCE(cpudata->max_freq);
>> -
>> -	policy->cpuinfo.min_freq = min_freq;
>> -	policy->cpuinfo.max_freq = max_freq;
>> +	policy->cpuinfo.min_freq = policy->min = perf_to_freq(cpudata, cpudata->lowest_perf);
>> +	policy->cpuinfo.max_freq = policy->max = perf_to_freq(cpudata, cpudata->highest_perf);
>>   	/* It will be updated by governor */
>>   	policy->cur = policy->cpuinfo.min_freq;
>>   
>>   	policy->driver_data = cpudata;
>>   
>> -	policy->min = policy->cpuinfo.min_freq;
>> -	policy->max = policy->cpuinfo.max_freq;
>> -
>>   	policy->boost_enabled = READ_ONCE(cpudata->boost_supported);
>>   
>>   	/*
>> @@ -1550,7 +1525,8 @@ static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>   	struct amd_cpudata *cpudata = policy->driver_data;
>>   	u8 epp;
>>   
>> -	amd_pstate_update_min_max_limit(policy);
>> +	if (policy->min != cpudata->min_limit_freq || policy->max != cpudata->max_limit_freq)
>> +		amd_pstate_update_min_max_limit(policy);
>>   
>>   	if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>   		epp = 0;
>> diff --git a/drivers/cpufreq/amd-pstate.h b/drivers/cpufreq/amd-pstate.h
>> index 19d405c6d805e..0149933692458 100644
>> --- a/drivers/cpufreq/amd-pstate.h
>> +++ b/drivers/cpufreq/amd-pstate.h
>> @@ -46,8 +46,6 @@ struct amd_aperf_mperf {
>>    * @max_limit_perf: Cached value of the performance corresponding to policy->max
>>    * @min_limit_freq: Cached value of policy->min (in khz)
>>    * @max_limit_freq: Cached value of policy->max (in khz)
>> - * @max_freq: the frequency (in khz) that mapped to highest_perf
>> - * @min_freq: the frequency (in khz) that mapped to lowest_perf
>>    * @nominal_freq: the frequency (in khz) that mapped to nominal_perf
>>    * @lowest_nonlinear_freq: the frequency (in khz) that mapped to lowest_nonlinear_perf
>>    * @cur: Difference of Aperf/Mperf/tsc count between last and current sample
>> @@ -77,11 +75,8 @@ struct amd_cpudata {
>>   	u8	prefcore_ranking;
>>   	u8	min_limit_perf;
>>   	u8	max_limit_perf;
>> -	u32     min_limit_freq;
>> -	u32     max_limit_freq;
>> -
>> -	u32	max_freq;
>> -	u32	min_freq;
>> +	u32	min_limit_freq;
>> +	u32	max_limit_freq;
>>   	u32	nominal_freq;
>>   	u32	lowest_nonlinear_freq;
>>   
>
Mario Limonciello Feb. 19, 2025, 6:05 p.m. UTC | #6
On 2/19/2025 09:25, Gautham R. Shenoy wrote:
> On Mon, Feb 17, 2025 at 04:07:06PM -0600, Mario Limonciello wrote:
>> From: Mario Limonciello <mario.limonciello@amd.com>
>>
>> The CPPC enable register is configured as "write once".  That is
>> any future writes don't actually do anything.
>>
>> Because of this, all the cleanup paths that currently exist for
>> CPPC disable are non-effective.
>>
>> Rework CPPC enable to only enable after all the CAP registers have
>> been read to avoid enabling CPPC on CPUs with invalid _CPC or
>> unpopulated MSRs.
>>
>> As the register is write once, remove all cleanup paths as well.
>>
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>> v3:
>>   * Fixup for suspend/resume issue
>> ---
> [..snip..]
> 
>>   
>> -static int shmem_cppc_enable(bool enable)
>> +static int shmem_cppc_enable(struct cpufreq_policy *policy)
>>   {
>> -	int cpu, ret = 0;
>> +	struct amd_cpudata *cpudata = policy->driver_data;
>>   	struct cppc_perf_ctrls perf_ctrls;
>> +	int ret;
>>   
>> -	if (enable == cppc_enabled)
>> -		return 0;
>> +	ret = cppc_set_enable(cpudata->cpu, 1);
>> +	if (ret)
>> +		return ret;
>>   
>> -	for_each_present_cpu(cpu) {
>> -		ret = cppc_set_enable(cpu, enable);
>> +	/* Enable autonomous mode for EPP */
>> +	if (cppc_state == AMD_PSTATE_ACTIVE) {
>> +		/* Set desired perf as zero to allow EPP firmware control */
>> +		perf_ctrls.desired_perf = 0;
>> +		ret = cppc_set_perf(cpudata->cpu, &perf_ctrls);
>>   		if (ret)
>>   			return ret;
> 
> We don't need the if condition here. There is nothing following this
> inside the if block and the function return "ret" soon after coming
> out of this if block.
> 
> 
>> -
>> -		/* Enable autonomous mode for EPP */
>> -		if (cppc_state == AMD_PSTATE_ACTIVE) {
>> -			/* Set desired perf as zero to allow EPP firmware control */
>> -			perf_ctrls.desired_perf = 0;
>> -			ret = cppc_set_perf(cpu, &perf_ctrls);
>> -			if (ret)
>> -				return ret;
>> -		}
>>   	}
>>   
>> -	cppc_enabled = enable;
>>   	return ret;
>>   }
>>   
>>   DEFINE_STATIC_CALL(amd_pstate_cppc_enable, msr_cppc_enable);
>>   
>> -static inline int amd_pstate_cppc_enable(bool enable)
> [..snip..]
> 
>>   
>> -static int amd_pstate_epp_reenable(struct cpufreq_policy *policy)
>> +static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>>   {
>>   	struct amd_cpudata *cpudata = policy->driver_data;
>>   	union perf_cached perf = READ_ONCE(cpudata->perf);
>>   	int ret;
>>   
>> -	ret = amd_pstate_cppc_enable(true);
>> -	if (ret)
>> -		pr_err("failed to enable amd pstate during resume, return %d\n", ret);
>> -
>> -
>> -	return amd_pstate_epp_update_limit(policy);
>> -}
>> +	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
>>   
>> -static int amd_pstate_epp_cpu_online(struct cpufreq_policy *policy)
>> -{
>> -	struct amd_cpudata *cpudata = policy->driver_data;
>> -	int ret;
>> +	ret = amd_pstate_cppc_enable(policy);
>> +	if (ret)
>> +		return ret;
>>   
>> -	pr_debug("AMD CPU Core %d going online\n", cpudata->cpu);
>>   
>> -	ret = amd_pstate_epp_reenable(policy);
>> +	ret = amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> 
> Previously, when a CPU came online, the callpath would be
> amd_pstate_epp_cpu_online(policy)
> --> amd_pstate_epp_reenable(policy)
>       --> amd_pstate_epp_update_limit(policy)
>            --> amd_pstate_epp_update_limit(policy)
> 
> which reevaluates the min_perf_limit and max_perf_limit based on
> policy->min and policy->max and then calls
> 
>        amd_pstate_update_perf(policy, min_limit_perf, 0, max_limit_perf, epp, false)
> 
> With this patch, we call
> 
>        amd_pstate_update_perf(policy, 0, 0, perf.highest_perf, cpudata->epp_cached, false);
> 
> which would set CPPC.min_perf to 0.
> 
> I guess this should be ok since cpufreq_online() would eventually call
> amd_pstate_verify() and amd_pstate_epp_set_policy() which should
> re-initialize the the min_limit_perf and max_limit_perf. Though I
> haven't verified if the behaviour changes with this patch when the CPU
> is offlined and brought back online.

I'll double check with removing amd_pstate_update_perf() call from 
amd_pstate_epp_cpu_online().  I think it will be clearer to let it get 
set from the amd_pstate_epp_set_policy() call.