Message ID | 20231123121433.12089-1-sumitg@nvidia.com |
---|---|
State | New |
Headers | show |
Series | [v7] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 | expand |
On 23/11/23 19:30, Sudeep Holla wrote: > External email: Use caution opening links or attachments > > > On Thu, Nov 23, 2023 at 05:44:33PM +0530, Sumit Gupta wrote: >> From: Srikar Srimath Tirumala <srikars@nvidia.com> >> >> Current implementation of processor_thermal performs software throttling >> in fixed steps of "20%" which can be too coarse for some platforms. >> We observed some performance gain after reducing the throttle percentage. >> Change the CPUFREQ thermal reduction percentage and maximum thermal steps >> to be configurable. Also, update the default values of both for Nvidia >> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" >> and accordingly the maximum number of thermal steps are increased as they >> are derived from the reduction percentage. >> >> Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> >> Co-developed-by: Sumit Gupta <sumitg@nvidia.com> >> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >> --- >> >> Sending this patch separately as the other patch in the series is >> applied by Rafael in v6[1]. Revision history before this version is >> in the cover letter of v6[1]. >> >> Please review and provide ACK if looks fine. >> > > For arm64 specific changes(a minor nit below though), > > Acked-by: Sudeep Holla <sudeep.holla@arm.com> > > > [...] > Thank you. >> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c >> new file mode 100644 >> index 000000000000..d524f2cd6044 >> --- /dev/null >> +++ b/drivers/acpi/arm64/thermal_cpufreq.c >> @@ -0,0 +1,20 @@ >> +// SPDX-License-Identifier: GPL-2.0-only >> +#include <linux/acpi.h> >> + >> +#include "../internal.h" >> + >> +#define SMCCC_SOC_ID_T241 0x036b0241 >> + > > [nit] We really need to find better place to define this globally and not > locally at each usage site like this. We already have it in GICv3 driver. > But that can come as a cleanup later if it causes issue for merging this > change. > > -- > Regards, > Sudeep Sure, will check and send a separate patch later on top. Best Regards, Sumit Gupta
>>> From: Srikar Srimath Tirumala <srikars@nvidia.com> >>> >>> Current implementation of processor_thermal performs software throttling >>> in fixed steps of "20%" which can be too coarse for some platforms. >>> We observed some performance gain after reducing the throttle >>> percentage. >>> Change the CPUFREQ thermal reduction percentage and maximum thermal >>> steps >>> to be configurable. Also, update the default values of both for Nvidia >>> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to >>> "5%" >>> and accordingly the maximum number of thermal steps are increased as >>> they >>> are derived from the reduction percentage. >>> >>> Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> >>> Co-developed-by: Sumit Gupta <sumitg@nvidia.com> >>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> >>> --- >>> >>> Sending this patch separately as the other patch in the series is >>> applied by Rafael in v6[1]. Revision history before this version is >>> in the cover letter of v6[1]. >>> >>> Please review and provide ACK if looks fine. >>> >> >> For arm64 specific changes(a minor nit below though), >> >> Acked-by: Sudeep Holla <sudeep.holla@arm.com> >> >> >> [...] >> >>> diff --git a/drivers/acpi/arm64/thermal_cpufreq.c >>> b/drivers/acpi/arm64/thermal_cpufreq.c >>> new file mode 100644 >>> index 000000000000..d524f2cd6044 >>> --- /dev/null >>> +++ b/drivers/acpi/arm64/thermal_cpufreq.c >>> @@ -0,0 +1,20 @@ >>> +// SPDX-License-Identifier: GPL-2.0-only >>> +#include <linux/acpi.h> >>> + >>> +#include "../internal.h" >>> + >>> +#define SMCCC_SOC_ID_T241 0x036b0241 >>> + >> >> [nit] We really need to find better place to define this globally and not >> locally at each usage site like this. We already have it in GICv3 driver. >> But that can come as a cleanup later if it causes issue for merging this >> change. > > Agreed. > > Looks good to me as well. > > Acked-by: Hanjun Guo <guohanjun@huawei.com> > Hi Rafael, If the change seems fine, could you please apply this patch? Thanks, Sumit
On Thu, Nov 23, 2023 at 3:32 PM Hanjun Guo <guohanjun@huawei.com> wrote: > > On 2023/11/23 22:00, Sudeep Holla wrote: > > On Thu, Nov 23, 2023 at 05:44:33PM +0530, Sumit Gupta wrote: > >> From: Srikar Srimath Tirumala <srikars@nvidia.com> > >> > >> Current implementation of processor_thermal performs software throttling > >> in fixed steps of "20%" which can be too coarse for some platforms. > >> We observed some performance gain after reducing the throttle percentage. > >> Change the CPUFREQ thermal reduction percentage and maximum thermal steps > >> to be configurable. Also, update the default values of both for Nvidia > >> Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" > >> and accordingly the maximum number of thermal steps are increased as they > >> are derived from the reduction percentage. > >> > >> Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> > >> Co-developed-by: Sumit Gupta <sumitg@nvidia.com> > >> Signed-off-by: Sumit Gupta <sumitg@nvidia.com> > >> --- > >> > >> Sending this patch separately as the other patch in the series is > >> applied by Rafael in v6[1]. Revision history before this version is > >> in the cover letter of v6[1]. > >> > >> Please review and provide ACK if looks fine. > >> > > > > For arm64 specific changes(a minor nit below though), > > > > Acked-by: Sudeep Holla <sudeep.holla@arm.com> Applied as 6.8 material, thanks!
diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile index 143debc1ba4a..726944648c9b 100644 --- a/drivers/acpi/arm64/Makefile +++ b/drivers/acpi/arm64/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o obj-$(CONFIG_ACPI_APMT) += apmt.o obj-$(CONFIG_ARM_AMBA) += amba.o obj-y += dma.o init.o +obj-y += thermal_cpufreq.o diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c new file mode 100644 index 000000000000..d524f2cd6044 --- /dev/null +++ b/drivers/acpi/arm64/thermal_cpufreq.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <linux/acpi.h> + +#include "../internal.h" + +#define SMCCC_SOC_ID_T241 0x036b0241 + +int acpi_arch_thermal_cpufreq_pctg(void) +{ + s32 soc_id = arm_smccc_get_soc_id_version(); + + /* + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and + * reduce the CPUFREQ Thermal reduction percentage to 5%. + */ + if (soc_id == SMCCC_SOC_ID_T241) + return 5; + + return 0; +} diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h index 8cd2d15dec11..bcef1ce1aed9 100644 --- a/drivers/acpi/internal.h +++ b/drivers/acpi/internal.h @@ -90,6 +90,15 @@ int acpi_passive_trip_temp(struct acpi_device *adev, int *ret_temp); int acpi_hot_trip_temp(struct acpi_device *adev, int *ret_temp); int acpi_critical_trip_temp(struct acpi_device *adev, int *ret_temp); +#ifdef CONFIG_ARM64 +int acpi_arch_thermal_cpufreq_pctg(void); +#else +static inline int acpi_arch_thermal_cpufreq_pctg(void) +{ + return 0; +} +#endif + /* -------------------------------------------------------------------------- Device Node Initialization / Removal -------------------------------------------------------------------------- */ diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index b7c6287eccca..1219adb11ab9 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -17,6 +17,8 @@ #include <acpi/processor.h> #include <linux/uaccess.h> +#include "internal.h" + #ifdef CONFIG_CPU_FREQ /* If a passive cooling situation is detected, primarily CPUfreq is used, as it @@ -26,12 +28,21 @@ */ #define CPUFREQ_THERMAL_MIN_STEP 0 -#define CPUFREQ_THERMAL_MAX_STEP 3 -static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); +static int cpufreq_thermal_max_step __read_mostly = 3; + +/* + * Minimum throttle percentage for processor_thermal cooling device. + * The processor_thermal driver uses it to calculate the percentage amount by + * which cpu frequency must be reduced for each cooling state. This is also used + * to calculate the maximum number of throttling steps or cooling states. + */ +static int cpufreq_thermal_reduction_pctg __read_mostly = 20; -#define reduction_pctg(cpu) \ - per_cpu(cpufreq_thermal_reduction_pctg, phys_package_first_cpu(cpu)) +static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_step); + +#define reduction_step(cpu) \ + per_cpu(cpufreq_thermal_reduction_step, phys_package_first_cpu(cpu)) /* * Emulate "per package data" using per cpu data (which should really be @@ -71,7 +82,7 @@ static int cpufreq_get_max_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return CPUFREQ_THERMAL_MAX_STEP; + return cpufreq_thermal_max_step; } static int cpufreq_get_cur_state(unsigned int cpu) @@ -79,7 +90,7 @@ static int cpufreq_get_cur_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return reduction_pctg(cpu); + return reduction_step(cpu); } static int cpufreq_set_cur_state(unsigned int cpu, int state) @@ -92,7 +103,7 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!cpu_has_cpufreq(cpu)) return 0; - reduction_pctg(cpu) = state; + reduction_step(cpu) = state; /* * Update all the CPUs in the same package because they all @@ -113,7 +124,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!policy) return -EINVAL; - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; + max_freq = (policy->cpuinfo.max_freq * + (100 - reduction_step(i) * cpufreq_thermal_reduction_pctg)) / 100; cpufreq_cpu_put(policy); @@ -126,10 +138,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) return 0; } +static void acpi_thermal_cpufreq_config(void) +{ + int cpufreq_pctg = acpi_arch_thermal_cpufreq_pctg(); + + if (!cpufreq_pctg) + return; + + cpufreq_thermal_reduction_pctg = cpufreq_pctg; + + /* + * Derive the MAX_STEP from minimum throttle percentage so that the reduction + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that + * the CPU performance doesn't become 0. + */ + cpufreq_thermal_max_step = (100 / cpufreq_pctg) - 2; +} + void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) { unsigned int cpu; + acpi_thermal_cpufreq_config(); + for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); int ret; @@ -190,7 +221,7 @@ static int acpi_processor_max_state(struct acpi_processor *pr) /* * There exists four states according to - * cpufreq_thermal_reduction_pctg. 0, 1, 2, 3 + * cpufreq_thermal_reduction_step. 0, 1, 2, 3 */ max_state += cpufreq_get_max_state(pr->id); if (pr->flags.throttling)