Message ID | 041258d65883df964890249a24d2a4788c419304.1547078153.git.amit.kucheria@linaro.org |
---|---|
State | New |
Headers | show |
Series | None | expand |
Quoting Amit Kucheria (2019-01-09 16:00:55) > 75 degrees is too aggressive for throttling the CPU. After speaking to > Qualcomm engineers, increase it to 95 degrees. > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > --- > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- Is the plan that these are some defaults that would be adjusted by board variants? Just curious why we have anything in here and don't punt it all to each board dts file.
On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote: > Hi Amit, > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > Qualcomm engineers, increase it to 95 degrees. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > index c27cbd3bcb0a..29e823b0caf4 100644 > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > @@ -1692,7 +1692,7 @@ > > > > trips { > > cpu_alert0: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1713,7 +1713,7 @@ > > > > trips { > > cpu_alert1: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1734,7 +1734,7 @@ > > > > trips { > > cpu_alert2: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1755,7 +1755,7 @@ > > > > trips { > > cpu_alert3: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1776,7 +1776,7 @@ > > > > trips { > > cpu_alert4: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1797,7 +1797,7 @@ > > > > trips { > > cpu_alert5: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1818,7 +1818,7 @@ > > > > trips { > > cpu_alert6: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > @@ -1839,7 +1839,7 @@ > > > > trips { > > cpu_alert7: trip0 { > > - temperature = <75000>; > > + temperature = <95000>; > > hysteresis = <2000>; > > type = "passive"; > > }; > > The change itself looks good to me, however I wonder if it would be > worth to eliminate redundancy and merge the current 8 thermal zones > into 2, one for the Silver and one for the Gold cluster (as done by > http://crrev.com/c/1381752). There is a single cooling device for > each cluster, so it's not clear to me if there is any gain from having > a separate thermal zone for each CPU. If it is important to monitor > the temperatures of the individual cores this can still be done by > configuring the thermal zone of the cluster with multiple thermal > sensors. I see your idea is to have a cooling device per CPU ("arm64: dts: sdm845: wireup the thermal trip points to cpufreq" / https://lore.kernel.org/patchwork/patch/1030742/), however that doesn't work as intended. Only two cpufreq 'devices' are created, one for CPU0 and one for CPU4. In consequence cpufreq->ready() only runs for these cores and only two cooling devices are registered. Since the cores of a cluster all run at the same frequency I also doubt if having multiple cooling devices would bring any benefits. Cheers Matthias
On Fri, Jan 11, 2019 at 01:15:09AM +0530, Amit Kucheria wrote: > On Thu, Jan 10, 2019 at 7:45 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > > > On Wed, Jan 09, 2019 at 05:15:33PM -0800, Matthias Kaehlcke wrote: > > > Hi Amit, > > > > > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > > > Qualcomm engineers, increase it to 95 degrees. > > > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > > --- > > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > > index c27cbd3bcb0a..29e823b0caf4 100644 > > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > > @@ -1692,7 +1692,7 @@ > > > > > > > > trips { > > > > cpu_alert0: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1713,7 +1713,7 @@ > > > > > > > > trips { > > > > cpu_alert1: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1734,7 +1734,7 @@ > > > > > > > > trips { > > > > cpu_alert2: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1755,7 +1755,7 @@ > > > > > > > > trips { > > > > cpu_alert3: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1776,7 +1776,7 @@ > > > > > > > > trips { > > > > cpu_alert4: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1797,7 +1797,7 @@ > > > > > > > > trips { > > > > cpu_alert5: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1818,7 +1818,7 @@ > > > > > > > > trips { > > > > cpu_alert6: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > @@ -1839,7 +1839,7 @@ > > > > > > > > trips { > > > > cpu_alert7: trip0 { > > > > - temperature = <75000>; > > > > + temperature = <95000>; > > > > hysteresis = <2000>; > > > > type = "passive"; > > > > }; > > > > > > The change itself looks good to me, however I wonder if it would be > > > worth to eliminate redundancy and merge the current 8 thermal zones > > > into 2, one for the Silver and one for the Gold cluster (as done by > > > http://crrev.com/c/1381752). There is a single cooling device for > > > each cluster, so it's not clear to me if there is any gain from having > > > a separate thermal zone for each CPU. If it is important to monitor > > > the temperatures of the individual cores this can still be done by > > > configuring the thermal zone of the cluster with multiple thermal > > > sensors. > > > > I see your idea is to have a cooling device per CPU ("arm64: dts: > > sdm845: wireup the thermal trip points to cpufreq" / > > https://lore.kernel.org/patchwork/patch/1030742/), however that > > doesn't work as intended. Only two cpufreq 'devices' are created, > > one for CPU0 and one for CPU4. In consequence cpufreq->ready() only > > runs for these cores and only two cooling devices are > > registered. Since the cores of a cluster all run at the same > > frequency I also doubt if having multiple cooling devices would > > bring any benefits. > > I actually only intended for two cooling devices - one for each > frequency domain. I'll clarify it better in the patch description. Viresh helped me understand that we currently need to add cooling device entries for all CPUs to the DT, even though at most one will be active per freq domain at any time (I wonder if this could be changed though). Independently of the cooling devices, is there any value in having a thermal zone for each CPU instead of having only one per freq domain? Thanks Matthias
On Thu, Jan 10, 2019 at 5:59 AM Stephen Boyd <swboyd@chromium.org> wrote: > > Quoting Amit Kucheria (2019-01-09 16:00:55) > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > Qualcomm engineers, increase it to 95 degrees. > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > --- > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > Is the plan that these are some defaults that would be adjusted by board > variants? Just curious why we have anything in here and don't punt it > all to each board dts file. These values are meant to be safe values for silicon operation as characterised[1] by the HW team. So I'd consider them a more than just default values. It also gives future boards a safe starting point. IMHO it would be better to have the characterized values merged upstream and if really required, those values can be tweaked in the board-specific DTS. [1] Caveat: The characterisation probably focused on mobile devices and might change depending on form-factor, active cooling and heat dissipation design but those differences should only impact the skin temperature, not the operation of the SoC itself.
On 10-01-19, 12:00, Matthias Kaehlcke wrote: > Viresh helped me understand that we currently need to add cooling > device entries for all CPUs to the DT, even though at most one will be > active per freq domain at any time (I wonder if this could be changed > though). Actually we were only adding cooling-cells in CPU0 until now and I fixed that, so that is going to stay :) The idea is that the hardware should be described properly and not partially. Even if all the CPUs are part of the same freq-domain, they are all capable of being a cooling device here and the DT should describe that. Kernel will ofcourse create a single cooling device. -- viresh
On Fri, Jan 11, 2019 at 03:54:23PM +0530, Amit Kucheria wrote: > On Thu, Jan 10, 2019 at 6:45 AM Matthias Kaehlcke <mka@chromium.org> wrote: > > > > Hi Amit, > > > > On Thu, Jan 10, 2019 at 05:30:55AM +0530, Amit Kucheria wrote: > > > 75 degrees is too aggressive for throttling the CPU. After speaking to > > > Qualcomm engineers, increase it to 95 degrees. > > > > > > Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> > > > --- > > > arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- > > > 1 file changed, 8 insertions(+), 8 deletions(-) > > > > > > diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > index c27cbd3bcb0a..29e823b0caf4 100644 > > > --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi > > > @@ -1692,7 +1692,7 @@ > > > > > > trips { > > > cpu_alert0: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1713,7 +1713,7 @@ > > > > > > trips { > > > cpu_alert1: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1734,7 +1734,7 @@ > > > > > > trips { > > > cpu_alert2: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1755,7 +1755,7 @@ > > > > > > trips { > > > cpu_alert3: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1776,7 +1776,7 @@ > > > > > > trips { > > > cpu_alert4: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1797,7 +1797,7 @@ > > > > > > trips { > > > cpu_alert5: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1818,7 +1818,7 @@ > > > > > > trips { > > > cpu_alert6: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > @@ -1839,7 +1839,7 @@ > > > > > > trips { > > > cpu_alert7: trip0 { > > > - temperature = <75000>; > > > + temperature = <95000>; > > > hysteresis = <2000>; > > > type = "passive"; > > > }; > > > > The change itself looks good to me, however I wonder if it would be > > worth to eliminate redundancy and merge the current 8 thermal zones > > into 2, one for the Silver and one for the Gold cluster (as done by > > http://crrev.com/c/1381752). There is a single cooling device for > > each cluster, so it's not clear to me if there is any gain from having > > a separate thermal zone for each CPU. If it is important to monitor > > the temperatures of the individual cores this can still be done by > > configuring the thermal zone of the cluster with multiple thermal > > sensors. > > Reducing the number of thermal zones to 2 (by grouping 4 sensors per > zone) is not possible due a limitation of the thermal framework[1]. It > is something that we want to address. Previous attempts to fix this > were rejected for various reasons. Eduardo was going to share a way to > have more flexible mapping between sensors and zones after discussions > at LPC. I wasn't aware of this limitation, thanks for the clarification! With this I understand that for now we indeed need the 8 thermal zones with all the redundant information :( > <nag> Eduardo, do you have anything we can review? </nag> :-) > > Having said that, we'll need some aggregation functions when we add > multiple sensors to a zone (e.g. max, mean) to reflect the zone. This > will lose information about hotspots and prevent things like idle > injection on a particular CPU that is causing most of the heat in the > aggregated zone. So IMHO, it might be useful to have information about > the hotspots (i.e TZ per sensor) and aggregated values (ambient > temperature) that can be fed to the thermal policy. Ok, it seems for now we need the 8 thermal zones in any case, when support for multiple sensors becomes available we can evaluate whether it's worth to change that or not. Cheers Matthias
diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi index c27cbd3bcb0a..29e823b0caf4 100644 --- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -1692,7 +1692,7 @@ trips { cpu_alert0: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1713,7 +1713,7 @@ trips { cpu_alert1: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1734,7 +1734,7 @@ trips { cpu_alert2: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1755,7 +1755,7 @@ trips { cpu_alert3: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1776,7 +1776,7 @@ trips { cpu_alert4: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1797,7 +1797,7 @@ trips { cpu_alert5: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1818,7 +1818,7 @@ trips { cpu_alert6: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; }; @@ -1839,7 +1839,7 @@ trips { cpu_alert7: trip0 { - temperature = <75000>; + temperature = <95000>; hysteresis = <2000>; type = "passive"; };
75 degrees is too aggressive for throttling the CPU. After speaking to Qualcomm engineers, increase it to 95 degrees. Signed-off-by: Amit Kucheria <amit.kucheria@linaro.org> --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) -- 2.17.1