Message ID | 20201104234427.26477-1-digetx@gmail.com |
---|---|
Headers | show |
Series | Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs | expand |
On Thu, Nov 05, 2020 at 02:43:57AM +0300, Dmitry Osipenko wrote: > Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces > power consumption and heating of the Tegra chips. Tegra SoC has multiple > hardware units which belong to a core power domain of the SoC and share > the core voltage. The voltage must be selected in accordance to a minimum > requirement of every core hardware unit. [...] Just looked briefly through the series - it looks like there is a lot of code duplication in *_init_opp_table() functions. Could this be made more generic / data-driven? Best Regards Michał Mirosław
+ Viresh On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote: > > Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces > power consumption and heating of the Tegra chips. Tegra SoC has multiple > hardware units which belong to a core power domain of the SoC and share > the core voltage. The voltage must be selected in accordance to a minimum > requirement of every core hardware unit. > > The minimum core voltage requirement depends on: > > 1. Clock enable state of a hardware unit. > 2. Clock frequency. > 3. Unit's internal idling/active state. > > This series is tested on Acer A500 (T20), AC100 (T20), Nexus 7 (T30) and > Ouya (T30) devices. I also added voltage scaling to the Ventana (T20) and > Cardhu (T30) boards which are tested by NVIDIA's CI farm. Tegra30 is now up > to 5C cooler on Nexus 7 and stays cool on Ouya (instead of becoming burning > hot) while system is idling. It should be possible to improve this further > by implementing a more advanced power management features for the kernel > drivers. > > The DVFS support is opt-in for all boards, meaning that older DTBs will > continue to work like they did it before this series. It should be possible > to easily add the core voltage scaling support for Tegra114+ SoCs based on > this grounding work later on, if anyone will want to implement it. > > WARNING(!) This series is made on top of the memory interconnect patches > which are currently under review [1]. The Tegra EMC driver > and devicetree-related patches need to be applied on top of > the ICC series. > > [1] https://patchwork.ozlabs.org/project/linux-tegra/list/?series=212196 > > Dmitry Osipenko (30): > dt-bindings: host1x: Document OPP and voltage regulator properties > dt-bindings: mmc: tegra: Document OPP and voltage regulator properties > dt-bindings: pwm: tegra: Document OPP and voltage regulator properties > media: dt: bindings: tegra-vde: Document OPP and voltage regulator > properties > dt-binding: usb: ci-hdrc-usb2: Document OPP and voltage regulator > properties > dt-bindings: usb: tegra-ehci: Document OPP and voltage regulator > properties > soc/tegra: Add sync state API > soc/tegra: regulators: Support Tegra SoC device sync state API > soc/tegra: regulators: Fix lockup when voltage-spread is out of range > regulator: Allow skipping disabled regulators in > regulator_check_consumers() > drm/tegra: dc: Support OPP and SoC core voltage scaling > drm/tegra: gr2d: Correct swapped device-tree compatibles > drm/tegra: gr2d: Support OPP and SoC core voltage scaling > drm/tegra: gr3d: Support OPP and SoC core voltage scaling > drm/tegra: hdmi: Support OPP and SoC core voltage scaling > gpu: host1x: Support OPP and SoC core voltage scaling > mmc: sdhci-tegra: Support OPP and core voltage scaling > pwm: tegra: Support OPP and core voltage scaling > media: staging: tegra-vde: Support OPP and SoC core voltage scaling > usb: chipidea: tegra: Support OPP and SoC core voltage scaling > usb: host: ehci-tegra: Support OPP and SoC core voltage scaling > memory: tegra20-emc: Support Tegra SoC device state syncing > memory: tegra30-emc: Support Tegra SoC device state syncing > ARM: tegra: Add OPP tables for Tegra20 peripheral devices > ARM: tegra: Add OPP tables for Tegra30 peripheral devices > ARM: tegra: ventana: Add voltage supplies to DVFS-capable devices > ARM: tegra: paz00: Add voltage supplies to DVFS-capable devices > ARM: tegra: acer-a500: Add voltage supplies to DVFS-capable devices > ARM: tegra: cardhu-a04: Add voltage supplies to DVFS-capable devices > ARM: tegra: nexus7: Add voltage supplies to DVFS-capable devices > > .../display/tegra/nvidia,tegra20-host1x.txt | 56 +++ > .../bindings/media/nvidia,tegra-vde.txt | 12 + > .../bindings/mmc/nvidia,tegra20-sdhci.txt | 12 + > .../bindings/pwm/nvidia,tegra20-pwm.txt | 13 + > .../devicetree/bindings/usb/ci-hdrc-usb2.txt | 4 + > .../bindings/usb/nvidia,tegra20-ehci.txt | 2 + > .../boot/dts/tegra20-acer-a500-picasso.dts | 30 +- > arch/arm/boot/dts/tegra20-paz00.dts | 40 +- > .../arm/boot/dts/tegra20-peripherals-opp.dtsi | 386 ++++++++++++++++ > arch/arm/boot/dts/tegra20-ventana.dts | 65 ++- > arch/arm/boot/dts/tegra20.dtsi | 14 + > .../tegra30-asus-nexus7-grouper-common.dtsi | 23 + > arch/arm/boot/dts/tegra30-cardhu-a04.dts | 44 ++ > .../arm/boot/dts/tegra30-peripherals-opp.dtsi | 415 ++++++++++++++++++ > arch/arm/boot/dts/tegra30.dtsi | 13 + > drivers/gpu/drm/tegra/Kconfig | 1 + > drivers/gpu/drm/tegra/dc.c | 138 +++++- > drivers/gpu/drm/tegra/dc.h | 5 + > drivers/gpu/drm/tegra/gr2d.c | 140 +++++- > drivers/gpu/drm/tegra/gr3d.c | 136 ++++++ > drivers/gpu/drm/tegra/hdmi.c | 63 ++- > drivers/gpu/host1x/Kconfig | 1 + > drivers/gpu/host1x/dev.c | 87 ++++ > drivers/memory/tegra/tegra20-emc.c | 8 +- > drivers/memory/tegra/tegra30-emc.c | 8 +- > drivers/mmc/host/Kconfig | 1 + > drivers/mmc/host/sdhci-tegra.c | 70 ++- > drivers/pwm/Kconfig | 1 + > drivers/pwm/pwm-tegra.c | 84 +++- > drivers/regulator/core.c | 12 +- > .../soc/samsung/exynos-regulator-coupler.c | 2 +- > drivers/soc/tegra/common.c | 152 ++++++- > drivers/soc/tegra/regulators-tegra20.c | 25 +- > drivers/soc/tegra/regulators-tegra30.c | 30 +- > drivers/staging/media/tegra-vde/Kconfig | 1 + > drivers/staging/media/tegra-vde/vde.c | 127 ++++++ > drivers/staging/media/tegra-vde/vde.h | 1 + > drivers/usb/chipidea/Kconfig | 1 + > drivers/usb/chipidea/ci_hdrc_tegra.c | 79 ++++ > drivers/usb/host/Kconfig | 1 + > drivers/usb/host/ehci-tegra.c | 79 ++++ > include/linux/regulator/coupler.h | 6 +- > include/soc/tegra/common.h | 22 + > 43 files changed, 2360 insertions(+), 50 deletions(-) > > -- > 2.27.0 > I need some more time to review this, but just a quick check found a few potential issues... The "core-supply", that you specify as a regulator for each controller's device node, is not the way we describe power domains. Instead, it seems like you should register a power-domain provider (with the help of genpd) and implement the ->set_performance_state() callback for it. Each device node should then be hooked up to this power-domain, rather than to a "core-supply". For DT bindings, please have a look at Documentation/devicetree/bindings/power/power-domain.yaml and Documentation/devicetree/bindings/power/power_domain.txt. In regards to the "sync state" problem (preventing to change performance states until all consumers have been attached), this can then be managed by the genpd provider driver instead. Kind regards Uffe
On Thu, Nov 5, 2020 at 5:15 AM Dmitry Osipenko <digetx@gmail.com> wrote: > diff --git a/drivers/mmc/host/sdhci-tegra.c b/drivers/mmc/host/sdhci-tegra.c > +static void sdhci_tegra_deinit_opp_table(void *data) > +{ > + struct device *dev = data; > + struct opp_table *opp_table; > + > + opp_table = dev_pm_opp_get_opp_table(dev); So you need to get an OPP table to put one :) You need to save the pointer returned by dev_pm_opp_set_regulators() instead. > + dev_pm_opp_of_remove_table(dev); > + dev_pm_opp_put_regulators(opp_table); > + dev_pm_opp_put_opp_table(opp_table); > +} > + > +static int devm_sdhci_tegra_init_opp_table(struct device *dev) > +{ > + struct opp_table *opp_table; > + const char *rname = "core"; > + int err; > + > + /* voltage scaling is optional */ > + if (device_property_present(dev, "core-supply")) > + opp_table = dev_pm_opp_set_regulators(dev, &rname, 1); > + else > + opp_table = dev_pm_opp_get_opp_table(dev); Nice. I didn't think that someone will end up abusing this API and so made it available for all, but someone just did that. I will fix that in the OPP core. Any idea why you are doing what you are doing here ? > + > + if (IS_ERR(opp_table)) > + return dev_err_probe(dev, PTR_ERR(opp_table), > + "failed to prepare OPP table\n"); > + > + /* > + * OPP table presence is optional and we want the set_rate() of OPP > + * API to work similarly to clk_set_rate() if table is missing in a > + * device-tree. The add_table() errors out if OPP is missing in DT. > + */ > + if (device_property_present(dev, "operating-points-v2")) { > + err = dev_pm_opp_of_add_table(dev); > + if (err) { > + dev_err(dev, "failed to add OPP table: %d\n", err); > + goto put_table; > + } > + } > + > + err = devm_add_action(dev, sdhci_tegra_deinit_opp_table, dev); > + if (err) > + goto remove_table; > + > + return 0; > + > +remove_table: > + dev_pm_opp_of_remove_table(dev); > +put_table: > + dev_pm_opp_put_regulators(opp_table); > + > + return err; > +} > + > static int sdhci_tegra_probe(struct platform_device *pdev) > { > const struct of_device_id *match; > @@ -1621,6 +1681,10 @@ static int sdhci_tegra_probe(struct platform_device *pdev) > goto err_power_req; > } > > + rc = devm_sdhci_tegra_init_opp_table(&pdev->dev); > + if (rc) > + goto err_parse_dt; > + > /* > * Tegra210 has a separate SDMMC_LEGACY_TM clock used for host > * timeout clock and SW can choose TMCLK or SDCLK for hardware > -- > 2.27.0 > > _______________________________________________ > devel mailing list > devel@linuxdriverproject.org > http://driverdev.linuxdriverproject.org/mailman/listinfo/driverdev-devel
On 05-11-20, 10:45, Ulf Hansson wrote: > + Viresh Thanks Ulf. I found a bug in OPP core because you cc'd me here :) > On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote: > I need some more time to review this, but just a quick check found a > few potential issues... > > The "core-supply", that you specify as a regulator for each > controller's device node, is not the way we describe power domains. Maybe I misunderstood your comment here, but there are two ways of scaling the voltage of a device depending on if it is a regulator (and can be modeled as one in the kernel) or a power domain. In case of Qcom earlier (when we added the performance-state stuff), the eventual hardware was out of kernel's control and we didn't wanted (allowed) to model it as a virtual regulator just to pass the votes to the RPM. And so we did what we did. But if the hardware (where the voltage is required to be changed) is indeed a regulator and is modeled as one, then what Dmitry has done looks okay. i.e. add a supply in the device's node and microvolt property in the DT entries.
On Thu, 5 Nov 2020 at 11:06, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 05-11-20, 10:45, Ulf Hansson wrote: > > + Viresh > > Thanks Ulf. I found a bug in OPP core because you cc'd me here :) Happy to help. :-) > > > On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote: > > I need some more time to review this, but just a quick check found a > > few potential issues... > > > > The "core-supply", that you specify as a regulator for each > > controller's device node, is not the way we describe power domains. > > Maybe I misunderstood your comment here, but there are two ways of > scaling the voltage of a device depending on if it is a regulator (and > can be modeled as one in the kernel) or a power domain. I am not objecting about scaling the voltage through a regulator, that's fine to me. However, encoding a power domain as a regulator (even if it may seem like a regulator) isn't. Well, unless Mark Brown has changed his mind about this. In this case, it seems like the regulator supply belongs in the description of the power domain provider. > > In case of Qcom earlier (when we added the performance-state stuff), > the eventual hardware was out of kernel's control and we didn't wanted > (allowed) to model it as a virtual regulator just to pass the votes to > the RPM. And so we did what we did. > > But if the hardware (where the voltage is required to be changed) is > indeed a regulator and is modeled as one, then what Dmitry has done > looks okay. i.e. add a supply in the device's node and microvolt > property in the DT entries. I guess I haven't paid enough attention how power domain regulators are being described then. I was under the impression that the CPUfreq case was a bit specific - and we had legacy bindings to stick with. Can you point me to some other existing examples of where power domain regulators are specified as a regulator in each device's node? Kind regards Uffe
On 05-11-20, 11:34, Ulf Hansson wrote: > I am not objecting about scaling the voltage through a regulator, > that's fine to me. However, encoding a power domain as a regulator > (even if it may seem like a regulator) isn't. Well, unless Mark Brown > has changed his mind about this. > > In this case, it seems like the regulator supply belongs in the > description of the power domain provider. Okay, I wasn't sure if it is a power domain or a regulator here. Btw, how do we identify if it is a power domain or a regulator ? > > In case of Qcom earlier (when we added the performance-state stuff), > > the eventual hardware was out of kernel's control and we didn't wanted > > (allowed) to model it as a virtual regulator just to pass the votes to > > the RPM. And so we did what we did. > > > > But if the hardware (where the voltage is required to be changed) is > > indeed a regulator and is modeled as one, then what Dmitry has done > > looks okay. i.e. add a supply in the device's node and microvolt > > property in the DT entries. > > I guess I haven't paid enough attention how power domain regulators > are being described then. I was under the impression that the CPUfreq > case was a bit specific - and we had legacy bindings to stick with. > > Can you point me to some other existing examples of where power domain > regulators are specified as a regulator in each device's node? No, I thought it is a regulator here and not a power domain.
On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 05-11-20, 11:34, Ulf Hansson wrote: > > I am not objecting about scaling the voltage through a regulator, > > that's fine to me. However, encoding a power domain as a regulator > > (even if it may seem like a regulator) isn't. Well, unless Mark Brown > > has changed his mind about this. > > > > In this case, it seems like the regulator supply belongs in the > > description of the power domain provider. > > Okay, I wasn't sure if it is a power domain or a regulator here. Btw, > how do we identify if it is a power domain or a regulator ? Good question. It's not a crystal clear line in between them, I think. A power domain to me, means that some part of a silicon (a group of controllers or just a single piece, for example) needs some kind of resource (typically a power rail) to be enabled to be functional, to start with. If there are operating points involved, that's also a clear indication to me, that it's not a regular regulator. Maybe we should try to specify this more exactly in some documentation, somewhere. > > > > In case of Qcom earlier (when we added the performance-state stuff), > > > the eventual hardware was out of kernel's control and we didn't wanted > > > (allowed) to model it as a virtual regulator just to pass the votes to > > > the RPM. And so we did what we did. > > > > > > But if the hardware (where the voltage is required to be changed) is > > > indeed a regulator and is modeled as one, then what Dmitry has done > > > looks okay. i.e. add a supply in the device's node and microvolt > > > property in the DT entries. > > > > I guess I haven't paid enough attention how power domain regulators > > are being described then. I was under the impression that the CPUfreq > > case was a bit specific - and we had legacy bindings to stick with. > > > > Can you point me to some other existing examples of where power domain > > regulators are specified as a regulator in each device's node? > > No, I thought it is a regulator here and not a power domain. Okay, thanks! Kind regards Uffe
On 05-11-20, 11:56, Ulf Hansson wrote: > On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > Btw, how do we identify if it is a power domain or a regulator ? To be honest, I was a bit afraid and embarrassed to ask this question, and was hoping people to make fun of me in return :) > Good question. It's not a crystal clear line in between them, I think. And I was relieved after reading this :) > A power domain to me, means that some part of a silicon (a group of > controllers or just a single piece, for example) needs some kind of > resource (typically a power rail) to be enabled to be functional, to > start with. Isn't this what a part of regulator does as well ? i.e. enabling/disabling of the regulator or power to a group of controllers. Over that the regulator does voltage/current scaling as well, which normally the power domains don't do (though we did that in performance-state case). > If there are operating points involved, that's also a > clear indication to me, that it's not a regular regulator. Is there any example of that? I hope by OPP you meant both freq and voltage here. I am not sure if I know of a case where a power domain handles both of them. > Maybe we should try to specify this more exactly in some > documentation, somewhere. I think yes, it is very much required. And in absence of that I think, many (or most) of the platforms that also need to scale the voltage would have modeled their hardware as a regulator and not a PM domain. What I always thought was: - Module that can just enable/disable power to a block of SoC is a power domain. - Module that can enable/disable as well as scale voltage is a regulator. And so I thought that this patchset has done the right thing. This changed a bit with the qcom stuff where the IP to be configured was in control of RPM and not Linux and so we couldn't add it as a regulator. If it was controlled by Linux, it would have been a regulator in kernel for sure :)
On Thu, 5 Nov 2020 at 12:13, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > On 05-11-20, 11:56, Ulf Hansson wrote: > > On Thu, 5 Nov 2020 at 11:40, Viresh Kumar <viresh.kumar@linaro.org> wrote: > > > Btw, how do we identify if it is a power domain or a regulator ? > > To be honest, I was a bit afraid and embarrassed to ask this question, > and was hoping people to make fun of me in return :) > > > Good question. It's not a crystal clear line in between them, I think. > > And I was relieved after reading this :) > > > A power domain to me, means that some part of a silicon (a group of > > controllers or just a single piece, for example) needs some kind of > > resource (typically a power rail) to be enabled to be functional, to > > start with. > > Isn't this what a part of regulator does as well ? i.e. > enabling/disabling of the regulator or power to a group of > controllers. It could, but it shouldn't. > > Over that the regulator does voltage/current scaling as well, which > normally the power domains don't do (though we did that in > performance-state case). > > > If there are operating points involved, that's also a > > clear indication to me, that it's not a regular regulator. > > Is there any example of that? I hope by OPP you meant both freq and > voltage here. I am not sure if I know of a case where a power domain > handles both of them. It may be both voltage and frequency - but in some cases only voltage.
05.11.2020 04:45, Michał Mirosław пишет: > On Thu, Nov 05, 2020 at 02:43:57AM +0300, Dmitry Osipenko wrote: >> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces >> power consumption and heating of the Tegra chips. Tegra SoC has multiple >> hardware units which belong to a core power domain of the SoC and share >> the core voltage. The voltage must be selected in accordance to a minimum >> requirement of every core hardware unit. > [...] > > Just looked briefly through the series - it looks like there is a lot of > code duplication in *_init_opp_table() functions. Could this be made > more generic / data-driven? Indeed, it should be possible to add a common helper. I had a quick thought about doing it too, but then decided to defer for the starter since there were some differences among the needs of the drivers. I'll take a closer look for the v2, thanks!
05.11.2020 12:58, Viresh Kumar пишет: >> +static void sdhci_tegra_deinit_opp_table(void *data) >> +{ >> + struct device *dev = data; >> + struct opp_table *opp_table; >> + >> + opp_table = dev_pm_opp_get_opp_table(dev); > So you need to get an OPP table to put one :) > You need to save the pointer returned by dev_pm_opp_set_regulators() instead. This is intentional because why do we need to save the pointer if we're not using it and we know that we could get this pointer using OPP API? This is exactly the same what I did for the CPUFreq driver [1] :) [1] https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/cpufreq/tegra20-cpufreq.c#L97 >> + dev_pm_opp_of_remove_table(dev); >> + dev_pm_opp_put_regulators(opp_table); >> + dev_pm_opp_put_opp_table(opp_table); >> +} >> + >> +static int devm_sdhci_tegra_init_opp_table(struct device *dev) >> +{ >> + struct opp_table *opp_table; >> + const char *rname = "core"; >> + int err; >> + >> + /* voltage scaling is optional */ >> + if (device_property_present(dev, "core-supply")) >> + opp_table = dev_pm_opp_set_regulators(dev, &rname, 1); >> + else > >> + opp_table = dev_pm_opp_get_opp_table(dev); > Nice. I didn't think that someone will end up abusing this API and so made it > available for all, but someone just did that. I will fix that in the OPP core. The dev_pm_opp_put_regulators() handles the case where regulator is missing by acting as dev_pm_opp_get_opp_table(), but the dev_pm_opp_set_regulators() doesn't do it. Hence I don't think this is an abuse, but the OPP API drawback. > Any idea why you are doing what you are doing here ? Two reasons: 1. Voltage regulator is optional, but dev_pm_opp_set_regulators() doesn't support optional regulators. 2. We need to balance the opp_table refcount in order to use OPP API without polluting code with if(have_regulator), hence the dev_pm_opp_get_opp_table() is needed for taking the opp_table reference to have the same refcount as in the case of the dev_pm_opp_set_regulators(). I guess we could make dev_pm_opp_set_regulators(dev, count) to accept regulators count=0 and then act as dev_pm_opp_get_opp_table(dev), will it be acceptable?
05.11.2020 12:45, Ulf Hansson пишет: ... > I need some more time to review this, but just a quick check found a > few potential issues... Thank you for starting the review! I'm pretty sure it will take a couple revisions until all the questions will be resolved :) > The "core-supply", that you specify as a regulator for each > controller's device node, is not the way we describe power domains. > Instead, it seems like you should register a power-domain provider > (with the help of genpd) and implement the ->set_performance_state() > callback for it. Each device node should then be hooked up to this > power-domain, rather than to a "core-supply". For DT bindings, please > have a look at Documentation/devicetree/bindings/power/power-domain.yaml > and Documentation/devicetree/bindings/power/power_domain.txt. > > In regards to the "sync state" problem (preventing to change > performance states until all consumers have been attached), this can > then be managed by the genpd provider driver instead. I'll need to take a closer look at GENPD, thank you for the suggestion. Sounds like a software GENPD driver which manages clocks and voltages could be a good idea, but it also could be an unnecessary over-engineering. Let's see..
On Thu, Nov 05, 2020 at 02:44:18AM +0300, Dmitry Osipenko wrote: > Add initial OPP and SoC core voltage scaling support to the Tegra EHCI > driver. This is required for enabling system-wide DVFS on older Tegra > SoCs. > > Tested-by: Peter Geis <pgwipeout@gmail.com> > Tested-by: Nicolas Chauvet <kwizart@gmail.com> > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- I'm no expert on OPP stuff, but some of what you have done here looks peculiar. > diff --git a/drivers/usb/host/ehci-tegra.c b/drivers/usb/host/ehci-tegra.c > index 869d9c4de5fc..0976577f54b4 100644 > --- a/drivers/usb/host/ehci-tegra.c > +++ b/drivers/usb/host/ehci-tegra.c > @@ -17,6 +17,7 @@ > #include <linux/of_device.h> > #include <linux/of_gpio.h> > #include <linux/platform_device.h> > +#include <linux/pm_opp.h> > #include <linux/pm_runtime.h> > #include <linux/reset.h> > #include <linux/slab.h> > @@ -364,6 +365,79 @@ static void tegra_ehci_unmap_urb_for_dma(struct usb_hcd *hcd, struct urb *urb) > free_dma_aligned_buffer(urb); > } > > +static void tegra_ehci_deinit_opp_table(void *data) > +{ > + struct device *dev = data; > + struct opp_table *opp_table; > + > + opp_table = dev_pm_opp_get_opp_table(dev); > + dev_pm_opp_of_remove_table(dev); > + dev_pm_opp_put_regulators(opp_table); > + dev_pm_opp_put_opp_table(opp_table); > +} > + > +static int devm_tegra_ehci_init_opp_table(struct device *dev) > +{ > + unsigned long rate = ULONG_MAX; > + struct opp_table *opp_table; > + const char *rname = "core"; > + struct dev_pm_opp *opp; > + int err; > + > + /* legacy device-trees don't have OPP table */ > + if (!device_property_present(dev, "operating-points-v2")) > + return 0; > + > + /* voltage scaling is optional */ > + if (device_property_present(dev, "core-supply")) > + opp_table = dev_pm_opp_set_regulators(dev, &rname, 1); > + else > + opp_table = dev_pm_opp_get_opp_table(dev); > + > + if (IS_ERR(opp_table)) > + return dev_err_probe(dev, PTR_ERR(opp_table), > + "failed to prepare OPP table\n"); > + > + err = dev_pm_opp_of_add_table(dev); > + if (err) { > + dev_err(dev, "failed to add OPP table: %d\n", err); > + goto put_table; > + } > + > + /* find suitable OPP for the maximum clock rate */ > + opp = dev_pm_opp_find_freq_floor(dev, &rate); > + err = PTR_ERR_OR_ZERO(opp); > + if (err) { > + dev_err(dev, "failed to get OPP: %d\n", err); > + goto remove_table; > + } > + > + dev_pm_opp_put(opp); > + > + /* > + * First dummy rate-set initializes voltage vote by setting voltage > + * in accordance to the clock rate. > + */ > + err = dev_pm_opp_set_rate(dev, rate); > + if (err) { > + dev_err(dev, "failed to initialize OPP clock: %d\n", err); > + goto remove_table; > + } > + > + err = devm_add_action(dev, tegra_ehci_deinit_opp_table, dev); > + if (err) > + goto remove_table; > + > + return 0; > + > +remove_table: > + dev_pm_opp_of_remove_table(dev); > +put_table: > + dev_pm_opp_put_regulators(opp_table); Do you really want to use the same error unwinding for opp_table values obtained from dev_pm_opp_set_regulators() as from dev_pm_opp_get_opp_table()? > + > + return err; > +} > + > static const struct tegra_ehci_soc_config tegra30_soc_config = { > .has_hostpc = true, > }; > @@ -431,6 +505,11 @@ static int tegra_ehci_probe(struct platform_device *pdev) > goto cleanup_hcd_create; > } > > + err = devm_tegra_ehci_init_opp_table(&pdev->dev); > + if (err) > + return dev_err_probe(&pdev->dev, err, > + "Failed to initialize OPP\n"); Why log a second error message? Just return err. Alan Stern
05.11.2020 19:07, Alan Stern пишет: > Do you really want to use the same error unwinding for opp_table values > obtained from dev_pm_opp_set_regulators() as from > dev_pm_opp_get_opp_table()? They both are pointing at the same opp_table, which is refcounted. The dev_pm_opp_set_regulators() is dev_pm_opp_get_opp_table() + it sets regulator for the table. https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/opp/core.c#L1756
05.11.2020 19:07, Alan Stern пишет: >> + err = devm_tegra_ehci_init_opp_table(&pdev->dev); >> + if (err) >> + return dev_err_probe(&pdev->dev, err, >> + "Failed to initialize OPP\n"); > Why log a second error message? Just return err. Indeed, thanks.
On 05-11-20, 17:18, Dmitry Osipenko wrote: > 05.11.2020 12:58, Viresh Kumar пишет: > >> +static void sdhci_tegra_deinit_opp_table(void *data) > >> +{ > >> + struct device *dev = data; > >> + struct opp_table *opp_table; > >> + > >> + opp_table = dev_pm_opp_get_opp_table(dev); > > So you need to get an OPP table to put one :) > > You need to save the pointer returned by dev_pm_opp_set_regulators() instead. > > This is intentional because why do we need to save the pointer if we're > not using it and we know that we could get this pointer using OPP API? Because it is highly inefficient and it doesn't follow the rules set by the OPP core. Hypothetically speaking, the OPP core is free to allocate the OPP table structure as much as it wants, and if you don't use the value returned back to you earlier (think of it as a cookie assigned to your driver), then it will eventually lead to memory leak. > This is exactly the same what I did for the CPUFreq driver [1] :) I will strongly suggest you to save the pointer here and do the same in the cpufreq driver as well. > >> +static int devm_sdhci_tegra_init_opp_table(struct device *dev) > >> +{ > >> + struct opp_table *opp_table; > >> + const char *rname = "core"; > >> + int err; > >> + > >> + /* voltage scaling is optional */ > >> + if (device_property_present(dev, "core-supply")) > >> + opp_table = dev_pm_opp_set_regulators(dev, &rname, 1); > >> + else > > > >> + opp_table = dev_pm_opp_get_opp_table(dev); To make it further clear, this will end up allocating an OPP table for you, which it shouldn't have. > > Nice. I didn't think that someone will end up abusing this API and so made it > > available for all, but someone just did that. I will fix that in the OPP core. To be fair, I allowed the cpufreq-dt driver to abuse it too, which I am going to fix shortly. > The dev_pm_opp_put_regulators() handles the case where regulator is > missing by acting as dev_pm_opp_get_opp_table(), but the > dev_pm_opp_set_regulators() doesn't do it. Hence I don't think this is > an abuse, but the OPP API drawback. I am not sure what you meant here. Normally you are required to call dev_pm_opp_put_regulators() only if you have called dev_pm_opp_set_regulators() earlier. And the refcount stays in balance. > > Any idea why you are doing what you are doing here ? > > Two reasons: > > 1. Voltage regulator is optional, but dev_pm_opp_set_regulators() > doesn't support optional regulators. > > 2. We need to balance the opp_table refcount in order to use OPP API > without polluting code with if(have_regulator), hence the > dev_pm_opp_get_opp_table() is needed for taking the opp_table reference > to have the same refcount as in the case of the dev_pm_opp_set_regulators(). I am going to send a patchset shortly after which this call to dev_pm_opp_get_opp_table() will fail, if it is called before adding the OPP table. > I guess we could make dev_pm_opp_set_regulators(dev, count) to accept > regulators count=0 and then act as dev_pm_opp_get_opp_table(dev), will > it be acceptable? Setting regulators for count as 0 doesn't sound good to me. But, I understand that you don't want to have that if (have_regulator) check, and it is a fair request. What I will instead do is, allow all dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP table and fail silently. And so you won't be required to have this unwanted check. But you will be required to save the pointer returned back by dev_pm_opp_set_regulators(), which is the right thing to do anyways.
06.11.2020 09:15, Viresh Kumar пишет: > Setting regulators for count as 0 doesn't sound good to me. > > But, I understand that you don't want to have that if (have_regulator) > check, and it is a fair request. What I will instead do is, allow all > dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP > table and fail silently. And so you won't be required to have this > unwanted check. But you will be required to save the pointer returned > back by dev_pm_opp_set_regulators(), which is the right thing to do > anyways. Perhaps even a better variant could be to add a devm versions of the OPP API functions, then drivers won't need to care about storing the opp_table pointer if it's unused by drivers.
On Fri, Nov 6, 2020 at 9:18 PM Dmitry Osipenko <digetx@gmail.com> wrote: > > 06.11.2020 09:15, Viresh Kumar пишет: > > Setting regulators for count as 0 doesn't sound good to me. > > > > But, I understand that you don't want to have that if (have_regulator) > > check, and it is a fair request. What I will instead do is, allow all > > dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP > > table and fail silently. And so you won't be required to have this > > unwanted check. But you will be required to save the pointer returned > > back by dev_pm_opp_set_regulators(), which is the right thing to do > > anyways. > > Perhaps even a better variant could be to add a devm versions of the OPP > API functions, then drivers won't need to care about storing the > opp_table pointer if it's unused by drivers. I think so. The consumer may not be so concerned about the status of these OPP tables. If the driver needs to manage the release, it needs to add a pointer to their driver global structure. Maybe it's worth having these devm interfaces for opp. Yangtao
05.11.2020 18:22, Dmitry Osipenko пишет: > 05.11.2020 12:45, Ulf Hansson пишет: > ... >> I need some more time to review this, but just a quick check found a >> few potential issues... > > Thank you for starting the review! I'm pretty sure it will take a couple > revisions until all the questions will be resolved :) > >> The "core-supply", that you specify as a regulator for each >> controller's device node, is not the way we describe power domains. >> Instead, it seems like you should register a power-domain provider >> (with the help of genpd) and implement the ->set_performance_state() >> callback for it. Each device node should then be hooked up to this >> power-domain, rather than to a "core-supply". For DT bindings, please >> have a look at Documentation/devicetree/bindings/power/power-domain.yaml >> and Documentation/devicetree/bindings/power/power_domain.txt. >> >> In regards to the "sync state" problem (preventing to change >> performance states until all consumers have been attached), this can >> then be managed by the genpd provider driver instead. > > I'll need to take a closer look at GENPD, thank you for the suggestion. > > Sounds like a software GENPD driver which manages clocks and voltages > could be a good idea, but it also could be an unnecessary > over-engineering. Let's see.. > Hello Ulf and all, I took a detailed look at the GENPD and tried to implement it. Here is what was found: 1. GENPD framework doesn't aggregate performance requests from the attached devices. This means that if deviceA requests performance state 10 and then deviceB requests state 3, then framework will set domain's state to 3 instead of 10. https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376 2. GENPD framework has a sync() callback in the genpd.domain structure, but this callback isn't allowed to be used by the GENPD implementation. The GENPD framework always overrides that callback for its own needs. Hence GENPD doesn't allow to solve the bootstrapping state-synchronization problem in a nice way. https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606 3. Tegra doesn't have a dedicated hardware power-controller for the core domain, instead there is only an external voltage regulator. Hence we will need to create a phony device-tree node for the virtual power domain, which is probably a wrong thing to do. === Perhaps it should be possible to create some hacks to work around bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but bullet 1 isn't solvable without changing how the GENPD core works. Altogether, the GENPD in its current form is a wrong abstraction for a system-wide DVFS in a case where multiple devices share power domain and this domain is a voltage regulator. The regulator framework is the correct abstraction in this case for today.
09.11.2020 07:43, Viresh Kumar пишет: > On 08-11-20, 15:19, Dmitry Osipenko wrote: >> I took a detailed look at the GENPD and tried to implement it. Here is >> what was found: >> >> 1. GENPD framework doesn't aggregate performance requests from the >> attached devices. This means that if deviceA requests performance state >> 10 and then deviceB requests state 3, then framework will set domain's >> state to 3 instead of 10. > > It does. Look at _genpd_reeval_performance_state(). > Thanks, I probably had a bug in the quick prototype and then overlooked that function.
09.11.2020 08:00, Viresh Kumar пишет: > On 06-11-20, 21:41, Frank Lee wrote: >> On Fri, Nov 6, 2020 at 9:18 PM Dmitry Osipenko <digetx@gmail.com> wrote: >>> >>> 06.11.2020 09:15, Viresh Kumar пишет: >>>> Setting regulators for count as 0 doesn't sound good to me. >>>> >>>> But, I understand that you don't want to have that if (have_regulator) >>>> check, and it is a fair request. What I will instead do is, allow all >>>> dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP >>>> table and fail silently. And so you won't be required to have this >>>> unwanted check. But you will be required to save the pointer returned >>>> back by dev_pm_opp_set_regulators(), which is the right thing to do >>>> anyways. >>> >>> Perhaps even a better variant could be to add a devm versions of the OPP >>> API functions, then drivers won't need to care about storing the >>> opp_table pointer if it's unused by drivers. >> >> I think so. The consumer may not be so concerned about the status of >> these OPP tables. >> If the driver needs to manage the release, it needs to add a pointer >> to their driver global structure. >> >> Maybe it's worth having these devm interfaces for opp. > > Sure if there are enough users of this, I am all for it. I was fine > with the patches you sent, just that there were not a lot of users of > it and so I pushed them back. If we find that we have more users of it > now, we can surely get that back. > There was already attempt to add the devm? Could you please give me a link to the patches? I already prepared a patch which adds the devm helpers. It helps to keep code cleaner and readable.
On 09-11-20, 08:08, Dmitry Osipenko wrote: > 09.11.2020 08:00, Viresh Kumar пишет: > > On 06-11-20, 21:41, Frank Lee wrote: > >> On Fri, Nov 6, 2020 at 9:18 PM Dmitry Osipenko <digetx@gmail.com> wrote: > >>> > >>> 06.11.2020 09:15, Viresh Kumar пишет: > >>>> Setting regulators for count as 0 doesn't sound good to me. > >>>> > >>>> But, I understand that you don't want to have that if (have_regulator) > >>>> check, and it is a fair request. What I will instead do is, allow all > >>>> dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP > >>>> table and fail silently. And so you won't be required to have this > >>>> unwanted check. But you will be required to save the pointer returned > >>>> back by dev_pm_opp_set_regulators(), which is the right thing to do > >>>> anyways. > >>> > >>> Perhaps even a better variant could be to add a devm versions of the OPP > >>> API functions, then drivers won't need to care about storing the > >>> opp_table pointer if it's unused by drivers. > >> > >> I think so. The consumer may not be so concerned about the status of > >> these OPP tables. > >> If the driver needs to manage the release, it needs to add a pointer > >> to their driver global structure. > >> > >> Maybe it's worth having these devm interfaces for opp. > > > > Sure if there are enough users of this, I am all for it. I was fine > > with the patches you sent, just that there were not a lot of users of > > it and so I pushed them back. If we find that we have more users of it > > now, we can surely get that back. > > > > There was already attempt to add the devm? Could you please give me a > link to the patches? > > I already prepared a patch which adds the devm helpers. It helps to keep > code cleaner and readable. https://lore.kernel.org/lkml/20201012135517.19468-1-frank@allwinnertech.com/
09.11.2020 08:10, Viresh Kumar пишет: > On 09-11-20, 08:08, Dmitry Osipenko wrote: >> 09.11.2020 08:00, Viresh Kumar пишет: >>> On 06-11-20, 21:41, Frank Lee wrote: >>>> On Fri, Nov 6, 2020 at 9:18 PM Dmitry Osipenko <digetx@gmail.com> wrote: >>>>> >>>>> 06.11.2020 09:15, Viresh Kumar пишет: >>>>>> Setting regulators for count as 0 doesn't sound good to me. >>>>>> >>>>>> But, I understand that you don't want to have that if (have_regulator) >>>>>> check, and it is a fair request. What I will instead do is, allow all >>>>>> dev_pm_opp_put*() API to start accepting a NULL pointer for the OPP >>>>>> table and fail silently. And so you won't be required to have this >>>>>> unwanted check. But you will be required to save the pointer returned >>>>>> back by dev_pm_opp_set_regulators(), which is the right thing to do >>>>>> anyways. >>>>> >>>>> Perhaps even a better variant could be to add a devm versions of the OPP >>>>> API functions, then drivers won't need to care about storing the >>>>> opp_table pointer if it's unused by drivers. >>>> >>>> I think so. The consumer may not be so concerned about the status of >>>> these OPP tables. >>>> If the driver needs to manage the release, it needs to add a pointer >>>> to their driver global structure. >>>> >>>> Maybe it's worth having these devm interfaces for opp. >>> >>> Sure if there are enough users of this, I am all for it. I was fine >>> with the patches you sent, just that there were not a lot of users of >>> it and so I pushed them back. If we find that we have more users of it >>> now, we can surely get that back. >>> >> >> There was already attempt to add the devm? Could you please give me a >> link to the patches? >> >> I already prepared a patch which adds the devm helpers. It helps to keep >> code cleaner and readable. > > https://lore.kernel.org/lkml/20201012135517.19468-1-frank@allwinnertech.com/ > Thanks, I made it in a different way by simply adding helpers to the pm_opp.h which use devm_add_action_or_reset(). This doesn't require to add new kernel symbols. static inline int devm_pm_opp_of_add_table(struct device *dev) { int err; err = dev_pm_opp_of_add_table(dev); if (err) return err; err = devm_add_action_or_reset(dev, (void*)dev_pm_opp_remove_table, dev); if (err) return err; return 0; }
09.11.2020 08:35, Viresh Kumar пишет: > On 09-11-20, 08:19, Dmitry Osipenko wrote: >> Thanks, I made it in a different way by simply adding helpers to the >> pm_opp.h which use devm_add_action_or_reset(). This doesn't require to >> add new kernel symbols. > > I will prefer to add it in core.c itself, and yes > devm_add_action_or_reset() looks better. But I am still not sure for > which helpers do we need the devm_*() variants, as this is only useful > for non-CPU devices. But if we have users that we can add right now, > why not. All current non-CPU drivers (devfreq, mmc, memory, etc) can benefit from it. For Tegra drivers we need these variants: devm_pm_opp_set_supported_hw() devm_pm_opp_set_regulators() [if we won't use GENPD] devm_pm_opp_set_clkname() devm_pm_opp_of_add_table()
On 09-11-20, 08:44, Dmitry Osipenko wrote: > 09.11.2020 08:35, Viresh Kumar пишет: > > On 09-11-20, 08:19, Dmitry Osipenko wrote: > >> Thanks, I made it in a different way by simply adding helpers to the > >> pm_opp.h which use devm_add_action_or_reset(). This doesn't require to > >> add new kernel symbols. > > > > I will prefer to add it in core.c itself, and yes > > devm_add_action_or_reset() looks better. But I am still not sure for > > which helpers do we need the devm_*() variants, as this is only useful > > for non-CPU devices. But if we have users that we can add right now, > > why not. > > All current non-CPU drivers (devfreq, mmc, memory, etc) can benefit from it. > > For Tegra drivers we need these variants: > > devm_pm_opp_set_supported_hw() > devm_pm_opp_set_regulators() [if we won't use GENPD] > devm_pm_opp_set_clkname() > devm_pm_opp_of_add_table() I tried to look earlier for the stuff already merged in and didn't find a lot of stuff where the devm_* could be used, maybe I missed some of it. Frank, would you like to refresh your series based on suggestions from Dmitry and make other drivers adapt to the new APIs ? -- viresh
On Thu, 05 Nov 2020 02:43:59 +0300, Dmitry Osipenko wrote: > Document new DVFS OPP table and voltage regulator properties of the > SDHCI controller. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > .../devicetree/bindings/mmc/nvidia,tegra20-sdhci.txt | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > Reviewed-by: Rob Herring <robh@kernel.org>
On Thu, 05 Nov 2020 02:44:03 +0300, Dmitry Osipenko wrote: > Document new DVFS OPP table and voltage regulator properties of the > Tegra EHCI controller. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > Documentation/devicetree/bindings/usb/nvidia,tegra20-ehci.txt | 2 ++ > 1 file changed, 2 insertions(+) > Reviewed-by: Rob Herring <robh@kernel.org>
10.11.2020 23:50, Thierry Reding пишет: > On Thu, Nov 05, 2020 at 02:44:15AM +0300, Dmitry Osipenko wrote: > [...] >> +static void tegra_pwm_deinit_opp_table(void *data) >> +{ >> + struct device *dev = data; >> + struct opp_table *opp_table; >> + >> + opp_table = dev_pm_opp_get_opp_table(dev); >> + dev_pm_opp_of_remove_table(dev); >> + dev_pm_opp_put_regulators(opp_table); >> + dev_pm_opp_put_opp_table(opp_table); >> +} >> + >> +static int devm_tegra_pwm_init_opp_table(struct device *dev) >> +{ >> + struct opp_table *opp_table; >> + const char *rname = "core"; >> + int err; >> + >> + /* voltage scaling is optional */ >> + if (device_property_present(dev, "core-supply")) >> + opp_table = dev_pm_opp_set_regulators(dev, &rname, 1); >> + else >> + opp_table = dev_pm_opp_get_opp_table(dev); >> + >> + if (IS_ERR(opp_table)) >> + return dev_err_probe(dev, PTR_ERR(opp_table), >> + "failed to prepare OPP table\n"); >> + >> + /* >> + * OPP table presence is optional and we want the set_rate() of OPP >> + * API to work similarly to clk_set_rate() if table is missing in a >> + * device-tree. The add_table() errors out if OPP is missing in DT. >> + */ >> + if (device_property_present(dev, "operating-points-v2")) { >> + err = dev_pm_opp_of_add_table(dev); >> + if (err) { >> + dev_err(dev, "failed to add OPP table: %d\n", err); >> + goto put_table; >> + } >> + } >> + >> + err = devm_add_action(dev, tegra_pwm_deinit_opp_table, dev); >> + if (err) >> + goto remove_table; >> + >> + return 0; >> + >> +remove_table: >> + dev_pm_opp_of_remove_table(dev); >> +put_table: >> + dev_pm_opp_put_regulators(opp_table); >> + >> + return err; >> +} > > These two functions seem to be heavily boilerplate across all these > drivers. Have you considered splitting these out into separate helpers? The helper is already prepared for v2.
On Sun, 8 Nov 2020 at 13:19, Dmitry Osipenko <digetx@gmail.com> wrote: > > 05.11.2020 18:22, Dmitry Osipenko пишет: > > 05.11.2020 12:45, Ulf Hansson пишет: > > ... > >> I need some more time to review this, but just a quick check found a > >> few potential issues... > > > > Thank you for starting the review! I'm pretty sure it will take a couple > > revisions until all the questions will be resolved :) > > > >> The "core-supply", that you specify as a regulator for each > >> controller's device node, is not the way we describe power domains. > >> Instead, it seems like you should register a power-domain provider > >> (with the help of genpd) and implement the ->set_performance_state() > >> callback for it. Each device node should then be hooked up to this > >> power-domain, rather than to a "core-supply". For DT bindings, please > >> have a look at Documentation/devicetree/bindings/power/power-domain.yaml > >> and Documentation/devicetree/bindings/power/power_domain.txt. > >> > >> In regards to the "sync state" problem (preventing to change > >> performance states until all consumers have been attached), this can > >> then be managed by the genpd provider driver instead. > > > > I'll need to take a closer look at GENPD, thank you for the suggestion. > > > > Sounds like a software GENPD driver which manages clocks and voltages > > could be a good idea, but it also could be an unnecessary > > over-engineering. Let's see.. > > > > Hello Ulf and all, > > I took a detailed look at the GENPD and tried to implement it. Here is > what was found: > > 1. GENPD framework doesn't aggregate performance requests from the > attached devices. This means that if deviceA requests performance state > 10 and then deviceB requests state 3, then framework will set domain's > state to 3 instead of 10. > > https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L376 As Viresh also stated, genpd does aggregate the votes. It even performs aggregation hierarchy (a genpd is allowed to have parent(s) to model a topology). > > 2. GENPD framework has a sync() callback in the genpd.domain structure, > but this callback isn't allowed to be used by the GENPD implementation. > The GENPD framework always overrides that callback for its own needs. > Hence GENPD doesn't allow to solve the bootstrapping > state-synchronization problem in a nice way. > > https://elixir.bootlin.com/linux/v5.10-rc2/source/drivers/base/power/domain.c#L2606 That ->sync() callback isn't the callback you are looking for, it's a PM domain specific callback - and has other purposes. To solve the problem you refer to, your genpd provider driver (a platform driver) should assign its ->sync_state() callback. The ->sync_state() callback will be invoked, when all consumer devices have been attached (and probed) to their corresponding provider. You may have a look at drivers/cpuidle/cpuidle-psci-domain.c, to see an example of how this works. If there is anything unclear, just tell me and I will try to help. > > 3. Tegra doesn't have a dedicated hardware power-controller for the core > domain, instead there is only an external voltage regulator. Hence we > will need to create a phony device-tree node for the virtual power > domain, which is probably a wrong thing to do. No, this is absolutely the correct thing to do. This isn't a virtual power domain, it's a real power domain. You only happen to model the control of it as a regulator, as it fits nicely with that for *this* SoC. Don't get me wrong, that's fine as long as the supply is specified only in the power-domain provider node. On another SoC, you might have a different FW interface for the power domain provider that doesn't fit well with the regulator. When that happens, all you need to do is to implement a new power domain provider and potentially re-define the power domain topology. More importantly, you don't need to re-invent yet another slew of device specific bindings - for each SoC. > > === > > Perhaps it should be possible to create some hacks to work around > bullets 2 and 3 in order to achieve what we need for DVFS on Tegra, but > bullet 1 isn't solvable without changing how the GENPD core works. > > Altogether, the GENPD in its current form is a wrong abstraction for a > system-wide DVFS in a case where multiple devices share power domain and > this domain is a voltage regulator. The regulator framework is the > correct abstraction in this case for today. Well, I admit it's a bit complex. But it solves the problem in a nicely abstracted way that should work for everybody, at least in my opinion. Although, let's not exclude that there are pieces missing in genpd or the opp layer, as this DVFS feature is rather new - but then we should just extend/fix it. Kind regards Uffe
On Thu, 5 Nov 2020 at 00:44, Dmitry Osipenko <digetx@gmail.com> wrote: > > Document new DVFS OPP table and voltage regulator properties of the > Host1x bus and devices sitting on the bus. > > Signed-off-by: Dmitry Osipenko <digetx@gmail.com> > --- > .../display/tegra/nvidia,tegra20-host1x.txt | 56 +++++++++++++++++++ > 1 file changed, 56 insertions(+) > > diff --git a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt > index 34d993338453..0593c8df70bb 100644 > --- a/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt > +++ b/Documentation/devicetree/bindings/display/tegra/nvidia,tegra20-host1x.txt > @@ -20,6 +20,18 @@ Required properties: > - reset-names: Must include the following entries: > - host1x > > +Optional properties: > +- operating-points-v2: See ../bindings/opp/opp.txt for details. > +- core-supply: Phandle of voltage regulator of the SoC "core" power domain. > + > +For each opp entry in 'operating-points-v2' table of host1x and its modules: > +- opp-supported-hw: One bitfield indicating: > + On Tegra20: SoC process ID mask > + On Tegra30+: SoC speedo ID mask > + > + A bitwise AND is performed against the value and if any bit > + matches, the OPP gets enabled. > + > Each host1x client module having to perform DMA through the Memory Controller > should have the interconnect endpoints set to the Memory Client and External > Memory respectively. > @@ -45,6 +57,8 @@ of the following host1x client modules: > - interconnect-names: Must include name of the interconnect path for each > interconnect entry. Consult TRM documentation for information about > available memory clients, see MEMORY CONTROLLER section. > + - core-supply: Phandle of voltage regulator of the SoC "core" power domain. > + - operating-points-v2: See ../bindings/opp/opp.txt for details. > As discussed in the thread for the cover-letter. We already have DT bindings for power-domains (providers and consumers). Please use them instead of adding SoC specific bindings to each peripheral device. [...] Kind regards Uffe
12.11.2020 23:43, Thierry Reding пишет: >> The difference in comparison to using voltage regulator directly is >> minimal, basically the core-supply phandle is replaced is replaced with >> a power-domain phandle in a device tree. > These new power-domain handles would have to be added to devices that > potentially already have a power-domain handle, right? Isn't that going > to cause issues? I vaguely recall that we already have multiple power > domains for the XUSB controller and we have to jump through extra hoops > to make that work. I modeled the core PD as a parent of the PMC sub-domains, which presumably is a correct way to represent the domains topology. https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7 >> The only thing which makes me feel a bit uncomfortable is that there is >> no real hardware node for the power domain node in a device-tree. > Could we anchor the new power domain at the PMC for example? That would > allow us to avoid the "virtual" node. I had a thought about using PMC for the core domain, but not sure whether it will be an entirely correct hardware description. Although, it will be nice to have it this way. This is what Tegra TRM says about PMC: "The Power Management Controller (PMC) block interacts with an external or Power Manager Unit (PMU). The PMC mostly controls the entry and exit of the system from different sleep modes. It provides power-gating controllers for SOC and CPU power-islands and also provides scratch storage to save some of the context during sleep modes (when CPU and/or SOC power rails are off). Additionally, PMC interacts with the external Power Manager Unit (PMU)." The core voltage regulator is a part of the PMU. Not all core SoC devices are behind PMC, IIUC. > On the other hand, if we were to > use a regulator, we'd be adding a node for that, right? So isn't this > effectively going to be the same node if we use a power domain? Both > software constructs are using the same voltage regulator, so they should > be able to be described by the same device tree node, shouldn't they? I'm not exactly sure what you're meaning by "use a regulator" and "we'd be adding a node for that", could you please clarify? This v1 approach uses a core-supply phandle (i.e. regulator is used), it doesn't require extra nodes.
13.11.2020 17:45, Ulf Hansson пишет: > On Thu, 12 Nov 2020 at 23:14, Dmitry Osipenko <digetx@gmail.com> wrote: >> >> 12.11.2020 23:43, Thierry Reding пишет: >>>> The difference in comparison to using voltage regulator directly is >>>> minimal, basically the core-supply phandle is replaced is replaced with >>>> a power-domain phandle in a device tree. >>> These new power-domain handles would have to be added to devices that >>> potentially already have a power-domain handle, right? Isn't that going >>> to cause issues? I vaguely recall that we already have multiple power >>> domains for the XUSB controller and we have to jump through extra hoops >>> to make that work. >> >> I modeled the core PD as a parent of the PMC sub-domains, which >> presumably is a correct way to represent the domains topology. >> >> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7 > > That could make sense, it seems. > > Anyway, this made me realize that > dev_pm_genpd_set_performance_state(dev) returns -EINVAL, in case the > device's genpd doesn't have the ->set_performance_state() assigned. > This may not be correct. Instead we should likely consider an empty > callback as okay and continue to walk the topology upwards to the > parent domain, etc. > > Just wanted to point this out. I intend to post a patch as soon as I > can for this. Thank you, I was also going to make the same change, but haven't bothered to do it so far. Please feel free to CC me on the patch.
13.11.2020 19:35, Thierry Reding пишет: > On Fri, Nov 13, 2020 at 01:14:45AM +0300, Dmitry Osipenko wrote: >> 12.11.2020 23:43, Thierry Reding пишет: >>>> The difference in comparison to using voltage regulator directly is >>>> minimal, basically the core-supply phandle is replaced is replaced with >>>> a power-domain phandle in a device tree. >>> These new power-domain handles would have to be added to devices that >>> potentially already have a power-domain handle, right? Isn't that going >>> to cause issues? I vaguely recall that we already have multiple power >>> domains for the XUSB controller and we have to jump through extra hoops >>> to make that work. >> >> I modeled the core PD as a parent of the PMC sub-domains, which >> presumably is a correct way to represent the domains topology. >> >> https://gist.github.com/digetx/dfd92c7f7e0aa6cef20403c4298088d7 >> >>>> The only thing which makes me feel a bit uncomfortable is that there is >>>> no real hardware node for the power domain node in a device-tree. >>> Could we anchor the new power domain at the PMC for example? That would >>> allow us to avoid the "virtual" node. >> >> I had a thought about using PMC for the core domain, but not sure >> whether it will be an entirely correct hardware description. Although, >> it will be nice to have it this way. >> >> This is what Tegra TRM says about PMC: >> >> "The Power Management Controller (PMC) block interacts with an external >> or Power Manager Unit (PMU). The PMC mostly controls the entry and exit >> of the system from different sleep modes. It provides power-gating >> controllers for SOC and CPU power-islands and also provides scratch >> storage to save some of the context during sleep modes (when CPU and/or >> SOC power rails are off). Additionally, PMC interacts with the external >> Power Manager Unit (PMU)." >> >> The core voltage regulator is a part of the PMU. >> >> Not all core SoC devices are behind PMC, IIUC. > > There are usually some SoC devices that are always-on. Things like the > RTC for example, can never be power-gated, as far as I recall. On newer > chips there are usually many more blocks that can't be powergated at > all. The RTC is actually a special power domain on Tegra, it's not a part of the CORE domain, they are separate from each other. We need to know what blocks belong to a power domain and what's the power topology of these blocks. I think we already have this knowledge, so it shouldn't be a problem. >>> On the other hand, if we were to >>> use a regulator, we'd be adding a node for that, right? So isn't this >>> effectively going to be the same node if we use a power domain? Both >>> software constructs are using the same voltage regulator, so they should >>> be able to be described by the same device tree node, shouldn't they? >> >> I'm not exactly sure what you're meaning by "use a regulator" and "we'd >> be adding a node for that", could you please clarify? This v1 approach >> uses a core-supply phandle (i.e. regulator is used), it doesn't require >> extra nodes. > > What I meant to say was that the actual supply voltage is generated by > some device (typically one of the SD outputs of the PMIC). Whether we > model this as a power domain or a regulator doesn't really matter, > right? So I'm wondering if the device that generates the voltage should > be the power domain provider, just like it is the provider of the > regulator if this was modelled as a regulator. Technically this could be done and it shouldn't be difficult to add GENPD support to the regulator framework, but I think this is an inaccurate hardware description. It shouldn't be correct to describe internal SoC parts as directly-connected to an external voltage regulator. The core voltage regulator is connected to a one of several power rails of the Tegra chip. There is no good way to describe hardware in terms of voltage regulators, hence that's why this v1 series added a core-supply to each SoC component of each board's DT individually. It's actually one of the benefits of using a separate DT node for the power-domain, which describes the "Tegra Core" part of the Tegra SoC, and thus, it all stays within tegra.dtsi. This means that PD explicitly belongs to the SoC internals in oppose to describing PD like it's an external/off-chip component. Initially I didn't like much that there is no hardware address to back up the power domain node in a DT, but actually there is no address for the power rail. Hence it should be better to describe hardware by keeping PD internally to the SoC. Note that potentially PD may require knowledge about specifics of a particular SoC, while external regulator doesn't belong to a SoC. Also, I guess technically there could be multiple external regulators which power a single SoC rail.
01.12.2020 16:57, Mark Brown пишет: > On Thu, 5 Nov 2020 02:43:57 +0300, Dmitry Osipenko wrote: >> Introduce core voltage scaling for NVIDIA Tegra20/30 SoCs, which reduces >> power consumption and heating of the Tegra chips. Tegra SoC has multiple >> hardware units which belong to a core power domain of the SoC and share >> the core voltage. The voltage must be selected in accordance to a minimum >> requirement of every core hardware unit. >> >> The minimum core voltage requirement depends on: >> >> [...] > > Applied to > > https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git for-next > > Thanks! > > [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers() > (no commit info) > > All being well this means that it will be integrated into the linux-next > tree (usually sometime in the next 24 hours) and sent to Linus during > the next merge window (or sooner if it is a bug fix), however if > problems are discovered then the patch may be dropped or reverted. > > You may get further e-mails resulting from automated or manual testing > and review of the tree, please engage with people reporting problems and > send followup patches addressing any issues that are reported if needed. > > If any updates are required or you are submitting further changes they > should be sent as incremental updates against current git, existing > patches will not be replaced. > > Please add any relevant lists and maintainers to the CCs when replying > to this mail. Hello Mark, Could you please hold on this patch? It won't be needed in a v2, which will use power domains. Also, I'm not sure whether the "sound" tree is suitable for any of the patches in this series.
On Tue, Dec 01, 2020 at 05:17:20PM +0300, Dmitry Osipenko wrote: > 01.12.2020 16:57, Mark Brown пишет: > > [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers() > > (no commit info) > Could you please hold on this patch? It won't be needed in a v2, which > will use power domains. > Also, I'm not sure whether the "sound" tree is suitable for any of the > patches in this series. It didn't actually get applied (note the "no commit info") - it looks like b4's matching code got confused and decided to generate mails for anything that I've ever downloaded and not posted.
01.12.2020 17:34, Mark Brown пишет: > On Tue, Dec 01, 2020 at 05:17:20PM +0300, Dmitry Osipenko wrote: >> 01.12.2020 16:57, Mark Brown пишет: > >>> [1/1] regulator: Allow skipping disabled regulators in regulator_check_consumers() >>> (no commit info) > >> Could you please hold on this patch? It won't be needed in a v2, which >> will use power domains. > >> Also, I'm not sure whether the "sound" tree is suitable for any of the >> patches in this series. > > It didn't actually get applied (note the "no commit info") - it looks > like b4's matching code got confused and decided to generate mails for > anything that I've ever downloaded and not posted. > Alright, thank you for the clarification.