cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Message ID	20240226065113.1690534-1-nick.hu@sifive.com
State	New
Headers	show Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 822ED1805A for <linux-pm@vger.kernel.org>; Mon, 26 Feb 2024 06:51:22 +0000 (UTC) From: Nick Hu <nick.hu@sifive.com> To: anup@brainfault.org, rafael@kernel.org, daniel.lezcano@linaro.org, paul.walmsley@sifive.com, palmer@dabbelt.com, linux-pm@vger.kernel.org, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Cc: zong.li@sifive.com, Nick Hu <nick.hu@sifive.com> Subject: [PATCH] cpuidle: riscv-sbi: Add cluster_pm_enter()/exit() Date: Mon, 26 Feb 2024 14:51:13 +0800 Message-Id: <20240226065113.1690534-1-nick.hu@sifive.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	cpuidle: riscv-sbi: Add cluster_pm_enter()/exit() \| expand cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Nick Hu Feb. 26, 2024, 6:51 a.m. UTC

When the cpus in the same cluster are all in the idle state, the kernel
might put the cluster into a deeper low power state. Call the
cluster_pm_enter() before entering the low power state and call the
cluster_pm_exit() after the cluster woken up.

Signed-off-by: Nick Hu <nick.hu@sifive.com>
---
 drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
 1 file changed, 22 insertions(+), 2 deletions(-)

patchwork-bot+linux-riscv@kernel.org April 28, 2024, 10 p.m. UTC | #1

Hello:

This patch was applied to riscv/linux.git (for-next)
by Palmer Dabbelt <palmer@rivosinc.com>:

On Mon, 26 Feb 2024 14:51:13 +0800 you wrote:
> When the cpus in the same cluster are all in the idle state, the kernel
> might put the cluster into a deeper low power state. Call the
> cluster_pm_enter() before entering the low power state and call the
> cluster_pm_exit() after the cluster woken up.
> 
> Signed-off-by: Nick Hu <nick.hu@sifive.com>
> 
> [...]

Here is the summary with links:
  - cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()
    https://git.kernel.org/riscv/c/3fd665f2854d

You are awesome, thank you!

Ulf Hansson April 29, 2024, 2:32 p.m. UTC | #2

On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
>
> When the cpus in the same cluster are all in the idle state, the kernel
> might put the cluster into a deeper low power state. Call the
> cluster_pm_enter() before entering the low power state and call the
> cluster_pm_exit() after the cluster woken up.
>
> Signed-off-by: Nick Hu <nick.hu@sifive.com>

I was not cced this patch, but noticed that this patch got queued up
recently. Sorry for not noticing earlier.

If not too late, can you please drop/revert it? We should really move
away from the CPU cluster notifiers. See more information below.

> ---
>  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
>  1 file changed, 22 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> index e8094fc92491..298dc76a00cf 100644
> --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
>  {
>         struct genpd_power_state *state = &pd->states[pd->state_idx];
>         u32 *pd_state;
> +       int ret;
>
>         if (!state->data)
>                 return 0;
> @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
>         if (!sbi_cpuidle_pd_allow_domain_state)
>                 return -EBUSY;
>
> +       ret = cpu_cluster_pm_enter();
> +       if (ret)
> +               return ret;

Rather than using the CPU cluster notifiers, consumers of the genpd
can register themselves to receive genpd on/off notifiers.

In other words, none of this should be needed, right?

[...]

Kind regards
Uffe

Nick Hu April 29, 2024, 4:22 p.m. UTC | #3

Hi Ulf

On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> >
> > When the cpus in the same cluster are all in the idle state, the kernel
> > might put the cluster into a deeper low power state. Call the
> > cluster_pm_enter() before entering the low power state and call the
> > cluster_pm_exit() after the cluster woken up.
> >
> > Signed-off-by: Nick Hu <nick.hu@sifive.com>
>
> I was not cced this patch, but noticed that this patch got queued up
> recently. Sorry for not noticing earlier.
>
> If not too late, can you please drop/revert it? We should really move
> away from the CPU cluster notifiers. See more information below.
>
> > ---
> >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> >  1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > index e8094fc92491..298dc76a00cf 100644
> > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> >  {
> >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> >         u32 *pd_state;
> > +       int ret;
> >
> >         if (!state->data)
> >                 return 0;
> > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> >         if (!sbi_cpuidle_pd_allow_domain_state)
> >                 return -EBUSY;
> >
> > +       ret = cpu_cluster_pm_enter();
> > +       if (ret)
> > +               return ret;
>
> Rather than using the CPU cluster notifiers, consumers of the genpd
> can register themselves to receive genpd on/off notifiers.
>
> In other words, none of this should be needed, right?
>
Thanks for the feedback!
Maybe I miss something, I'm wondering about a case like below:
If we have a shared L2 cache controller inside the cpu cluster power
domain and we add this controller to be a consumer of the power
domain, Shouldn't the genpd invoke the domain idle only after the
shared L2 cache controller is suspended?
Is there a way that we can put the L2 cache down while all cpus in the
same cluster are idle?
> [...]
>
> Kind regards
> Uffe

Nick Hu April 29, 2024, 4:26 p.m. UTC | #4

On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
>
> Hi Ulf
>
> On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > >
> > > When the cpus in the same cluster are all in the idle state, the kernel
> > > might put the cluster into a deeper low power state. Call the
> > > cluster_pm_enter() before entering the low power state and call the
> > > cluster_pm_exit() after the cluster woken up.
> > >
> > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> >
> > I was not cced this patch, but noticed that this patch got queued up
> > recently. Sorry for not noticing earlier.
> >
> > If not too late, can you please drop/revert it? We should really move
> > away from the CPU cluster notifiers. See more information below.
> >
> > > ---
> > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > index e8094fc92491..298dc76a00cf 100644
> > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > >  {
> > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > >         u32 *pd_state;
> > > +       int ret;
> > >
> > >         if (!state->data)
> > >                 return 0;
> > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > >                 return -EBUSY;
> > >
> > > +       ret = cpu_cluster_pm_enter();
> > > +       if (ret)
> > > +               return ret;
> >
> > Rather than using the CPU cluster notifiers, consumers of the genpd
> > can register themselves to receive genpd on/off notifiers.
> >
> > In other words, none of this should be needed, right?
> >
> Thanks for the feedback!
> Maybe I miss something, I'm wondering about a case like below:
> If we have a shared L2 cache controller inside the cpu cluster power
> domain and we add this controller to be a consumer of the power
> domain, Shouldn't the genpd invoke the domain idle only after the
> shared L2 cache controller is suspended?
> Is there a way that we can put the L2 cache down while all cpus in the
> same cluster are idle?
> > [...]
Sorry, I made some mistake in my second question.
Update the question here:
Is there a way that we can save the L2 cache states while all cpus in the
same cluster are idle and the cluster could be powered down?
> >
> > Kind regards
> > Uffe

Ulf Hansson April 30, 2024, 8:12 a.m. UTC | #5

On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
>
> On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> >
> > Hi Ulf
> >
> > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > might put the cluster into a deeper low power state. Call the
> > > > cluster_pm_enter() before entering the low power state and call the
> > > > cluster_pm_exit() after the cluster woken up.
> > > >
> > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > >
> > > I was not cced this patch, but noticed that this patch got queued up
> > > recently. Sorry for not noticing earlier.
> > >
> > > If not too late, can you please drop/revert it? We should really move
> > > away from the CPU cluster notifiers. See more information below.
> > >
> > > > ---
> > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > index e8094fc92491..298dc76a00cf 100644
> > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > >  {
> > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > >         u32 *pd_state;
> > > > +       int ret;
> > > >
> > > >         if (!state->data)
> > > >                 return 0;
> > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > >                 return -EBUSY;
> > > >
> > > > +       ret = cpu_cluster_pm_enter();
> > > > +       if (ret)
> > > > +               return ret;
> > >
> > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > can register themselves to receive genpd on/off notifiers.
> > >
> > > In other words, none of this should be needed, right?
> > >
> > Thanks for the feedback!
> > Maybe I miss something, I'm wondering about a case like below:
> > If we have a shared L2 cache controller inside the cpu cluster power
> > domain and we add this controller to be a consumer of the power
> > domain, Shouldn't the genpd invoke the domain idle only after the
> > shared L2 cache controller is suspended?
> > Is there a way that we can put the L2 cache down while all cpus in the
> > same cluster are idle?
> > > [...]
> Sorry, I made some mistake in my second question.
> Update the question here:
> Is there a way that we can save the L2 cache states while all cpus in the
> same cluster are idle and the cluster could be powered down?

If the L2 cache is a consumer of the cluster, the consumer driver for
the L2 cache should register for genpd on/off notifiers.

The device representing the L2 cache needs to be enabled for runtime
PM, to be taken into account correctly by the cluster genpd. In this
case, the device should most likely remain runtime suspended, but
instead rely on the genpd on/off notifiers to understand when
save/restore of the cache states should be done.

Kind regards
Uffe

Nick Hu May 14, 2024, 9:50 a.m. UTC | #6

Hi Ulf,

Thank you for your valuable suggestion.
I sincerely apologize for the delay in responding to your message. We
have diligently worked on experimenting with the suggestion you
provided.

As per your recommendation, we have incorporated the "power-domains=<>
property" into the consumer's node, resulting in modifications to the
DTS as illustrated below:

cpus {
    ...
     domain-idle-states {
           CLUSTER_SLEEP:cluster-sleep {
                        compatible = "domain-idle-state";
                        ...
            }
     }
     power-domains {
            ...
            ...
            CLUSTER_PD: clusterpd {
                    domain-idle-states = <&CLUSTER_SLEEP>;
            };
     }
}
soc {
      deviceA@xxx{
             ...
             power-domains = <&CLUSTER_PD>;
             ...
      }
}

However, this adjustment has led to an issue where the probe for
'deviceA' is deferred by 'device_links_check_suppliers()' within
'really_probe()'. In an attempt to mitigate this issue, we
experimented with a workaround by adding the attribute
"status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
prevent the creation of a device link between 'deviceA' and
'CLUSTER_PD'. Nevertheless, we remain uncertain about the
appropriateness of this solution.

Do you have suggestions on how to effectively address this issue?

Regards,
Nick

On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> >
> > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > >
> > > Hi Ulf
> > >
> > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > >
> > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > >
> > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > might put the cluster into a deeper low power state. Call the
> > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > cluster_pm_exit() after the cluster woken up.
> > > > >
> > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > >
> > > > I was not cced this patch, but noticed that this patch got queued up
> > > > recently. Sorry for not noticing earlier.
> > > >
> > > > If not too late, can you please drop/revert it? We should really move
> > > > away from the CPU cluster notifiers. See more information below.
> > > >
> > > > > ---
> > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > >
> > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > >  {
> > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > >         u32 *pd_state;
> > > > > +       int ret;
> > > > >
> > > > >         if (!state->data)
> > > > >                 return 0;
> > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > >                 return -EBUSY;
> > > > >
> > > > > +       ret = cpu_cluster_pm_enter();
> > > > > +       if (ret)
> > > > > +               return ret;
> > > >
> > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > can register themselves to receive genpd on/off notifiers.
> > > >
> > > > In other words, none of this should be needed, right?
> > > >
> > > Thanks for the feedback!
> > > Maybe I miss something, I'm wondering about a case like below:
> > > If we have a shared L2 cache controller inside the cpu cluster power
> > > domain and we add this controller to be a consumer of the power
> > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > shared L2 cache controller is suspended?
> > > Is there a way that we can put the L2 cache down while all cpus in the
> > > same cluster are idle?
> > > > [...]
> > Sorry, I made some mistake in my second question.
> > Update the question here:
> > Is there a way that we can save the L2 cache states while all cpus in the
> > same cluster are idle and the cluster could be powered down?
>
> If the L2 cache is a consumer of the cluster, the consumer driver for
> the L2 cache should register for genpd on/off notifiers.
>
> The device representing the L2 cache needs to be enabled for runtime
> PM, to be taken into account correctly by the cluster genpd. In this
> case, the device should most likely remain runtime suspended, but
> instead rely on the genpd on/off notifiers to understand when
> save/restore of the cache states should be done.
>
> Kind regards
> Uffe

Anup Patel May 14, 2024, 2:23 p.m. UTC | #7

Hi Nick,

On Tue, May 14, 2024 at 3:20 PM Nick Hu <nick.hu@sifive.com> wrote:
>
> Hi Ulf,
>
> Thank you for your valuable suggestion.
> I sincerely apologize for the delay in responding to your message. We
> have diligently worked on experimenting with the suggestion you
> provided.
>
> As per your recommendation, we have incorporated the "power-domains=<>
> property" into the consumer's node, resulting in modifications to the
> DTS as illustrated below:
>
> cpus {
>     ...
>      domain-idle-states {
>            CLUSTER_SLEEP:cluster-sleep {
>                         compatible = "domain-idle-state";
>                         ...
>             }
>      }
>      power-domains {
>             ...
>             ...
>             CLUSTER_PD: clusterpd {
>                     domain-idle-states = <&CLUSTER_SLEEP>;
>             };
>      }
> }
> soc {
>       deviceA@xxx{
>              ...
>              power-domains = <&CLUSTER_PD>;
>              ...
>       }
> }
>
> However, this adjustment has led to an issue where the probe for
> 'deviceA' is deferred by 'device_links_check_suppliers()' within
> 'really_probe()'. In an attempt to mitigate this issue, we
> experimented with a workaround by adding the attribute
> "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> prevent the creation of a device link between 'deviceA' and
> 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> appropriateness of this solution.
>
> Do you have suggestions on how to effectively address this issue?

I totally missed this email since I was not CC'ed sorry about that. Please
use get_maintainers.pl when sending patches.

The genpd_add_provider() (called by of_genpd_add_provider_simple())
does mark the power-domain DT node as initialized (fwnode_dev_initialized())
so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
resolved and 'deviceA' should be probed unless there are other unmet
dependencies.

Try adding "#define DEBUG" before all includes in drivers/core/base.c
and add "loglevel=8" in kernel parameters, this will print producer-consumer
linkage of all devices.

Marking the power-domain DT node as "disabled" is certainly not the
right way.

Regards,
Anup

>
> Regards,
> Nick
>
> On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >
> > On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> > >
> > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > Hi Ulf
> > > >
> > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > >
> > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > >
> > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > >
> > > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > > >
> > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > recently. Sorry for not noticing earlier.
> > > > >
> > > > > If not too late, can you please drop/revert it? We should really move
> > > > > away from the CPU cluster notifiers. See more information below.
> > > > >
> > > > > > ---
> > > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > >  {
> > > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > >         u32 *pd_state;
> > > > > > +       int ret;
> > > > > >
> > > > > >         if (!state->data)
> > > > > >                 return 0;
> > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > >                 return -EBUSY;
> > > > > >
> > > > > > +       ret = cpu_cluster_pm_enter();
> > > > > > +       if (ret)
> > > > > > +               return ret;
> > > > >
> > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > can register themselves to receive genpd on/off notifiers.
> > > > >
> > > > > In other words, none of this should be needed, right?
> > > > >
> > > > Thanks for the feedback!
> > > > Maybe I miss something, I'm wondering about a case like below:
> > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > domain and we add this controller to be a consumer of the power
> > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > shared L2 cache controller is suspended?
> > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > same cluster are idle?
> > > > > [...]
> > > Sorry, I made some mistake in my second question.
> > > Update the question here:
> > > Is there a way that we can save the L2 cache states while all cpus in the
> > > same cluster are idle and the cluster could be powered down?
> >
> > If the L2 cache is a consumer of the cluster, the consumer driver for
> > the L2 cache should register for genpd on/off notifiers.
> >
> > The device representing the L2 cache needs to be enabled for runtime
> > PM, to be taken into account correctly by the cluster genpd. In this
> > case, the device should most likely remain runtime suspended, but
> > instead rely on the genpd on/off notifiers to understand when
> > save/restore of the cache states should be done.
> >
> > Kind regards
> > Uffe

Anup Patel May 14, 2024, 2:54 p.m. UTC | #8

On Tue, May 14, 2024 at 7:53 PM Anup Patel <anup@brainfault.org> wrote:
>
> Hi Nick,
>
> On Tue, May 14, 2024 at 3:20 PM Nick Hu <nick.hu@sifive.com> wrote:
> >
> > Hi Ulf,
> >
> > Thank you for your valuable suggestion.
> > I sincerely apologize for the delay in responding to your message. We
> > have diligently worked on experimenting with the suggestion you
> > provided.
> >
> > As per your recommendation, we have incorporated the "power-domains=<>
> > property" into the consumer's node, resulting in modifications to the
> > DTS as illustrated below:
> >
> > cpus {
> >     ...
> >      domain-idle-states {
> >            CLUSTER_SLEEP:cluster-sleep {
> >                         compatible = "domain-idle-state";
> >                         ...
> >             }
> >      }
> >      power-domains {
> >             ...
> >             ...
> >             CLUSTER_PD: clusterpd {
> >                     domain-idle-states = <&CLUSTER_SLEEP>;
> >             };
> >      }
> > }
> > soc {
> >       deviceA@xxx{
> >              ...
> >              power-domains = <&CLUSTER_PD>;
> >              ...
> >       }
> > }
> >
> > However, this adjustment has led to an issue where the probe for
> > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > 'really_probe()'. In an attempt to mitigate this issue, we
> > experimented with a workaround by adding the attribute
> > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > prevent the creation of a device link between 'deviceA' and
> > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > appropriateness of this solution.
> >
> > Do you have suggestions on how to effectively address this issue?
>
> I totally missed this email since I was not CC'ed sorry about that. Please
> use get_maintainers.pl when sending patches.

I stand corrected. This patch had landed in the "spam" folder. I don't know why.

Regards,
Anup

>
> The genpd_add_provider() (called by of_genpd_add_provider_simple())
> does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> resolved and 'deviceA' should be probed unless there are other unmet
> dependencies.
>
> Try adding "#define DEBUG" before all includes in drivers/core/base.c
> and add "loglevel=8" in kernel parameters, this will print producer-consumer
> linkage of all devices.
>
> Marking the power-domain DT node as "disabled" is certainly not the
> right way.
>
> Regards,
> Anup
>
> >
> > Regards,
> > Nick
> >
> > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >
> > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > > > >
> > > > > Hi Ulf
> > > > >
> > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > >
> > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > >
> > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > >
> > > > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > > > >
> > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > recently. Sorry for not noticing earlier.
> > > > > >
> > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > >
> > > > > > > ---
> > > > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > >
> > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > >  {
> > > > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > >         u32 *pd_state;
> > > > > > > +       int ret;
> > > > > > >
> > > > > > >         if (!state->data)
> > > > > > >                 return 0;
> > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > >                 return -EBUSY;
> > > > > > >
> > > > > > > +       ret = cpu_cluster_pm_enter();
> > > > > > > +       if (ret)
> > > > > > > +               return ret;
> > > > > >
> > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > >
> > > > > > In other words, none of this should be needed, right?
> > > > > >
> > > > > Thanks for the feedback!
> > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > domain and we add this controller to be a consumer of the power
> > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > shared L2 cache controller is suspended?
> > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > same cluster are idle?
> > > > > > [...]
> > > > Sorry, I made some mistake in my second question.
> > > > Update the question here:
> > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > same cluster are idle and the cluster could be powered down?
> > >
> > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > the L2 cache should register for genpd on/off notifiers.
> > >
> > > The device representing the L2 cache needs to be enabled for runtime
> > > PM, to be taken into account correctly by the cluster genpd. In this
> > > case, the device should most likely remain runtime suspended, but
> > > instead rely on the genpd on/off notifiers to understand when
> > > save/restore of the cache states should be done.
> > >
> > > Kind regards
> > > Uffe

Anup Patel May 15, 2024, 1:45 p.m. UTC | #9

Hi Nick,

On Wed, May 15, 2024 at 5:45 PM Nick Hu <nick.hu@sifive.com> wrote:
>
> Hi Anup,
>
> Thank you for your guidance.
> After enabling the debug message, we found a way to solve the problem
> by the following steps:
> 1. Add a compatible string in 'power-domains' node otherwise it won't
> be the supplier of the consumers. (See of_link_to_phandle())

Hmm, requiring a compatible string is odd. Where should we document
this compatible string ?

> 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> won't be added to the device hierarchy by device_add().
> 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> '/power-domains'.

By adding a compatible string and moving the "power-domains" node
outside, you are simply forcing the OF framework to populate devices.

How about manually creating platform_device for each power-domain
DT node using of_platform_device_create() in sbi_pd_init() ?

>
> So the DTS will be like:
> cpus {
>     ...
>      domain-idle-states {
>            CLUSTER_SLEEP:cluster-sleep {
>                         compatible = "domain-idle-state";
>                         ...
>             }
>      }
> }
> power-domains {
>             compatible = "riscv,sbi-power-domains"
>             ...
>             ...
>             CLUSTER_PD: clusterpd {
>                     domain-idle-states = <&CLUSTER_SLEEP>;
>             };
> }
> soc {
>       deviceA@xxx{
>              ...
>              power-domains = <&CLUSTER_PD>;
>              ...
>       }
> }

Regards,
Anup

>
> Regards,
> Nick
>
> On Tue, May 14, 2024 at 10:54 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Tue, May 14, 2024 at 7:53 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > Hi Nick,
> > >
> > > On Tue, May 14, 2024 at 3:20 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > Hi Ulf,
> > > >
> > > > Thank you for your valuable suggestion.
> > > > I sincerely apologize for the delay in responding to your message. We
> > > > have diligently worked on experimenting with the suggestion you
> > > > provided.
> > > >
> > > > As per your recommendation, we have incorporated the "power-domains=<>
> > > > property" into the consumer's node, resulting in modifications to the
> > > > DTS as illustrated below:
> > > >
> > > > cpus {
> > > >     ...
> > > >      domain-idle-states {
> > > >            CLUSTER_SLEEP:cluster-sleep {
> > > >                         compatible = "domain-idle-state";
> > > >                         ...
> > > >             }
> > > >      }
> > > >      power-domains {
> > > >             ...
> > > >             ...
> > > >             CLUSTER_PD: clusterpd {
> > > >                     domain-idle-states = <&CLUSTER_SLEEP>;
> > > >             };
> > > >      }
> > > > }
> > > > soc {
> > > >       deviceA@xxx{
> > > >              ...
> > > >              power-domains = <&CLUSTER_PD>;
> > > >              ...
> > > >       }
> > > > }
> > > >
> > > > However, this adjustment has led to an issue where the probe for
> > > > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > > > 'really_probe()'. In an attempt to mitigate this issue, we
> > > > experimented with a workaround by adding the attribute
> > > > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > > > prevent the creation of a device link between 'deviceA' and
> > > > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > > > appropriateness of this solution.
> > > >
> > > > Do you have suggestions on how to effectively address this issue?
> > >
> > > I totally missed this email since I was not CC'ed sorry about that. Please
> > > use get_maintainers.pl when sending patches.
> >
> > I stand corrected. This patch had landed in the "spam" folder. I don't know why.
> >
> > Regards,
> > Anup
> >
> > >
> > > The genpd_add_provider() (called by of_genpd_add_provider_simple())
> > > does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> > > so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> > > resolved and 'deviceA' should be probed unless there are other unmet
> > > dependencies.
> > >
> > > Try adding "#define DEBUG" before all includes in drivers/core/base.c
> > > and add "loglevel=8" in kernel parameters, this will print producer-consumer
> > > linkage of all devices.
> > >
> > > Marking the power-domain DT node as "disabled" is certainly not the
> > > right way.
> > >
> > > Regards,
> > > Anup
> > >
> > > >
> > > > Regards,
> > > > Nick
> > > >
> > > > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > >
> > > > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > >
> > > > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > >
> > > > > > > Hi Ulf
> > > > > > >
> > > > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > >
> > > > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > > > >
> > > > > > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > > > > > >
> > > > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > > > recently. Sorry for not noticing earlier.
> > > > > > > >
> > > > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > > > >
> > > > > > > > > ---
> > > > > > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > > >
> > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > >  {
> > > > > > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > > >         u32 *pd_state;
> > > > > > > > > +       int ret;
> > > > > > > > >
> > > > > > > > >         if (!state->data)
> > > > > > > > >                 return 0;
> > > > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > > >                 return -EBUSY;
> > > > > > > > >
> > > > > > > > > +       ret = cpu_cluster_pm_enter();
> > > > > > > > > +       if (ret)
> > > > > > > > > +               return ret;
> > > > > > > >
> > > > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > > > >
> > > > > > > > In other words, none of this should be needed, right?
> > > > > > > >
> > > > > > > Thanks for the feedback!
> > > > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > > > domain and we add this controller to be a consumer of the power
> > > > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > > > shared L2 cache controller is suspended?
> > > > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > > > same cluster are idle?
> > > > > > > > [...]
> > > > > > Sorry, I made some mistake in my second question.
> > > > > > Update the question here:
> > > > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > > > same cluster are idle and the cluster could be powered down?
> > > > >
> > > > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > > > the L2 cache should register for genpd on/off notifiers.
> > > > >
> > > > > The device representing the L2 cache needs to be enabled for runtime
> > > > > PM, to be taken into account correctly by the cluster genpd. In this
> > > > > case, the device should most likely remain runtime suspended, but
> > > > > instead rely on the genpd on/off notifiers to understand when
> > > > > save/restore of the cache states should be done.
> > > > >
> > > > > Kind regards
> > > > > Uffe

Anup Patel May 17, 2024, 4:39 a.m. UTC | #10

On Thu, May 16, 2024 at 9:40 AM Nick Hu <nick.hu@sifive.com> wrote:
>
> Hi Anup
>
> On Wed, May 15, 2024 at 9:46 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > Hi Nick,
> >
> > On Wed, May 15, 2024 at 5:45 PM Nick Hu <nick.hu@sifive.com> wrote:
> > >
> > > Hi Anup,
> > >
> > > Thank you for your guidance.
> > > After enabling the debug message, we found a way to solve the problem
> > > by the following steps:
> > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > be the supplier of the consumers. (See of_link_to_phandle())
> >
> > Hmm, requiring a compatible string is odd. Where should we document
> > this compatible string ?
> >
> Sorry, this is my fault. I didn't include some updates in
> of_link_to_phandle(). This led some misunderstandings here.
> You are right, we don't need it.
> The supplier will be linked to the CLUSTER_PD node.
>
> > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > won't be added to the device hierarchy by device_add().
> > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > '/power-domains'.
> >
> > By adding a compatible string and moving the "power-domains" node
> > outside, you are simply forcing the OF framework to populate devices.
> >
> > How about manually creating platform_device for each power-domain
> > DT node using of_platform_device_create() in sbi_pd_init() ?
> >
> Thanks for the suggestion! We have test the solution and it could work.
> We was wondering if it's feasible for us to relocate the
> 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> encompass not only the CPUs but also other components within the
> cluster.

The cpuidle-riscv-sbi driver expects "power-domains" DT node
under "/cpus" DT node because this driver only deals with power
domains related to CPU cluster or CPU cache-hierarchy. It does
make sense to define L2/L3 power domains under
"/cpus/power-domain" since these are related to CPUs.

Moving the CPU "power-domains" DT node directly under "/" or
somewhere else would mean that it covers system-wide power
domains which is not true.

I suggest we continue using "/cpus/power-domains" DT node
only for power domains related to CPU clusters or CPU
cache-hierarchy.

For system wide power domains of SoC devices, we can either:
1) Use device power domains through the SBI MPXY extension
    via different driver
2) Use a platform specific driver

>
> We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> create the devices for each subnode of psci domain.
> Is there any reason that they don't need it?

Existing ARM DTS files under arch/arm64/boot/dts, use device
power domains through SCMI (or platform specific mechanism)
which are already populated as devices by Linux DD framework.

Regards,
Anup

>
> > >
> > > So the DTS will be like:
> > > cpus {
> > >     ...
> > >      domain-idle-states {
> > >            CLUSTER_SLEEP:cluster-sleep {
> > >                         compatible = "domain-idle-state";
> > >                         ...
> > >             }
> > >      }
> > > }
> > > power-domains {
> > >             compatible = "riscv,sbi-power-domains"
> > >             ...
> > >             ...
> > >             CLUSTER_PD: clusterpd {
> > >                     domain-idle-states = <&CLUSTER_SLEEP>;
> > >             };
> > > }
> > > soc {
> > >       deviceA@xxx{
> > >              ...
> > >              power-domains = <&CLUSTER_PD>;
> > >              ...
> > >       }
> > > }
> >
> > Regards,
> > Anup
> >
> > >
> > > Regards,
> > > Nick
> > >
> > > On Tue, May 14, 2024 at 10:54 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > On Tue, May 14, 2024 at 7:53 PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > Hi Nick,
> > > > >
> > > > > On Tue, May 14, 2024 at 3:20 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > > > >
> > > > > > Hi Ulf,
> > > > > >
> > > > > > Thank you for your valuable suggestion.
> > > > > > I sincerely apologize for the delay in responding to your message We
> > > > > > have diligently worked on experimenting with the suggestion you
> > > > > > provided.
> > > > > >
> > > > > > As per your recommendation, we have incorporated the "power-domains=<>
> > > > > > property" into the consumer's node, resulting in modifications to the
> > > > > > DTS as illustrated below:
> > > > > >
> > > > > > cpus {
> > > > > >     ...
> > > > > >      domain-idle-states {
> > > > > >            CLUSTER_SLEEP:cluster-sleep {
> > > > > >                         compatible = "domain-idle-state";
> > > > > >                         ...
> > > > > >             }
> > > > > >      }
> > > > > >      power-domains {
> > > > > >             ...
> > > > > >             ...
> > > > > >             CLUSTER_PD: clusterpd {
> > > > > >                     domain-idle-states = <&CLUSTER_SLEEP>;
> > > > > >             };
> > > > > >      }
> > > > > > }
> > > > > > soc {
> > > > > >       deviceA@xxx{
> > > > > >              ...
> > > > > >              power-domains = <&CLUSTER_PD>;
> > > > > >              ...
> > > > > >       }
> > > > > > }
> > > > > >
> > > > > > However, this adjustment has led to an issue where the probe for
> > > > > > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > > > > > 'really_probe()'. In an attempt to mitigate this issue, we
> > > > > > experimented with a workaround by adding the attribute
> > > > > > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > > > > > prevent the creation of a device link between 'deviceA' and
> > > > > > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > > > > > appropriateness of this solution.
> > > > > >
> > > > > > Do you have suggestions on how to effectively address this issue?
> > > > >
> > > > > I totally missed this email since I was not CC'ed sorry about that. Please
> > > > > use get_maintainers.pl when sending patches.
> > > >
> > > > I stand corrected. This patch had landed in the "spam" folder. I don't know why.
> > > >
> > > > Regards,
> > > > Anup
> > > >
> > > > >
> > > > > The genpd_add_provider() (called by of_genpd_add_provider_simple())
> > > > > does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> > > > > so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> > > > > resolved and 'deviceA' should be probed unless there are other unmet
> > > > > dependencies.
> > > > >
> > > > > Try adding "#define DEBUG" before all includes in drivers/core/basec
> > > > > and add "loglevel=8" in kernel parameters, this will print producer-consumer
> > > > > linkage of all devices.
> > > > >
> > > > > Marking the power-domain DT node as "disabled" is certainly not the
> > > > > right way.
> > > > >
> > > > > Regards,
> > > > > Anup
> > > > >
> > > > > >
> > > > > > Regards,
> > > > > > Nick
> > > > > >
> > > > > > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > >
> > > > > > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi Ulf
> > > > > > > > >
> > > > > > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > > > > > >
> > > > > > > > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > > > > > > > >
> > > > > > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > > > > > recently. Sorry for not noticing earlier.
> > > > > > > > > >
> > > > > > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > > > > > >
> > > > > > > > > > > ---
> > > > > > > > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > > > > >
> > > > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > > >  {
> > > > > > > > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > > > > >         u32 *pd_state;
> > > > > > > > > > > +       int ret;
> > > > > > > > > > >
> > > > > > > > > > >         if (!state->data)
> > > > > > > > > > >                 return 0;
> > > > > > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > > > > >                 return -EBUSY;
> > > > > > > > > > >
> > > > > > > > > > > +       ret = cpu_cluster_pm_enter();
> > > > > > > > > > > +       if (ret)
> > > > > > > > > > > +               return ret;
> > > > > > > > > >
> > > > > > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > > > > > >
> > > > > > > > > > In other words, none of this should be needed, right?
> > > > > > > > > >
> > > > > > > > > Thanks for the feedback!
> > > > > > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > > > > > domain and we add this controller to be a consumer of the power
> > > > > > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > > > > > shared L2 cache controller is suspended?
> > > > > > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > > > > > same cluster are idle?
> > > > > > > > > > [...]
> > > > > > > > Sorry, I made some mistake in my second question.
> > > > > > > > Update the question here:
> > > > > > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > > > > > same cluster are idle and the cluster could be powered down?
> > > > > > >
> > > > > > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > > > > > the L2 cache should register for genpd on/off notifiers.
> > > > > > >
> > > > > > > The device representing the L2 cache needs to be enabled for runtime
> > > > > > > PM, to be taken into account correctly by the cluster genpd. In this
> > > > > > > case, the device should most likely remain runtime suspended, but
> > > > > > > instead rely on the genpd on/off notifiers to understand when
> > > > > > > save/restore of the cache states should be done.
> > > > > > >
> > > > > > > Kind regards
> > > > > > > Uffe
>

Nick Hu May 23, 2024, 3:26 a.m. UTC | #11

Hi Anup,

On Fri, May 17, 2024 at 12:39 PM Anup Patel <apatel@ventanamicro.com> wrote:
>
> On Thu, May 16, 2024 at 9:40 AM Nick Hu <nick.hu@sifive.com> wrote:
> >
> > Hi Anup
> >
> > On Wed, May 15, 2024 at 9:46 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > Hi Nick,
> > >
> > > On Wed, May 15, 2024 at 5:45 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > Hi Anup,
> > > >
> > > > Thank you for your guidance.
> > > > After enabling the debug message, we found a way to solve the problem
> > > > by the following steps:
> > > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > > be the supplier of the consumers. (See of_link_to_phandle())
> > >
> > > Hmm, requiring a compatible string is odd. Where should we document
> > > this compatible string ?
> > >
> > Sorry, this is my fault. I didn't include some updates in
> > of_link_to_phandle(). This led some misunderstandings here.
> > You are right, we don't need it.
> > The supplier will be linked to the CLUSTER_PD node.
> >
> > > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > > won't be added to the device hierarchy by device_add().
> > > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > > '/power-domains'.
> > >
> > > By adding a compatible string and moving the "power-domains" node
> > > outside, you are simply forcing the OF framework to populate devices.
> > >
> > > How about manually creating platform_device for each power-domain
> > > DT node using of_platform_device_create() in sbi_pd_init() ?
> > >
> > Thanks for the suggestion! We have test the solution and it could work.
> > We was wondering if it's feasible for us to relocate the
> > 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> > encompass not only the CPUs but also other components within the
> > cluster.
>
> The cpuidle-riscv-sbi driver expects "power-domains" DT node
> under "/cpus" DT node because this driver only deals with power
> domains related to CPU cluster or CPU cache-hierarchy. It does
> make sense to define L2/L3 power domains under
> "/cpus/power-domain" since these are related to CPUs.
>
> Moving the CPU "power-domains" DT node directly under "/" or
> somewhere else would mean that it covers system-wide power
> domains which is not true.
>
> I suggest we continue using "/cpus/power-domains" DT node
> only for power domains related to CPU clusters or CPU
> cache-hierarchy.
>
> For system wide power domains of SoC devices, we can either:
> 1) Use device power domains through the SBI MPXY extension
>     via different driver
> 2) Use a platform specific driver
>
Thank you for your valuable feedback. We will continue to use the
"/cpus/power-domains".

> >
> > We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> > create the devices for each subnode of psci domain.
> > Is there any reason that they don't need it?
>
> Existing ARM DTS files under arch/arm64/boot/dts, use device
> power domains through SCMI (or platform specific mechanism)
> which are already populated as devices by Linux DD framework.
>
> Regards,
> Anup
>
Thank you for the explanation!

Regards,
Nick


> >
> > > >
> > > > So the DTS will be like:
> > > > cpus {
> > > >     ...
> > > >      domain-idle-states {
> > > >            CLUSTER_SLEEP:cluster-sleep {
> > > >                         compatible = "domain-idle-state";
> > > >                         ...
> > > >             }
> > > >      }
> > > > }
> > > > power-domains {
> > > >             compatible = "riscv,sbi-power-domains"
> > > >             ...
> > > >             ...
> > > >             CLUSTER_PD: clusterpd {
> > > >                     domain-idle-states = <&CLUSTER_SLEEP>;
> > > >             };
> > > > }
> > > > soc {
> > > >       deviceA@xxx{
> > > >              ...
> > > >              power-domains = <&CLUSTER_PD>;
> > > >              ...
> > > >       }
> > > > }
> > >
> > > Regards,
> > > Anup
> > >
> > > >
> > > > Regards,
> > > > Nick
> > > >
> > > > On Tue, May 14, 2024 at 10:54 PM Anup Patel <anup@brainfault.org> wrote:
> > > > >
> > > > > On Tue, May 14, 2024 at 7:53 PM Anup Patel <anup@brainfault.org> wrote:
> > > > > >
> > > > > > Hi Nick,
> > > > > >
> > > > > > On Tue, May 14, 2024 at 3:20 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > >
> > > > > > > Hi Ulf,
> > > > > > >
> > > > > > > Thank you for your valuable suggestion.
> > > > > > > I sincerely apologize for the delay in responding to your message We
> > > > > > > have diligently worked on experimenting with the suggestion you
> > > > > > > provided.
> > > > > > >
> > > > > > > As per your recommendation, we have incorporated the "power-domains=<>
> > > > > > > property" into the consumer's node, resulting in modifications to the
> > > > > > > DTS as illustrated below:
> > > > > > >
> > > > > > > cpus {
> > > > > > >     ...
> > > > > > >      domain-idle-states {
> > > > > > >            CLUSTER_SLEEP:cluster-sleep {
> > > > > > >                         compatible = "domain-idle-state";
> > > > > > >                         ...
> > > > > > >             }
> > > > > > >      }
> > > > > > >      power-domains {
> > > > > > >             ...
> > > > > > >             ...
> > > > > > >             CLUSTER_PD: clusterpd {
> > > > > > >                     domain-idle-states = <&CLUSTER_SLEEP>;
> > > > > > >             };
> > > > > > >      }
> > > > > > > }
> > > > > > > soc {
> > > > > > >       deviceA@xxx{
> > > > > > >              ...
> > > > > > >              power-domains = <&CLUSTER_PD>;
> > > > > > >              ...
> > > > > > >       }
> > > > > > > }
> > > > > > >
> > > > > > > However, this adjustment has led to an issue where the probe for
> > > > > > > 'deviceA' is deferred by 'device_links_check_suppliers()' within
> > > > > > > 'really_probe()'. In an attempt to mitigate this issue, we
> > > > > > > experimented with a workaround by adding the attribute
> > > > > > > "status="disabled"" to the 'CLUSTER_PD' node. This action aimed to
> > > > > > > prevent the creation of a device link between 'deviceA' and
> > > > > > > 'CLUSTER_PD'. Nevertheless, we remain uncertain about the
> > > > > > > appropriateness of this solution.
> > > > > > >
> > > > > > > Do you have suggestions on how to effectively address this issue?
> > > > > >
> > > > > > I totally missed this email since I was not CC'ed sorry about that. Please
> > > > > > use get_maintainers.pl when sending patches.
> > > > >
> > > > > I stand corrected. This patch had landed in the "spam" folder. I don't know why.
> > > > >
> > > > > Regards,
> > > > > Anup
> > > > >
> > > > > >
> > > > > > The genpd_add_provider() (called by of_genpd_add_provider_simple())
> > > > > > does mark the power-domain DT node as initialized (fwnode_dev_initialized())
> > > > > > so after the cpuidle-riscv-sbi driver is probed the 'deviceA' dependency is
> > > > > > resolved and 'deviceA' should be probed unless there are other unmet
> > > > > > dependencies.
> > > > > >
> > > > > > Try adding "#define DEBUG" before all includes in drivers/core/basec
> > > > > > and add "loglevel=8" in kernel parameters, this will print producer-consumer
> > > > > > linkage of all devices.
> > > > > >
> > > > > > Marking the power-domain DT node as "disabled" is certainly not the
> > > > > > right way.
> > > > > >
> > > > > > Regards,
> > > > > > Anup
> > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > > Nick
> > > > > > >
> > > > > > > On Tue, Apr 30, 2024 at 4:13 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > > >
> > > > > > > > On Mon, 29 Apr 2024 at 18:26, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > >
> > > > > > > > > On Tue, Apr 30, 2024 at 12:22 AM Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Ulf
> > > > > > > > > >
> > > > > > > > > > On Mon, Apr 29, 2024 at 10:32 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 26 Feb 2024 at 07:51, Nick Hu <nick.hu@sifive.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > When the cpus in the same cluster are all in the idle state, the kernel
> > > > > > > > > > > > might put the cluster into a deeper low power state. Call the
> > > > > > > > > > > > cluster_pm_enter() before entering the low power state and call the
> > > > > > > > > > > > cluster_pm_exit() after the cluster woken up.
> > > > > > > > > > > >
> > > > > > > > > > > > Signed-off-by: Nick Hu <nick.hu@sifive.com>
> > > > > > > > > > >
> > > > > > > > > > > I was not cced this patch, but noticed that this patch got queued up
> > > > > > > > > > > recently. Sorry for not noticing earlier.
> > > > > > > > > > >
> > > > > > > > > > > If not too late, can you please drop/revert it? We should really move
> > > > > > > > > > > away from the CPU cluster notifiers. See more information below.
> > > > > > > > > > >
> > > > > > > > > > > > ---
> > > > > > > > > > > >  drivers/cpuidle/cpuidle-riscv-sbi.c | 24 ++++++++++++++++++++++--
> > > > > > > > > > > >  1 file changed, 22 insertions(+), 2 deletions(-)
> > > > > > > > > > > >
> > > > > > > > > > > > diff --git a/drivers/cpuidle/cpuidle-riscv-sbi.c b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > > index e8094fc92491..298dc76a00cf 100644
> > > > > > > > > > > > --- a/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > > +++ b/drivers/cpuidle/cpuidle-riscv-sbi.c
> > > > > > > > > > > > @@ -394,6 +394,7 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > > > >  {
> > > > > > > > > > > >         struct genpd_power_state *state = &pd->states[pd->state_idx];
> > > > > > > > > > > >         u32 *pd_state;
> > > > > > > > > > > > +       int ret;
> > > > > > > > > > > >
> > > > > > > > > > > >         if (!state->data)
> > > > > > > > > > > >                 return 0;
> > > > > > > > > > > > @@ -401,6 +402,10 @@ static int sbi_cpuidle_pd_power_off(struct generic_pm_domain *pd)
> > > > > > > > > > > >         if (!sbi_cpuidle_pd_allow_domain_state)
> > > > > > > > > > > >                 return -EBUSY;
> > > > > > > > > > > >
> > > > > > > > > > > > +       ret = cpu_cluster_pm_enter();
> > > > > > > > > > > > +       if (ret)
> > > > > > > > > > > > +               return ret;
> > > > > > > > > > >
> > > > > > > > > > > Rather than using the CPU cluster notifiers, consumers of the genpd
> > > > > > > > > > > can register themselves to receive genpd on/off notifiers.
> > > > > > > > > > >
> > > > > > > > > > > In other words, none of this should be needed, right?
> > > > > > > > > > >
> > > > > > > > > > Thanks for the feedback!
> > > > > > > > > > Maybe I miss something, I'm wondering about a case like below:
> > > > > > > > > > If we have a shared L2 cache controller inside the cpu cluster power
> > > > > > > > > > domain and we add this controller to be a consumer of the power
> > > > > > > > > > domain, Shouldn't the genpd invoke the domain idle only after the
> > > > > > > > > > shared L2 cache controller is suspended?
> > > > > > > > > > Is there a way that we can put the L2 cache down while all cpus in the
> > > > > > > > > > same cluster are idle?
> > > > > > > > > > > [...]
> > > > > > > > > Sorry, I made some mistake in my second question.
> > > > > > > > > Update the question here:
> > > > > > > > > Is there a way that we can save the L2 cache states while all cpus in the
> > > > > > > > > same cluster are idle and the cluster could be powered down?
> > > > > > > >
> > > > > > > > If the L2 cache is a consumer of the cluster, the consumer driver for
> > > > > > > > the L2 cache should register for genpd on/off notifiers.
> > > > > > > >
> > > > > > > > The device representing the L2 cache needs to be enabled for runtime
> > > > > > > > PM, to be taken into account correctly by the cluster genpd. In this
> > > > > > > > case, the device should most likely remain runtime suspended, but
> > > > > > > > instead rely on the genpd on/off notifiers to understand when
> > > > > > > > save/restore of the cache states should be done.
> > > > > > > >
> > > > > > > > Kind regards
> > > > > > > > Uffe
> >

Ulf Hansson May 24, 2024, 10:41 a.m. UTC | #12

On Fri, 17 May 2024 at 06:39, Anup Patel <apatel@ventanamicro.com> wrote:
>
> On Thu, May 16, 2024 at 9:40 AM Nick Hu <nick.hu@sifive.com> wrote:
> >
> > Hi Anup
> >
> > On Wed, May 15, 2024 at 9:46 PM Anup Patel <anup@brainfault.org> wrote:
> > >
> > > Hi Nick,
> > >
> > > On Wed, May 15, 2024 at 5:45 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > >
> > > > Hi Anup,
> > > >
> > > > Thank you for your guidance.
> > > > After enabling the debug message, we found a way to solve the problem
> > > > by the following steps:
> > > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > > be the supplier of the consumers. (See of_link_to_phandle())
> > >
> > > Hmm, requiring a compatible string is odd. Where should we document
> > > this compatible string ?
> > >
> > Sorry, this is my fault. I didn't include some updates in
> > of_link_to_phandle(). This led some misunderstandings here.
> > You are right, we don't need it.
> > The supplier will be linked to the CLUSTER_PD node.
> >
> > > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > > won't be added to the device hierarchy by device_add().
> > > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > > '/power-domains'.
> > >
> > > By adding a compatible string and moving the "power-domains" node
> > > outside, you are simply forcing the OF framework to populate devices.
> > >
> > > How about manually creating platform_device for each power-domain
> > > DT node using of_platform_device_create() in sbi_pd_init() ?
> > >
> > Thanks for the suggestion! We have test the solution and it could work.
> > We was wondering if it's feasible for us to relocate the
> > 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> > encompass not only the CPUs but also other components within the
> > cluster.
>
> The cpuidle-riscv-sbi driver expects "power-domains" DT node
> under "/cpus" DT node because this driver only deals with power
> domains related to CPU cluster or CPU cache-hierarchy. It does
> make sense to define L2/L3 power domains under
> "/cpus/power-domain" since these are related to CPUs.
>
> Moving the CPU "power-domains" DT node directly under "/" or
> somewhere else would mean that it covers system-wide power
> domains which is not true.

I understand your point, but I am not convinced that the power-domains
need to belong to the "cpus" node. Ideally, the power-domain describes
the power-rail and the interface to manage the CPUs, this can surely
be described outside the "cpus" node - even if there are only CPUs
that are using it.

Moreover, moving forward, one should not be surprised if it turns out
that a platform has other devices than the CPUs, sharing the same
power-rail as the CPU cluster. At least, we have that for arm/psci
[1].

>
> I suggest we continue using "/cpus/power-domains" DT node
> only for power domains related to CPU clusters or CPU
> cache-hierarchy.
>
> For system wide power domains of SoC devices, we can either:
> 1) Use device power domains through the SBI MPXY extension
>     via different driver
> 2) Use a platform specific driver
>
> >
> > We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> > create the devices for each subnode of psci domain.
> > Is there any reason that they don't need it?

We don't need it for arm as we have a separate node for PSCI and its
power-domains [2]. Moreover, we have a separate driver that manages
the power-domain (cpuidle-psci-domain).

[...]

[1] arch/arm64/boot/dts/qcom/sc7280.dtsi (search for "CLUSTER_PD")
[2] Documentation/devicetree/bindings/arm/psci.yaml

Kind regards
Uffe

Anup Patel May 24, 2024, 12:54 p.m. UTC | #13

On Fri, May 24, 2024 at 4:11 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> On Fri, 17 May 2024 at 06:39, Anup Patel <apatel@ventanamicro.com> wrote:
> >
> > On Thu, May 16, 2024 at 9:40 AM Nick Hu <nick.hu@sifive.com> wrote:
> > >
> > > Hi Anup
> > >
> > > On Wed, May 15, 2024 at 9:46 PM Anup Patel <anup@brainfault.org> wrote:
> > > >
> > > > Hi Nick,
> > > >
> > > > On Wed, May 15, 2024 at 5:45 PM Nick Hu <nick.hu@sifive.com> wrote:
> > > > >
> > > > > Hi Anup,
> > > > >
> > > > > Thank you for your guidance.
> > > > > After enabling the debug message, we found a way to solve the problem
> > > > > by the following steps:
> > > > > 1. Add a compatible string in 'power-domains' node otherwise it won't
> > > > > be the supplier of the consumers. (See of_link_to_phandle())
> > > >
> > > > Hmm, requiring a compatible string is odd. Where should we document
> > > > this compatible string ?
> > > >
> > > Sorry, this is my fault. I didn't include some updates in
> > > of_link_to_phandle(). This led some misunderstandings here.
> > > You are right, we don't need it.
> > > The supplier will be linked to the CLUSTER_PD node.
> > >
> > > > > 2. Move the 'power-domains' node outside the 'cpus' node otherwise it
> > > > > won't be added to the device hierarchy by device_add().
> > > > > 3. Update the cpuidle-riscv-sbi driver to get the pds_node from
> > > > > '/power-domains'.
> > > >
> > > > By adding a compatible string and moving the "power-domains" node
> > > > outside, you are simply forcing the OF framework to populate devices.
> > > >
> > > > How about manually creating platform_device for each power-domain
> > > > DT node using of_platform_device_create() in sbi_pd_init() ?
> > > >
> > > Thanks for the suggestion! We have test the solution and it could work.
> > > We was wondering if it's feasible for us to relocate the
> > > 'power-domains' node outside of the /cpus? The CLUSTER_PD might
> > > encompass not only the CPUs but also other components within the
> > > cluster.
> >
> > The cpuidle-riscv-sbi driver expects "power-domains" DT node
> > under "/cpus" DT node because this driver only deals with power
> > domains related to CPU cluster or CPU cache-hierarchy. It does
> > make sense to define L2/L3 power domains under
> > "/cpus/power-domain" since these are related to CPUs.
> >
> > Moving the CPU "power-domains" DT node directly under "/" or
> > somewhere else would mean that it covers system-wide power
> > domains which is not true.
>
> I understand your point, but I am not convinced that the power-domains
> need to belong to the "cpus" node. Ideally, the power-domain describes
> the power-rail and the interface to manage the CPUs, this can surely
> be described outside the "cpus" node - even if there are only CPUs
> that are using it.
>
> Moreover, moving forward, one should not be surprised if it turns out
> that a platform has other devices than the CPUs, sharing the same
> power-rail as the CPU cluster. At least, we have that for arm/psci
> [1].

For non-CPU power domains, we are working on a messaging
specification (RPMI) [1]. The supervisor software might have
direct access to a RPMI transport or it can send RPMI messages
via SBI MPXY extension [2].

If power-rails on a platform are shared between CPUs and
devices then the platform can:

1) Use SBI HSM for CPUs and use RPMI for devices. The
DT bindings for device power-domains based on RPMI are
still work-in-progress. If there are multiple supervisor domains
(aka system level partitions) created by SBI implementation or
some partitioning hypervisor then the RPMI messages can be
arbitraged by SBI implementation using SBI MPXY extension.
The SBI MPXY extension also allows sharing the same RPMI
transport between machine-mode (firmware) and supervisor-mode.

2) Use its own platform specific power-domain driver for both
CPUs and devices (basically don't use the SBI HSM and RPMI).
In this case, there won't be any controlled access (or arbitration)
of power rails across supervisor domains.

>
> >
> > I suggest we continue using "/cpus/power-domains" DT node
> > only for power domains related to CPU clusters or CPU
> > cache-hierarchy.
> >
> > For system wide power domains of SoC devices, we can either:
> > 1) Use device power domains through the SBI MPXY extension
> >     via different driver
> > 2) Use a platform specific driver
> >
> > >
> > > We also look at cpuidle_psci_domain driver and it seems Arm doesn't
> > > create the devices for each subnode of psci domain.
> > > Is there any reason that they don't need it?
>
> We don't need it for arm as we have a separate node for PSCI and its
> power-domains [2]. Moreover, we have a separate driver that manages
> the power-domain (cpuidle-psci-domain).

Unlike the ARM world, we don't have any DT node for SBI in
the RISC-V world because the SBI is always there. Due to this,
the SBI HSM CPU idle driver (this driver) currently looks for
CPU "power-domains" under "/cpus" DT node because the
SBI HSM extension only deals with CPU states.

>
> [...]
>
> [1] arch/arm64/boot/dts/qcom/sc7280.dtsi (search for "CLUSTER_PD")
> [2] Documentation/devicetree/bindings/arm/psci.yaml
>
> Kind regards
> Uffe

[1] https://github.com/riscv-non-isa/riscv-rpmi
[2] https://docs.google.com/document/d/1Ivej3u6uQgVdJHnjrbqgUwE1Juy75d4uYCjWrdNjeAg/edit?usp=sharing

Best regards,
Anup

cpuidle: riscv-sbi: Add cluster_pm_enter()/exit()

Commit Message

Comments

Patch