diff mbox series

[v6,5/6] arm64: dts: qcom: sm8450: Add opp table support to PCIe

Message ID 20240112-opp_support-v6-5-77bbf7d0cc37@quicinc.com
State Superseded
Headers show
Series PCI: qcom: Add support for OPP | expand

Commit Message

Krishna Chaitanya Chundru Jan. 12, 2024, 2:22 p.m. UTC
PCIe needs to choose the appropriate performance state of RPMH power
domain and interconnect bandwidth based up on the PCIe gen speed.

Add the OPP table support to specify RPMH performance states and
interconnect peak bandwidth.

Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
---
 arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++
 1 file changed, 74 insertions(+)

Comments

Manivannan Sadhasivam Jan. 29, 2024, 4:04 p.m. UTC | #1
On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote:
> PCIe needs to choose the appropriate performance state of RPMH power
> domain and interconnect bandwidth based up on the PCIe gen speed.
> 
> Add the OPP table support to specify RPMH performance states and
> interconnect peak bandwidth.
> 
> Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
> ---
>  arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++
>  1 file changed, 74 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> index 6b1d2e0d9d14..eab85ecaeff0 100644
> --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
> +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 {
>  			pinctrl-names = "default";
>  			pinctrl-0 = <&pcie0_default_state>;
>  
> +			operating-points-v2 = <&pcie0_opp_table>;
> +
>  			status = "disabled";
> +
> +			pcie0_opp_table: opp-table {
> +				compatible = "operating-points-v2";
> +
> +				opp-2500000 {
> +					opp-hz = /bits/ 64 <2500000>;
> +					required-opps = <&rpmhpd_opp_low_svs>;
> +					opp-peak-kBps = <250000 250000>;

This is a question for Viresh: We already have macros in the driver to derive
the bandwidth based on link speed. So if OPP core exposes a callback to allow
the consumers to set the bw on its own, we can get rid of this entry.

Similar to config_clks()/config_regulators(). Is that feasible?

- Mani
Viresh Kumar Jan. 30, 2024, 6:11 a.m. UTC | #2
On 29-01-24, 21:34, Manivannan Sadhasivam wrote:
> On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote:
> > PCIe needs to choose the appropriate performance state of RPMH power
> > domain and interconnect bandwidth based up on the PCIe gen speed.
> > 
> > Add the OPP table support to specify RPMH performance states and
> > interconnect peak bandwidth.
> > 
> > Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
> > ---
> >  arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++
> >  1 file changed, 74 insertions(+)
> > 
> > diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > index 6b1d2e0d9d14..eab85ecaeff0 100644
> > --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 {
> >  			pinctrl-names = "default";
> >  			pinctrl-0 = <&pcie0_default_state>;
> >  
> > +			operating-points-v2 = <&pcie0_opp_table>;
> > +
> >  			status = "disabled";
> > +
> > +			pcie0_opp_table: opp-table {
> > +				compatible = "operating-points-v2";
> > +
> > +				opp-2500000 {
> > +					opp-hz = /bits/ 64 <2500000>;
> > +					required-opps = <&rpmhpd_opp_low_svs>;
> > +					opp-peak-kBps = <250000 250000>;
> 
> This is a question for Viresh: We already have macros in the driver to derive
> the bandwidth based on link speed. So if OPP core exposes a callback to allow
> the consumers to set the bw on its own, we can get rid of this entry.
> 
> Similar to config_clks()/config_regulators(). Is that feasible?

I don't have any issues with a new callback for bw. But, AFAIU, the DT
is required to represent the hardware irrespective of what any OS
would do with it. So DT should ideally have these values here, right ?

Also, the driver has already moved away from using those macros now
and depend on the OPP core to do the right thing. It only uses the
macro for the cases where the DT OPP table isn't available. And as
said by few others as well already, the driver really should try to
add OPPs dynamically in that case to avoid multiple code paths and
stick to a single OPP based solution.
Manivannan Sadhasivam Jan. 30, 2024, 7:14 a.m. UTC | #3
On Tue, Jan 30, 2024 at 11:41:11AM +0530, Viresh Kumar wrote:
> On 29-01-24, 21:34, Manivannan Sadhasivam wrote:
> > On Fri, Jan 12, 2024 at 07:52:04PM +0530, Krishna chaitanya chundru wrote:
> > > PCIe needs to choose the appropriate performance state of RPMH power
> > > domain and interconnect bandwidth based up on the PCIe gen speed.
> > > 
> > > Add the OPP table support to specify RPMH performance states and
> > > interconnect peak bandwidth.
> > > 
> > > Signed-off-by: Krishna chaitanya chundru <quic_krichai@quicinc.com>
> > > ---
> > >  arch/arm64/boot/dts/qcom/sm8450.dtsi | 74 ++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 74 insertions(+)
> > > 
> > > diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > > index 6b1d2e0d9d14..eab85ecaeff0 100644
> > > --- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > > +++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
> > > @@ -1827,7 +1827,32 @@ pcie0: pcie@1c00000 {
> > >  			pinctrl-names = "default";
> > >  			pinctrl-0 = <&pcie0_default_state>;
> > >  
> > > +			operating-points-v2 = <&pcie0_opp_table>;
> > > +
> > >  			status = "disabled";
> > > +
> > > +			pcie0_opp_table: opp-table {
> > > +				compatible = "operating-points-v2";
> > > +
> > > +				opp-2500000 {
> > > +					opp-hz = /bits/ 64 <2500000>;
> > > +					required-opps = <&rpmhpd_opp_low_svs>;
> > > +					opp-peak-kBps = <250000 250000>;
> > 
> > This is a question for Viresh: We already have macros in the driver to derive
> > the bandwidth based on link speed. So if OPP core exposes a callback to allow
> > the consumers to set the bw on its own, we can get rid of this entry.
> > 
> > Similar to config_clks()/config_regulators(). Is that feasible?
> 
> I don't have any issues with a new callback for bw. But, AFAIU, the DT
> is required to represent the hardware irrespective of what any OS
> would do with it. So DT should ideally have these values here, right ?
> 

Not necessarily. Because, right now the bandwidth values of the all peripherals
are encoded within the drivers. Only OPP has the requirement to define the
values in DT.

> Also, the driver has already moved away from using those macros now
> and depend on the OPP core to do the right thing. It only uses the
> macro for the cases where the DT OPP table isn't available. And as
> said by few others as well already, the driver really should try to
> add OPPs dynamically in that case to avoid multiple code paths and
> stick to a single OPP based solution.
> 

Still I prefer to use OPP for bandwidth control because both the voltage and
bandwidth values need to be updated at the same time. My only point here is, if
OPP exposes a callback for bw, then we can keep the DT behavior consistent.

- Mani

> -- 
> viresh
Viresh Kumar Jan. 30, 2024, 9:55 a.m. UTC | #4
On 30-01-24, 15:18, Manivannan Sadhasivam wrote:
> So you are saying that the ICC core itself should get the bw values from DT
> instead of hardcoding in the driver? If so, I'd like to get the opinion from
> Georgi/Bjorn.

Not really. The drivers or the ICC core doesn't need to do anything I
guess. Since the values are coming via the OPP, we must just use it to
hide all these details.

Why is the ICC core required to get into this here ? ICC core should
be ready to get the information from DT (may or may not via the OPP
core), or from driver.
Viresh Kumar Jan. 31, 2024, 10 a.m. UTC | #5
On 31-01-24, 14:16, Manivannan Sadhasivam wrote:
> Most of the hits are from CPU nodes... For some reasons, peripheral drivers are
> sticking to hardcoded values.

I guess the reason for this is that the OPP core wasn't used for non-CPU devices
until recently. And we are in a transition phase where few of the drivers will
migrate to using it and so will have DT based bw values.
Konrad Dybcio Feb. 9, 2024, 9:14 p.m. UTC | #6
On 2.02.2024 08:33, Viresh Kumar wrote:
> On 01-02-24, 15:45, Konrad Dybcio wrote:
>> I'm lukewarm on this.
>>
>> A *lot* of hardware has more complex requirements than "x MBps at y MHz",
>> especially when performance counters come into the picture for dynamic
>> bw management.
>>
>> OPP tables can't really handle this properly.
> 
> There was a similar concern for voltages earlier on and we added the capability
> of adjusting the voltage for OPPs in the OPP core. Maybe something similar can
> be done here ?
> 
I really don't think it's fitting.. At any moment the device may require any
bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH..

Konrad
Krishna Chaitanya Chundru Feb. 19, 2024, 7:02 a.m. UTC | #7
On 2/10/2024 2:44 AM, Konrad Dybcio wrote:
> On 2.02.2024 08:33, Viresh Kumar wrote:
>> On 01-02-24, 15:45, Konrad Dybcio wrote:
>>> I'm lukewarm on this.
>>>
>>> A *lot* of hardware has more complex requirements than "x MBps at y MHz",
>>> especially when performance counters come into the picture for dynamic
>>> bw management.
>>>
>>> OPP tables can't really handle this properly.
>>
>> There was a similar concern for voltages earlier on and we added the capability
>> of adjusting the voltage for OPPs in the OPP core. Maybe something similar can
>> be done here ?
>>
> I really don't think it's fitting.. At any moment the device may require any
> bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH..
> 
> Konrad
Viresh & konrad can you both come to conclusion on this.

- Krishna Chaitanya.
Viresh Kumar Feb. 19, 2024, 10:28 a.m. UTC | #8
On 09-02-24, 22:14, Konrad Dybcio wrote:
> On 2.02.2024 08:33, Viresh Kumar wrote:
> > On 01-02-24, 15:45, Konrad Dybcio wrote:
> >> I'm lukewarm on this.
> >>
> >> A *lot* of hardware has more complex requirements than "x MBps at y MHz",
> >> especially when performance counters come into the picture for dynamic
> >> bw management.
> >>
> >> OPP tables can't really handle this properly.
> > 
> > There was a similar concern for voltages earlier on and we added the capability
> > of adjusting the voltage for OPPs in the OPP core. Maybe something similar can
> > be done here ?
> > 
> I really don't think it's fitting.. At any moment the device may require any
> bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH..

Okay, I leave it up to you guys to decide on how you want to do it. I still
believe getting the information via DT is the right thing, but maybe I still
don't understand the problem fully.

Thanks.
Manivannan Sadhasivam Feb. 19, 2024, 12:38 p.m. UTC | #9
On Mon, Feb 19, 2024 at 03:58:34PM +0530, Viresh Kumar wrote:
> On 09-02-24, 22:14, Konrad Dybcio wrote:
> > On 2.02.2024 08:33, Viresh Kumar wrote:
> > > On 01-02-24, 15:45, Konrad Dybcio wrote:
> > >> I'm lukewarm on this.
> > >>
> > >> A *lot* of hardware has more complex requirements than "x MBps at y MHz",
> > >> especially when performance counters come into the picture for dynamic
> > >> bw management.
> > >>
> > >> OPP tables can't really handle this properly.
> > > 
> > > There was a similar concern for voltages earlier on and we added the capability
> > > of adjusting the voltage for OPPs in the OPP core. Maybe something similar can
> > > be done here ?
> > > 
> > I really don't think it's fitting.. At any moment the device may require any
> > bandwidth value between 0 and MAX_BW_PER_LINK_GEN * LINK_WIDTH..
> 
> Okay, I leave it up to you guys to decide on how you want to do it. I still
> believe getting the information via DT is the right thing, but maybe I still
> don't understand the problem fully.
> 

I argued for a different issue, but what Konrad pointed out is not a valid
concern to me. The driver may only require _fixed_ bandwidth between 0 and
(MAX_BW_PER_LINK_GEN * LINK_WIDTH) and DT can pass those bandwidth values.

Chaitanya pointed out that this may end up with long entries in DT once the PCIe
Gen versions start to increase (current Qcom platforms support upto Gen 4 only).
But that shouldn't be a real concern if we look at what DT has to provide.

- Mani
diff mbox series

Patch

diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index 6b1d2e0d9d14..eab85ecaeff0 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -1827,7 +1827,32 @@  pcie0: pcie@1c00000 {
 			pinctrl-names = "default";
 			pinctrl-0 = <&pcie0_default_state>;
 
+			operating-points-v2 = <&pcie0_opp_table>;
+
 			status = "disabled";
+
+			pcie0_opp_table: opp-table {
+				compatible = "operating-points-v2";
+
+				opp-2500000 {
+					opp-hz = /bits/ 64 <2500000>;
+					required-opps = <&rpmhpd_opp_low_svs>;
+					opp-peak-kBps = <250000 250000>;
+				};
+
+				opp-5000000 {
+					opp-hz = /bits/ 64 <5000000>;
+					required-opps = <&rpmhpd_opp_low_svs>;
+					opp-peak-kBps = <500000 250000>;
+				};
+
+				opp-8000000 {
+					opp-hz = /bits/ 64 <8000000>;
+					required-opps = <&rpmhpd_opp_nom>;
+					opp-peak-kBps = <984500 250000>;
+				};
+			};
+
 		};
 
 		pcie0_phy: phy@1c06000 {
@@ -1938,7 +1963,56 @@  pcie1: pcie@1c08000 {
 			pinctrl-names = "default";
 			pinctrl-0 = <&pcie1_default_state>;
 
+			operating-points-v2 = <&pcie1_opp_table>;
+
 			status = "disabled";
+
+			pcie1_opp_table: opp-table {
+				compatible = "operating-points-v2";
+
+				/* GEN 1x1 */
+				opp-2500000 {
+					opp-hz = /bits/ 64 <2500000>;
+					required-opps = <&rpmhpd_opp_low_svs>;
+					opp-peak-kBps = <250000 250000>;
+				};
+
+				/* GEN 1x2 GEN 2x1 */
+				opp-5000000 {
+					opp-hz = /bits/ 64 <5000000>;
+					required-opps = <&rpmhpd_opp_low_svs>;
+					opp-peak-kBps = <500000 250000>;
+				};
+
+				/* GEN 2x2 */
+				opp-10000000 {
+					opp-hz = /bits/ 64 <10000000>;
+					required-opps = <&rpmhpd_opp_low_svs>;
+					opp-peak-kBps = <1000000 250000>;
+				};
+
+				/* GEN 3x1 */
+				opp-8000000 {
+					opp-hz = /bits/ 64 <8000000>;
+					required-opps = <&rpmhpd_opp_nom>;
+					opp-peak-kBps = <984500 250000>;
+				};
+
+				/* GEN 3x2 GEN 4x1 */
+				opp-16000000 {
+					opp-hz = /bits/ 64 <16000000>;
+					required-opps = <&rpmhpd_opp_nom>;
+					opp-peak-kBps = <1969000 250000>;
+				};
+
+				/* GEN 4x2 */
+				opp-32000000 {
+					opp-hz = /bits/ 64 <32000000>;
+					required-opps = <&rpmhpd_opp_nom>;
+					opp-peak-kBps = <3938000 250000>;
+				};
+			};
+
 		};
 
 		pcie1_phy: phy@1c0e000 {