Message ID | 20241113-topic-sm8x50-gpu-bw-vote-v1-0-3b8d39737a9b@linaro.org |
---|---|
Headers | show |
Series | drm/msm: adreno: add support for DDR bandwidth scaling via GMU | expand |
On 13-11-24, 16:48, Neil Armstrong wrote: > Add and implement the dev_pm_opp_get_bandwidth() to retrieve > the OPP's bandwidth in the same was as the dev_pm_opp_get_voltage() way > helper. > > Retrieving bandwidth is required in the case of the Adreno GPU > where the GPU Management Unit can handle the Bandwidth scaling. > > The helper can get the peak or everage bandwidth for any of average > the interconnect path. > > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> > --- > drivers/opp/core.c | 25 +++++++++++++++++++++++++ > include/linux/pm_opp.h | 7 +++++++ > 2 files changed, 32 insertions(+) > > diff --git a/drivers/opp/core.c b/drivers/opp/core.c > index 494f8860220d97fc690ebab5ed3b7f5f04f22d73..19fb82033de26b74e9604c33b9781689df2fe80a 100644 > --- a/drivers/opp/core.c > +++ b/drivers/opp/core.c > @@ -106,6 +106,31 @@ static bool assert_single_clk(struct opp_table *opp_table) > return !WARN_ON(opp_table->clk_count > 1); > } > > +/** > + * dev_pm_opp_get_bandwidth() - Gets the peak bandwidth corresponding to an opp s/peak bandwidth/bandwidth/ > + * @opp: opp for which voltage has to be returned for > + * @peak: select peak or average bandwidth > + * @index: bandwidth index > + * > + * Return: peak bandwidth in kBps, else return 0 s/peak bandwidth/bandwidth/ > + */ > +unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index) > +{ > + if (IS_ERR_OR_NULL(opp)) { > + pr_err("%s: Invalid parameters\n", __func__); > + return 0; > + } > + > + if (index > opp->opp_table->path_count) > + return 0; > + > + if (!opp->bandwidth) > + return 0; > + > + return peak ? opp->bandwidth[index].peak : opp->bandwidth[index].avg; > +} > +EXPORT_SYMBOL_GPL(dev_pm_opp_get_bandwidth); All other bandwidth APIs are named as _bw, maybe do same here too ?
Hi, On 14/11/2024 05:10, Viresh Kumar wrote: > On 13-11-24, 16:48, Neil Armstrong wrote: >> Add and implement the dev_pm_opp_get_bandwidth() to retrieve >> the OPP's bandwidth in the same was as the dev_pm_opp_get_voltage() > > way > >> helper. >> >> Retrieving bandwidth is required in the case of the Adreno GPU >> where the GPU Management Unit can handle the Bandwidth scaling. >> >> The helper can get the peak or everage bandwidth for any of > > average Aww, good catch, thanks > >> the interconnect path. >> >> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> >> --- >> drivers/opp/core.c | 25 +++++++++++++++++++++++++ >> include/linux/pm_opp.h | 7 +++++++ >> 2 files changed, 32 insertions(+) >> >> diff --git a/drivers/opp/core.c b/drivers/opp/core.c >> index 494f8860220d97fc690ebab5ed3b7f5f04f22d73..19fb82033de26b74e9604c33b9781689df2fe80a 100644 >> --- a/drivers/opp/core.c >> +++ b/drivers/opp/core.c >> @@ -106,6 +106,31 @@ static bool assert_single_clk(struct opp_table *opp_table) >> return !WARN_ON(opp_table->clk_count > 1); >> } >> >> +/** >> + * dev_pm_opp_get_bandwidth() - Gets the peak bandwidth corresponding to an opp > > s/peak bandwidth/bandwidth/ Ack > >> + * @opp: opp for which voltage has to be returned for >> + * @peak: select peak or average bandwidth >> + * @index: bandwidth index >> + * >> + * Return: peak bandwidth in kBps, else return 0 > > s/peak bandwidth/bandwidth/ Ack > >> + */ >> +unsigned long dev_pm_opp_get_bandwidth(struct dev_pm_opp *opp, bool peak, int index) >> +{ >> + if (IS_ERR_OR_NULL(opp)) { >> + pr_err("%s: Invalid parameters\n", __func__); >> + return 0; >> + } >> + >> + if (index > opp->opp_table->path_count) >> + return 0; >> + >> + if (!opp->bandwidth) >> + return 0; >> + >> + return peak ? opp->bandwidth[index].peak : opp->bandwidth[index].avg; >> +} >> +EXPORT_SYMBOL_GPL(dev_pm_opp_get_bandwidth); > > All other bandwidth APIs are named as _bw, maybe do same here too ? > Sure, I wasn't sure about that, will switch to _bw. Neil
On 15/11/2024 08:07, Dmitry Baryshkov wrote: > On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote: >> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth >> along the Frequency and Power Domain level, but by default we leave the >> OPP core vote for the interconnect ddr path. >> >> While scaling via the interconnect path was sufficient, newer GPUs >> like the A750 requires specific vote paremeters and bandwidth to >> achieve full functionality. >> >> Add a new Quirk enabling DDR Bandwidth vote via GMU. > > Please describe, why this is defined as a quirk rather than a proper > platform-level property. From my experience with 6xx and 7xx, all the > platforms need to send some kind of BW data to the GMU. Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ? Perhaps the "quirks" bitfield should be features instead ? > >> >> Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> >> --- >> drivers/gpu/drm/msm/adreno/adreno_gpu.h | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/gpu/drm/msm/adreno/adreno_gpu.h b/drivers/gpu/drm/msm/adreno/adreno_gpu.h >> index e71f420f8b3a8e6cfc52dd1c4d5a63ef3704a07f..20b6b7f49473d42751cd4fb4fc82849be42cb807 100644 >> --- a/drivers/gpu/drm/msm/adreno/adreno_gpu.h >> +++ b/drivers/gpu/drm/msm/adreno/adreno_gpu.h >> @@ -57,6 +57,7 @@ enum adreno_family { >> #define ADRENO_QUIRK_HAS_HW_APRIV BIT(3) >> #define ADRENO_QUIRK_HAS_CACHED_COHERENT BIT(4) >> #define ADRENO_QUIRK_PREEMPTION BIT(5) >> +#define ADRENO_QUIRK_GMU_BW_VOTE BIT(6) >> >> /* Helper for formating the chip_id in the way that userspace tools like >> * crashdec expect. >> >> -- >> 2.34.1 >> >
On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote: > > On 15/11/2024 08:07, Dmitry Baryshkov wrote: > > On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote: > >> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth > >> along the Frequency and Power Domain level, but by default we leave the > >> OPP core vote for the interconnect ddr path. > >> > >> While scaling via the interconnect path was sufficient, newer GPUs > >> like the A750 requires specific vote paremeters and bandwidth to > >> achieve full functionality. > >> > >> Add a new Quirk enabling DDR Bandwidth vote via GMU. > > > > Please describe, why this is defined as a quirk rather than a proper > > platform-level property. From my experience with 6xx and 7xx, all the > > platforms need to send some kind of BW data to the GMU. > > Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ? > > Perhaps the "quirks" bitfield should be features instead ? Sounds like that.
On Fri, Nov 15, 2024 at 10:11:09AM +0100, Neil Armstrong wrote: > On 15/11/2024 08:24, Dmitry Baryshkov wrote: > > On Wed, Nov 13, 2024 at 04:48:30PM +0100, Neil Armstrong wrote: > > > The Adreno GPU Management Unit (GMU) can also scale the ddr > > > bandwidth along the frequency and power domain level, but for > > > now we statically fill the bw_table with values from the > > > downstream driver. > > > > > > Only the first entry is used, which is a disable vote, so we > > > currently rely on scaling via the linux interconnect paths. > > > > > > Let's dynamically generate the bw_table with the vote values > > > previously calculated from the OPPs. > > > > Nice to see this being worked upon. I hope the code can is generic > > enough so that we can use it from other adreno_foo_build_bw_table() > > functions. > > I would hope so, but I don't have the HW to properly test it on those > platforms. Welcome to the club^W Lab. > > > Those entried will then be used by the GMU when passing the > > > appropriate bandwidth level when voting for a gpu frequency. > > > > > > Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> > > > --- > > > drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 48 +++++++++++++++++++++++++++-------- > > > 1 file changed, 37 insertions(+), 11 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c > > > index cb8844ed46b29c4569d05eb7a24f7b27e173190f..9a89ba95843e7805d78f0e5ddbe328677b6431dd 100644 > > > --- a/drivers/gpu/drm/msm/adreno/a6xx_hfi.c > > > +++ b/drivers/gpu/drm/msm/adreno/a6xx_hfi.c > > > @@ -596,22 +596,48 @@ static void a730_build_bw_table(struct a6xx_hfi_msg_bw_table *msg) > > > msg->cnoc_cmds_data[1][0] = 0x60000001; > > > } > > > -static void a740_build_bw_table(struct a6xx_hfi_msg_bw_table *msg) > > > +static void a740_generate_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu, > > > + struct a6xx_hfi_msg_bw_table *msg) > > > { > > > - msg->bw_level_num = 1; > > > + const struct a6xx_info *info = adreno_gpu->info->a6xx; > > > + unsigned int i, j; > > > - msg->ddr_cmds_num = 3; > > > msg->ddr_wait_bitmask = 0x7; > > > - msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0"); > > > - msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0"); > > > - msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV"); > > > + for (i = 0; i < 3; i++) { > > > + if (!info->bcm[i].name) > > > + break; > > > + msg->ddr_cmds_addrs[i] = cmd_db_read_addr(info->bcm[i].name); > > > + } > > > + msg->ddr_cmds_num = i; > > > - msg->ddr_cmds_data[0][0] = 0x40000000; > > > - msg->ddr_cmds_data[0][1] = 0x40000000; > > > - msg->ddr_cmds_data[0][2] = 0x40000000; > > > + for (i = 0; i < gmu->nr_gpu_bws; ++i) > > > + for (j = 0; j < msg->ddr_cmds_num; j++) > > > + msg->ddr_cmds_data[i][j] = gmu->gpu_bw_votes[i][j]; > > > + msg->bw_level_num = gmu->nr_gpu_bws; > > > +} > > > + > > > +static void a740_build_bw_table(struct adreno_gpu *adreno_gpu, struct a6xx_gmu *gmu, > > > + struct a6xx_hfi_msg_bw_table *msg) > > > +{ > > > + if ((adreno_gpu->info->quirks & ADRENO_QUIRK_GMU_BW_VOTE) && gmu->nr_gpu_bws) { > > > + a740_generate_bw_table(adreno_gpu, gmu, msg); > > > + } else { > > > > Why do we need a fallback code here? > > Because at this particular commit, it would generate an invalid table, I should probably remove the fallback at the end Or move this to a generic code that generates a table if there is no bw data (like there is none for older platforms with the current DTs). > > > > > > + msg->bw_level_num = 1; > > > - /* TODO: add a proper dvfs table */ > > > + msg->ddr_cmds_num = 3; > > > + msg->ddr_wait_bitmask = 0x7; > > > + > > > + msg->ddr_cmds_addrs[0] = cmd_db_read_addr("SH0"); > > > + msg->ddr_cmds_addrs[1] = cmd_db_read_addr("MC0"); > > > + msg->ddr_cmds_addrs[2] = cmd_db_read_addr("ACV"); > > > + > > > + msg->ddr_cmds_data[0][0] = 0x40000000; > > > + msg->ddr_cmds_data[0][1] = 0x40000000; > > > + msg->ddr_cmds_data[0][2] = 0x40000000; > > > + > > > + /* TODO: add a proper dvfs table */ > > > > I think TODO is unapplicable anymore. > > > > > + } > > > msg->cnoc_cmds_num = 1; > > > msg->cnoc_wait_bitmask = 0x1; > > > @@ -691,7 +717,7 @@ static int a6xx_hfi_send_bw_table(struct a6xx_gmu *gmu) > > > else if (adreno_is_a730(adreno_gpu)) > > > a730_build_bw_table(msg); > > > else if (adreno_is_a740_family(adreno_gpu)) > > > - a740_build_bw_table(msg); > > > + a740_build_bw_table(adreno_gpu, gmu, msg); > > > else > > > a6xx_build_bw_table(msg); > > > > > > -- > > > 2.34.1 > > > > > >
On Fri, Nov 15, 2024 at 6:18 AM Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > > On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote: > > > > On 15/11/2024 08:07, Dmitry Baryshkov wrote: > > > On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote: > > >> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth > > >> along the Frequency and Power Domain level, but by default we leave the > > >> OPP core vote for the interconnect ddr path. > > >> > > >> While scaling via the interconnect path was sufficient, newer GPUs > > >> like the A750 requires specific vote paremeters and bandwidth to > > >> achieve full functionality. > > >> > > >> Add a new Quirk enabling DDR Bandwidth vote via GMU. > > > > > > Please describe, why this is defined as a quirk rather than a proper > > > platform-level property. From my experience with 6xx and 7xx, all the > > > platforms need to send some kind of BW data to the GMU. > > > > Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ? > > > > Perhaps the "quirks" bitfield should be features instead ? > > Sounds like that. But LMLOADKILL_DISABLE and TWO_PASS_USE_WFI are quirks.. so it is kind of a mix of quirks and features. So meh BR, -R > > -- > With best wishes > Dmitry
On 15/11/2024 16:10, Rob Clark wrote: > On Fri, Nov 15, 2024 at 6:18 AM Dmitry Baryshkov > <dmitry.baryshkov@linaro.org> wrote: >> >> On Fri, 15 Nov 2024 at 11:21, Neil Armstrong <neil.armstrong@linaro.org> wrote: >>> >>> On 15/11/2024 08:07, Dmitry Baryshkov wrote: >>>> On Wed, Nov 13, 2024 at 04:48:28PM +0100, Neil Armstrong wrote: >>>>> The Adreno GMU Management Unit (GNU) can also scale the DDR Bandwidth >>>>> along the Frequency and Power Domain level, but by default we leave the >>>>> OPP core vote for the interconnect ddr path. >>>>> >>>>> While scaling via the interconnect path was sufficient, newer GPUs >>>>> like the A750 requires specific vote paremeters and bandwidth to >>>>> achieve full functionality. >>>>> >>>>> Add a new Quirk enabling DDR Bandwidth vote via GMU. >>>> >>>> Please describe, why this is defined as a quirk rather than a proper >>>> platform-level property. From my experience with 6xx and 7xx, all the >>>> platforms need to send some kind of BW data to the GMU. >>> >>> Well APRIV, CACHED_COHERENT & PREEMPTION are HW features, why this can't be part of this ? >>> >>> Perhaps the "quirks" bitfield should be features instead ? >> >> Sounds like that. > > But LMLOADKILL_DISABLE and TWO_PASS_USE_WFI are quirks.. so it is kind > of a mix of quirks and features. So meh Well I can do a split and move the features into a clean .features bitfield, would it be ok ? Neil > > BR, > -R > >> >> -- >> With best wishes >> Dmitry
The Adreno GMU Management Unit (GMU) can also vote for DDR Bandwidth along the Frequency and Power Domain level, but by default we leave the OPP core scale the interconnect ddr path. While scaling the interconnect path was sufficient, newer GPUs like the A750 requires specific vote parameters and bandwidth to achieve full functionnality. In order to get the vote values to be used by the GPU Management Unit (GMU), we need to parse all the possible OPP Bandwidths and create a vote value to be send to the appropriate Bus Control Modules (BCMs) declared in the GPU info struct. The added dev_pm_opp_get_bandwidth() is used in this case. The vote array will then be used to dynamically generate the GMU bw_table sent during the GMU power-up. Those entries will then be used by passing the appropriate bandwidth level when voting for a GPU frequency. This will make sure all resources are equally voted for a same OPP, whatever decision is done by the GMU, it will ensure all resources votes are synchronized. Tested on SM8650 and SM8550 platforms. Any feedback is welcome. Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> --- Neil Armstrong (8): opp: core: implement dev_pm_opp_get_bandwidth drm/msm: adreno: add GMU_BW_VOTE quirk drm/msm: adreno: add plumbing to generate bandwidth vote table for GMU drm/msm: adreno: dynamically generate GMU bw table drm/msm: adreno: find bandwidth index of OPP and set it along freq index drm/msm: adreno: enable GMU bandwidth for A740 and A750 arm64: qcom: dts: sm8550: add interconnect and opp-peak-kBps for GPU arm64: qcom: dts: sm8650: add interconnect and opp-peak-kBps for GPU arch/arm64/boot/dts/qcom/sm8550.dtsi | 11 ++ arch/arm64/boot/dts/qcom/sm8650.dtsi | 14 +++ drivers/gpu/drm/msm/adreno/a6xx_catalog.c | 26 ++++- drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 180 +++++++++++++++++++++++++++++- drivers/gpu/drm/msm/adreno/a6xx_gmu.h | 14 ++- drivers/gpu/drm/msm/adreno/a6xx_gpu.h | 1 + drivers/gpu/drm/msm/adreno/a6xx_hfi.c | 54 ++++++--- drivers/gpu/drm/msm/adreno/adreno_gpu.h | 1 + drivers/opp/core.c | 25 +++++ include/linux/pm_opp.h | 7 ++ 10 files changed, 314 insertions(+), 19 deletions(-) --- base-commit: 86313a9cd152330c634b25d826a281c6a002eb77 change-id: 20241113-topic-sm8x50-gpu-bw-vote-f5e022fe7a47 Best regards,