mbox series

[PATCHv5,0/6] System Cache support for GPU and required SMMU support

Message ID cover.1600754909.git.saiprakash.ranjan@codeaurora.org
Headers show
Series System Cache support for GPU and required SMMU support | expand

Message

Sai Prakash Ranjan Sept. 22, 2020, 6:18 a.m. UTC
Some hardware variants contain a system cache or the last level
cache(llc). This cache is typically a large block which is shared
by multiple clients on the SOC. GPU uses the system cache to cache
both the GPU data buffers(like textures) as well the SMMU pagetables.
This helps with improved render performance as well as lower power
consumption by reducing the bus traffic to the system memory.

The system cache architecture allows the cache to be split into slices
which then be used by multiple SOC clients. This patch series is an
effort to enable and use two of those slices perallocated for the GPU,
one for the GPU data buffers and another for the GPU SMMU hardware
pagetables.

Patch 1 - Patch 4 adds system cache support in SMMU and GPU driver.
Patch 5 and 6 are minor cleanups for arm-smmu impl.

The series is based on top of https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables

Changes in v5:
 * Drop cleanup of blank lines since it was intentional (Robin)
 * Rebase again on top of msm-next-pgtables as it moves pretty fast

Changes in v4:
 * Drop IOMMU_SYS_CACHE prot flag
 * Rebase on top of https://gitlab.freedesktop.org/drm/msm/-/tree/msm-next-pgtables

Changes in v3:
 * Fix domain attribute setting to before iommu_attach_device()
 * Fix few code style and checkpatch warnings
 * Rebase on top of Jordan's latest split pagetables and per-instance
   pagetables support

Changes in v2:
 * Addressed review comments and rebased on top of Jordan's split
   pagetables series

Sai Prakash Ranjan (4):
  iommu/io-pgtable-arm: Add support to use system cache
  iommu/arm-smmu: Add domain attribute for system cache
  iommu: arm-smmu-impl: Use table to list QCOM implementations
  iommu: arm-smmu-impl: Add a space before open parenthesis

Sharat Masetty (2):
  drm/msm: rearrange the gpu_rmw() function
  drm/msm/a6xx: Add support for using system cache(LLC)

 drivers/gpu/drm/msm/adreno/a6xx_gpu.c      | 83 ++++++++++++++++++++++
 drivers/gpu/drm/msm/adreno/a6xx_gpu.h      |  4 ++
 drivers/gpu/drm/msm/adreno/adreno_gpu.c    | 17 +++++
 drivers/gpu/drm/msm/msm_drv.c              |  8 +++
 drivers/gpu/drm/msm/msm_drv.h              |  1 +
 drivers/gpu/drm/msm/msm_gpu.h              |  5 +-
 drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 14 ++--
 drivers/iommu/arm/arm-smmu/arm-smmu.c      | 17 +++++
 drivers/iommu/arm/arm-smmu/arm-smmu.h      |  1 +
 drivers/iommu/io-pgtable-arm.c             |  7 +-
 include/linux/io-pgtable.h                 |  4 ++
 include/linux/iommu.h                      |  1 +
 12 files changed, 152 insertions(+), 10 deletions(-)


base-commit: 115b1aca7a2a9c0649b1f5f6cffee6873c7efd89

Comments

Robin Murphy Sept. 23, 2020, 3:24 p.m. UTC | #1
On 2020-09-22 07:18, Sai Prakash Ranjan wrote:
> Use table and of_match_node() to match qcom implementation
> instead of multiple of_device_compatible() calls for each
> QCOM SMMU implementation.
> 
> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
> ---
>   drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 ++++++++----
>   1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> index d199b4bff15d..ce78295cfa78 100644
> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
> @@ -9,6 +9,13 @@
>   
>   #include "arm-smmu.h"
>   
> +static const struct of_device_id __maybe_unused qcom_smmu_impl_of_match[] = {
> +	{ .compatible = "qcom,sc7180-smmu-500" },
> +	{ .compatible = "qcom,sdm845-smmu-500" },
> +	{ .compatible = "qcom,sm8150-smmu-500" },
> +	{ .compatible = "qcom,sm8250-smmu-500" },
> +	{ }
> +};

Can you push the table itself into arm-smmu-qcom? That way you'll be 
free to add new SoCs willy-nilly without any possibility of conflicting 
with anything else.

Bonus points if you can fold in the Adreno variant and keep everything 
together ;)

Robin.

>   static int arm_smmu_gr0_ns(int offset)
>   {
> @@ -217,10 +224,7 @@ struct arm_smmu_device *arm_smmu_impl_init(struct arm_smmu_device *smmu)
>   	if (of_device_is_compatible(np, "nvidia,tegra194-smmu"))
>   		return nvidia_smmu_impl_init(smmu);
>   
> -	if (of_device_is_compatible(np, "qcom,sdm845-smmu-500") ||
> -	    of_device_is_compatible(np, "qcom,sc7180-smmu-500") ||
> -	    of_device_is_compatible(np, "qcom,sm8150-smmu-500") ||
> -	    of_device_is_compatible(np, "qcom,sm8250-smmu-500"))
> +	if (of_match_node(qcom_smmu_impl_of_match, np))
>   		return qcom_smmu_impl_init(smmu);
>   
>   	if (of_device_is_compatible(smmu->dev->of_node, "qcom,adreno-smmu"))
>
Sai Prakash Ranjan Sept. 28, 2020, 12:28 p.m. UTC | #2
On 2020-09-23 20:54, Robin Murphy wrote:
> On 2020-09-22 07:18, Sai Prakash Ranjan wrote:
>> Use table and of_match_node() to match qcom implementation
>> instead of multiple of_device_compatible() calls for each
>> QCOM SMMU implementation.
>> 
>> Signed-off-by: Sai Prakash Ranjan <saiprakash.ranjan@codeaurora.org>
>> ---
>>   drivers/iommu/arm/arm-smmu/arm-smmu-impl.c | 12 ++++++++----
>>   1 file changed, 8 insertions(+), 4 deletions(-)
>> 
>> diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c 
>> b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
>> index d199b4bff15d..ce78295cfa78 100644
>> --- a/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
>> +++ b/drivers/iommu/arm/arm-smmu/arm-smmu-impl.c
>> @@ -9,6 +9,13 @@
>>     #include "arm-smmu.h"
>>   +static const struct of_device_id __maybe_unused 
>> qcom_smmu_impl_of_match[] = {
>> +	{ .compatible = "qcom,sc7180-smmu-500" },
>> +	{ .compatible = "qcom,sdm845-smmu-500" },
>> +	{ .compatible = "qcom,sm8150-smmu-500" },
>> +	{ .compatible = "qcom,sm8250-smmu-500" },
>> +	{ }
>> +};
> 
> Can you push the table itself into arm-smmu-qcom? That way you'll be
> free to add new SoCs willy-nilly without any possibility of
> conflicting with anything else.
> 
> Bonus points if you can fold in the Adreno variant and keep everything
> together ;)
> 

Sure I can get bonus points :)

Thanks,
Sai