mbox series

[v5,00/25] Add perf support to the rockchip-dfi driver

Message ID 20230524083153.2046084-1-s.hauer@pengutronix.de
Headers show
Series Add perf support to the rockchip-dfi driver | expand

Message

Sascha Hauer May 24, 2023, 8:31 a.m. UTC
This is v5 of the series adding perf support to the rockchip DFI driver.

A lot has changed in the perf driver since v4. First of all the review
feedback from Robin and Jonathan has been integrated. The perf driver
now not only supports monitoring the total DDR utilization, but also the
individual channels. I also reworked the way the raw 32bit counter
values are summed up to 64bit perf values, so hopefully the code is
easier to follow now.

lockdep found out that that locking in the perf driver was broken, so I
reworked that as well. None of the perf hooks allows locking with
mutexes or spinlocks, so in perf it's not possible to enable the DFI
controller when needed. Instead I now unconditionally enable the DFI
controller during probe when perf is enabled.

Furthermore the hrtimer I use for reading out the hardware counter
values before they overflow race with perf. Now a seqlock is used to
prevent that.

The RK3588 device tree changes for the DFI were not part of v4. As
Vincent Legoll showed interest in testing this series the necessary
device tree changes are now part of this series.

Changes since v4:
- Add device tree changes for RK3588
- Use seqlock to protect perf counter values from hrtimer
- Unconditionally enable DFI when perf is enabled
- Bring back changes to dts/binding patches that were lost in v4

Changes since v3:
- Add RK3588 support

Changes since v2:
- Fix broken reference to binding
- Add Reviewed-by from Rob

Changes since v1:
- Fix example to actually match the binding and fix the warnings resulted thereof
- Make addition of rockchip,rk3568-dfi an extra patch

Sascha Hauer (25):
  PM / devfreq: rockchip-dfi: Make pmu regmap mandatory
  PM / devfreq: rockchip-dfi: Embed desc into private data struct
  PM / devfreq: rockchip-dfi: use consistent name for private data
    struct
  PM / devfreq: rockchip-dfi: Add SoC specific init function
  PM / devfreq: rockchip-dfi: dfi store raw values in counter struct
  PM / devfreq: rockchip-dfi: Use free running counter
  PM / devfreq: rockchip-dfi: introduce channel mask
  PM / devfreq: rk3399_dmc,dfi: generalize DDRTYPE defines
  PM / devfreq: rockchip-dfi: Clean up DDR type register defines
  PM / devfreq: rockchip-dfi: Add RK3568 support
  PM / devfreq: rockchip-dfi: Handle LPDDR2 correctly
  PM / devfreq: rockchip-dfi: Handle LPDDR4X
  PM / devfreq: rockchip-dfi: Pass private data struct to internal
    functions
  PM / devfreq: rockchip-dfi: Prepare for multiple users
  PM / devfreq: rockchip-dfi: give variable a better name
  PM / devfreq: rockchip-dfi: Add perf support
  PM / devfreq: rockchip-dfi: make register stride SoC specific
  PM / devfreq: rockchip-dfi: account for multiple DDRMON_CTRL registers
  PM / devfreq: rockchip-dfi: add support for RK3588
  dt-bindings: devfreq: event: convert Rockchip DFI binding to yaml
  dt-bindings: devfreq: event: rockchip,dfi: Add rk3568 support
  dt-bindings: devfreq: event: rockchip,dfi: Add rk3588 support
  arm64: dts: rockchip: rk3399: Enable DFI
  arm64: dts: rockchip: rk356x: Add DFI
  arm64: dts: rockchip: rk3588s: Add DFI

 .../bindings/devfreq/event/rockchip,dfi.yaml  |  84 ++
 .../bindings/devfreq/event/rockchip-dfi.txt   |  18 -
 .../rockchip,rk3399-dmc.yaml                  |   2 +-
 arch/arm64/boot/dts/rockchip/rk3399.dtsi      |   1 -
 arch/arm64/boot/dts/rockchip/rk356x.dtsi      |   7 +
 arch/arm64/boot/dts/rockchip/rk3588s.dtsi     |  16 +
 drivers/devfreq/event/rockchip-dfi.c          | 796 +++++++++++++++---
 drivers/devfreq/rk3399_dmc.c                  |  10 +-
 include/soc/rockchip/rk3399_grf.h             |   9 +-
 include/soc/rockchip/rk3568_grf.h             |  13 +
 include/soc/rockchip/rk3588_grf.h             |  18 +
 include/soc/rockchip/rockchip_grf.h           |  18 +
 12 files changed, 854 insertions(+), 138 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
 delete mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
 create mode 100644 include/soc/rockchip/rk3568_grf.h
 create mode 100644 include/soc/rockchip/rk3588_grf.h
 create mode 100644 include/soc/rockchip/rockchip_grf.h

Comments

Rob Herring June 8, 2023, 8:07 p.m. UTC | #1
On Wed, May 24, 2023 at 10:31:50AM +0200, Sascha Hauer wrote:
> This adds rockchip,rk3588-dfi to the list of compatibles. Unlike ealier
> SoCs the rk3588 has four interrupts (one for each channel) instead of
> only one, so increase the number of allowed interrupts to four and also
> add interrupt-names.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
> 
> Notes:
>     Changes since v4:
>     - new patch
> 
>  .../bindings/devfreq/event/rockchip,dfi.yaml       | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml b/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
> index e8b64494ee8bd..4e647a9560560 100644
> --- a/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
> +++ b/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
> @@ -14,6 +14,7 @@ properties:
>      enum:
>        - rockchip,rk3399-dfi
>        - rockchip,rk3568-dfi
> +      - rockchip,rk3588-dfi
>  
>    clocks:
>      maxItems: 1
> @@ -23,7 +24,18 @@ properties:
>        - const: pclk_ddr_mon
>  
>    interrupts:
> -    maxItems: 1
> +    minItems: 1
> +    maxItems: 4
> +
> +  interrupt-names:
> +    oneOf:
> +      - items:
> +          - const: ch0
> +      - items:
> +          - const: ch0
> +          - const: ch1
> +          - const: ch2
> +          - const: ch3

Names that are just an index are generally pointless.
Sebastian Reichel June 13, 2023, 4:19 p.m. UTC | #2
Hi,

On Wed, May 24, 2023 at 10:31:53AM +0200, Sascha Hauer wrote:
> The DFI unit can be used to measure DRAM utilization using perf. Add the
> node to the device tree. The DFI needs a rockchip,pmu phandle to the pmu
> containing registers for SDRAM configuration details. This is added in
> this patch as well.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---
> 
> Notes:
>     Changes since v4:
>     - new patch
> 
>  arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> index 657c019d27fa9..4a445d8704c8f 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> @@ -388,6 +388,11 @@ scmi_shmem: sram@0 {
>  		};
>  	};
>  
> +	pmu1grf: syscon@fd58a000 {
> +		compatible = "rockchip,rk3588-pmugrf", "syscon", "simple-mfd";
                      ^^^^^^^^^^^^^^^^^^^^^^

Must be added in Documentation/devicetree/bindings/soc/rockchip/grf.yaml

Otherwise the patch is

Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>

-- Sebastian

> +		reg = <0x0 0xfd58a000 0x0 0x10000>;
> +	};
> +
>  	sys_grf: syscon@fd58c000 {
>  		compatible = "rockchip,rk3588-sys-grf", "syscon";
>  		reg = <0x0 0xfd58c000 0x0 0x1000>;
> @@ -1112,6 +1117,17 @@ qos_vop_m1: qos@fdf82200 {
>  		reg = <0x0 0xfdf82200 0x0 0x20>;
>  	};
>  
> +	dfi: dfi@fe060000 {
> +		reg = <0x00 0xfe060000 0x00 0x10000>;
> +		compatible = "rockchip,rk3588-dfi";
> +		interrupts = <GIC_SPI 28 IRQ_TYPE_LEVEL_HIGH 0>,
> +			     <GIC_SPI 38 IRQ_TYPE_LEVEL_HIGH 0>,
> +			     <GIC_SPI 48 IRQ_TYPE_LEVEL_HIGH 0>,
> +			     <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH 0>;
> +		interrupt-names = "ch0", "ch1", "ch2", "ch3";
> +		rockchip,pmu = <&pmu1grf>;
> +	};
> +
>  	gmac1: ethernet@fe1c0000 {
>  		compatible = "rockchip,rk3588-gmac", "snps,dwmac-4.20a";
>  		reg = <0x0 0xfe1c0000 0x0 0x10000>;
> -- 
> 2.39.2
>
Sebastian Reichel June 13, 2023, 5:17 p.m. UTC | #3
Hi,

On Wed, May 24, 2023 at 10:31:45AM +0200, Sascha Hauer wrote:
> The currently supported RK3399 has a stride of 20 between the channel
> specific registers. Upcoming RK3588 has a different stride, so put
> the stride into driver data to make it configurable.
> While at it convert decimal 20 to hex 0x14 for consistency with RK3588
> which has a register stride 0x4000 and we want to write that in hex
> as well.
> 
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---

Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>

-- Sebastian

>  drivers/devfreq/event/rockchip-dfi.c | 11 +++++++----
>  1 file changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/devfreq/event/rockchip-dfi.c b/drivers/devfreq/event/rockchip-dfi.c
> index 88145688e3d9c..a872550a7caf5 100644
> --- a/drivers/devfreq/event/rockchip-dfi.c
> +++ b/drivers/devfreq/event/rockchip-dfi.c
> @@ -112,6 +112,7 @@ struct rockchip_dfi {
>  	int active_events;
>  	int burst_len;
>  	int buswidth[DMC_MAX_CHANNELS];
> +	int ddrmon_stride;
>  };
>  
>  static int rockchip_dfi_enable(struct rockchip_dfi *dfi)
> @@ -189,13 +190,13 @@ static void rockchip_dfi_read_counters(struct rockchip_dfi *dfi, struct dmc_coun
>  		if (!(dfi->channel_mask & BIT(i)))
>  			continue;
>  		c->c[i].read_access = readl_relaxed(dfi_regs +
> -				DDRMON_CH0_RD_NUM + i * 20);
> +				DDRMON_CH0_RD_NUM + i * dfi->ddrmon_stride);
>  		c->c[i].write_access = readl_relaxed(dfi_regs +
> -				DDRMON_CH0_WR_NUM + i * 20);
> +				DDRMON_CH0_WR_NUM + i * dfi->ddrmon_stride);
>  		c->c[i].access = readl_relaxed(dfi_regs +
> -				DDRMON_CH0_DFI_ACCESS_NUM + i * 20);
> +				DDRMON_CH0_DFI_ACCESS_NUM + i * dfi->ddrmon_stride);
>  		c->c[i].clock_cycles = readl_relaxed(dfi_regs +
> -				DDRMON_CH0_COUNT_NUM + i * 20);
> +				DDRMON_CH0_COUNT_NUM + i * dfi->ddrmon_stride);
>  	}
>  }
>  
> @@ -661,6 +662,8 @@ static int rk3399_dfi_init(struct rockchip_dfi *dfi)
>  	dfi->buswidth[0] = FIELD_GET(RK3399_PMUGRF_OS_REG2_BW_CH0, val) == 0 ? 4 : 2;
>  	dfi->buswidth[1] = FIELD_GET(RK3399_PMUGRF_OS_REG2_BW_CH1, val) == 0 ? 4 : 2;
>  
> +	dfi->ddrmon_stride = 0x14;
> +
>  	return 0;
>  };
>  
> -- 
> 2.39.2
>
Sebastian Reichel June 13, 2023, 11:39 p.m. UTC | #4
Hi,

On Wed, May 24, 2023 at 10:31:48AM +0200, Sascha Hauer wrote:
> Convert the Rockchip DFI binding to yaml.
> 
> Reviewed-by: Rob Herring <robh@kernel.org>
> Signed-off-by: Sascha Hauer <s.hauer@pengutronix.de>
> ---

Reviewed-by: Sebastian Reichel <sebastian.reichel@collabora.com>

-- Sebastian

> 
> Notes:
>     Changes since v4:
>     
>     - Revert to state of v3 (changes were lost in v4)
> 
>  .../bindings/devfreq/event/rockchip,dfi.yaml  | 61 +++++++++++++++++++
>  .../bindings/devfreq/event/rockchip-dfi.txt   | 18 ------
>  .../rockchip,rk3399-dmc.yaml                  |  2 +-
>  3 files changed, 62 insertions(+), 19 deletions(-)
>  create mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
>  delete mode 100644 Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> 
> diff --git a/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml b/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
> new file mode 100644
> index 0000000000000..7a82f6ae0701e
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml
> @@ -0,0 +1,61 @@
> +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/devfreq/event/rockchip,dfi.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Rockchip DFI
> +
> +maintainers:
> +  - Sascha Hauer <s.hauer@pengutronix.de>
> +
> +properties:
> +  compatible:
> +    enum:
> +      - rockchip,rk3399-dfi
> +
> +  clocks:
> +    maxItems: 1
> +
> +  clock-names:
> +    items:
> +      - const: pclk_ddr_mon
> +
> +  interrupts:
> +    maxItems: 1
> +
> +  reg:
> +    maxItems: 1
> +
> +  rockchip,pmu:
> +    $ref: /schemas/types.yaml#/definitions/phandle
> +    description:
> +      Phandle to the syscon managing the "PMU general register files".
> +
> +required:
> +  - compatible
> +  - clocks
> +  - clock-names
> +  - interrupts
> +  - reg
> +
> +additionalProperties: false
> +
> +examples:
> +  - |
> +    #include <dt-bindings/interrupt-controller/arm-gic.h>
> +    #include <dt-bindings/clock/rk3308-cru.h>
> +
> +    bus {
> +      #address-cells = <2>;
> +      #size-cells = <2>;
> +
> +      dfi: dfi@ff630000 {
> +        compatible = "rockchip,rk3399-dfi";
> +        reg = <0x00 0xff630000 0x00 0x4000>;
> +        interrupts = <GIC_SPI 131 IRQ_TYPE_LEVEL_HIGH 0>;
> +        rockchip,pmu = <&pmugrf>;
> +        clocks = <&cru PCLK_DDR_MON>;
> +        clock-names = "pclk_ddr_mon";
> +      };
> +    };
> diff --git a/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt b/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> deleted file mode 100644
> index 148191b0fc158..0000000000000
> --- a/Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt
> +++ /dev/null
> @@ -1,18 +0,0 @@
> -
> -* Rockchip rk3399 DFI device
> -
> -Required properties:
> -- compatible: Must be "rockchip,rk3399-dfi".
> -- reg: physical base address of each DFI and length of memory mapped region
> -- rockchip,pmu: phandle to the syscon managing the "pmu general register files"
> -- clocks: phandles for clock specified in "clock-names" property
> -- clock-names : the name of clock used by the DFI, must be "pclk_ddr_mon";
> -
> -Example:
> -	dfi: dfi@ff630000 {
> -		compatible = "rockchip,rk3399-dfi";
> -		reg = <0x00 0xff630000 0x00 0x4000>;
> -		rockchip,pmu = <&pmugrf>;
> -		clocks = <&cru PCLK_DDR_MON>;
> -		clock-names = "pclk_ddr_mon";
> -	};
> diff --git a/Documentation/devicetree/bindings/memory-controllers/rockchip,rk3399-dmc.yaml b/Documentation/devicetree/bindings/memory-controllers/rockchip,rk3399-dmc.yaml
> index fb4920397d08e..aba8649aaeb10 100644
> --- a/Documentation/devicetree/bindings/memory-controllers/rockchip,rk3399-dmc.yaml
> +++ b/Documentation/devicetree/bindings/memory-controllers/rockchip,rk3399-dmc.yaml
> @@ -18,7 +18,7 @@ properties:
>      $ref: /schemas/types.yaml#/definitions/phandle
>      description:
>        Node to get DDR loading. Refer to
> -      Documentation/devicetree/bindings/devfreq/event/rockchip-dfi.txt.
> +      Documentation/devicetree/bindings/devfreq/event/rockchip,dfi.yaml.
>  
>    clocks:
>      maxItems: 1
> -- 
> 2.39.2
>
Sebastian Reichel June 14, 2023, 3:27 p.m. UTC | #5
Hi,

On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote:
> I tested the series on RK3588 EVB1. The read/write byts looks
> sensible. Sometimes cycles reads unrealistic values, though:
> 
> 18446744070475110400      rockchip_ddr/cycles/

I have seen this going off a few times with and without memory
pressure. If it's way off, it always seems to follow the same
pattern: The upper 32 bits are 0xffffffff instead of 0x00000000
with the lower 32 bits containing sensible data.

-- Sebastian
Vincent Legoll June 14, 2023, 7:51 p.m. UTC | #6
Hi,

On Wed, Jun 14, 2023 at 3:27 PM Sebastian Reichel
<sebastian.reichel@collabora.com> wrote:
> On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote:
> > I tested the series on RK3588 EVB1. The read/write byts looks
> > sensible. Sometimes cycles reads unrealistic values, though:
> >
> > 18446744070475110400      rockchip_ddr/cycles/
>
> I have seen this going off a few times with and without memory
> pressure. If it's way off, it always seems to follow the same
> pattern: The upper 32 bits are 0xffffffff instead of 0x00000000
> with the lower 32 bits containing sensible data.

How often is that happening ?

I have been running this in a loop with varying sleep duration for
the last few hours without hitting it, with and without memtester.

REPEATS=1000
MAX_DURATION=100

OUT="/tmp/perf-dfi-rk3588-${REPEATS}x${MAX_DURATION}s.txt"
echo -n > $OUT

for i in $(seq $REPEATS) ; do
  DURATION=$(shuf -i "0-${MAX_DURATION}" -n 1)
  echo -n "${DURATION} : " >> $OUT
  perf stat -a -e rockchip_ddr/cycles/ sleep $DURATION 2>&1 | awk
'/rockchip_ddr/ {printf("%X\n", int($1))}' >> $OUT
done

BTW, it's on a musl-libc arm64 void linux userspace.
Sebastian Reichel June 14, 2023, 10:18 p.m. UTC | #7
Hi,

On Wed, Jun 14, 2023 at 07:51:21PM +0000, Vincent Legoll wrote:
> On Wed, Jun 14, 2023 at 3:27 PM Sebastian Reichel
> <sebastian.reichel@collabora.com> wrote:
> > On Wed, Jun 14, 2023 at 03:40:34PM +0200, Sebastian Reichel wrote:
> > > I tested the series on RK3588 EVB1. The read/write byts looks
> > > sensible. Sometimes cycles reads unrealistic values, though:
> > >
> > > 18446744070475110400      rockchip_ddr/cycles/
> >
> > I have seen this going off a few times with and without memory
> > pressure. If it's way off, it always seems to follow the same
> > pattern: The upper 32 bits are 0xffffffff instead of 0x00000000
> > with the lower 32 bits containing sensible data.
> 
> How often is that happening ?

I saw it multiple times (4 or 5) within a few tries (I guess around
20). I could see it with and without applying load on the memory.
Tests have been running globally for a second using 'sleep 1' (just
like the example from Sascha Hauer in the perf patch)

> BTW, it's on a musl-libc arm64 void linux userspace.

In my case it's Debian unstable.

-- Sebastian