mbox series

[v2,00/12] PolarFire SoC reset controller & clock cleanups

Message ID 20220704121558.2088698-1-conor.dooley@microchip.com
Headers show
Series PolarFire SoC reset controller & clock cleanups | expand

Message

Conor Dooley July 4, 2022, 12:15 p.m. UTC
Hey all,
I know I have not sat on the RFC I sent about the aux. bus parts
for too long, but figured I'd just send the whole thing anyway to all
lists etc.

Kinda two things happening in this series, but I sent it together to
ensure the second part would apply correctly.

The first is the reset controller that I promised after discovering the
issue triggered by CONFIG_PM & the phy not coming up correctly. I have
now removed all the messing with resets from clock enable/disable
functions & now use the aux bus to set up a reset controller driver.
Since I needed something to test it, I hooked up the reset for the
Cadence MACB on PolarFire SoC. This has been split into a second series
for v2:
https://lore.kernel.org/all/20220704114511.1892332-1-conor.dooley@microchip.com/

The second part adds rate control for the MSS PLL clock, followed by
some simplifications to the driver & conversions of some custom structs
to the corresponding structs in the framework.

Thanks,
Conor.

Changes since v1:
- split off the net patches
- clk: actually pass the spinlock to the converted dividers & gates
- reset: added a spinlock around RMW access to registers
- reset: switched to BIT(i) macros
- reset: used local copies of some variables as pointed out by Philipp
- reset: dropped the success printout

Conor Dooley (12):
  dt-bindings: clk: microchip: mpfs: add reset controller support
  clk: microchip: mpfs: add reset controller
  reset: add polarfire soc reset support
  MAINTAINERS: add polarfire soc reset controller
  riscv: dts: microchip: add mpfs specific macb reset support
  clk: microchip: mpfs: add module_authors entries
  clk: microchip: mpfs: add MSS pll's set & round rate
  clk: microchip: mpfs: move id & offset out of clock structs
  clk: microchip: mpfs: simplify control reg access
  clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
  clk: microchip: mpfs: convert cfg_clk to clk_divider
  clk: microchip: mpfs: convert periph_clk to clk_gate

 .../bindings/clock/microchip,mpfs.yaml        |  17 +-
 MAINTAINERS                                   |   1 +
 arch/riscv/boot/dts/microchip/mpfs.dtsi       |   7 +-
 drivers/clk/microchip/Kconfig                 |   1 +
 drivers/clk/microchip/clk-mpfs.c              | 379 +++++++++---------
 drivers/reset/Kconfig                         |   7 +
 drivers/reset/Makefile                        |   2 +-
 drivers/reset/reset-mpfs.c                    | 157 ++++++++
 include/soc/microchip/mpfs.h                  |   8 +
 9 files changed, 386 insertions(+), 193 deletions(-)
 create mode 100644 drivers/reset/reset-mpfs.c


base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3

Comments

Daire.McNamara@microchip.com Aug. 10, 2022, 11:50 a.m. UTC | #1
Reviewed-by: Daire McNamara <daire.mcnamara@microchip.com>

On Mon, 2022-07-04 at 13:15 +0100, Conor Dooley wrote:
> Hey all,
> I know I have not sat on the RFC I sent about the aux. bus parts
> for too long, but figured I'd just send the whole thing anyway to all
> lists etc.
> 
> Kinda two things happening in this series, but I sent it together to
> ensure the second part would apply correctly.
> 
> The first is the reset controller that I promised after discovering
> the
> issue triggered by CONFIG_PM & the phy not coming up correctly. I
> have
> now removed all the messing with resets from clock enable/disable
> functions & now use the aux bus to set up a reset controller driver.
> Since I needed something to test it, I hooked up the reset for the
> Cadence MACB on PolarFire SoC. This has been split into a second
> series
> for v2:
> https://lore.kernel.org/all/20220704114511.1892332-1-conor.dooley@microchip.com/
> 
> The second part adds rate control for the MSS PLL clock, followed by
> some simplifications to the driver & conversions of some custom
> structs
> to the corresponding structs in the framework.
> 
> Thanks,
> Conor.
> 
> Changes since v1:
> - split off the net patches
> - clk: actually pass the spinlock to the converted dividers & gates
> - reset: added a spinlock around RMW access to registers
> - reset: switched to BIT(i) macros
> - reset: used local copies of some variables as pointed out by
> Philipp
> - reset: dropped the success printout
> 
> Conor Dooley (12):
>   dt-bindings: clk: microchip: mpfs: add reset controller support
>   clk: microchip: mpfs: add reset controller
>   reset: add polarfire soc reset support
>   MAINTAINERS: add polarfire soc reset controller
>   riscv: dts: microchip: add mpfs specific macb reset support
>   clk: microchip: mpfs: add module_authors entries
>   clk: microchip: mpfs: add MSS pll's set & round rate
>   clk: microchip: mpfs: move id & offset out of clock structs
>   clk: microchip: mpfs: simplify control reg access
>   clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
>   clk: microchip: mpfs: convert cfg_clk to clk_divider
>   clk: microchip: mpfs: convert periph_clk to clk_gate
> 
>  .../bindings/clock/microchip,mpfs.yaml        |  17 +-
>  MAINTAINERS                                   |   1 +
>  arch/riscv/boot/dts/microchip/mpfs.dtsi       |   7 +-
>  drivers/clk/microchip/Kconfig                 |   1 +
>  drivers/clk/microchip/clk-mpfs.c              | 379 +++++++++-------
> --
>  drivers/reset/Kconfig                         |   7 +
>  drivers/reset/Makefile                        |   2 +-
>  drivers/reset/reset-mpfs.c                    | 157 ++++++++
>  include/soc/microchip/mpfs.h                  |   8 +
>  9 files changed, 386 insertions(+), 193 deletions(-)
>  create mode 100644 drivers/reset/reset-mpfs.c
> 
> 
> base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
Conor Dooley Aug. 10, 2022, 7:20 p.m. UTC | #2
On 10/08/2022 19:56, Nathan Chancellor wrote:
> Hi Conor,
> 
> On Tue, Aug 09, 2022 at 11:05:32PM +0000, Conor.Dooley@microchip.com wrote:
>> +CC clang people :)
>>
>> Got an odd one here and would appreciate some pointers for where to
>> look. This code when built with gcc boots fine, for example with:
>> riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0
>> The same code but build with clang build it fails to boot but prior to
>> that applying this patchset it boots fine. Specifically it is the patch
>> "clk: microchip: mpfs: move id & offset out of clock structs"
>>
>> I applied this patchset on top of tonight's master (15205c2829ca) but
>> I've been seeing the same problem for a few weeks on -next too. I tried
>> the following 2 versions of clang/llvm:
>> ClangBuiltLinux clang version 15.0.0 (5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0
>> ClangBuiltLinux clang version 15.0.0 (bab8af8ea062f6332b5c5d13ae688bb8900f244a), LLD 15.0.0
> 
> Good to know that it reproduces with fairly recent versions of LLVM :)
> 
>> It's probably something silly that I've overlooked but I am not au
>> fait with these sort of things unfortunately, but hey - at least I'll
>> learn something then.
> 
> I took a quick glance at the patch you mentioned above and I don't
> immediately see anything as problematic...

Yeah, I couldn't see any low hanging fruit either.

> I was going to see if I could
> reproduce this locally in QEMU since I do see there is a machine
> 'microchip-icicle-kit' but I am not having much success getting the
> machine past SBI. Does this reproduce in QEMU or are you working with
> the real hardware? If QEMU, do you happen to have a working invocation
> handy?

Yeah... So there was a QEMU incantation that worked at some point in
the past (ie when someone wrote the QEMU port) but most peripherals
are not implemented and current versions of our openSBI implementation
requires more than one of the unimplemented peripherals. I was trying to
get it working lately in the evenings based on some patches that were a
year old but no joy :/

I'm running on the real hardware, I'll give the older combo of qemu
"bios" etc a go again over the weekend & try to get it working. In the
meantime, any suggestions?
Thanks Nathan,
Conor.

> 
>> The boot log is fairly short so here ya go:
>>
>> [    0.000000] Linux version 5.19.0-13253-g374b508ee318 (conor@spud) (ClangBuiltLinux clang version 15.0.0 (git@g
>> ithub.com:llvm/llvm-project.git 5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0) #1 SMP Tue Aug 9 22:42:10 IST 
>> 2022
>> [    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
>> [    0.000000] Machine model: Microchip PolarFire-SoC Icicle Kit
>> [    0.000000] earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8')
>> [    0.000000] printk: bootconsole [ns16550a0] enabled
>> [    0.000000] efi: UEFI not found.
>> [    0.000000] Zone ranges:
>> [    0.000000]   DMA32    [mem 0x0000000080200000-0x00000000ffffffff]
>> [    0.000000]   Normal   [mem 0x0000000100000000-0x000000103fffffff]
>> [    0.000000] Movable zone start for each node
>> [    0.000000] Early memory node ranges
>> [    0.000000]   node   0: [mem 0x0000000080200000-0x00000000adffffff]
>> [    0.000000]   node   0: [mem 0x0000001000000000-0x000000103fffffff]
>> [    0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000103fffffff]
>> [    0.000000] On node 0, zone Normal: 16064512 pages in unavailable ranges
>> [    0.000000] SBI specification v0.3 detected
>> [    0.000000] SBI implementation ID=0x1 Version=0x9
>> [    0.000000] SBI TIME extension detected
>> [    0.000000] SBI IPI extension detected
>> [    0.000000] SBI RFENCE extension detected
>> [    0.000000] SBI HSM extension detected
>> [    0.000000] CPU with hartid=0 is not available
>> [    0.000000] CPU with hartid=0 is not available
>> [    0.000000] riscv: base ISA extensions acdfim
>> [    0.000000] riscv: ELF capabilities acdfim
>> [    0.000000] percpu: Embedded 18 pages/cpu s34168 r8192 d31368 u73728
>> [    0.000000] pcpu-alloc: s34168 r8192 d31368 u73728 alloc=18*4096
>> [    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
>> [    0.000000] CPU node for /cpus/cpu@0 exist but the possible cpu range is :0-3
>> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 224263
>> [    0.000000] Kernel command line: root=/dev/nfs ip=dhcp debug nfsroot=192.168.2.5:/stuff/nfs_share earlycon
>> [    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
>> [    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
>> [    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
>> [    0.000000] software IO TLB: area num 4.
>> [    0.000000] software IO TLB: mapped [mem 0x00000000aa000000-0x00000000ae000000] (64MB)
>> [    0.000000] Virtual kernel memory layout:
>> [    0.000000]       fixmap : 0xffffffc6fee00000 - 0xffffffc6ff000000   (2048 kB)
>> [    0.000000]       pci io : 0xffffffc6ff000000 - 0xffffffc700000000   (  16 MB)
>> [    0.000000]      vmemmap : 0xffffffc700000000 - 0xffffffc800000000   (4096 MB)
>> [    0.000000]      vmalloc : 0xffffffc800000000 - 0xffffffd800000000   (  64 GB)
>> [    0.000000]       lowmem : 0xffffffd800000000 - 0xffffffe7bfe00000   (  62 GB)
>> [    0.000000]       kernel : 0xffffffff80000000 - 0xffffffffffffffff   (2047 MB)
>> [    0.000000] Memory: 803032K/1800192K available (7485K kernel code, 2858K rwdata, 4096K rodata, 2181K init, 394K bss, 997160K reserved, 0K cma-reserved)
>> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
>> [    0.000000] rcu: Hierarchical RCU implementation.
>> [    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
>> [    0.000000] rcu:     RCU debug extended QS entry/exit.
>> [    0.000000]  Tracing variant of Tasks RCU enabled.
>> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
>> [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
>> [    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
>> [    0.000000] CPU with hartid=0 is not available
>> [    0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
>> [    0.000000] riscv-intc: 64 local interrupts mapped
>> [    0.000000] plic: interrupt-controller@c000000: mapped 186 interrupts with 4 handlers for 9 contexts.
>> [    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
>> [    0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [4]
>> [    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
>> [    0.000003] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
>> [    0.009713] Console: colour dummy device 80x25
>> [    0.014676] printk: console [tty0] enabled
>> [    0.019217] printk: bootconsole [ns16550a0] disabled
>>
>> FWIW this is right about when the clock driver gets loaded
>> as you might imagine.
>>
>> My config is here, but it's been reproduced on a few different
>> defconfigs:
>> https://raw.githubusercontent.com/ConchuOD/polarfire-soc-buildroot-sdk/dev/conf/lowmem/defconfig
>>
>> On 04/07/2022 13:15, Conor Dooley wrote:
>>> Hey all,
>>> I know I have not sat on the RFC I sent about the aux. bus parts
>>> for too long, but figured I'd just send the whole thing anyway to all
>>> lists etc.
>>>
>>> Kinda two things happening in this series, but I sent it together to
>>> ensure the second part would apply correctly.
>>>
>>> The first is the reset controller that I promised after discovering the
>>> issue triggered by CONFIG_PM & the phy not coming up correctly. I have
>>> now removed all the messing with resets from clock enable/disable
>>> functions & now use the aux bus to set up a reset controller driver.
>>> Since I needed something to test it, I hooked up the reset for the
>>> Cadence MACB on PolarFire SoC. This has been split into a second series
>>> for v2:
>>> https://lore.kernel.org/all/20220704114511.1892332-1-conor.dooley@microchip.com/
>>>
>>> The second part adds rate control for the MSS PLL clock, followed by
>>> some simplifications to the driver & conversions of some custom structs
>>> to the corresponding structs in the framework.
>>>
>>> Thanks,
>>> Conor.
>>>
>>> Changes since v1:
>>> - split off the net patches
>>> - clk: actually pass the spinlock to the converted dividers & gates
>>> - reset: added a spinlock around RMW access to registers
>>> - reset: switched to BIT(i) macros
>>> - reset: used local copies of some variables as pointed out by Philipp
>>> - reset: dropped the success printout
>>>
>>> Conor Dooley (12):
>>>   dt-bindings: clk: microchip: mpfs: add reset controller support
>>>   clk: microchip: mpfs: add reset controller
>>>   reset: add polarfire soc reset support
>>>   MAINTAINERS: add polarfire soc reset controller
>>>   riscv: dts: microchip: add mpfs specific macb reset support
>>>   clk: microchip: mpfs: add module_authors entries
>>>   clk: microchip: mpfs: add MSS pll's set & round rate
>>>   clk: microchip: mpfs: move id & offset out of clock structs
>>>   clk: microchip: mpfs: simplify control reg access
>>>   clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
>>>   clk: microchip: mpfs: convert cfg_clk to clk_divider
>>>   clk: microchip: mpfs: convert periph_clk to clk_gate
>>>
>>>  .../bindings/clock/microchip,mpfs.yaml        |  17 +-
>>>  MAINTAINERS                                   |   1 +
>>>  arch/riscv/boot/dts/microchip/mpfs.dtsi       |   7 +-
>>>  drivers/clk/microchip/Kconfig                 |   1 +
>>>  drivers/clk/microchip/clk-mpfs.c              | 379 +++++++++---------
>>>  drivers/reset/Kconfig                         |   7 +
>>>  drivers/reset/Makefile                        |   2 +-
>>>  drivers/reset/reset-mpfs.c                    | 157 ++++++++
>>>  include/soc/microchip/mpfs.h                  |   8 +
>>>  9 files changed, 386 insertions(+), 193 deletions(-)
>>>  create mode 100644 drivers/reset/reset-mpfs.c
>>>
>>>
>>> base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
Nathan Chancellor Aug. 10, 2022, 7:32 p.m. UTC | #3
On Wed, Aug 10, 2022 at 07:20:24PM +0000, Conor.Dooley@microchip.com wrote:
> On 10/08/2022 19:56, Nathan Chancellor wrote:
> > Hi Conor,
> > 
> > On Tue, Aug 09, 2022 at 11:05:32PM +0000, Conor.Dooley@microchip.com wrote:
> >> +CC clang people :)
> >>
> >> Got an odd one here and would appreciate some pointers for where to
> >> look. This code when built with gcc boots fine, for example with:
> >> riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0
> >> The same code but build with clang build it fails to boot but prior to
> >> that applying this patchset it boots fine. Specifically it is the patch
> >> "clk: microchip: mpfs: move id & offset out of clock structs"
> >>
> >> I applied this patchset on top of tonight's master (15205c2829ca) but
> >> I've been seeing the same problem for a few weeks on -next too. I tried
> >> the following 2 versions of clang/llvm:
> >> ClangBuiltLinux clang version 15.0.0 (5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0
> >> ClangBuiltLinux clang version 15.0.0 (bab8af8ea062f6332b5c5d13ae688bb8900f244a), LLD 15.0.0
> > 
> > Good to know that it reproduces with fairly recent versions of LLVM :)
> > 
> >> It's probably something silly that I've overlooked but I am not au
> >> fait with these sort of things unfortunately, but hey - at least I'll
> >> learn something then.
> > 
> > I took a quick glance at the patch you mentioned above and I don't
> > immediately see anything as problematic...
> 
> Yeah, I couldn't see any low hanging fruit either.
> 
> > I was going to see if I could
> > reproduce this locally in QEMU since I do see there is a machine
> > 'microchip-icicle-kit' but I am not having much success getting the
> > machine past SBI. Does this reproduce in QEMU or are you working with
> > the real hardware? If QEMU, do you happen to have a working invocation
> > handy?
> 
> Yeah... So there was a QEMU incantation that worked at some point in
> the past (ie when someone wrote the QEMU port) but most peripherals
> are not implemented and current versions of our openSBI implementation
> requires more than one of the unimplemented peripherals. I was trying to
> get it working lately in the evenings based on some patches that were a
> year old but no joy :/

Heh, I guess that would explain why it wasn't working for me :)

> I'm running on the real hardware, I'll give the older combo of qemu
> "bios" etc a go again over the weekend & try to get it working. In the
> meantime, any suggestions?

Are you building with 'LLVM=1' or just 'CC=clang'? If 'LLVM=1', I would
try breaking it apart into the individual options (LD=ld.lld,
OBJCOPY=llvm-objcopy) and see if dropping one of those makes a
difference. We have had subtle differences between the GNU and LLVM
tools before and it is much easier to look into that difference if we
know it happens in only one tool.

Otherwise, I am not sure I have any immediate ideas other than looking
at the disassembly and trying to see if something is going wrong. Is
the object file being modified in any other way (I don't think there is
something like objtool for RISC-V but I could be wrong)?

Cheers,
Nathan

> >> The boot log is fairly short so here ya go:
> >>
> >> [    0.000000] Linux version 5.19.0-13253-g374b508ee318 (conor@spud) (ClangBuiltLinux clang version 15.0.0 (git@g
> >> ithub.com:llvm/llvm-project.git 5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0) #1 SMP Tue Aug 9 22:42:10 IST 
> >> 2022
> >> [    0.000000] OF: fdt: Ignoring memory range 0x80000000 - 0x80200000
> >> [    0.000000] Machine model: Microchip PolarFire-SoC Icicle Kit
> >> [    0.000000] earlycon: ns16550a0 at MMIO32 0x0000000020100000 (options '115200n8')
> >> [    0.000000] printk: bootconsole [ns16550a0] enabled
> >> [    0.000000] efi: UEFI not found.
> >> [    0.000000] Zone ranges:
> >> [    0.000000]   DMA32    [mem 0x0000000080200000-0x00000000ffffffff]
> >> [    0.000000]   Normal   [mem 0x0000000100000000-0x000000103fffffff]
> >> [    0.000000] Movable zone start for each node
> >> [    0.000000] Early memory node ranges
> >> [    0.000000]   node   0: [mem 0x0000000080200000-0x00000000adffffff]
> >> [    0.000000]   node   0: [mem 0x0000001000000000-0x000000103fffffff]
> >> [    0.000000] Initmem setup node 0 [mem 0x0000000080200000-0x000000103fffffff]
> >> [    0.000000] On node 0, zone Normal: 16064512 pages in unavailable ranges
> >> [    0.000000] SBI specification v0.3 detected
> >> [    0.000000] SBI implementation ID=0x1 Version=0x9
> >> [    0.000000] SBI TIME extension detected
> >> [    0.000000] SBI IPI extension detected
> >> [    0.000000] SBI RFENCE extension detected
> >> [    0.000000] SBI HSM extension detected
> >> [    0.000000] CPU with hartid=0 is not available
> >> [    0.000000] CPU with hartid=0 is not available
> >> [    0.000000] riscv: base ISA extensions acdfim
> >> [    0.000000] riscv: ELF capabilities acdfim
> >> [    0.000000] percpu: Embedded 18 pages/cpu s34168 r8192 d31368 u73728
> >> [    0.000000] pcpu-alloc: s34168 r8192 d31368 u73728 alloc=18*4096
> >> [    0.000000] pcpu-alloc: [0] 0 [0] 1 [0] 2 [0] 3
> >> [    0.000000] CPU node for /cpus/cpu@0 exist but the possible cpu range is :0-3
> >> [    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 224263
> >> [    0.000000] Kernel command line: root=/dev/nfs ip=dhcp debug nfsroot=192.168.2.5:/stuff/nfs_share earlycon
> >> [    0.000000] Dentry cache hash table entries: 131072 (order: 8, 1048576 bytes, linear)
> >> [    0.000000] Inode-cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
> >> [    0.000000] mem auto-init: stack:all(zero), heap alloc:off, heap free:off
> >> [    0.000000] software IO TLB: area num 4.
> >> [    0.000000] software IO TLB: mapped [mem 0x00000000aa000000-0x00000000ae000000] (64MB)
> >> [    0.000000] Virtual kernel memory layout:
> >> [    0.000000]       fixmap : 0xffffffc6fee00000 - 0xffffffc6ff000000   (2048 kB)
> >> [    0.000000]       pci io : 0xffffffc6ff000000 - 0xffffffc700000000   (  16 MB)
> >> [    0.000000]      vmemmap : 0xffffffc700000000 - 0xffffffc800000000   (4096 MB)
> >> [    0.000000]      vmalloc : 0xffffffc800000000 - 0xffffffd800000000   (  64 GB)
> >> [    0.000000]       lowmem : 0xffffffd800000000 - 0xffffffe7bfe00000   (  62 GB)
> >> [    0.000000]       kernel : 0xffffffff80000000 - 0xffffffffffffffff   (2047 MB)
> >> [    0.000000] Memory: 803032K/1800192K available (7485K kernel code, 2858K rwdata, 4096K rodata, 2181K init, 394K bss, 997160K reserved, 0K cma-reserved)
> >> [    0.000000] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=4, Nodes=1
> >> [    0.000000] rcu: Hierarchical RCU implementation.
> >> [    0.000000] rcu:     RCU restricting CPUs from NR_CPUS=8 to nr_cpu_ids=4.
> >> [    0.000000] rcu:     RCU debug extended QS entry/exit.
> >> [    0.000000]  Tracing variant of Tasks RCU enabled.
> >> [    0.000000] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
> >> [    0.000000] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=4
> >> [    0.000000] NR_IRQS: 64, nr_irqs: 64, preallocated irqs: 0
> >> [    0.000000] CPU with hartid=0 is not available
> >> [    0.000000] riscv-intc: unable to find hart id for /cpus/cpu@0/interrupt-controller
> >> [    0.000000] riscv-intc: 64 local interrupts mapped
> >> [    0.000000] plic: interrupt-controller@c000000: mapped 186 interrupts with 4 handlers for 9 contexts.
> >> [    0.000000] rcu: srcu_init: Setting srcu_struct sizes based on contention.
> >> [    0.000000] riscv_timer_init_dt: Registering clocksource cpuid [0] hartid [4]
> >> [    0.000000] clocksource: riscv_clocksource: mask: 0xffffffffffffffff max_cycles: 0x1d854df40, max_idle_ns: 3526361616960 ns
> >> [    0.000003] sched_clock: 64 bits at 1000kHz, resolution 1000ns, wraps every 2199023255500ns
> >> [    0.009713] Console: colour dummy device 80x25
> >> [    0.014676] printk: console [tty0] enabled
> >> [    0.019217] printk: bootconsole [ns16550a0] disabled
> >>
> >> FWIW this is right about when the clock driver gets loaded
> >> as you might imagine.
> >>
> >> My config is here, but it's been reproduced on a few different
> >> defconfigs:
> >> https://raw.githubusercontent.com/ConchuOD/polarfire-soc-buildroot-sdk/dev/conf/lowmem/defconfig
> >>
> >> On 04/07/2022 13:15, Conor Dooley wrote:
> >>> Hey all,
> >>> I know I have not sat on the RFC I sent about the aux. bus parts
> >>> for too long, but figured I'd just send the whole thing anyway to all
> >>> lists etc.
> >>>
> >>> Kinda two things happening in this series, but I sent it together to
> >>> ensure the second part would apply correctly.
> >>>
> >>> The first is the reset controller that I promised after discovering the
> >>> issue triggered by CONFIG_PM & the phy not coming up correctly. I have
> >>> now removed all the messing with resets from clock enable/disable
> >>> functions & now use the aux bus to set up a reset controller driver.
> >>> Since I needed something to test it, I hooked up the reset for the
> >>> Cadence MACB on PolarFire SoC. This has been split into a second series
> >>> for v2:
> >>> https://lore.kernel.org/all/20220704114511.1892332-1-conor.dooley@microchip.com/
> >>>
> >>> The second part adds rate control for the MSS PLL clock, followed by
> >>> some simplifications to the driver & conversions of some custom structs
> >>> to the corresponding structs in the framework.
> >>>
> >>> Thanks,
> >>> Conor.
> >>>
> >>> Changes since v1:
> >>> - split off the net patches
> >>> - clk: actually pass the spinlock to the converted dividers & gates
> >>> - reset: added a spinlock around RMW access to registers
> >>> - reset: switched to BIT(i) macros
> >>> - reset: used local copies of some variables as pointed out by Philipp
> >>> - reset: dropped the success printout
> >>>
> >>> Conor Dooley (12):
> >>>   dt-bindings: clk: microchip: mpfs: add reset controller support
> >>>   clk: microchip: mpfs: add reset controller
> >>>   reset: add polarfire soc reset support
> >>>   MAINTAINERS: add polarfire soc reset controller
> >>>   riscv: dts: microchip: add mpfs specific macb reset support
> >>>   clk: microchip: mpfs: add module_authors entries
> >>>   clk: microchip: mpfs: add MSS pll's set & round rate
> >>>   clk: microchip: mpfs: move id & offset out of clock structs
> >>>   clk: microchip: mpfs: simplify control reg access
> >>>   clk: microchip: mpfs: delete 2 line mpfs_clk_register_foo()
> >>>   clk: microchip: mpfs: convert cfg_clk to clk_divider
> >>>   clk: microchip: mpfs: convert periph_clk to clk_gate
> >>>
> >>>  .../bindings/clock/microchip,mpfs.yaml        |  17 +-
> >>>  MAINTAINERS                                   |   1 +
> >>>  arch/riscv/boot/dts/microchip/mpfs.dtsi       |   7 +-
> >>>  drivers/clk/microchip/Kconfig                 |   1 +
> >>>  drivers/clk/microchip/clk-mpfs.c              | 379 +++++++++---------
> >>>  drivers/reset/Kconfig                         |   7 +
> >>>  drivers/reset/Makefile                        |   2 +-
> >>>  drivers/reset/reset-mpfs.c                    | 157 ++++++++
> >>>  include/soc/microchip/mpfs.h                  |   8 +
> >>>  9 files changed, 386 insertions(+), 193 deletions(-)
> >>>  create mode 100644 drivers/reset/reset-mpfs.c
> >>>
> >>>
> >>> base-commit: b13baccc3850ca8b8cccbf8ed9912dbaa0fdf7f3
Conor Dooley Aug. 11, 2022, 1:13 p.m. UTC | #4
Hey Nathan,

On 10/08/2022 20:43, Conor Dooley - M52691 wrote:
> On 10/08/2022 20:32, Nathan Chancellor wrote:
>> On Wed, Aug 10, 2022 at 07:20:24PM +0000, Conor.Dooley@microchip.com wrote:
>>> On 10/08/2022 19:56, Nathan Chancellor wrote:
>>>> Hi Conor,
>>>>
>>>> On Tue, Aug 09, 2022 at 11:05:32PM +0000, Conor.Dooley@microchip.com wrote:
>>>>> +CC clang people :)
>>>>>
>>>>> Got an odd one here and would appreciate some pointers for where to
>>>>> look. This code when built with gcc boots fine, for example with:
>>>>> riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0
>>>>> The same code but build with clang build it fails to boot but prior to
>>>>> that applying this patchset it boots fine. Specifically it is the patch
>>>>> "clk: microchip: mpfs: move id & offset out of clock structs"
>>>>>
>>>>> I applied this patchset on top of tonight's master (15205c2829ca) but
>>>>> I've been seeing the same problem for a few weeks on -next too. I tried
>>>>> the following 2 versions of clang/llvm:
>>>>> ClangBuiltLinux clang version 15.0.0 (5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0
>>>>> ClangBuiltLinux clang version 15.0.0 (bab8af8ea062f6332b5c5d13ae688bb8900f244a), LLD 15.0.0
>>>>
>>>> Good to know that it reproduces with fairly recent versions of LLVM :)
>>>>
>>>>> It's probably something silly that I've overlooked but I am not au
>>>>> fait with these sort of things unfortunately, but hey - at least I'll
>>>>> learn something then.
>>>>
>>>> I took a quick glance at the patch you mentioned above and I don't
>>>> immediately see anything as problematic...
>>>
>>> Yeah, I couldn't see any low hanging fruit either.
>>>
>>>> I was going to see if I could
>>>> reproduce this locally in QEMU since I do see there is a machine
>>>> 'microchip-icicle-kit' but I am not having much success getting the
>>>> machine past SBI. Does this reproduce in QEMU or are you working with
>>>> the real hardware? If QEMU, do you happen to have a working invocation
>>>> handy?
>>>
>>> Yeah... So there was a QEMU incantation that worked at some point in
>>> the past (ie when someone wrote the QEMU port) but most peripherals
>>> are not implemented and current versions of our openSBI implementation
>>> requires more than one of the unimplemented peripherals. I was trying to
>>> get it working lately in the evenings based on some patches that were a
>>> year old but no joy :/
>>
>> Heh, I guess that would explain why it wasn't working for me :)
>>
>>> I'm running on the real hardware, I'll give the older combo of qemu
>>> "bios" etc a go again over the weekend & try to get it working. In the
>>> meantime, any suggestions?
>>
>> Are you building with 'LLVM=1' or just 'CC=clang'? If 'LLVM=1', I would
>> try breaking it apart into the individual options (LD=ld.lld,
>> OBJCOPY=llvm-objcopy) and see if dropping one of those makes a
>> difference. We have had subtle differences between the GNU and LLVM
>> tools before and it is much easier to look into that difference if we
>> know it happens in only one tool.
> 
> LLVM=1.
> 
>>
>> Otherwise, I am not sure I have any immediate ideas other than looking
>> at the disassembly and trying to see if something is going wrong. Is
>> the object file being modified in any other way (I don't think there is
>> something like objtool for RISC-V but I could be wrong)?
> 
> I'll give the options a go so, I'll LYK how I get on.

So I managed to wrangle QEMU into repro-ing. booting with bootloaders
etc isn't going to work (nor will the config with gcc actually boot
properly) but it gets far enough to reproduce the problem.
You've got to jump right to the kernel for which the magic incantation
is:

$(QEMU)/qemu-system-riscv64 -M microchip-icicle-kit \
	-m 2G -smp 5 \
	-kernel $(wrkdir)/vmlinux.bin \
	-dtb $(wrkdir)/riscvpc.dtb \
	-display none -serial null \
	-serial stdio

(serial0 is disabled in the dt)

With gcc there'll be a bunch of warnings like:
clk_ahb: Zero divisor and CLK_DIVIDER_ALLOW_ZERO not set
That's "fine", not sure if it's the lack of bootloaders or the
emulation but 0 isn't a value the hardware will see. With the defconfig
I provided it'll fail to boot fairly late on because of missing musb
emulation.

Doesn't really matter since thats long enough to get past the switch
out of earlycon which is where the clang built kernel dies.

Didn't get a chance to look at disassembly etc today, but as I said
last night it reproduces with GNU binutils.

Thanks,
Conor.

On another note, brought up our QEMU port's state today so fixing
it is now on the good ole, ever expanding todo list :)
Conor Dooley Aug. 14, 2022, 11:41 a.m. UTC | #5
On 11/08/2022 14:13, Conor Dooley wrote:
> Hey Nathan,
> 
> On 10/08/2022 20:43, Conor Dooley - M52691 wrote:
>> On 10/08/2022 20:32, Nathan Chancellor wrote:
>>> On Wed, Aug 10, 2022 at 07:20:24PM +0000, Conor.Dooley@microchip.com wrote:
>>>> On 10/08/2022 19:56, Nathan Chancellor wrote:
>>>>> Hi Conor,
>>>>>
>>>>> On Tue, Aug 09, 2022 at 11:05:32PM +0000, Conor.Dooley@microchip.com wrote:
>>>>>> +CC clang people :)
>>>>>>
>>>>>> Got an odd one here and would appreciate some pointers for where to
>>>>>> look. This code when built with gcc boots fine, for example with:
>>>>>> riscv64-unknown-linux-gnu-gcc (g5964b5cd727) 11.1.0
>>>>>> The same code but build with clang build it fails to boot but prior to
>>>>>> that applying this patchset it boots fine. Specifically it is the patch
>>>>>> "clk: microchip: mpfs: move id & offset out of clock structs"
>>>>>>
>>>>>> I applied this patchset on top of tonight's master (15205c2829ca) but
>>>>>> I've been seeing the same problem for a few weeks on -next too. I tried
>>>>>> the following 2 versions of clang/llvm:
>>>>>> ClangBuiltLinux clang version 15.0.0 (5b0788fef86ed7008a11f6ee19b9d86d42b6fcfa), LLD 15.0.0
>>>>>> ClangBuiltLinux clang version 15.0.0 (bab8af8ea062f6332b5c5d13ae688bb8900f244a), LLD 15.0.0
>>>>>
>>>>> Good to know that it reproduces with fairly recent versions of LLVM :)
>>>>>
>>>>>> It's probably something silly that I've overlooked but I am not au
>>>>>> fait with these sort of things unfortunately, but hey - at least I'll
>>>>>> learn something then.
>>>>>
>>>>> I took a quick glance at the patch you mentioned above and I don't
>>>>> immediately see anything as problematic...
>>>>
>>>> Yeah, I couldn't see any low hanging fruit either.
>>>>
>>>>> I was going to see if I could
>>>>> reproduce this locally in QEMU since I do see there is a machine
>>>>> 'microchip-icicle-kit' but I am not having much success getting the
>>>>> machine past SBI. Does this reproduce in QEMU or are you working with
>>>>> the real hardware? If QEMU, do you happen to have a working invocation
>>>>> handy?
>>>>
>>>> Yeah... So there was a QEMU incantation that worked at some point in
>>>> the past (ie when someone wrote the QEMU port) but most peripherals
>>>> are not implemented and current versions of our openSBI implementation
>>>> requires more than one of the unimplemented peripherals. I was trying to
>>>> get it working lately in the evenings based on some patches that were a
>>>> year old but no joy :/
>>>
>>> Heh, I guess that would explain why it wasn't working for me :)
>>>
>>>> I'm running on the real hardware, I'll give the older combo of qemu
>>>> "bios" etc a go again over the weekend & try to get it working. In the
>>>> meantime, any suggestions?
>>>
>>> Are you building with 'LLVM=1' or just 'CC=clang'? If 'LLVM=1', I would
>>> try breaking it apart into the individual options (LD=ld.lld,
>>> OBJCOPY=llvm-objcopy) and see if dropping one of those makes a
>>> difference. We have had subtle differences between the GNU and LLVM
>>> tools before and it is much easier to look into that difference if we
>>> know it happens in only one tool.
>>
>> LLVM=1.
>>
>>>
>>> Otherwise, I am not sure I have any immediate ideas other than looking
>>> at the disassembly and trying to see if something is going wrong. Is
>>> the object file being modified in any other way (I don't think there is
>>> something like objtool for RISC-V but I could be wrong)?
>>
>> I'll give the options a go so, I'll LYK how I get on.
> 
> So I managed to wrangle QEMU into repro-ing. booting with bootloaders
> etc isn't going to work (nor will the config with gcc actually boot
> properly) but it gets far enough to reproduce the problem.
> You've got to jump right to the kernel for which the magic incantation
> is:
> 
> $(QEMU)/qemu-system-riscv64 -M microchip-icicle-kit \
>     -m 2G -smp 5 \
>     -kernel $(wrkdir)/vmlinux.bin \
>     -dtb $(wrkdir)/riscvpc.dtb \
>     -display none -serial null \
>     -serial stdio
> 
> (serial0 is disabled in the dt)
> 
> With gcc there'll be a bunch of warnings like:
> clk_ahb: Zero divisor and CLK_DIVIDER_ALLOW_ZERO not set
> That's "fine", not sure if it's the lack of bootloaders or the
> emulation but 0 isn't a value the hardware will see. With the defconfig
> I provided it'll fail to boot fairly late on because of missing musb
> emulation.

FWIW, I posted a QEMU patch to fix the missing peripherals, so a direct
kernel boot works now for GCC:
https://lore.kernel.org/qemu-devel/20220813135127.2971754-1-mail@conchuod.ie

(btw, I am on libera as conchuod in #riscv if you ever wanna ping me about
something, usually still about for "sane" NA working hours too)

> 
> Doesn't really matter since thats long enough to get past the switch
> out of earlycon which is where the clang built kernel dies.
> 
> Didn't get a chance to look at disassembly etc today, but as I said
> last night it reproduces with GNU binutils.
> 
> Thanks,
> Conor.
> 
> On another note, brought up our QEMU port's state today so fixing
> it is now on the good ole, ever expanding todo list :)