mbox series

[RFC,v1,0/3] support soc_device_match to return -EPROBE_DEFER

Message ID 20210419042722.27554-1-alice.guo@oss.nxp.com
Headers show
Series support soc_device_match to return -EPROBE_DEFER | expand

Message

Alice Guo (OSS) April 19, 2021, 4:27 a.m. UTC
From: Alice Guo <alice.guo@nxp.com>

In patch "soc: imx8m: change to use platform driver", change soc-imx8m.c to use
module platform driver and use NVMEM APIs to ocotp register, the reason is that
directly reading ocotp egister causes kexec kernel hang because kernel will
disable unused clks after kernel boots up. This patch makes the SoC driver
ready. This patch makes the SoC driver ready later than before, and causes device
depends on soc_device_match() for initialization are affected, resulting in
kernel boot error.

CAAM driver is one of these affected drivers. It uses soc_device_match() to find
the first matching entry of caam_imx_soc_table, if none of them match, the next
instruction will be executed without any processing because CAAM driver is used
not only on i.MX and LS, but also PPC and Vybrid. We hope that
soc_device_match() could support to return -EPROBE_DEFER(or some other error
code, e.g. -ENODEV, but not NULL) in case of “no SoC device registered” to SoC
bus. We tried it and updated all the code that is using soc_device_match()
throughout the tree.

Alice Guo (3):
  drivers: soc: add support for soc_device_match returning -EPROBE_DEFER
  caam: add defer probe when the caam driver cannot identify SoC
  driver: update all the code that use soc_device_match

 drivers/base/soc.c                            |  5 +++++
 drivers/bus/ti-sysc.c                         |  2 +-
 drivers/clk/renesas/r8a7795-cpg-mssr.c        |  4 +++-
 drivers/clk/renesas/rcar-gen2-cpg.c           |  2 +-
 drivers/clk/renesas/rcar-gen3-cpg.c           |  2 +-
 drivers/crypto/caam/ctrl.c                    |  3 +++
 drivers/dma/fsl-dpaa2-qdma/dpaa2-qdma.c       |  7 ++++++-
 drivers/dma/ti/k3-psil.c                      |  3 +++
 drivers/dma/ti/k3-udma.c                      |  2 +-
 drivers/gpu/drm/bridge/nwl-dsi.c              |  2 +-
 drivers/gpu/drm/meson/meson_drv.c             |  4 +++-
 drivers/gpu/drm/omapdrm/dss/dispc.c           |  2 +-
 drivers/gpu/drm/omapdrm/dss/dpi.c             |  4 +++-
 drivers/gpu/drm/omapdrm/dss/dsi.c             |  3 +++
 drivers/gpu/drm/omapdrm/dss/dss.c             |  3 +++
 drivers/gpu/drm/omapdrm/dss/hdmi4_core.c      |  3 +++
 drivers/gpu/drm/omapdrm/dss/venc.c            |  4 +++-
 drivers/gpu/drm/omapdrm/omap_drv.c            |  3 +++
 drivers/gpu/drm/rcar-du/rcar_du_crtc.c        |  4 +++-
 drivers/gpu/drm/rcar-du/rcar_lvds.c           |  2 +-
 drivers/gpu/drm/tidss/tidss_dispc.c           |  4 +++-
 drivers/iommu/ipmmu-vmsa.c                    |  7 +++++--
 drivers/media/platform/rcar-vin/rcar-core.c   |  2 +-
 drivers/media/platform/rcar-vin/rcar-csi2.c   |  2 +-
 drivers/media/platform/vsp1/vsp1_uif.c        |  4 +++-
 drivers/mmc/host/renesas_sdhi_core.c          |  2 +-
 drivers/mmc/host/renesas_sdhi_internal_dmac.c |  2 +-
 drivers/mmc/host/sdhci-of-esdhc.c             | 21 ++++++++++++++-----
 drivers/mmc/host/sdhci-omap.c                 |  2 +-
 drivers/mmc/host/sdhci_am654.c                |  2 +-
 drivers/net/ethernet/renesas/ravb_main.c      |  4 +++-
 drivers/net/ethernet/ti/am65-cpsw-nuss.c      |  2 +-
 drivers/net/ethernet/ti/cpsw.c                |  2 +-
 drivers/net/ethernet/ti/cpsw_new.c            |  2 +-
 drivers/phy/ti/phy-omap-usb2.c                |  4 +++-
 drivers/pinctrl/renesas/core.c                |  2 +-
 drivers/pinctrl/renesas/pfc-r8a7790.c         |  5 ++++-
 drivers/pinctrl/renesas/pfc-r8a7794.c         |  5 ++++-
 drivers/soc/fsl/dpio/dpio-driver.c            | 13 ++++++++----
 drivers/soc/renesas/r8a774c0-sysc.c           |  5 ++++-
 drivers/soc/renesas/r8a7795-sysc.c            |  2 +-
 drivers/soc/renesas/r8a77990-sysc.c           |  5 ++++-
 drivers/soc/ti/k3-ringacc.c                   |  2 +-
 drivers/staging/mt7621-pci/pci-mt7621.c       |  2 +-
 drivers/thermal/rcar_gen3_thermal.c           |  4 +++-
 drivers/thermal/ti-soc-thermal/ti-bandgap.c   | 10 +++++++--
 drivers/usb/gadget/udc/renesas_usb3.c         |  2 +-
 drivers/usb/host/ehci-platform.c              |  4 +++-
 drivers/usb/host/xhci-rcar.c                  |  2 +-
 drivers/watchdog/renesas_wdt.c                |  2 +-
 50 files changed, 139 insertions(+), 52 deletions(-)

Comments

Dominique MARTINET April 19, 2021, 4:49 a.m. UTC | #1
First comment overall for the whole serie:
Since it is the solution I had suggested when I reported the problem[1]
I have no qualm on the approach, comments for individual patches
follow.

[1] http://lore.kernel.org/r/YGGZJjAxA1IO+/VU@atmark-techno.com


Alice Guo (OSS) wrote on Mon, Apr 19, 2021 at 12:27:20PM +0800:
> From: Alice Guo <alice.guo@nxp.com>
> 
> In i.MX8M boards, the registration of SoC device is later than caam
> driver which needs it. Caam driver needs soc_device_match to provide
> -EPROBE_DEFER when no SoC device is registered and no
> early_soc_dev_attr.

This patch should be last in the set: you can't have soc_device_match
return an error before its callers handle it.

> Signed-off-by: Alice Guo <alice.guo@nxp.com>

As the one who reported the problem I would have been appreciated being
at least added to Ccs... I only happened to notice you posted this by
chance.

There is also not a single Fixes tag -- I believe this commit should
have Fixes: 7d981405d0fd ("soc: imx8m: change to use platform driver")
but I'm not sure how such tags should be handled in case of multiple
patches fixing something.
Dominique MARTINET April 19, 2021, 5:02 a.m. UTC | #2
Alice Guo (OSS) wrote on Mon, Apr 19, 2021 at 12:27:22PM +0800:
> From: Alice Guo <alice.guo@nxp.com>
> 
> Update all the code that use soc_device_match

A single patch might be difficult to accept for all components, a each
maintainer will probably want to have a say on their subsystem?

I would suggest to split these for a non-RFC version; a this will really
need to be case-by-case handling.

> because add support for soc_device_match returning -EPROBE_DEFER.

(English does not parse here for me)

I've only commented a couple of places in the code itself, but this
doesn't seem to add much support for errors, just sweep the problem
under the rug.

> Signed-off-by: Alice Guo <alice.guo@nxp.com>
> ---
> 
> diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c
> index 5fae60f8c135..00c59aa217c1 100644
> --- a/drivers/bus/ti-sysc.c
> +++ b/drivers/bus/ti-sysc.c
> @@ -2909,7 +2909,7 @@ static int sysc_init_soc(struct sysc *ddata)
>  	}
>  
>  	match = soc_device_match(sysc_soc_feat_match);
> -	if (!match)
> +	if (!match || IS_ERR(match))
>  		return 0;

This function handles errors, I would recommend returning the error as
is if soc_device_match returned one so the probe can be retried later.

>  
>  	if (match->data)
> diff --git a/drivers/clk/renesas/r8a7795-cpg-mssr.c b/drivers/clk/renesas/r8a7795-cpg-mssr.c
> index c32d2c678046..90a18336a4c3 100644
> --- a/drivers/clk/renesas/r8a7795-cpg-mssr.c
> +++ b/drivers/clk/renesas/r8a7795-cpg-mssr.c
> @@ -439,6 +439,7 @@ static const unsigned int r8a7795es2_mod_nullify[] __initconst = {
>  
>  static int __init r8a7795_cpg_mssr_init(struct device *dev)
>  {
> +	const struct soc_device_attribute *match;
>  	const struct rcar_gen3_cpg_pll_config *cpg_pll_config;
>  	u32 cpg_mode;
>  	int error;
> @@ -453,7 +454,8 @@ static int __init r8a7795_cpg_mssr_init(struct device *dev)
>  		return -EINVAL;
>  	}
>  
> -	if (soc_device_match(r8a7795es1)) {
> +	match = soc_device_match(r8a7795es1);
> +	if (!IS_ERR(match) && match) {

Same, return the error.
Assuming an error means no match will just lead to hard to debug
problems because the driver potentially assumed the wrong device when
it's just not ready yet.

>  		cpg_core_nullify_range(r8a7795_core_clks,
>  				       ARRAY_SIZE(r8a7795_core_clks),
>  				       R8A7795_CLK_S0D2, R8A7795_CLK_S0D12);
> [...]
> diff --git a/drivers/iommu/ipmmu-vmsa.c b/drivers/iommu/ipmmu-vmsa.c
> index eaaec0a55cc6..13a06b613379 100644
> --- a/drivers/iommu/ipmmu-vmsa.c
> +++ b/drivers/iommu/ipmmu-vmsa.c
> @@ -757,17 +757,20 @@ static const char * const devices_allowlist[] = {
>  
>  static bool ipmmu_device_is_allowed(struct device *dev)
>  {
> +	const struct soc_device_attribute *match1, *match2;
>  	unsigned int i;
>  
>  	/*
>  	 * R-Car Gen3 and RZ/G2 use the allow list to opt-in devices.
>  	 * For Other SoCs, this returns true anyway.
>  	 */
> -	if (!soc_device_match(soc_needs_opt_in))
> +	match1 = soc_device_match(soc_needs_opt_in);
> +	if (!IS_ERR(match1) && !match1)

I'm not sure what you intended to do, but !match1 already means there is
no error so the original code is identical.

In this case ipmmu_device_is_allowed does not allow errors so this is
one of the "difficult" drivers that require slightly more thinking.
It is only called in ipmmu_of_xlate which does return errors properly,
so in this case the most straightforward approach would be to make
ipmmu_device_is_allowed return an int and forward errors as well.



...
This is going to need quite some more work to be acceptable, in my
opinion, but I think it should be possible.

Thanks,
Alice Guo (OSS) April 19, 2021, 6:46 a.m. UTC | #3
> -----Original Message-----

> From: Leon Romanovsky <leon@kernel.org>

> Sent: 2021年4月19日 13:02

> To: Alice Guo (OSS) <alice.guo@oss.nxp.com>

> Cc: gregkh@linuxfoundation.org; rafael@kernel.org; Horia Geanta

> <horia.geanta@nxp.com>; Aymen Sghaier <aymen.sghaier@nxp.com>;

> herbert@gondor.apana.org.au; davem@davemloft.net; tony@atomide.com;

> geert+renesas@glider.be; mturquette@baylibre.com; sboyd@kernel.org;

> vkoul@kernel.org; peter.ujfalusi@gmail.com; a.hajda@samsung.com;

> narmstrong@baylibre.com; robert.foss@linaro.org; airlied@linux.ie;

> daniel@ffwll.ch; khilman@baylibre.com; tomba@kernel.org; jyri.sarha@iki.fi;

> joro@8bytes.org; will@kernel.org; mchehab@kernel.org;

> ulf.hansson@linaro.org; adrian.hunter@intel.com; kishon@ti.com;

> kuba@kernel.org; linus.walleij@linaro.org; Roy Pledge <roy.pledge@nxp.com>;

> Leo Li <leoyang.li@nxp.com>; ssantosh@kernel.org; matthias.bgg@gmail.com;

> edubezval@gmail.com; j-keerthy@ti.com; balbi@kernel.org;

> linux@prisktech.co.nz; stern@rowland.harvard.edu; wim@linux-watchdog.org;

> linux@roeck-us.net; linux-kernel@vger.kernel.org; linux-crypto@vger.kernel.org;

> linux-omap@vger.kernel.org; linux-renesas-soc@vger.kernel.org;

> linux-clk@vger.kernel.org; dmaengine@vger.kernel.org;

> dri-devel@lists.freedesktop.org; linux-amlogic@lists.infradead.org;

> linux-arm-kernel@lists.infradead.org; iommu@lists.linux-foundation.org;

> linux-media@vger.kernel.org; linux-mmc@vger.kernel.org;

> netdev@vger.kernel.org; linux-phy@lists.infradead.org;

> linux-gpio@vger.kernel.org; linuxppc-dev@lists.ozlabs.org;

> linux-staging@lists.linux.dev; linux-mediatek@lists.infradead.org;

> linux-pm@vger.kernel.org; linux-usb@vger.kernel.org;

> linux-watchdog@vger.kernel.org

> Subject: Re: [RFC v1 PATCH 3/3] driver: update all the code that use

> soc_device_match

> 

> On Mon, Apr 19, 2021 at 12:27:22PM +0800, Alice Guo (OSS) wrote:

> > From: Alice Guo <alice.guo@nxp.com>

> >

> > Update all the code that use soc_device_match because add support for

> > soc_device_match returning -EPROBE_DEFER.

> >

> > Signed-off-by: Alice Guo <alice.guo@nxp.com>

> > ---

> >  drivers/bus/ti-sysc.c                         |  2 +-

> >  drivers/clk/renesas/r8a7795-cpg-mssr.c        |  4 +++-

> >  drivers/clk/renesas/rcar-gen2-cpg.c           |  2 +-

> >  drivers/clk/renesas/rcar-gen3-cpg.c           |  2 +-

> >  drivers/dma/fsl-dpaa2-qdma/dpaa2-qdma.c       |  7 ++++++-

> >  drivers/dma/ti/k3-psil.c                      |  3 +++

> >  drivers/dma/ti/k3-udma.c                      |  2 +-

> >  drivers/gpu/drm/bridge/nwl-dsi.c              |  2 +-

> >  drivers/gpu/drm/meson/meson_drv.c             |  4 +++-

> >  drivers/gpu/drm/omapdrm/dss/dispc.c           |  2 +-

> >  drivers/gpu/drm/omapdrm/dss/dpi.c             |  4 +++-

> >  drivers/gpu/drm/omapdrm/dss/dsi.c             |  3 +++

> >  drivers/gpu/drm/omapdrm/dss/dss.c             |  3 +++

> >  drivers/gpu/drm/omapdrm/dss/hdmi4_core.c      |  3 +++

> >  drivers/gpu/drm/omapdrm/dss/venc.c            |  4 +++-

> >  drivers/gpu/drm/omapdrm/omap_drv.c            |  3 +++

> >  drivers/gpu/drm/rcar-du/rcar_du_crtc.c        |  4 +++-

> >  drivers/gpu/drm/rcar-du/rcar_lvds.c           |  2 +-

> >  drivers/gpu/drm/tidss/tidss_dispc.c           |  4 +++-

> >  drivers/iommu/ipmmu-vmsa.c                    |  7 +++++--

> >  drivers/media/platform/rcar-vin/rcar-core.c   |  2 +-

> >  drivers/media/platform/rcar-vin/rcar-csi2.c   |  2 +-

> >  drivers/media/platform/vsp1/vsp1_uif.c        |  4 +++-

> >  drivers/mmc/host/renesas_sdhi_core.c          |  2 +-

> >  drivers/mmc/host/renesas_sdhi_internal_dmac.c |  2 +-

> >  drivers/mmc/host/sdhci-of-esdhc.c             | 21

> ++++++++++++++-----

> >  drivers/mmc/host/sdhci-omap.c                 |  2 +-

> >  drivers/mmc/host/sdhci_am654.c                |  2 +-

> >  drivers/net/ethernet/renesas/ravb_main.c      |  4 +++-

> >  drivers/net/ethernet/ti/am65-cpsw-nuss.c      |  2 +-

> >  drivers/net/ethernet/ti/cpsw.c                |  2 +-

> >  drivers/net/ethernet/ti/cpsw_new.c            |  2 +-

> >  drivers/phy/ti/phy-omap-usb2.c                |  4 +++-

> >  drivers/pinctrl/renesas/core.c                |  2 +-

> >  drivers/pinctrl/renesas/pfc-r8a7790.c         |  5 ++++-

> >  drivers/pinctrl/renesas/pfc-r8a7794.c         |  5 ++++-

> >  drivers/soc/fsl/dpio/dpio-driver.c            | 13 ++++++++----

> >  drivers/soc/renesas/r8a774c0-sysc.c           |  5 ++++-

> >  drivers/soc/renesas/r8a7795-sysc.c            |  2 +-

> >  drivers/soc/renesas/r8a77990-sysc.c           |  5 ++++-

> >  drivers/soc/ti/k3-ringacc.c                   |  2 +-

> >  drivers/staging/mt7621-pci/pci-mt7621.c       |  2 +-

> >  drivers/thermal/rcar_gen3_thermal.c           |  4 +++-

> >  drivers/thermal/ti-soc-thermal/ti-bandgap.c   | 10 +++++++--

> >  drivers/usb/gadget/udc/renesas_usb3.c         |  2 +-

> >  drivers/usb/host/ehci-platform.c              |  4 +++-

> >  drivers/usb/host/xhci-rcar.c                  |  2 +-

> >  drivers/watchdog/renesas_wdt.c                |  2 +-

> >  48 files changed, 131 insertions(+), 52 deletions(-)

> >

> > diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index

> > 5fae60f8c135..00c59aa217c1 100644

> > --- a/drivers/bus/ti-sysc.c

> > +++ b/drivers/bus/ti-sysc.c

> > @@ -2909,7 +2909,7 @@ static int sysc_init_soc(struct sysc *ddata)

> >  	}

> >

> >  	match = soc_device_match(sysc_soc_feat_match);

> > -	if (!match)

> > +	if (!match || IS_ERR(match))

> >  		return 0;

> >

> >  	if (match->data)

> > diff --git a/drivers/clk/renesas/r8a7795-cpg-mssr.c

> > b/drivers/clk/renesas/r8a7795-cpg-mssr.c

> > index c32d2c678046..90a18336a4c3 100644

> > --- a/drivers/clk/renesas/r8a7795-cpg-mssr.c

> > +++ b/drivers/clk/renesas/r8a7795-cpg-mssr.c

> > @@ -439,6 +439,7 @@ static const unsigned int r8a7795es2_mod_nullify[]

> > __initconst = {

> >

> >  static int __init r8a7795_cpg_mssr_init(struct device *dev)  {

> > +	const struct soc_device_attribute *match;

> >  	const struct rcar_gen3_cpg_pll_config *cpg_pll_config;

> >  	u32 cpg_mode;

> >  	int error;

> > @@ -453,7 +454,8 @@ static int __init r8a7795_cpg_mssr_init(struct device

> *dev)

> >  		return -EINVAL;

> >  	}

> >

> > -	if (soc_device_match(r8a7795es1)) {

> > +	match = soc_device_match(r8a7795es1);

> > +	if (!IS_ERR(match) && match) {

> 

> "if (!IS_ERR_OR_NULL(match))" in all places.


Thank you for your suggestion. I will modify them.

Best Regards,
Alice Guo

> Thanks
Dominique MARTINET April 19, 2021, 9:33 a.m. UTC | #4
Geert Uytterhoeven wrote on Mon, Apr 19, 2021 at 11:03:24AM +0200:
> > This is going to need quite some more work to be acceptable, in my
> > opinion, but I think it should be possible.
> 
> In general, this is very hard to do, IMHO. Some drivers may be used on
> multiple platforms, some of them registering an SoC device, some of
> them not registering an SoC device.  So there is no way to know the
> difference between "SoC device not registered, intentionally", and
> "SoC device not yet registered".

Hm, good point, I was probably a bit too optimistic if there are devices
which don't register any soc yet have drivers which want one; I don't
see how to make the difference indeed... And that does mean we can't
just defer all the time.

> soc_device_match() should only be used as a last resort, to identify
> systems that cannot be identified otherwise.  Typically this is used for
> quirks, which should only be enabled on a very specific subset of
> systems.  IMHO such systems should make sure soc_device_match()
> is available early, by registering their SoC device early.

I definitely agree there, my suggestion to defer was only because I know
of no other way to influence the ordering of drivers loading reliably
and gave up on soc being init'd early.

In this particular case the problem is that since 7d981405d0fd ("soc:
imx8m: change to use platform driver") the soc probe tries to use the
nvmem driver for ocotp fuses for imx8m devices, which isn't ready yet.
So soc loading gets pushed back to the end of the list because it gets
defered and other drivers relying on soc_device_match get confused
because they wrongly think a device doesn't match a quirk when it
actually does.

If there is a way to ensure the nvmem driver gets loaded before the soc,
that would also solve the problem nicely, and avoid the need to mess
with all the ~50 drivers which use it.


Is there a way to control in what order drivers get loaded? Something in
the dtb perhaps?


Thanks,
Guenter Roeck April 19, 2021, 1:36 p.m. UTC | #5
On 4/18/21 9:27 PM, Alice Guo (OSS) wrote:
> From: Alice Guo <alice.guo@nxp.com>
> 
> Update all the code that use soc_device_match because add support for
> soc_device_match returning -EPROBE_DEFER.
> 
> Signed-off-by: Alice Guo <alice.guo@nxp.com>
> ---
[ ... ]
>  drivers/watchdog/renesas_wdt.c                |  2 +-
>  48 files changed, 131 insertions(+), 52 deletions(-)
> 
[ ... ]
> diff --git a/drivers/watchdog/renesas_wdt.c b/drivers/watchdog/renesas_wdt.c
> index 5791198960e6..fdc534dc4024 100644
> --- a/drivers/watchdog/renesas_wdt.c
> +++ b/drivers/watchdog/renesas_wdt.c
> @@ -197,7 +197,7 @@ static bool rwdt_blacklisted(struct device *dev)
>  	const struct soc_device_attribute *attr;
>  
>  	attr = soc_device_match(rwdt_quirks_match);
> -	if (attr && setup_max_cpus > (uintptr_t)attr->data) {
> +	if (!IS_ERR(attr) && attr && setup_max_cpus > (uintptr_t)attr->data) {

This is wrong. We can not make the decision below without having access
to attr. The function may wrongly return false if soc_device_match()
returns an error.

Guenter

>  		dev_info(dev, "Watchdog blacklisted on %s %s\n", attr->soc_id,
>  			 attr->revision);
>  		return true;
>
Dominique MARTINET April 19, 2021, 11:42 p.m. UTC | #6
Arnd Bergmann wrote on Mon, Apr 19, 2021 at 02:16:36PM +0200:
> In some cases, you can use the device_link infrastructure to deal
> with dependencies between devices. Not sure if this would help
> in your case, but have a look at device_link_add() etc in drivers/base/core.c

I'll need to actually try to convince myself but if creating the link
forces driver registration then it should be workable.

> > In this particular case the problem is that since 7d981405d0fd ("soc:
> > imx8m: change to use platform driver") the soc probe tries to use the
> > nvmem driver for ocotp fuses for imx8m devices, which isn't ready yet.
> > So soc loading gets pushed back to the end of the list because it gets
> > defered and other drivers relying on soc_device_match get confused
> > because they wrongly think a device doesn't match a quirk when it
> > actually does.
> >
> > If there is a way to ensure the nvmem driver gets loaded before the soc,
> > that would also solve the problem nicely, and avoid the need to mess
> > with all the ~50 drivers which use it.
> >
> > Is there a way to control in what order drivers get loaded? Something in
> > the dtb perhaps?
> 
> For built-in drivers, load order depends on the initcall level and
> link order (how things are lined listed in the Makefile hierarchy).
> 
> For loadable modules, this is up to user space in the end.
> 
> Which of the drivers in this scenario are loadable modules?

All the drivers involved in my case are built-in (nvmem, soc and final
soc_device_match consumer e.g. caam_jr that crashes the kernel if soc is
not identified properly).

I frankly don't like the idea of moving nvmem/ above soc/ in
drivers/Makefile as a "solution" to this (especially as there is one
that seems to care about what soc they run on...), so I'll have a look
at links first, hopefully that will work out.


Thanks,
Arnd Bergmann April 20, 2021, 9:10 a.m. UTC | #7
On Tue, Apr 20, 2021 at 1:44 AM Dominique MARTINET
<dominique.martinet@atmark-techno.com> wrote:
> Arnd Bergmann wrote on Mon, Apr 19, 2021 at 02:16:36PM +0200:
> > For built-in drivers, load order depends on the initcall level and
> > link order (how things are lined listed in the Makefile hierarchy).
> >
> > For loadable modules, this is up to user space in the end.
> >
> > Which of the drivers in this scenario are loadable modules?
>
> All the drivers involved in my case are built-in (nvmem, soc and final
> soc_device_match consumer e.g. caam_jr that crashes the kernel if soc is
> not identified properly).

Ok, in that case you may have a chance to just adapt the initcall
levels. This is somewhat fragile if someone else already relies
on a particular order, but it's an easy one-line change to change
a driver e.g. from module_init() or device_initcall() to arch_initcall().

> I frankly don't like the idea of moving nvmem/ above soc/ in
> drivers/Makefile as a "solution" to this (especially as there is one
> that seems to care about what soc they run on...), so I'll have a look
> at links first, hopefully that will work out.

Right, that would be way more fragile.

I think the main problem in this case is the caam driver that really
should not look into the particular SoC type or even machine
compatible string. This is something we can do as a last resort
for compatibility with busted devicetree files, but it appears that
this driver does it as the primary method for identifying different
hardware revisions. I would suggest fixing the binding so that
each SoC that includes one of these devices has a soc specific
compatible string associated with the device that the driver can
use as the primary way of identifying the device.

We probably need to keep the old logic around for old dtb files,
but there can at least be a comment next to that table that
discourages people from adding more entries there.

      Arnd
Dan Carpenter April 20, 2021, 11:21 a.m. UTC | #8
On Mon, Apr 19, 2021 at 10:20:13AM +0200, Geert Uytterhoeven wrote:
> Hi Alice,
> 
> CC Arnd (soc_device_match() author)
> 
> On Mon, Apr 19, 2021 at 6:28 AM Alice Guo (OSS) <alice.guo@oss.nxp.com> wrote:
> > From: Alice Guo <alice.guo@nxp.com>
> >
> > In i.MX8M boards, the registration of SoC device is later than caam
> > driver which needs it. Caam driver needs soc_device_match to provide
> > -EPROBE_DEFER when no SoC device is registered and no
> > early_soc_dev_attr.
> 
> I'm wondering if this is really a good idea: soc_device_match() is a
> last-resort low-level check, and IMHO should be made available early on,
> so there is no need for -EPROBE_DEFER.
> 
> >
> > Signed-off-by: Alice Guo <alice.guo@nxp.com>
> 
> Thanks for your patch!
> 
> > --- a/drivers/base/soc.c
> > +++ b/drivers/base/soc.c
> > @@ -110,6 +110,7 @@ static void soc_release(struct device *dev)
> >  }
> >
> >  static struct soc_device_attribute *early_soc_dev_attr;
> > +static bool soc_dev_attr_init_done = false;
> 
> Do you need this variable?
> 
> >
> >  struct soc_device *soc_device_register(struct soc_device_attribute *soc_dev_attr)
> >  {
> > @@ -157,6 +158,7 @@ struct soc_device *soc_device_register(struct soc_device_attribute *soc_dev_attr
> >                 return ERR_PTR(ret);
> >         }
> >
> > +       soc_dev_attr_init_done = true;
> >         return soc_dev;
> >
> >  out3:
> > @@ -246,6 +248,9 @@ const struct soc_device_attribute *soc_device_match(
> >         if (!matches)
> >                 return NULL;
> >
> > +       if (!soc_dev_attr_init_done && !early_soc_dev_attr)
> 
> if (!soc_bus_type.p && !early_soc_dev_attr)

There is one place checking this already.  We could wrap it in a helper
function:

static bool device_init_done(void)
{
	return soc_bus_type.p ? true : false;
}

regards,
dan carpenter