Message ID | 20230127001141.407071-1-saravanak@google.com |
---|---|
Headers | show |
Series | fw_devlink improvements | expand |
On Thu, Jan 26, 2023 at 04:11:28PM -0800, Saravana Kannan wrote: > When a device X is bound successfully to a driver, if it has a child > firmware node Y that doesn't have a struct device created by then, we > delete fwnode links where the child firmware node Y is the supplier. We > did this to avoid blocking the consumers of the child firmware node Y > from deferring probe indefinitely. > > While that a step in the right direction, it's better to make the > consumers of the child firmware node Y to be consumers of the device X > because device X is probably implementing whatever functionality is > represented by child firmware node Y. By doing this, we capture the > device dependencies more accurately and ensure better > probe/suspend/resume ordering. ... > static unsigned int defer_sync_state_count = 1; > static DEFINE_MUTEX(fwnode_link_lock); > static bool fw_devlink_is_permissive(void); > +static void __fw_devlink_link_to_consumers(struct device *dev); > static bool fw_devlink_drv_reg_done; > static bool fw_devlink_best_effort; I'm wondering if may avoid adding more forward declarations... Perhaps it's a sign that devlink code should be split to its own module? ... > -int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) > +static int __fwnode_link_add(struct fwnode_handle *con, > + struct fwnode_handle *sup) I believe we tolerate a bit longer lines, so you may still have it on a single line. ... > +int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) > +{ > + int ret = 0; Redundant assignment. > + mutex_lock(&fwnode_link_lock); > + ret = __fwnode_link_add(con, sup); > + mutex_unlock(&fwnode_link_lock); > return ret; > } ... > if (dev->fwnode && dev->fwnode->dev == dev) { You may have above something like fwnode = dev_fwnode(dev); if (fwnode && fwnode->dev == dev) { > struct fwnode_handle *child; > fwnode_links_purge_suppliers(dev->fwnode); > + mutex_lock(&fwnode_link_lock); > fwnode_for_each_available_child_node(dev->fwnode, child) > - fw_devlink_purge_absent_suppliers(child); > + __fw_devlink_pickup_dangling_consumers(child, > + dev->fwnode); __fw_devlink_pickup_dangling_consumers(child, fwnode); > + __fw_devlink_link_to_consumers(dev); > + mutex_unlock(&fwnode_link_lock); > }
On Thu, Jan 26, 2023 at 04:11:31PM -0800, Saravana Kannan wrote: > Registering an irqdomain sets the flag for the fwnode. But having the > flag set when a device is added is interpreted by fw_devlink to mean the > device has already been initialized and will never probe. This prevents > fw_devlink from creating device links with the gpio_device as a > supplier. So, clear the flag before adding the device. ... > + /* > + * If fwnode doesn't belong to another device, it's safe to clear its > + * initialized flag. > + */ > + if (!gdev->dev.fwnode->dev) > + fwnode_dev_initialized(gdev->dev.fwnode, false); Do not dereference fwnode in struct device. Use dev_fwnode() for that. struct fwnode_handle *fwnode = dev_fwnode(&gdev->dev); if (!fwnode->dev) fwnode_dev_initialized(fwnode, false); + Blank line. > ret = gcdev_register(gdev, gpio_devt); > if (ret) > return ret;
On Fri, Jan 27, 2023 at 11:29:43AM +0200, Andy Shevchenko wrote: > On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote: ... > > DL_FLAG_AUTOREMOVE_SUPPLIER | \ > > DL_FLAG_AUTOPROBE_CONSUMER | \ > > DL_FLAG_SYNC_STATE_ONLY | \ > > - DL_FLAG_INFERRED) > > + DL_FLAG_INFERRED | \ > > + DL_FLAG_CYCLE) > > You can make less churn by squeezing the new one above the last one. Or even define a mask with all necessary bits in the header and use it.
On Thu, Jan 26, 2023 at 04:11:33PM -0800, Saravana Kannan wrote: > To improve detection and handling of dependency cycles, we need to be > able to mark fwnode links as being part of cycles. fwnode links marked > as being part of a cycle should not block their consumers from probing. ... > + list_for_each_entry(link, &fwnode->suppliers, c_hook) { > + if (link->flags & FWLINK_FLAG_CYCLE) > + continue; > + return link->supplier; Hmm... if (!(link->flags & FWLINK_FLAG_CYCLE)) return link->supplier; ? > + } > + > + return NULL; ... > - if (dev->fwnode && !list_empty(&dev->fwnode->suppliers) && > - !fw_devlink_is_permissive()) { > - sup_fw = list_first_entry(&dev->fwnode->suppliers, > - struct fwnode_link, > - c_hook)->supplier; > + sup_fw = fwnode_links_check_suppliers(dev->fwnode); dev_fwnode() ? ... > - val = !list_empty(&dev->fwnode->suppliers); > + mutex_lock(&fwnode_link_lock); > + val = !!fwnode_links_check_suppliers(dev->fwnode); Ditto? > + mutex_unlock(&fwnode_link_lock);
On Fri, Jan 27, 2023 at 10:30 AM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote: > > DL_FLAG_AUTOREMOVE_SUPPLIER | \ > > DL_FLAG_AUTOPROBE_CONSUMER | \ > > DL_FLAG_SYNC_STATE_ONLY | \ > > - DL_FLAG_INFERRED) > > + DL_FLAG_INFERRED | \ > > + DL_FLAG_CYCLE) > > You can make less churn by squeezing the new one above the last one. And avoiding some future churn by introducing alphabetical order. Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds
On Thu, Jan 26, 2023 at 04:11:38PM -0800, Saravana Kannan wrote: > This allows fw_devlink to track and enforce supplier-consumer > dependencies for scmi_device. > Is there any dependency in the series, if so Acked-by: Sudeep Holla <sudeep.holla@arm.com> after you incorporate Andy's suggestion. Let me know if you want me to pick this up.
On Thu, Jan 26, 2023 at 04:11:27PM -0800, Saravana Kannan wrote: > Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe, > > I've Cc-ed you because I had pointed you to v1 of this series + the > patches in that thread at one point or another as a fix to some issue > you were facing. It'd appreciate it if you can test this series and > report any issues, or things it fixed and give Tested-bys. I applied this on my working net-next/main development branch and can confirm I am able to successfully boot the Beaglebone Black. Tested-by: Colin Foster <colin.foster@in-advantage.com>
On Fri, Jan 27, 2023 at 12:30 PM Colin Foster <colin.foster@in-advantage.com> wrote: > > On Thu, Jan 26, 2023 at 04:11:27PM -0800, Saravana Kannan wrote: > > Dmitry, Maxim(s), Miquel, Luca, Doug, Colin, Martin, Jean-Philippe, > > > > I've Cc-ed you because I had pointed you to v1 of this series + the > > patches in that thread at one point or another as a fix to some issue > > you were facing. It'd appreciate it if you can test this series and > > report any issues, or things it fixed and give Tested-bys. > > I applied this on my working net-next/main development branch and can > confirm I am able to successfully boot the Beaglebone Black. > > Tested-by: Colin Foster <colin.foster@in-advantage.com> Thanks! -Saravana
On Fri, Jan 27, 2023 at 1:22 AM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > On Thu, Jan 26, 2023 at 04:11:28PM -0800, Saravana Kannan wrote: > > When a device X is bound successfully to a driver, if it has a child > > firmware node Y that doesn't have a struct device created by then, we > > delete fwnode links where the child firmware node Y is the supplier. We > > did this to avoid blocking the consumers of the child firmware node Y > > from deferring probe indefinitely. > > > > While that a step in the right direction, it's better to make the > > consumers of the child firmware node Y to be consumers of the device X > > because device X is probably implementing whatever functionality is > > represented by child firmware node Y. By doing this, we capture the > > device dependencies more accurately and ensure better > > probe/suspend/resume ordering. > > ... > > > static unsigned int defer_sync_state_count = 1; > > static DEFINE_MUTEX(fwnode_link_lock); > > static bool fw_devlink_is_permissive(void); > > +static void __fw_devlink_link_to_consumers(struct device *dev); > > static bool fw_devlink_drv_reg_done; > > static bool fw_devlink_best_effort; > > I'm wondering if may avoid adding more forward declarations... > > Perhaps it's a sign that devlink code should be split to its own > module? I've thought about that before, but I'm not there yet. Maybe once my remaining refactors and TODOs are done, it'd be a good time to revisit this question. But I don't think it should be done for the reason of forward declaration as we'd just end up moving these into base.h and we can do that even today. > > ... > > > -int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) > > +static int __fwnode_link_add(struct fwnode_handle *con, > > + struct fwnode_handle *sup) > > I believe we tolerate a bit longer lines, so you may still have it on a single > line. That'd make it >80 cols. I'm going to leave it as is. > > ... > > > +int fwnode_link_add(struct fwnode_handle *con, struct fwnode_handle *sup) > > +{ > > > + int ret = 0; > > Redundant assignment. Thanks. Will fix in v3. > > > + mutex_lock(&fwnode_link_lock); > > + ret = __fwnode_link_add(con, sup); > > + mutex_unlock(&fwnode_link_lock); > > return ret; > > } > > ... > > > if (dev->fwnode && dev->fwnode->dev == dev) { > > You may have above something like > > > fwnode = dev_fwnode(dev); I'll leave it as-is for now. I see dev->fwnode vs dev_fwnode() don't always give the same results. I need to re-examine other places I use dev->fwnode in fw_devlink code before I start using that function. But in general it seems like a good idea. I'll add this to my TODOs. > if (fwnode && fwnode->dev == dev) { > > > struct fwnode_handle *child; > > fwnode_links_purge_suppliers(dev->fwnode); > > + mutex_lock(&fwnode_link_lock); > > fwnode_for_each_available_child_node(dev->fwnode, child) > > - fw_devlink_purge_absent_suppliers(child); > > + __fw_devlink_pickup_dangling_consumers(child, > > + dev->fwnode); > > __fw_devlink_pickup_dangling_consumers(child, fwnode); I like the dev->fwnode->dev == dev check. It makes it super clear that I'm checking "The device's fwnode points back to the device". If I just use fwnode->dev == dev, then one will have to go back and read what fwnode is set to, etc. Also, when reading all these function calls it's easier to see that I'm working on the dev's fwnode (where dev is the device that was just bound to a driver) instead of some other fwnode. So I find it more readable as is and the compiler would optimize it anyway. If you feel strongly about this, I can change to use fwnode instead of dev->fwnode. Thanks, Saravana
On Fri, Jan 27, 2023 at 1:30 AM Andy Shevchenko <andriy.shevchenko@linux.intel.com> wrote: > > On Thu, Jan 26, 2023 at 04:11:32PM -0800, Saravana Kannan wrote: > > fw_devlink uses DL_FLAG_SYNC_STATE_ONLY device link flag for two > > purposes: > > > > 1. To allow a parent device to proxy its child device's dependency on a > > supplier so that the supplier doesn't get its sync_state() callback > > before the child device/consumer can be added and probed. In this > > usage scenario, we need to ignore cycles for ensure correctness of > > sync_state() callbacks. > > > > 2. When there are dependency cycles in firmware, we don't know which of > > those dependencies are valid. So, we have to ignore them all wrt > > probe ordering while still making sure the sync_state() callbacks > > come correctly. > > > > However, when detecting dependency cycles, there can be multiple > > dependency cycles between two devices that we need to detect. For > > example: > > > > A -> B -> A and A -> C -> B -> A. > > > > To detect multiple cycles correct, we need to be able to differentiate > > DL_FLAG_SYNC_STATE_ONLY device links used for (1) vs (2) above. > > > > To allow this differentiation, add a DL_FLAG_CYCLE that can be use to > > mark use case (2). We can then use the DL_FLAG_CYCLE to decide which > > DL_FLAG_SYNC_STATE_ONLY device links to follow when looking for > > dependency cycles. > > ... > > > +static inline bool device_link_flag_is_sync_state_only(u32 flags) > > +{ > > + return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE)) > > + == (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED); > > Weird indentation, why not > > return (flags & ~(DL_FLAG_INFERRED | DL_FLAG_CYCLE)) == > (DL_FLAG_SYNC_STATE_ONLY | DL_FLAG_MANAGED); > > ? Ack. Will fix in v3. > > > +} > > ... > > > DL_FLAG_AUTOREMOVE_SUPPLIER | \ > > DL_FLAG_AUTOPROBE_CONSUMER | \ > > DL_FLAG_SYNC_STATE_ONLY | \ > > - DL_FLAG_INFERRED) > > + DL_FLAG_INFERRED | \ > > + DL_FLAG_CYCLE) > > You can make less churn by squeezing the new one above the last one. I feel like this part is getting bike shedded. I'm going to leave it as is. It's done in the order it's defined in the header and keeping it that way makes it way more easier to read than worry about a single line churn. -Saravana > > -- > With Best Regards, > Andy Shevchenko > > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >
On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju <naresh.kamboju@linaro.org> wrote: > > Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh, > sparc and x86_64. > > Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64. > Boot failed on FVP. > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> > > Please refer following link for details of testing. > FVP boot log failed. > https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/ Sudeep pointed me to what the issue might be. But it's strange that you are hitting an issue now. I'm pretty sure I haven't changed this part since v1. I'd also expect the limited assumptions I made to have not been affected between v1 and v2. Anyway, I'll look at this and fix it in v3. -Saravana
On Mon, Jan 30, 2023 at 4:09 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > Hi Saravana & Miquel. > > Sorry for the long response. I finally got access to my test device > and tried this patch series. > > And unfortunately it didn't solve my issue. I'm still getting a > hanging f1070000.ethernet dependency > from the nvmem-cell mac@6 subnode. Thanks for testing the series. Btw, don't top post. It's frowned upon. Top post means your reply is on the top before the email you are replying to. See how my first line of reply in inline with your email I'm replying to? > > Here are related parts of my kernel log and device tree: > > > [ 2.713302] device: 'mtd-0': device_add > [ 2.719528] device: 'spi0': device_add > [ 2.724180] device: 'spi0.0': device_add > [ 2.728957] spi-nor spi0.0: mx66l51235f (65536 Kbytes) > [ 2.735338] 7 fixed-partitions partitions found on MTD device spi0.0 > [ 2.741978] device: > 'f1010600.spi:m25p80@0:partitions:partition@1': device_add > [ 2.749636] Creating 7 MTD partitions on "spi0.0": > [ 2.754564] 0x000000000000-0x000000080000 : "SPI.U_BOOT" > [ 2.759981] device: 'mtd0': device_add > [ 2.764323] device: 'mtd0': device_add > [ 2.768280] device: 'mtd0ro': device_add > [ 2.772624] 0x0000000a0000-0x0000000c0000 : "SPI.INV_INFO" > [ 2.778218] device: 'mtd1': device_add > [ 2.782549] device: 'mtd1': device_add > [ 2.786582] device: 'mtd1ro': device_add > ... > [ 5.426625] mvneta_bm f10c0000.bm: Buffer Manager for network > controller enabled > [ 5.492867] platform f1070000.ethernet: error -EPROBE_DEFER: > wait for supplier mac@6 > [ 5.528636] device: 'Fixed MDIO bus.0': device_add > [ 5.533726] device: 'fixed-0': device_add > [ 5.547564] device: 'f1072004.mdio-eth-mii': device_add > [ 5.616368] device: 'f1072004.mdio-eth-mii:00': device_add > [ 5.645127] device: 'f1072004.mdio-eth-mii:1e': device_add > [ 5.651530] devices_kset: Moving f1070000.ethernet to end of list > [ 5.657948] platform f1070000.ethernet: error -EPROBE_DEFER: > wait for supplier mac@6 > > spi@10600 { > m25p80@0 { > compatible = "mx66l51235l"; > > partitions { > compatible = "fixed-partitions"; > > partition@0 { > label = "SPI.U_BOOT"; > }; > partition@1 { > compatible = "nvmem-cells"; > label = "SPI.INV_INFO"; > macaddr: mac@6 { > reg = <0x6 0x6>; > }; > }; > ... > }; > }; > }; > > enet1: ethernet@70000 { > nvmem-cells = <&macaddr>; > nvmem-cell-names = "mac-address"; > phy-mode = "rgmii"; > phy = <&phy0>; > }; > > > Maybe I should provide some additional debug info? I took a look at it and I think I know the issue. But it'll be good if you can point me to the dts (not dtsi) file that corresponds to the board you are seeing this issue on so I can double check my guess by looking at the exact code/drivers. The main problem/mistake is the nvmem framework is using a "struct bus" instead of a "struct class" to keep a list of the nvmem devices. And we can't change it now because it'd affect the sysfs paths significantly and might break userspace ABI. Can you try the patch at the end of this email under these configurations and tell me which ones fail vs pass? I don't need logs for any pass/failures. 1. On top of this series 2. Without this series 3. On top of the series but with the call to fwnode_dev_initialized() deleted? 4. Without this series, but with the call to fwnode_dev_initialized() deleted? -Saravana Sorry about tabs to spaces conversion. Email client issue. diff --git a/drivers/nvmem/core.c b/drivers/nvmem/core.c index 321d7d63e068..23d94c0ecccf 100644 --- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -752,6 +752,7 @@ static int nvmem_add_cells_from_of(struct nvmem_device *nvmem) struct nvmem_device *nvmem_register(const struct nvmem_config *config) { struct nvmem_device *nvmem; + struct fwnode_handle *fwnode; int rval; if (!config->dev) @@ -804,9 +805,18 @@ struct nvmem_device *nvmem_register(const struct nvmem_config *config) nvmem->keepout = config->keepout; nvmem->nkeepout = config->nkeepout; if (config->of_node) - nvmem->dev.of_node = config->of_node; + fwnode = of_fwnode_handle(config->of_node); else if (!config->no_of_node) - nvmem->dev.of_node = config->dev->of_node; + fwnode = of_fwnode_handle(config->dev->of_node); + device_set_node(&nvmem->dev, fwnode); + + /* + * If the fwnode doesn't have another device associated with it, mark + * the fwnode as initialized since no driver is going to bind to the + * nvmem. + */ + if (fwnode && !fwnode->dev) + fwnode_dev_initialized(fwnode, true); switch (config->id) { case NVMEM_DEVID_NONE:
Hi Saravana, On Mon, Jan 30, 2023 at 03:03:01PM -0800, Saravana Kannan wrote: > On Mon, Jan 30, 2023 at 12:56 AM Naresh Kamboju > <naresh.kamboju@linaro.org> wrote: > > > > Build test pass on arm, arm64, i386, mips, parisc, powerpc, riscv, s390, sh, > > sparc and x86_64. > > > > Boot and LTP smoke pass on qemu-arm64, qemu-armv7, qemu-i386 and qemu-x86_64. > > Boot failed on FVP. > > > > Reported-by: Linux Kernel Functional Testing <lkft@linaro.org> > > > > Please refer following link for details of testing. > > FVP boot log failed. > > https://qa-reports.linaro.org/~anders.roxell/linux-mainline-patches/build/lore_kernel_org_linux-devicetree_20230127001141_407071-1-saravanak_google_com/testrun/14389034/suite/boot/test/gcc-12-lkftconfig-64k_page_size/details/ > > Sudeep pointed me to what the issue might be. But it's strange that > you are hitting an issue now. I'm pretty sure I haven't changed this > part since v1. I'd also expect the limited assumptions I made to have > not been affected between v1 and v2. > Sorry I hadn't seen or tested v1. FYI The fwnode non-NULL check as in your nvmem diff/suggestion and the diff I replied on the gpiolib patch thread fixes the issues. > Anyway, I'll look at this and fix it in v3. > If you add that fwnode check, feel free to add my tested by. -- Regards, Sudeep
On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > Hi Saravana, > > > Can you try the patch at the end of this email under these > > configurations and tell me which ones fail vs pass? I don't need logs > > I did these tests and here is the results: Did you hand edit the In-Reply-To: in the header? Because in the thread you are reply to the wrong email, but the context in your email seems to be from the right email. For example, see how your reply isn't under the email you are replying to in this thread overview: https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > 1. On top of this series - Not works > 2. Without this series - Works > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > So your nvmem/core.c patch helps only when it is applied without the series. > But despite the fact that this helps to avoid getting stuck at probing > my ethernet device, there is still regression. > > When the ethernet module is loaded it takes a lot of time to drop dependency > from the nvmem-cell with mac address. > > Please look at the kernel logs below. The kernel logs below really aren't that useful for me in their current state. See more below. ---8<---- <snip> --->8---- > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > but look like it is not best way to solve a problem which we discussed > in the MTD thread. > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > applied on top of this series. Maybe I missed something. Yeah, I'm not too sure if the test was done correctly. You also didn't answer my question about the dts from my earlier email. https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t So, can you please retest config 1 with all pr_debug and dev_dbg in drivers/core/base.c changed to the _info variants? And then share the kernel log from the beginning of boot? Maybe attach it to the email so it doesn't get word wrapped by your email client. And please point me to the .dts that corresponds to your board. Without that, I can't debug much. Thanks, Saravana
On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > Hi Saravana, > > > > > > > Can you try the patch at the end of this email under these > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > I did these tests and here is the results: > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > thread you are reply to the wrong email, but the context in your email > > seems to be from the right email. > > > > For example, see how your reply isn't under the email you are replying > > to in this thread overview: > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > 1. On top of this series - Not works > > > 2. Without this series - Works > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > But despite the fact that this helps to avoid getting stuck at probing > > > my ethernet device, there is still regression. > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > from the nvmem-cell with mac address. > > > > > > Please look at the kernel logs below. > > > > The kernel logs below really aren't that useful for me in their > > current state. See more below. > > > > ---8<---- <snip> --->8---- > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > but look like it is not best way to solve a problem which we discussed > > > in the MTD thread. > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > applied on top of this series. Maybe I missed something. > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > answer my question about the dts from my earlier email. > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > drivers/core/base.c changed to the _info variants? And then share the > > kernel log from the beginning of boot? Maybe attach it to the email so > > it doesn't get word wrapped by your email client. And please point me > > to the .dts that corresponds to your board. Without that, I can't > > debug much. > > > > Thanks, > > Saravana > > > Did you hand edit the In-Reply-To: in the header? Because in the > > thread you are reply to the wrong email, but the context in your email > > seems to be from the right email. > > Sorry for that, it seems like I accidently deleted it. > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > drivers/core/base.c changed to the _info variants? And then share the > > kernel log from the beginning of boot? Maybe attach it to the email so > > it doesn't get word wrapped by your email client. And please point me > > to the .dts that corresponds to your board. Without that, I can't > > debug much. > > Ok, I retested config 1 with all _debug logs changed to the _info. I > added the kernel log and the dts file to the attachment of this email. Ah, so your device is not supported/present upstream? Even though it's not upstream, I'll help fix this because it should fix what I believe are unreported issues in upstream. Ok I know why configs 1 - 4 behaved the way they did and why my test patch didn't help. After staring at mtd/nvmem code for a few hours I think mtd/nvmem interaction is kind of a mess. mtd core creates "partition" platform devices (including for nvmem-cells) that are probed by drivers in drivers/nvmem. However, there's no driver for "nvmem-cells" partition platform device. However, the nvmem core creates nvmem_device when nvmem_register() is called by MTD or these partition platform devices created by MTD. But these nvmem_devices are added to a nvmem_bus but the bus has no means to even register a driver (it should really be a nvmem_class and not nvmem_bus). And the nvmem_device sometimes points to the DT node of the MTD device or sometimes the partition platform devices or maybe no DT node at all. So it's a mess of multiple devices pointing to the same DT node with no clear way to identify which ones will point to a DT node and which ones will probe and which ones won't. In the future, we shouldn't allow adding new compatible strings for partitions for which we don't plan on adding nvmem drivers. Can you give the patch at the end of the email a shot? It should fix the issue with this series and without this series. It just avoids this whole mess by not creating useless platform device for nvmem-cells compatible DT nodes. Thanks, Saravana diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c index d442fa94c872..88a213f4d651 100644 --- a/drivers/mtd/mtdpart.c +++ b/drivers/mtd/mtdpart.c @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master, { struct mtd_part_parser *parser; struct device_node *np; + struct device_node *child; struct property *prop; struct device *dev; const char *compat; @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master, else np = of_get_child_by_name(np, "partitions"); + for_each_child_of_node(np, child) + if (of_device_is_compatible(child, "nvmem-cells")) + of_node_set_flag(child, OF_POPULATED); + of_property_for_each_string(np, "compatible", prop, compat) { parser = mtd_part_get_compatible_parser(compat); if (!parser)
On Sun, Feb 5, 2023 at 5:32 PM Saravana Kannan <saravanak@google.com> wrote: > > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > > > Hi Saravana, > > > > > > > > > Can you try the patch at the end of this email under these > > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > > > I did these tests and here is the results: > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > > > For example, see how your reply isn't under the email you are replying > > > to in this thread overview: > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > > > 1. On top of this series - Not works > > > > 2. Without this series - Works > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > > But despite the fact that this helps to avoid getting stuck at probing > > > > my ethernet device, there is still regression. > > > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > > from the nvmem-cell with mac address. > > > > > > > > Please look at the kernel logs below. > > > > > > The kernel logs below really aren't that useful for me in their > > > current state. See more below. > > > > > > ---8<---- <snip> --->8---- > > > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > > but look like it is not best way to solve a problem which we discussed > > > > in the MTD thread. > > > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > > applied on top of this series. Maybe I missed something. > > > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > > answer my question about the dts from my earlier email. > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > > > Thanks, > > > Saravana > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > Sorry for that, it seems like I accidently deleted it. > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > added the kernel log and the dts file to the attachment of this email. > > Ah, so your device is not supported/present upstream? Even though it's > not upstream, I'll help fix this because it should fix what I believe > are unreported issues in upstream. > > Ok I know why configs 1 - 4 behaved the way they did and why my test > patch didn't help. > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > interaction is kind of a mess. mtd core creates "partition" platform > devices (including for nvmem-cells) that are probed by drivers in > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > platform device. However, the nvmem core creates nvmem_device when > nvmem_register() is called by MTD or these partition platform devices > created by MTD. But these nvmem_devices are added to a nvmem_bus but > the bus has no means to even register a driver (it should really be a > nvmem_class and not nvmem_bus). And the nvmem_device sometimes points > to the DT node of the MTD device or sometimes the partition platform > devices or maybe no DT node at all. > > So it's a mess of multiple devices pointing to the same DT node with > no clear way to identify which ones will point to a DT node and which > ones will probe and which ones won't. In the future, we shouldn't > allow adding new compatible strings for partitions for which we don't > plan on adding nvmem drivers. > > Can you give the patch at the end of the email a shot? It should fix > the issue with this series and without this series. It just avoids > this whole mess by not creating useless platform device for > nvmem-cells compatible DT nodes. Actually, without this series, the patch below will need an additional line of code inside the if block: fwnode_dev_initialized(of_fwnode_handle(child), true); -Saravana > > Thanks, > Saravana > > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c > index d442fa94c872..88a213f4d651 100644 > --- a/drivers/mtd/mtdpart.c > +++ b/drivers/mtd/mtdpart.c > @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master, > { > struct mtd_part_parser *parser; > struct device_node *np; > + struct device_node *child; > struct property *prop; > struct device *dev; > const char *compat; > @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master, > else > np = of_get_child_by_name(np, "partitions"); > > + for_each_child_of_node(np, child) > + if (of_device_is_compatible(child, "nvmem-cells")) > + of_node_set_flag(child, OF_POPULATED); > + > of_property_for_each_string(np, "compatible", prop, compat) { > parser = mtd_part_get_compatible_parser(compat); > if (!parser)
Hi Saravana, + Srinivas, nvmem maintainer saravanak@google.com wrote on Sun, 5 Feb 2023 17:32:57 -0800: > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > > > Hi Saravana, > > > > > > > > > Can you try the patch at the end of this email under these > > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > > > I did these tests and here is the results: > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > > > For example, see how your reply isn't under the email you are replying > > > to in this thread overview: > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > > > 1. On top of this series - Not works > > > > 2. Without this series - Works > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > > But despite the fact that this helps to avoid getting stuck at probing > > > > my ethernet device, there is still regression. > > > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > > from the nvmem-cell with mac address. > > > > > > > > Please look at the kernel logs below. > > > > > > The kernel logs below really aren't that useful for me in their > > > current state. See more below. > > > > > > ---8<---- <snip> --->8---- > > > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > > but look like it is not best way to solve a problem which we discussed > > > > in the MTD thread. > > > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > > applied on top of this series. Maybe I missed something. > > > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > > answer my question about the dts from my earlier email. > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > > > Thanks, > > > Saravana > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > Sorry for that, it seems like I accidently deleted it. > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > added the kernel log and the dts file to the attachment of this email. > > Ah, so your device is not supported/present upstream? Even though it's > not upstream, I'll help fix this because it should fix what I believe > are unreported issues in upstream. > > Ok I know why configs 1 - 4 behaved the way they did and why my test > patch didn't help. > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > interaction is kind of a mess. nvmem is a recent subsystem but mtd carries a lot of legacy stuff we cannot really re-wire without breaking users, so nvmem on top of mtd of course inherit from the fragile designs in place. > mtd core creates "partition" platform > devices (including for nvmem-cells) that are probed by drivers in > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > platform device. However, the nvmem core creates nvmem_device when > nvmem_register() is called by MTD or these partition platform devices > created by MTD. But these nvmem_devices are added to a nvmem_bus but > the bus has no means to even register a driver (it should really be a > nvmem_class and not nvmem_bus). Srinivas, do you think we could change this? > And the nvmem_device sometimes points > to the DT node of the MTD device or sometimes the partition platform > devices or maybe no DT node at all. I guess this comes from the fact that this is not strongly defined in mtd and depends on the situation (not mentioning 20 years of history there as well). "mtd" is a bit inconsistent on what it means. Older designs mixed: controllers, ECC engines when relevant and memories; while these three components are completely separated. Hence sometimes the mtd device ends up being the top level controller, sometimes it's just one partition... But I'm surprised not all of them point to a DT node. Could you show us an example? Because that might likely be unexpected (or perhaps I am missing something). > So it's a mess of multiple devices pointing to the same DT node with > no clear way to identify which ones will point to a DT node and which > ones will probe and which ones won't. In the future, we shouldn't > allow adding new compatible strings for partitions for which we don't > plan on adding nvmem drivers. > > Can you give the patch at the end of the email a shot? It should fix > the issue with this series and without this series. It just avoids > this whole mess by not creating useless platform device for > nvmem-cells compatible DT nodes. Thanks a lot for your help. > > Thanks, > Saravana > > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c > index d442fa94c872..88a213f4d651 100644 > --- a/drivers/mtd/mtdpart.c > +++ b/drivers/mtd/mtdpart.c > @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master, > { > struct mtd_part_parser *parser; > struct device_node *np; > + struct device_node *child; > struct property *prop; > struct device *dev; > const char *compat; > @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master, > else > np = of_get_child_by_name(np, "partitions"); > > + for_each_child_of_node(np, child) > + if (of_device_is_compatible(child, "nvmem-cells")) > + of_node_set_flag(child, OF_POPULATED); What about a comment explaining why we need that in the final patch (with a comment)? Otherwise it's a little bit obscure. > + > of_property_for_each_string(np, "compatible", prop, compat) { > parser = mtd_part_get_compatible_parser(compat); > if (!parser) Thanks, Miquèl
On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <saravanak@google.com> wrote: > > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > > > Hi Saravana, > > > > > > > > > Can you try the patch at the end of this email under these > > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > > > I did these tests and here is the results: > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > > > For example, see how your reply isn't under the email you are replying > > > to in this thread overview: > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > > > 1. On top of this series - Not works > > > > 2. Without this series - Works > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > > But despite the fact that this helps to avoid getting stuck at probing > > > > my ethernet device, there is still regression. > > > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > > from the nvmem-cell with mac address. > > > > > > > > Please look at the kernel logs below. > > > > > > The kernel logs below really aren't that useful for me in their > > > current state. See more below. > > > > > > ---8<---- <snip> --->8---- > > > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > > but look like it is not best way to solve a problem which we discussed > > > > in the MTD thread. > > > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > > applied on top of this series. Maybe I missed something. > > > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > > answer my question about the dts from my earlier email. > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > > > Thanks, > > > Saravana > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > thread you are reply to the wrong email, but the context in your email > > > seems to be from the right email. > > > > Sorry for that, it seems like I accidently deleted it. > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > drivers/core/base.c changed to the _info variants? And then share the > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > it doesn't get word wrapped by your email client. And please point me > > > to the .dts that corresponds to your board. Without that, I can't > > > debug much. > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > added the kernel log and the dts file to the attachment of this email. > > Ah, so your device is not supported/present upstream? Even though it's > not upstream, I'll help fix this because it should fix what I believe > are unreported issues in upstream. > > Ok I know why configs 1 - 4 behaved the way they did and why my test > patch didn't help. > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > interaction is kind of a mess. mtd core creates "partition" platform > devices (including for nvmem-cells) that are probed by drivers in > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > platform device. However, the nvmem core creates nvmem_device when > nvmem_register() is called by MTD or these partition platform devices > created by MTD. But these nvmem_devices are added to a nvmem_bus but > the bus has no means to even register a driver (it should really be a > nvmem_class and not nvmem_bus). And the nvmem_device sometimes points > to the DT node of the MTD device or sometimes the partition platform > devices or maybe no DT node at all. > > So it's a mess of multiple devices pointing to the same DT node with > no clear way to identify which ones will point to a DT node and which > ones will probe and which ones won't. In the future, we shouldn't > allow adding new compatible strings for partitions for which we don't > plan on adding nvmem drivers. That won't work. Having a compatible string cannot mean there must be a driver. Rob
On Mon, Feb 6, 2023 at 7:19 AM Rob Herring <robh+dt@kernel.org> wrote: > > On Sun, Feb 5, 2023 at 7:33 PM Saravana Kannan <saravanak@google.com> wrote: > > > > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > > > > > Hi Saravana, > > > > > > > > > > > Can you try the patch at the end of this email under these > > > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > > > > > I did these tests and here is the results: > > > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > > thread you are reply to the wrong email, but the context in your email > > > > seems to be from the right email. > > > > > > > > For example, see how your reply isn't under the email you are replying > > > > to in this thread overview: > > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > > > > > 1. On top of this series - Not works > > > > > 2. Without this series - Works > > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > > > But despite the fact that this helps to avoid getting stuck at probing > > > > > my ethernet device, there is still regression. > > > > > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > > > from the nvmem-cell with mac address. > > > > > > > > > > Please look at the kernel logs below. > > > > > > > > The kernel logs below really aren't that useful for me in their > > > > current state. See more below. > > > > > > > > ---8<---- <snip> --->8---- > > > > > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > > > but look like it is not best way to solve a problem which we discussed > > > > > in the MTD thread. > > > > > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > > > applied on top of this series. Maybe I missed something. > > > > > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > > > answer my question about the dts from my earlier email. > > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > > drivers/core/base.c changed to the _info variants? And then share the > > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > > it doesn't get word wrapped by your email client. And please point me > > > > to the .dts that corresponds to your board. Without that, I can't > > > > debug much. > > > > > > > > Thanks, > > > > Saravana > > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > > thread you are reply to the wrong email, but the context in your email > > > > seems to be from the right email. > > > > > > Sorry for that, it seems like I accidently deleted it. > > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > > drivers/core/base.c changed to the _info variants? And then share the > > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > > it doesn't get word wrapped by your email client. And please point me > > > > to the .dts that corresponds to your board. Without that, I can't > > > > debug much. > > > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > > added the kernel log and the dts file to the attachment of this email. > > > > Ah, so your device is not supported/present upstream? Even though it's > > not upstream, I'll help fix this because it should fix what I believe > > are unreported issues in upstream. > > > > Ok I know why configs 1 - 4 behaved the way they did and why my test > > patch didn't help. > > > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > > interaction is kind of a mess. mtd core creates "partition" platform > > devices (including for nvmem-cells) that are probed by drivers in > > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > > platform device. However, the nvmem core creates nvmem_device when > > nvmem_register() is called by MTD or these partition platform devices > > created by MTD. But these nvmem_devices are added to a nvmem_bus but > > the bus has no means to even register a driver (it should really be a > > nvmem_class and not nvmem_bus). And the nvmem_device sometimes points > > to the DT node of the MTD device or sometimes the partition platform > > devices or maybe no DT node at all. > > > > So it's a mess of multiple devices pointing to the same DT node with > > no clear way to identify which ones will point to a DT node and which > > ones will probe and which ones won't. In the future, we shouldn't > > allow adding new compatible strings for partitions for which we don't > > plan on adding nvmem drivers. > > That won't work. Having a compatible string cannot mean there must be a driver. Right, I know what you mean Rob and I know where you are coming from (DT isn't just about Linux or even driver core). But what I'm saying is that this seems to already be the case for MTD partitions after commit: bcdf0315a61a mtd: call of_platform_populate() for MTD partitions So, if we are adding compatible properties only for some of them, then I'm saying we should make sure people write drivers for them going forward. I don't know enough about MTD partitions to know why only some of them have compatible properties. -Saravana
On Mon, Feb 6, 2023 at 1:39 AM Miquel Raynal <miquel.raynal@bootlin.com> wrote: > > Hi Saravana, > > + Srinivas, nvmem maintainer > > saravanak@google.com wrote on Sun, 5 Feb 2023 17:32:57 -0800: > > > On Fri, Feb 3, 2023 at 1:39 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > пт, 3 февр. 2023 г. в 09:07, Saravana Kannan <saravanak@google.com>: > > > > > > > > On Thu, Feb 2, 2023 at 9:36 AM Maxim Kiselev <bigunclemax@gmail.com> wrote: > > > > > > > > > > Hi Saravana, > > > > > > > > > > > Can you try the patch at the end of this email under these > > > > > > configurations and tell me which ones fail vs pass? I don't need logs > > > > > > > > > > I did these tests and here is the results: > > > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > > thread you are reply to the wrong email, but the context in your email > > > > seems to be from the right email. > > > > > > > > For example, see how your reply isn't under the email you are replying > > > > to in this thread overview: > > > > https://lore.kernel.org/lkml/20230127001141.407071-1-saravanak@google.com/#r > > > > > > > > > 1. On top of this series - Not works > > > > > 2. Without this series - Works > > > > > 3. On top of the series with the fwnode_dev_initialized() deleted - Not works > > > > > 4. Without this series, with the fwnode_dev_initialized() deleted - Works > > > > > > > > > > So your nvmem/core.c patch helps only when it is applied without the series. > > > > > But despite the fact that this helps to avoid getting stuck at probing > > > > > my ethernet device, there is still regression. > > > > > > > > > > When the ethernet module is loaded it takes a lot of time to drop dependency > > > > > from the nvmem-cell with mac address. > > > > > > > > > > Please look at the kernel logs below. > > > > > > > > The kernel logs below really aren't that useful for me in their > > > > current state. See more below. > > > > > > > > ---8<---- <snip> --->8---- > > > > > > > > > P.S. Your nvmem patch definitely helps to avoid a device probe stuck > > > > > but look like it is not best way to solve a problem which we discussed > > > > > in the MTD thread. > > > > > > > > > > P.P.S. Also I don't know why your nvmem-cell patch doesn't help when it was > > > > > applied on top of this series. Maybe I missed something. > > > > > > > > Yeah, I'm not too sure if the test was done correctly. You also didn't > > > > answer my question about the dts from my earlier email. > > > > https://lore.kernel.org/lkml/CAGETcx8FpmbaRm2CCwqt3BRBpgbogwP5gNB+iA5OEtuxWVTNLA@mail.gmail.com/#t > > > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > > drivers/core/base.c changed to the _info variants? And then share the > > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > > it doesn't get word wrapped by your email client. And please point me > > > > to the .dts that corresponds to your board. Without that, I can't > > > > debug much. > > > > > > > > Thanks, > > > > Saravana > > > > > > > Did you hand edit the In-Reply-To: in the header? Because in the > > > > thread you are reply to the wrong email, but the context in your email > > > > seems to be from the right email. > > > > > > Sorry for that, it seems like I accidently deleted it. > > > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > > drivers/core/base.c changed to the _info variants? And then share the > > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > > it doesn't get word wrapped by your email client. And please point me > > > > to the .dts that corresponds to your board. Without that, I can't > > > > debug much. > > > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > > added the kernel log and the dts file to the attachment of this email. > > > > Ah, so your device is not supported/present upstream? Even though it's > > not upstream, I'll help fix this because it should fix what I believe > > are unreported issues in upstream. > > > > Ok I know why configs 1 - 4 behaved the way they did and why my test > > patch didn't help. > > > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > > interaction is kind of a mess. > > nvmem is a recent subsystem but mtd carries a lot of legacy stuff we > cannot really re-wire without breaking users, so nvmem on top of mtd > of course inherit from the fragile designs in place. Thanks for the context. Yeah, I figured. That's why I explicitly limited my comment to "interaction". Although, I'd love to see the MTD parsers all be converted to proper drivers that probe. MTD is essentially repeating the driver matching logic. I think it can be cleaned up to move to proper drivers and still not break backward compatibility. Not saying it'll be trivial, but it should be possible. Ironically MTD uses mtd_class but has real drivers that work on the device (compared to nvmem_bus below). > > mtd core creates "partition" platform > > devices (including for nvmem-cells) that are probed by drivers in > > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > > platform device. However, the nvmem core creates nvmem_device when > > nvmem_register() is called by MTD or these partition platform devices > > created by MTD. But these nvmem_devices are added to a nvmem_bus but > > the bus has no means to even register a driver (it should really be a > > nvmem_class and not nvmem_bus). > > Srinivas, do you think we could change this? Yeah, this part gets a bit tricky. It depends on whether the sysfs files for nvmem devices is considered an ABI. Changing from bus to class would change the sysfs path for nvmem devices from: /sys/class/nvmem to /sys/bus/nvmem > > And the nvmem_device sometimes points > > to the DT node of the MTD device or sometimes the partition platform > > devices or maybe no DT node at all. > > I guess this comes from the fact that this is not strongly defined in > mtd and depends on the situation (not mentioning 20 years of history > there as well). "mtd" is a bit inconsistent on what it means. Older > designs mixed: controllers, ECC engines when relevant and memories; > while these three components are completely separated. Hence > sometimes the mtd device ends up being the top level controller, > sometimes it's just one partition... > > But I'm surprised not all of them point to a DT node. Could you show us > an example? Because that might likely be unexpected (or perhaps I am > missing something). Well, the logic that sets the DT node for nvmem_device is like so: if (config->of_node) nvmem->dev.of_node = config->of_node; else if (!config->no_of_node) nvmem->dev.of_node = config->dev->of_node; So there's definitely a path (where both if's could be false) where the DT node will not get set. I don't know if that path is possible with the existing users of nvmem_register(), but it's definitely possible. > > So it's a mess of multiple devices pointing to the same DT node with > > no clear way to identify which ones will point to a DT node and which > > ones will probe and which ones won't. In the future, we shouldn't > > allow adding new compatible strings for partitions for which we don't > > plan on adding nvmem drivers. > > > > Can you give the patch at the end of the email a shot? It should fix > > the issue with this series and without this series. It just avoids > > this whole mess by not creating useless platform device for > > nvmem-cells compatible DT nodes. > > Thanks a lot for your help. No problem. I want fw_devlink to work for everyone. > > > > Thanks, > > Saravana > > > > diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c > > index d442fa94c872..88a213f4d651 100644 > > --- a/drivers/mtd/mtdpart.c > > +++ b/drivers/mtd/mtdpart.c > > @@ -577,6 +577,7 @@ static int mtd_part_of_parse(struct mtd_info *master, > > { > > struct mtd_part_parser *parser; > > struct device_node *np; > > + struct device_node *child; > > struct property *prop; > > struct device *dev; > > const char *compat; > > @@ -594,6 +595,10 @@ static int mtd_part_of_parse(struct mtd_info *master, > > else > > np = of_get_child_by_name(np, "partitions"); > > > > + for_each_child_of_node(np, child) > > + if (of_device_is_compatible(child, "nvmem-cells")) > > + of_node_set_flag(child, OF_POPULATED); > > What about a comment explaining why we need that in the final patch > (with a comment)? Otherwise it's a little bit obscure. This wasn't meant to be reviewed :) Just a quick patch to make sure I'm going down the right path. Once Maxim confirms I was going to roll this into a proper patch. But point noted. Will add a comment. Thanks, Saravana
Hi Saravana, > > > > > So, can you please retest config 1 with all pr_debug and dev_dbg in > > > > > drivers/core/base.c changed to the _info variants? And then share the > > > > > kernel log from the beginning of boot? Maybe attach it to the email so > > > > > it doesn't get word wrapped by your email client. And please point me > > > > > to the .dts that corresponds to your board. Without that, I can't > > > > > debug much. > > > > > > > > Ok, I retested config 1 with all _debug logs changed to the _info. I > > > > added the kernel log and the dts file to the attachment of this email. > > > > > > Ah, so your device is not supported/present upstream? Even though it's > > > not upstream, I'll help fix this because it should fix what I believe > > > are unreported issues in upstream. > > > > > > Ok I know why configs 1 - 4 behaved the way they did and why my test > > > patch didn't help. > > > > > > After staring at mtd/nvmem code for a few hours I think mtd/nvmem > > > interaction is kind of a mess. > > > > nvmem is a recent subsystem but mtd carries a lot of legacy stuff we > > cannot really re-wire without breaking users, so nvmem on top of mtd > > of course inherit from the fragile designs in place. > > Thanks for the context. Yeah, I figured. That's why I explicitly > limited my comment to "interaction". Although, I'd love to see the MTD > parsers all be converted to proper drivers that probe. MTD is > essentially repeating the driver matching logic. I think it can be > cleaned up to move to proper drivers and still not break backward > compatibility. Not saying it'll be trivial, but it should be possible. > Ironically MTD uses mtd_class but has real drivers that work on the > device (compared to nvmem_bus below). > > > > mtd core creates "partition" platform > > > devices (including for nvmem-cells) that are probed by drivers in > > > drivers/nvmem. However, there's no driver for "nvmem-cells" partition > > > platform device. However, the nvmem core creates nvmem_device when > > > nvmem_register() is called by MTD or these partition platform devices > > > created by MTD. But these nvmem_devices are added to a nvmem_bus but > > > the bus has no means to even register a driver (it should really be a > > > nvmem_class and not nvmem_bus). > > > > Srinivas, do you think we could change this? > > Yeah, this part gets a bit tricky. It depends on whether the sysfs > files for nvmem devices is considered an ABI. Changing from bus to > class would change the sysfs path for nvmem devices from: > /sys/class/nvmem to /sys/bus/nvmem Ok, so this is a no :) > > > And the nvmem_device sometimes points > > > to the DT node of the MTD device or sometimes the partition platform > > > devices or maybe no DT node at all. > > > > I guess this comes from the fact that this is not strongly defined in > > mtd and depends on the situation (not mentioning 20 years of history > > there as well). "mtd" is a bit inconsistent on what it means. Older > > designs mixed: controllers, ECC engines when relevant and memories; > > while these three components are completely separated. Hence > > sometimes the mtd device ends up being the top level controller, > > sometimes it's just one partition... > > > > But I'm surprised not all of them point to a DT node. Could you show us > > an example? Because that might likely be unexpected (or perhaps I am > > missing something). > > Well, the logic that sets the DT node for nvmem_device is like so: > > if (config->of_node) > nvmem->dev.of_node = config->of_node; > else if (!config->no_of_node) > nvmem->dev.of_node = config->dev->of_node; > > So there's definitely a path (where both if's could be false) where > the DT node will not get set. I don't know if that path is possible > with the existing users of nvmem_register(), but it's definitely > possible. It's an actual path. I just checked more in details, this is the change from 2018 which uses the no_of_node flag: c4dfa25ab307 ("mtd: add support for reading MTD devices via the nvmem API") It basically allows any mtd device to be accessible (read-only) through nvmem. So mtd partitions or such which are not described in the DT may just be accessed through nvmem (that is my current understanding). There was later a patch in 2021 which prevented this flag to be automatically set, so that if partitions (well, mtd devices in general) were described in the DT, they would provide a valid of_node in order to be used as cell providers (again, my understanding): 658c4448bbbf ("mtd: core: add nvmem-cells compatible to parse mtd as nvmem cells") But I guess the major problem comes from the nvmem-cell compatible. I am wondering if it would make sense to kind of transpose the meaning of this compatible into a property. But, well, backward compatibility would still be a problem I guess... > > > So it's a mess of multiple devices pointing to the same DT node with > > > no clear way to identify which ones will point to a DT node and which > > > ones will probe and which ones won't. In the future, we shouldn't > > > allow adding new compatible strings for partitions for which we don't > > > plan on adding nvmem drivers. > > > > > > Can you give the patch at the end of the email a shot? It should fix > > > the issue with this series and without this series. It just avoids > > > this whole mess by not creating useless platform device for > > > nvmem-cells compatible DT nodes. > > > > Thanks a lot for your help. > > No problem. I want fw_devlink to work for everyone. > Thanks, Miquèl