Message ID | 20201121203040.146252-1-hdegoede@redhat.com |
---|---|
Headers | show |
Series | ACPI: scan: Split root scanning into 2 steps | expand |
On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi Rafael, > > A while ago (almost 2 years ago) I discussed an issue with you about > some devices, where some of the methods used during device-addition > (such as _HID) may rely on OpRegions of other devices: > > https://www.spinics.net/lists/linux-acpi/msg86303.html > > An example of this is the Acer Switch 10E SW3-016 model. The _HID method > of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect > the installed wifi chip and update the _HID for the Bluetooth's ACPI node > accordingly. The current ACPI scan code calls _HID before the GPIO > controller's OpRegions are available, leading to the wrong _HID being > used and Bluetooth not working. > > Last week I bought a second hand Acer device, not knowing it was this > exact model. Since I now have access to affected hardware I decided to > take a shot at fixing this. > > In the discussion you suggested to split the acpi_bus_scan of the root > into 2 steps, first scan devices with an empty _DEP, putting other > acpi_handle-s on a list of deferred devices and then in step 2 scan the > rest. > > I'm happy to report that, at least on the affected device, this works > nicely. While working on this I came up with a less drastic way to > deal with this. As you will see in patch 4 of this series, I decided > to first add a more KISS method of deciding which devices to defer > to the second scan step by matching by HID. This has the disadvantage > of not being a generic solution. But it has the advantage of being a > solution which does not potentially regress other devices. > > Then in patch 5 I actually do add the option to defer or not based on > _DEP being empty. I've put this behind a kernel commandline option as > I'm not sure we should do this everywhere by default. At least no without > a lot more testing. > > Patch 6 fixes an issue with patch 5 which causes battery devices to stop > working. > > And patch 7 adds some extra HIDs to the list of HIDs which should be > ignored when checking if the _DEP list is empty from Linux' pov, iow > some extra HIDs which Linux does not bind to. > > Please let me know what you think about this patch-set. I would be happy > to see just patches 1-4 merged. I took patches 1 and 2, because IMO they are generally useful (I rewrote the changelogs to avoid mentioning the rest of the series though), but I have some reservations regarding the rest. First off, I'm not really sure if failing acpi_add_single_object() for devices with missing dependencies is a good idea. IIRC there is nothing in there that should depend on any opregions supplied by the other devices, so it should be safe to allow it to complete. That, in turn, will allow the flags in struct acpi_device to be used to mark the "deferred" devices without allocating more memory. Next, in theory, devices with dependencies may also appear during hotplug, so it would be prudent to handle that on every invocation of acpi_bus_scan() and not just when it runs for the root object. So my approach would be to allow the first namespace walk in acpi_bus_scan() to complete, change acpi_bus_attach() to optionally skip the devices with missing dependencies and return a result indicating whether or not it has set flags.visited for any devices and run it in a loop on the "root" device object until it says that no new devices have been "attached". Let me cut a prototype patch for that and get back to you.
On Sat, Nov 21, 2020 at 9:31 PM Hans de Goede <hdegoede@redhat.com> wrote: > > When battery devices get added during the second step of the now 2 step > scan-process, then acpi_walk_dep_device_list() may have already been > called for some deps during the first step. > > In this case acpi_device_dep_initialize() should not add these deps to > the acpi_dep_list; and it should not increase adev->dep_unmet. > > Add a check for already registered (and bound to a driver) devices to > acpi_device_dep_initialize(). This fixes battery devices (which honor the > dep_unmet value) not working with acpi.defer_scan_based_on_dep=1. I'd rather avoid having to fix this issue at all ...
On Wednesday, December 2, 2020 2:49:17 PM CET Rafael J. Wysocki wrote: > On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: > > > > Hi Rafael, > > > > A while ago (almost 2 years ago) I discussed an issue with you about > > some devices, where some of the methods used during device-addition > > (such as _HID) may rely on OpRegions of other devices: > > > > https://www.spinics.net/lists/linux-acpi/msg86303.html > > > > An example of this is the Acer Switch 10E SW3-016 model. The _HID method > > of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect > > the installed wifi chip and update the _HID for the Bluetooth's ACPI node > > accordingly. The current ACPI scan code calls _HID before the GPIO > > controller's OpRegions are available, leading to the wrong _HID being > > used and Bluetooth not working. > > > > Last week I bought a second hand Acer device, not knowing it was this > > exact model. Since I now have access to affected hardware I decided to > > take a shot at fixing this. > > > > In the discussion you suggested to split the acpi_bus_scan of the root > > into 2 steps, first scan devices with an empty _DEP, putting other > > acpi_handle-s on a list of deferred devices and then in step 2 scan the > > rest. > > > > I'm happy to report that, at least on the affected device, this works > > nicely. While working on this I came up with a less drastic way to > > deal with this. As you will see in patch 4 of this series, I decided > > to first add a more KISS method of deciding which devices to defer > > to the second scan step by matching by HID. This has the disadvantage > > of not being a generic solution. But it has the advantage of being a > > solution which does not potentially regress other devices. > > > > Then in patch 5 I actually do add the option to defer or not based on > > _DEP being empty. I've put this behind a kernel commandline option as > > I'm not sure we should do this everywhere by default. At least no without > > a lot more testing. > > > > Patch 6 fixes an issue with patch 5 which causes battery devices to stop > > working. > > > > And patch 7 adds some extra HIDs to the list of HIDs which should be > > ignored when checking if the _DEP list is empty from Linux' pov, iow > > some extra HIDs which Linux does not bind to. > > > > Please let me know what you think about this patch-set. I would be happy > > to see just patches 1-4 merged. > > I took patches 1 and 2, because IMO they are generally useful (I > rewrote the changelogs to avoid mentioning the rest of the series > though), but I have some reservations regarding the rest. > > First off, I'm not really sure if failing acpi_add_single_object() for > devices with missing dependencies is a good idea. IIRC there is > nothing in there that should depend on any opregions supplied by the > other devices, so it should be safe to allow it to complete. That, in > turn, will allow the flags in struct acpi_device to be used to mark > the "deferred" devices without allocating more memory. > > Next, in theory, devices with dependencies may also appear during > hotplug, so it would be prudent to handle that on every invocation of > acpi_bus_scan() and not just when it runs for the root object. > > So my approach would be to allow the first namespace walk in > acpi_bus_scan() to complete, change acpi_bus_attach() to optionally > skip the devices with missing dependencies and return a result > indicating whether or not it has set flags.visited for any devices and > run it in a loop on the "root" device object until it says that no new > devices have been "attached". > > Let me cut a prototype patch for that and get back to you. Maybe something like the patch below (untested). I borrowed a few items from your patches, hopefully not a problem. The multiple passes idea would require using a static variable which would be slightly inelegant, so this assumes that two passes should be sufficient. --- drivers/acpi/scan.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 52 insertions(+), 3 deletions(-) Index: linux-pm/drivers/acpi/scan.c =================================================================== --- linux-pm.orig/drivers/acpi/scan.c +++ linux-pm/drivers/acpi/scan.c @@ -1961,12 +1961,50 @@ static int acpi_scan_attach_handler(stru return ret; } -static void acpi_bus_attach(struct acpi_device *device) +/* + * List of IDs for which we defer adding them to the second pass of the + * scanning, because some of their methods used during addition depend on + * OpRegions registered by the drivers for other ACPI devices. + */ +static const struct acpi_device_id acpi_defer_attach_ids[] = { + { "BCM2E5B", 0 }, /* Acer SW3-016 bluetooth vs GPIO OpRegs */ + {"", 0}, +}; + +static bool acpi_scan_should_defer_attach(struct acpi_device *adev) +{ + if (acpi_match_device_ids(adev, acpi_defer_attach_ids)) + return true; + + /* + * We check for _ADR here to avoid deferring the adding of the following: + * 1. PCI devices + * 2. ACPI nodes describing USB ports + * Note checking for _ADR catches more then just these cases ... + */ + if (adev->pnp.type.bus_address) + return false; + + return adev->dep_unmet > 0; +} + +static void __acpi_bus_attach(struct acpi_device *device, bool first_pass) { struct acpi_device *child; acpi_handle ejd; int ret; + if (first_pass) { + if (acpi_scan_should_defer_attach(device)) + return; + } else if (device->flags.visited) { + /* + * This is not the first pass in the given scan and the device + * has been "attached" already, so get to the children right away. + */ + goto ok; + } + if (ACPI_SUCCESS(acpi_bus_get_ejd(device->handle, &ejd))) register_dock_dependent_device(device, ejd); @@ -2013,12 +2051,23 @@ static void acpi_bus_attach(struct acpi_ ok: list_for_each_entry(child, &device->children, node) - acpi_bus_attach(child); + __acpi_bus_attach(child, first_pass); - if (device->handler && device->handler->hotplug.notify_online) + if (first_pass && device->handler && + device->handler->hotplug.notify_online) device->handler->hotplug.notify_online(device); } +static void acpi_bus_attach(struct acpi_device *device) +{ + /* + * Assume two passes to be sufficient to satisfy all of the operation + * region dependencies. + */ + __acpi_bus_attach(device, true); + __acpi_bus_attach(device, false); +} + void acpi_walk_dep_device_list(acpi_handle handle) { struct acpi_dep_data *dep, *tmp;
Hi, On 12/2/20 2:49 PM, Rafael J. Wysocki wrote: > On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi Rafael, >> >> A while ago (almost 2 years ago) I discussed an issue with you about >> some devices, where some of the methods used during device-addition >> (such as _HID) may rely on OpRegions of other devices: >> >> https://www.spinics.net/lists/linux-acpi/msg86303.html >> >> An example of this is the Acer Switch 10E SW3-016 model. The _HID method >> of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect >> the installed wifi chip and update the _HID for the Bluetooth's ACPI node >> accordingly. The current ACPI scan code calls _HID before the GPIO >> controller's OpRegions are available, leading to the wrong _HID being >> used and Bluetooth not working. >> >> Last week I bought a second hand Acer device, not knowing it was this >> exact model. Since I now have access to affected hardware I decided to >> take a shot at fixing this. >> >> In the discussion you suggested to split the acpi_bus_scan of the root >> into 2 steps, first scan devices with an empty _DEP, putting other >> acpi_handle-s on a list of deferred devices and then in step 2 scan the >> rest. >> >> I'm happy to report that, at least on the affected device, this works >> nicely. While working on this I came up with a less drastic way to >> deal with this. As you will see in patch 4 of this series, I decided >> to first add a more KISS method of deciding which devices to defer >> to the second scan step by matching by HID. This has the disadvantage >> of not being a generic solution. But it has the advantage of being a >> solution which does not potentially regress other devices. >> >> Then in patch 5 I actually do add the option to defer or not based on >> _DEP being empty. I've put this behind a kernel commandline option as >> I'm not sure we should do this everywhere by default. At least no without >> a lot more testing. >> >> Patch 6 fixes an issue with patch 5 which causes battery devices to stop >> working. >> >> And patch 7 adds some extra HIDs to the list of HIDs which should be >> ignored when checking if the _DEP list is empty from Linux' pov, iow >> some extra HIDs which Linux does not bind to. >> >> Please let me know what you think about this patch-set. I would be happy >> to see just patches 1-4 merged. > > I took patches 1 and 2, because IMO they are generally useful (I > rewrote the changelogs to avoid mentioning the rest of the series > though), That is fine. Thanks for taking those. > but I have some reservations regarding the rest. > > First off, I'm not really sure if failing acpi_add_single_object() for > devices with missing dependencies is a good idea. IIRC there is > nothing in there that should depend on any opregions supplied by the > other devices, so it should be safe to allow it to complete. Actually acpi_add_single_object() does depend on ACPI methods which may depend on opregions, that is the whole reason why this series is necessary. Otherwise we could just delay the binding of the driver based in dep_unmet which would be easier. Here are 2 actual examples of acpi_add_single_object() calling ACPI methods which may depend on opregions: 1. acpi_add_single_object() calls acpi_init_device_object() which calls acpi_set_pnp_ids() which fills a bunch if fields of struct acpi_device with info returned by the acpi_get_object_info() call. Specifically it stores the value returned by the _HID method in the acpi_device_pnp array for the device and that _HID method is actually the problem in the example device which started this series. The _HID method of the bluetooth device reads 2 GPIOs to get a hw-id (0-3) and then translates the hwid to a _HID string. If the GPIO opregion's handlers have not registered yet then the reading of the GPIOs is correctly skipped, and hwid 0 is assumed, which is wrong in this case. 2. I've also seen examples where _STA depends on GPIOs in a similar manner; and acpi_add_single_object() calls acpi_bus_get_power_flags() which calls acpi_bus_init_power() which calls acpi_device_is_present() which depends on _STA results. > That, in > turn, will allow the flags in struct acpi_device to be used to mark > the "deferred" devices without allocating more memory. See above, we would need to either redo a bunch of the device addition; and/or potentially even undo it in the case of _STA changing from present -> not present once the OpRegions have registered. > Next, in theory, devices with dependencies may also appear during > hotplug, so it would be prudent to handle that on every invocation of > acpi_bus_scan() and not just when it runs for the root object. > > So my approach would be to allow the first namespace walk in > acpi_bus_scan() to complete, change acpi_bus_attach() to optionally > skip the devices with missing dependencies and return a result > indicating whether or not it has set flags.visited for any devices and > run it in a loop on the "root" device object until it says that no new > devices have been "attached". > > Let me cut a prototype patch for that and get back to you. See above for why the prototype patch will not work. By the time acpi_bus_attach() runs, the wrong HID has already been stored by acpi_set_pnp_ids(). Note I'm not saying that your approach cannot work, we could e.g. re-fetch the HID before running bus_attach. But once we start doing extra work, replacing fields in the earlier instantiated acpi_device, then there is not that much difference between the 2 approaches and then the question becomes which way is cleaner ? I still favor my own approach because that simply delays the entire instantiation, rather then doing it when we don't have all the _DEPs yet and then "patching up" the acpi_device later. Note I did consider the "patching up" approach, but I rejected it (*). The patching-up approach feels fragile / more error-prone to me, which is why I chose to simply defer the entire instantiation. Regards, Hans *) In hindsight I should have probably written that down somewhere, but the whole though process behind my choices it is not something which I could quickly describe in 1 or 2 paragraphs in the commit message. Some other notes about this: 1. I considered the patching-up approach, but I did not implement it. 2. Another reason for delaying the entire instantiation is that we have some quirk handling in various places which relies on _HID / on the pnp-ids and since those are still in flux when the deps have not been met yet, some of that quirk handling could (theoretically) also go wonky if we instantiate the device too soon.
On Wednesday, December 2, 2020 4:51:59 PM CET Rafael J. Wysocki wrote: > On Wednesday, December 2, 2020 2:49:17 PM CET Rafael J. Wysocki wrote: > > On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: > > > > > > Hi Rafael, > > > > > > A while ago (almost 2 years ago) I discussed an issue with you about > > > some devices, where some of the methods used during device-addition > > > (such as _HID) may rely on OpRegions of other devices: > > > > > > https://www.spinics.net/lists/linux-acpi/msg86303.html > > > > > > An example of this is the Acer Switch 10E SW3-016 model. The _HID method > > > of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect > > > the installed wifi chip and update the _HID for the Bluetooth's ACPI node > > > accordingly. The current ACPI scan code calls _HID before the GPIO > > > controller's OpRegions are available, leading to the wrong _HID being > > > used and Bluetooth not working. > > > > > > Last week I bought a second hand Acer device, not knowing it was this > > > exact model. Since I now have access to affected hardware I decided to > > > take a shot at fixing this. > > > > > > In the discussion you suggested to split the acpi_bus_scan of the root > > > into 2 steps, first scan devices with an empty _DEP, putting other > > > acpi_handle-s on a list of deferred devices and then in step 2 scan the > > > rest. > > > > > > I'm happy to report that, at least on the affected device, this works > > > nicely. While working on this I came up with a less drastic way to > > > deal with this. As you will see in patch 4 of this series, I decided > > > to first add a more KISS method of deciding which devices to defer > > > to the second scan step by matching by HID. This has the disadvantage > > > of not being a generic solution. But it has the advantage of being a > > > solution which does not potentially regress other devices. > > > > > > Then in patch 5 I actually do add the option to defer or not based on > > > _DEP being empty. I've put this behind a kernel commandline option as > > > I'm not sure we should do this everywhere by default. At least no without > > > a lot more testing. > > > > > > Patch 6 fixes an issue with patch 5 which causes battery devices to stop > > > working. > > > > > > And patch 7 adds some extra HIDs to the list of HIDs which should be > > > ignored when checking if the _DEP list is empty from Linux' pov, iow > > > some extra HIDs which Linux does not bind to. > > > > > > Please let me know what you think about this patch-set. I would be happy > > > to see just patches 1-4 merged. > > > > I took patches 1 and 2, because IMO they are generally useful (I > > rewrote the changelogs to avoid mentioning the rest of the series > > though), but I have some reservations regarding the rest. > > > > First off, I'm not really sure if failing acpi_add_single_object() for > > devices with missing dependencies is a good idea. IIRC there is > > nothing in there that should depend on any opregions supplied by the > > other devices, so it should be safe to allow it to complete. That, in > > turn, will allow the flags in struct acpi_device to be used to mark > > the "deferred" devices without allocating more memory. > > > > Next, in theory, devices with dependencies may also appear during > > hotplug, so it would be prudent to handle that on every invocation of > > acpi_bus_scan() and not just when it runs for the root object. > > > > So my approach would be to allow the first namespace walk in > > acpi_bus_scan() to complete, change acpi_bus_attach() to optionally > > skip the devices with missing dependencies and return a result > > indicating whether or not it has set flags.visited for any devices and > > run it in a loop on the "root" device object until it says that no new > > devices have been "attached". > > > > Let me cut a prototype patch for that and get back to you. > > Maybe something like the patch below (untested). I borrowed a few items from > your patches, hopefully not a problem. > > The multiple passes idea would require using a static variable which would > be slightly inelegant, so this assumes that two passes should be sufficient. > An update. This one has been lightly tested, but it doesn't make any practical difference on the system where it was run AFAICS. I found a missing ! in acpi_scan_should_defer_attach() and then realized that looking for _ADR wasn't really necessary, because _ADR devices should not be affected by this in a meaningful way anyway (scan handlers and ACPI drivers match on the _HID and/or _CID basis and the status check/power up in __acpi_bus_attach() should be skipped for them, because they may be "glued" to their "physical" counterparts before this code runs even - looks like a bug). --- drivers/acpi/scan.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-) Index: linux-pm/drivers/acpi/scan.c =================================================================== --- linux-pm.orig/drivers/acpi/scan.c +++ linux-pm/drivers/acpi/scan.c @@ -1979,12 +1979,42 @@ static int acpi_scan_attach_handler(stru return ret; } -static void acpi_bus_attach(struct acpi_device *device) +/* + * List of IDs for which we defer adding them to the second pass of the + * scanning, because some of their methods used during addition depend on + * OpRegions registered by the drivers for other ACPI devices. + */ +static const struct acpi_device_id acpi_defer_attach_ids[] = { + { "BCM2E5B", 0 }, /* Acer SW3-016 bluetooth vs GPIO OpRegs */ + {"", 0}, +}; + +static bool acpi_scan_should_defer_attach(struct acpi_device *adev) +{ + if (!acpi_match_device_ids(adev, acpi_defer_attach_ids)) + return true; + + return adev->dep_unmet > 0; +} + +static void __acpi_bus_attach(struct acpi_device *device, bool first_pass) { struct acpi_device *child; acpi_handle ejd; int ret; + if (first_pass) { + if (acpi_scan_should_defer_attach(device)) + return; + } else if (device->flags.visited) { + /* + * This is not the first pass in the given scan and the device + * has been "attached" already, so get to the children right + * away. + */ + goto ok; + } + if (ACPI_SUCCESS(acpi_bus_get_ejd(device->handle, &ejd))) register_dock_dependent_device(device, ejd); @@ -2031,12 +2061,23 @@ static void acpi_bus_attach(struct acpi_ ok: list_for_each_entry(child, &device->children, node) - acpi_bus_attach(child); + __acpi_bus_attach(child, first_pass); - if (device->handler && device->handler->hotplug.notify_online) + if (first_pass && device->handler && + device->handler->hotplug.notify_online) device->handler->hotplug.notify_online(device); } +static void acpi_bus_attach(struct acpi_device *device) +{ + /* + * Assume two passes to be sufficient to satisfy all of the operation + * region dependencies. + */ + __acpi_bus_attach(device, true); + __acpi_bus_attach(device, false); +} + void acpi_walk_dep_device_list(acpi_handle handle) { struct acpi_dep_data *dep, *tmp;
On Wed, Dec 2, 2020 at 8:39 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi, > > On 12/2/20 2:49 PM, Rafael J. Wysocki wrote: > > On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: > >> > >> Hi Rafael, > >> > >> A while ago (almost 2 years ago) I discussed an issue with you about > >> some devices, where some of the methods used during device-addition > >> (such as _HID) may rely on OpRegions of other devices: > >> > >> https://www.spinics.net/lists/linux-acpi/msg86303.html > >> > >> An example of this is the Acer Switch 10E SW3-016 model. The _HID method > >> of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect > >> the installed wifi chip and update the _HID for the Bluetooth's ACPI node > >> accordingly. The current ACPI scan code calls _HID before the GPIO > >> controller's OpRegions are available, leading to the wrong _HID being > >> used and Bluetooth not working. > >> > >> Last week I bought a second hand Acer device, not knowing it was this > >> exact model. Since I now have access to affected hardware I decided to > >> take a shot at fixing this. > >> > >> In the discussion you suggested to split the acpi_bus_scan of the root > >> into 2 steps, first scan devices with an empty _DEP, putting other > >> acpi_handle-s on a list of deferred devices and then in step 2 scan the > >> rest. > >> > >> I'm happy to report that, at least on the affected device, this works > >> nicely. While working on this I came up with a less drastic way to > >> deal with this. As you will see in patch 4 of this series, I decided > >> to first add a more KISS method of deciding which devices to defer > >> to the second scan step by matching by HID. This has the disadvantage > >> of not being a generic solution. But it has the advantage of being a > >> solution which does not potentially regress other devices. > >> > >> Then in patch 5 I actually do add the option to defer or not based on > >> _DEP being empty. I've put this behind a kernel commandline option as > >> I'm not sure we should do this everywhere by default. At least no without > >> a lot more testing. > >> > >> Patch 6 fixes an issue with patch 5 which causes battery devices to stop > >> working. > >> > >> And patch 7 adds some extra HIDs to the list of HIDs which should be > >> ignored when checking if the _DEP list is empty from Linux' pov, iow > >> some extra HIDs which Linux does not bind to. > >> > >> Please let me know what you think about this patch-set. I would be happy > >> to see just patches 1-4 merged. > > > > I took patches 1 and 2, because IMO they are generally useful (I > > rewrote the changelogs to avoid mentioning the rest of the series > > though), > > That is fine. Thanks for taking those. > > > but I have some reservations regarding the rest. > > > > First off, I'm not really sure if failing acpi_add_single_object() for > > devices with missing dependencies is a good idea. IIRC there is > > nothing in there that should depend on any opregions supplied by the > > other devices, so it should be safe to allow it to complete. > > Actually acpi_add_single_object() does depend on ACPI methods > which may depend on opregions, that is the whole reason why > this series is necessary. Otherwise we could just delay the > binding of the driver based in dep_unmet which would be easier. > > Here are 2 actual examples of acpi_add_single_object() calling > ACPI methods which may depend on opregions: > > 1. acpi_add_single_object() calls acpi_init_device_object() which > calls acpi_set_pnp_ids() which fills a bunch if fields of > struct acpi_device with info returned by the acpi_get_object_info() > call. > > Specifically it stores the value returned by the _HID method in > the acpi_device_pnp array for the device and that _HID method is > actually the problem in the example device which started this > series. The _HID method of the bluetooth device reads 2 GPIOs > to get a hw-id (0-3) and then translates the hwid to a _HID > string. If the GPIO opregion's handlers have not registered yet > then the reading of the GPIOs is correctly skipped, and hwid > 0 is assumed, which is wrong in this case. > > 2. I've also seen examples where _STA depends on GPIOs in a similar > manner; and acpi_add_single_object() calls acpi_bus_get_power_flags() > which calls acpi_bus_init_power() which calls acpi_device_is_present() > which depends on _STA results. Well, this means that there is a bug in acpi_bus_attach() which shouldn't call acpi_bus_init_power() which has been called already. And it all means that either deferring acpi_add_single_object() is needed and so there need to be 2 passes in acpi_bus_attach() overall, or acpi_add_single_object() needs to avoid calling methods that may depend on supplied opregions. I guess the latter is rather unrealistic, so the only practical choice is the former. However, I still don't think that the extra list of "dependent devices" is needed.
Hi, On 12/2/20 8:57 PM, Rafael J. Wysocki wrote: > On Wed, Dec 2, 2020 at 8:39 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi, >> >> On 12/2/20 2:49 PM, Rafael J. Wysocki wrote: >>> On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: >>>> >>>> Hi Rafael, >>>> >>>> A while ago (almost 2 years ago) I discussed an issue with you about >>>> some devices, where some of the methods used during device-addition >>>> (such as _HID) may rely on OpRegions of other devices: >>>> >>>> https://www.spinics.net/lists/linux-acpi/msg86303.html >>>> >>>> An example of this is the Acer Switch 10E SW3-016 model. The _HID method >>>> of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect >>>> the installed wifi chip and update the _HID for the Bluetooth's ACPI node >>>> accordingly. The current ACPI scan code calls _HID before the GPIO >>>> controller's OpRegions are available, leading to the wrong _HID being >>>> used and Bluetooth not working. >>>> >>>> Last week I bought a second hand Acer device, not knowing it was this >>>> exact model. Since I now have access to affected hardware I decided to >>>> take a shot at fixing this. >>>> >>>> In the discussion you suggested to split the acpi_bus_scan of the root >>>> into 2 steps, first scan devices with an empty _DEP, putting other >>>> acpi_handle-s on a list of deferred devices and then in step 2 scan the >>>> rest. >>>> >>>> I'm happy to report that, at least on the affected device, this works >>>> nicely. While working on this I came up with a less drastic way to >>>> deal with this. As you will see in patch 4 of this series, I decided >>>> to first add a more KISS method of deciding which devices to defer >>>> to the second scan step by matching by HID. This has the disadvantage >>>> of not being a generic solution. But it has the advantage of being a >>>> solution which does not potentially regress other devices. >>>> >>>> Then in patch 5 I actually do add the option to defer or not based on >>>> _DEP being empty. I've put this behind a kernel commandline option as >>>> I'm not sure we should do this everywhere by default. At least no without >>>> a lot more testing. >>>> >>>> Patch 6 fixes an issue with patch 5 which causes battery devices to stop >>>> working. >>>> >>>> And patch 7 adds some extra HIDs to the list of HIDs which should be >>>> ignored when checking if the _DEP list is empty from Linux' pov, iow >>>> some extra HIDs which Linux does not bind to. >>>> >>>> Please let me know what you think about this patch-set. I would be happy >>>> to see just patches 1-4 merged. >>> >>> I took patches 1 and 2, because IMO they are generally useful (I >>> rewrote the changelogs to avoid mentioning the rest of the series >>> though), >> >> That is fine. Thanks for taking those. >> >>> but I have some reservations regarding the rest. >>> >>> First off, I'm not really sure if failing acpi_add_single_object() for >>> devices with missing dependencies is a good idea. IIRC there is >>> nothing in there that should depend on any opregions supplied by the >>> other devices, so it should be safe to allow it to complete. >> >> Actually acpi_add_single_object() does depend on ACPI methods >> which may depend on opregions, that is the whole reason why >> this series is necessary. Otherwise we could just delay the >> binding of the driver based in dep_unmet which would be easier. >> >> Here are 2 actual examples of acpi_add_single_object() calling >> ACPI methods which may depend on opregions: >> >> 1. acpi_add_single_object() calls acpi_init_device_object() which >> calls acpi_set_pnp_ids() which fills a bunch if fields of >> struct acpi_device with info returned by the acpi_get_object_info() >> call. >> >> Specifically it stores the value returned by the _HID method in >> the acpi_device_pnp array for the device and that _HID method is >> actually the problem in the example device which started this >> series. The _HID method of the bluetooth device reads 2 GPIOs >> to get a hw-id (0-3) and then translates the hwid to a _HID >> string. If the GPIO opregion's handlers have not registered yet >> then the reading of the GPIOs is correctly skipped, and hwid >> 0 is assumed, which is wrong in this case. >> >> 2. I've also seen examples where _STA depends on GPIOs in a similar >> manner; and acpi_add_single_object() calls acpi_bus_get_power_flags() >> which calls acpi_bus_init_power() which calls acpi_device_is_present() >> which depends on _STA results. > > Well, this means that there is a bug in acpi_bus_attach() which > shouldn't call acpi_bus_init_power() which has been called already. I'm afraid we have a bit of a misunderstanding here, the problem is not that acpi_bus_attach() calls acpi_bus_init_power(), the problem is that acpi_bus_init_power() (which is called from acpi_add_single_object()) depends on the value returned by _STA and that in turn may depend on some OpRegions being available. IOW it is the same problem as the _HID problem. > And it all means that either deferring acpi_add_single_object() is > needed and so there need to be 2 passes in acpi_bus_attach() overall, > or acpi_add_single_object() needs to avoid calling methods that may > depend on supplied opregions. I guess the latter is rather > unrealistic, so the only practical choice is the former. I agree. > However, I still don't think that the extra list of "dependent > devices" is needed. I'm not sure what you are trying to say here? Do you mean this list: +/* + * List of HIDs for which we defer adding them to the second step of the + * scanning of the root, because some of their methods used during addition + * depend on OpRegions registered by the drivers for other ACPI devices. + */ +static const char * const acpi_defer_add_hids[] = { + "BCM2E5B", /* Acer SW3-016 bluetooth HID when GPIO OpRegs or not available yet */ + NULL +}; + ? That indeed is not necessary if you take the entire set and always enable the new behavior instead of using the module option. I guess we could go that route for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum testing. Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] thing and the module option and simply always uses the new behavior? Regards, Hans
On Thu, Dec 3, 2020 at 10:53 AM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi, > > On 12/2/20 8:57 PM, Rafael J. Wysocki wrote: > > On Wed, Dec 2, 2020 at 8:39 PM Hans de Goede <hdegoede@redhat.com> wrote: > >> > >> Hi, > >> > >> On 12/2/20 2:49 PM, Rafael J. Wysocki wrote: > >>> On Sat, Nov 21, 2020 at 9:30 PM Hans de Goede <hdegoede@redhat.com> wrote: > >>>> > >>>> Hi Rafael, > >>>> > >>>> A while ago (almost 2 years ago) I discussed an issue with you about > >>>> some devices, where some of the methods used during device-addition > >>>> (such as _HID) may rely on OpRegions of other devices: > >>>> > >>>> https://www.spinics.net/lists/linux-acpi/msg86303.html > >>>> > >>>> An example of this is the Acer Switch 10E SW3-016 model. The _HID method > >>>> of the ACPI node for the UART attached Bluetooth, reads GPIOs to detect > >>>> the installed wifi chip and update the _HID for the Bluetooth's ACPI node > >>>> accordingly. The current ACPI scan code calls _HID before the GPIO > >>>> controller's OpRegions are available, leading to the wrong _HID being > >>>> used and Bluetooth not working. > >>>> > >>>> Last week I bought a second hand Acer device, not knowing it was this > >>>> exact model. Since I now have access to affected hardware I decided to > >>>> take a shot at fixing this. > >>>> > >>>> In the discussion you suggested to split the acpi_bus_scan of the root > >>>> into 2 steps, first scan devices with an empty _DEP, putting other > >>>> acpi_handle-s on a list of deferred devices and then in step 2 scan the > >>>> rest. > >>>> > >>>> I'm happy to report that, at least on the affected device, this works > >>>> nicely. While working on this I came up with a less drastic way to > >>>> deal with this. As you will see in patch 4 of this series, I decided > >>>> to first add a more KISS method of deciding which devices to defer > >>>> to the second scan step by matching by HID. This has the disadvantage > >>>> of not being a generic solution. But it has the advantage of being a > >>>> solution which does not potentially regress other devices. > >>>> > >>>> Then in patch 5 I actually do add the option to defer or not based on > >>>> _DEP being empty. I've put this behind a kernel commandline option as > >>>> I'm not sure we should do this everywhere by default. At least no without > >>>> a lot more testing. > >>>> > >>>> Patch 6 fixes an issue with patch 5 which causes battery devices to stop > >>>> working. > >>>> > >>>> And patch 7 adds some extra HIDs to the list of HIDs which should be > >>>> ignored when checking if the _DEP list is empty from Linux' pov, iow > >>>> some extra HIDs which Linux does not bind to. > >>>> > >>>> Please let me know what you think about this patch-set. I would be happy > >>>> to see just patches 1-4 merged. > >>> > >>> I took patches 1 and 2, because IMO they are generally useful (I > >>> rewrote the changelogs to avoid mentioning the rest of the series > >>> though), > >> > >> That is fine. Thanks for taking those. > >> > >>> but I have some reservations regarding the rest. > >>> > >>> First off, I'm not really sure if failing acpi_add_single_object() for > >>> devices with missing dependencies is a good idea. IIRC there is > >>> nothing in there that should depend on any opregions supplied by the > >>> other devices, so it should be safe to allow it to complete. > >> > >> Actually acpi_add_single_object() does depend on ACPI methods > >> which may depend on opregions, that is the whole reason why > >> this series is necessary. Otherwise we could just delay the > >> binding of the driver based in dep_unmet which would be easier. > >> > >> Here are 2 actual examples of acpi_add_single_object() calling > >> ACPI methods which may depend on opregions: > >> > >> 1. acpi_add_single_object() calls acpi_init_device_object() which > >> calls acpi_set_pnp_ids() which fills a bunch if fields of > >> struct acpi_device with info returned by the acpi_get_object_info() > >> call. > >> > >> Specifically it stores the value returned by the _HID method in > >> the acpi_device_pnp array for the device and that _HID method is > >> actually the problem in the example device which started this > >> series. The _HID method of the bluetooth device reads 2 GPIOs > >> to get a hw-id (0-3) and then translates the hwid to a _HID > >> string. If the GPIO opregion's handlers have not registered yet > >> then the reading of the GPIOs is correctly skipped, and hwid > >> 0 is assumed, which is wrong in this case. > >> > >> 2. I've also seen examples where _STA depends on GPIOs in a similar > >> manner; and acpi_add_single_object() calls acpi_bus_get_power_flags() > >> which calls acpi_bus_init_power() which calls acpi_device_is_present() > >> which depends on _STA results. > > > > Well, this means that there is a bug in acpi_bus_attach() which > > shouldn't call acpi_bus_init_power() which has been called already. > > I'm afraid we have a bit of a misunderstanding here, the problem is > not that acpi_bus_attach() calls acpi_bus_init_power(), the problem is > that acpi_bus_init_power() (which is called from acpi_add_single_object()) > depends on the value returned by _STA and that in turn may depend on > some OpRegions being available. IOW it is the same problem as the _HID > problem. This was a side note. Arguably, acpi_bus_init_power() should not be called twice in a row for the same device, which is actually done by the current code. Another side note: while we can avoid evaluating _STA for the devices whose enumeration we want to defer, avoiding to evaluate _HID (or _CID for that matter) for them will be rather hard. > > And it all means that either deferring acpi_add_single_object() is > > needed and so there need to be 2 passes in acpi_bus_attach() overall, > > or acpi_add_single_object() needs to avoid calling methods that may > > depend on supplied opregions. I guess the latter is rather > > unrealistic, so the only practical choice is the former. > > I agree. > > > However, I still don't think that the extra list of "dependent > > devices" is needed. > > I'm not sure what you are trying to say here? Do you mean this list: > > +/* > + * List of HIDs for which we defer adding them to the second step of the > + * scanning of the root, because some of their methods used during addition > + * depend on OpRegions registered by the drivers for other ACPI devices. > + */ > +static const char * const acpi_defer_add_hids[] = { > + "BCM2E5B", /* Acer SW3-016 bluetooth HID when GPIO OpRegs or not available yet */ > + NULL > +}; > + > > ? No, the one created by patch [4/7] in your series, acpi_deferred_devices. > That indeed is not necessary if you take the entire set and always enable the > new behavior instead of using the module option. I guess we could go that route > for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum > testing. > > Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] > thing and the module option and simply always uses the new behavior? No, something else. Stay tuned.
On Thursday, December 3, 2020 3:27:27 PM CET Rafael J. Wysocki wrote: > On Thu, Dec 3, 2020 at 10:53 AM Hans de Goede <hdegoede@redhat.com> wrote: [cut] > > That indeed is not necessary if you take the entire set and always enable the > > new behavior instead of using the module option. I guess we could go that route > > for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum > > testing. I'd prefer to do the whole thing in 5.11-rc and possibly revert something if it causes problems to occur. > > Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] > > thing and the module option and simply always uses the new behavior? > > No, something else. Stay tuned. The patch below illustrates what I'd like to do. It at least doesn't kill my test-bed system, but also it doesn't cause the enumeration ordering to change on that system. It really is three patches in one and (again) I borrowed some code from your patches in the $subject series. [It is on top of the "ACPI: scan: Add PNP0D80 to the _DEP exceptions list" patch I've just posted.] Please let me know what you think. --- drivers/acpi/scan.c | 141 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 120 insertions(+), 21 deletions(-) Index: linux-pm/drivers/acpi/scan.c =================================================================== --- linux-pm.orig/drivers/acpi/scan.c +++ linux-pm/drivers/acpi/scan.c @@ -1646,17 +1646,77 @@ void acpi_device_add_finalize(struct acp kobject_uevent(&device->dev.kobj, KOBJ_ADD); } -static int acpi_add_single_object(struct acpi_device **child, - acpi_handle handle, int type, - unsigned long long sta) +/* + * List of IDs for which we defer enumeration them to the second pass, because + * some of their methods used during addition depend on OpRegions registered by + * the drivers for other ACPI devices. + */ +static const char * const acpi_defer_enumeration_ids[] = { + "BCM2E5B", /* Acer SW3-016 bluetooth HID vs GPIO OpRegs */ + NULL +}; + +static bool acpi_should_defer_enumeration(acpi_handle handle, + struct acpi_device_info *info) +{ + struct acpi_handle_list dep_devices; + acpi_status status; + int i; + + if (acpi_info_matches_ids(info, acpi_defer_enumeration_ids)) + return true; + + /* + * We check for _HID here to avoid deferring the enumeration of: + * 1. PCI devices + * 2. ACPI nodes describing USB ports + * However, checking for _HID catches more then just these cases ... + */ + if (!(info->valid & ACPI_VALID_HID)) + return false; + + status = acpi_evaluate_reference(handle, "_DEP", NULL, &dep_devices); + if (ACPI_FAILURE(status)) + return false; + + for (i = 0; i < dep_devices.count; i++) { + struct acpi_device_info *dep_info; + bool ignore; + + status = acpi_get_object_info(dep_devices.handles[i], &dep_info); + if (ACPI_FAILURE(status)) + continue; + + ignore = acpi_info_matches_ids(dep_info, acpi_ignore_dep_ids); + kfree(dep_info); + + if (ignore) + continue; + + return true; + } + + return false; +} + +static int __acpi_add_single_object(struct acpi_device **child, + acpi_handle handle, int type, + unsigned long long sta, bool check_dep) { int result; struct acpi_device *device; struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; struct acpi_device_info *info = NULL; - if (handle != ACPI_ROOT_OBJECT && type == ACPI_BUS_TYPE_DEVICE) + if (handle != ACPI_ROOT_OBJECT && type == ACPI_BUS_TYPE_DEVICE) { acpi_get_object_info(handle, &info); + if (check_dep && info && + acpi_should_defer_enumeration(handle, info)) { + kfree(info); + acpi_handle_info(handle, "Missing dependencies\n"); + return -EAGAIN; + } + } device = kzalloc(sizeof(struct acpi_device), GFP_KERNEL); if (!device) { @@ -1696,6 +1756,13 @@ static int acpi_add_single_object(struct return 0; } +static int acpi_add_single_object(struct acpi_device **child, + acpi_handle handle, int type, + unsigned long long sta) +{ + return __acpi_add_single_object(child, handle, type, sta, false); +} + static acpi_status acpi_get_resource_memory(struct acpi_resource *ares, void *context) { @@ -1892,8 +1959,8 @@ static void acpi_device_dep_initialize(s } } -static acpi_status acpi_bus_check_add(acpi_handle handle, u32 lvl_not_used, - void *not_used, void **return_value) +static acpi_status acpi_bus_check_add(acpi_handle handle, bool check_dep, + struct acpi_device **adev_p) { struct acpi_device *device = NULL; int type; @@ -1913,7 +1980,7 @@ static acpi_status acpi_bus_check_add(ac return AE_OK; } - acpi_add_single_object(&device, handle, type, sta); + __acpi_add_single_object(&device, handle, type, sta, check_dep); if (!device) return AE_CTRL_DEPTH; @@ -1921,12 +1988,24 @@ static acpi_status acpi_bus_check_add(ac acpi_device_dep_initialize(device); out: - if (!*return_value) - *return_value = device; + if (!*adev_p) + *adev_p = device; return AE_OK; } +static acpi_status acpi_bus_check_add_1(acpi_handle handle, u32 lvl_not_used, + void *not_used, void **ret_p) +{ + return acpi_bus_check_add(handle, true, (struct acpi_device **)ret_p); +} + +static acpi_status acpi_bus_check_add_2(acpi_handle handle, u32 lvl_not_used, + void *not_used, void **ret_p) +{ + return acpi_bus_check_add(handle, false, (struct acpi_device **)ret_p); +} + static void acpi_default_enumeration(struct acpi_device *device) { /* @@ -1994,12 +2073,16 @@ static int acpi_scan_attach_handler(stru return ret; } -static void acpi_bus_attach(struct acpi_device *device) +static void acpi_bus_attach(struct acpi_device *device, bool first_pass) { struct acpi_device *child; + bool skip = !first_pass && device->flags.visited; acpi_handle ejd; int ret; + if (skip) + goto ok; + if (ACPI_SUCCESS(acpi_bus_get_ejd(device->handle, &ejd))) register_dock_dependent_device(device, ejd); @@ -2046,9 +2129,9 @@ static void acpi_bus_attach(struct acpi_ ok: list_for_each_entry(child, &device->children, node) - acpi_bus_attach(child); + acpi_bus_attach(child, first_pass); - if (device->handler && device->handler->hotplug.notify_online) + if (!skip && device->handler && device->handler->hotplug.notify_online) device->handler->hotplug.notify_online(device); } @@ -2066,7 +2149,8 @@ void acpi_walk_dep_device_list(acpi_hand adev->dep_unmet--; if (!adev->dep_unmet) - acpi_bus_attach(adev); + acpi_bus_attach(adev, true); + list_del(&dep->node); kfree(dep); } @@ -2091,17 +2175,32 @@ EXPORT_SYMBOL_GPL(acpi_walk_dep_device_l */ int acpi_bus_scan(acpi_handle handle) { - void *device = NULL; + struct acpi_device *device = NULL; + + /* Pass 1: Avoid enumerating devices with missing dependencies. */ - if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device))) + if (ACPI_SUCCESS(acpi_bus_check_add(handle, true, &device))) acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX, - acpi_bus_check_add, NULL, NULL, &device); + acpi_bus_check_add_1, NULL, NULL, + (void **)&device); - if (device) { - acpi_bus_attach(device); - return 0; - } - return -ENODEV; + if (!device) + return -ENODEV; + + acpi_bus_attach(device, true); + + /* Pass 2: Enumerate all of the remaining devices. */ + + device = NULL; + + if (ACPI_SUCCESS(acpi_bus_check_add(handle, false, &device))) + acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX, + acpi_bus_check_add_2, NULL, NULL, + (void **)&device); + + acpi_bus_attach(device, false); + + return 0; } EXPORT_SYMBOL(acpi_bus_scan);
Hi, On 12/5/20 4:44 PM, Rafael J. Wysocki wrote: > On Thursday, December 3, 2020 3:27:27 PM CET Rafael J. Wysocki wrote: >> On Thu, Dec 3, 2020 at 10:53 AM Hans de Goede <hdegoede@redhat.com> wrote: > > [cut] > >>> That indeed is not necessary if you take the entire set and always enable the >>> new behavior instead of using the module option. I guess we could go that route >>> for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum >>> testing. > > I'd prefer to do the whole thing in 5.11-rc and possibly revert something if it > causes problems to occur. Ok, that is you call to make, so that is fine with me. >>> Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] >>> thing and the module option and simply always uses the new behavior? >> >> No, something else. Stay tuned. > > The patch below illustrates what I'd like to do. It at least doesn't kill my > test-bed system, but also it doesn't cause the enumeration ordering to change > on that system. > > It really is three patches in one and (again) I borrowed some code from your > patches in the $subject series. Borrowing some of my code is fine, no worries about that. I'm happy as some fix for this gets upstream in some form :) > [It is on top of the "ACPI: scan: Add PNP0D80 > to the _DEP exceptions list" patch I've just posted.] > > > Please let me know what you think. I think this should work fine. I've 2 small remarks inline below, but nothing big / important. My list of kernel things to look into is longer then my available time (something which I assume you are familiar with), so for now I've only taken a good look at your proposed solution and not actually tested it. Let me know if you want me to give this a spin on the device with the _HID issue as is, or if you have something closer to a final version ready which you want me to test. > --- > drivers/acpi/scan.c | 141 ++++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 120 insertions(+), 21 deletions(-) > > Index: linux-pm/drivers/acpi/scan.c > =================================================================== > --- linux-pm.orig/drivers/acpi/scan.c > +++ linux-pm/drivers/acpi/scan.c > @@ -1646,17 +1646,77 @@ void acpi_device_add_finalize(struct acp > kobject_uevent(&device->dev.kobj, KOBJ_ADD); > } > > -static int acpi_add_single_object(struct acpi_device **child, > - acpi_handle handle, int type, > - unsigned long long sta) > +/* > + * List of IDs for which we defer enumeration them to the second pass, because > + * some of their methods used during addition depend on OpRegions registered by > + * the drivers for other ACPI devices. > + */ > +static const char * const acpi_defer_enumeration_ids[] = { > + "BCM2E5B", /* Acer SW3-016 bluetooth HID vs GPIO OpRegs */ > + NULL > +}; Note that since you defer if there are unmet _DEP-s, this won't be necessary: This list was purely there as a safer way to select devices which to defer, the kernel cmdline option in my patch-set would switch between either using this list, or checking _DEP. Since we are going to fully go for using _DEP now, this can be dropped. > + > +static bool acpi_should_defer_enumeration(acpi_handle handle, > + struct acpi_device_info *info) > +{ > + struct acpi_handle_list dep_devices; > + acpi_status status; > + int i; > + > + if (acpi_info_matches_ids(info, acpi_defer_enumeration_ids)) > + return true; > + > + /* > + * We check for _HID here to avoid deferring the enumeration of: > + * 1. PCI devices > + * 2. ACPI nodes describing USB ports > + * However, checking for _HID catches more then just these cases ... > + */ > + if (!(info->valid & ACPI_VALID_HID)) > + return false; Interesting approach (using _HID), I went with _ADR since _ADR indicates (more or less) that the ACPI fwnode is a companion node for a device which will be enumerated through other means (such as PCI devices), rather then one where the ACPI code will instantiate a platform_device, or i2c_client (etc) for it. Maybe if we want to play it safe check both, and if there either is no HID, or there is an ADR do not defer ? Note just thinking out loud here, I'm fine with either approach. > + > + status = acpi_evaluate_reference(handle, "_DEP", NULL, &dep_devices); > + if (ACPI_FAILURE(status)) > + return false; > + > + for (i = 0; i < dep_devices.count; i++) { > + struct acpi_device_info *dep_info; > + bool ignore; > + > + status = acpi_get_object_info(dep_devices.handles[i], &dep_info); > + if (ACPI_FAILURE(status)) > + continue; > + > + ignore = acpi_info_matches_ids(dep_info, acpi_ignore_dep_ids); > + kfree(dep_info); > + > + if (ignore) > + continue; > + > + return true; > + } > + > + return false; > +} > + > +static int __acpi_add_single_object(struct acpi_device **child, > + acpi_handle handle, int type, > + unsigned long long sta, bool check_dep) > { > int result; > struct acpi_device *device; > struct acpi_buffer buffer = { ACPI_ALLOCATE_BUFFER, NULL }; > struct acpi_device_info *info = NULL; > > - if (handle != ACPI_ROOT_OBJECT && type == ACPI_BUS_TYPE_DEVICE) > + if (handle != ACPI_ROOT_OBJECT && type == ACPI_BUS_TYPE_DEVICE) { > acpi_get_object_info(handle, &info); > + if (check_dep && info && > + acpi_should_defer_enumeration(handle, info)) { > + kfree(info); > + acpi_handle_info(handle, "Missing dependencies\n"); > + return -EAGAIN; > + } > + } > > device = kzalloc(sizeof(struct acpi_device), GFP_KERNEL); > if (!device) { > @@ -1696,6 +1756,13 @@ static int acpi_add_single_object(struct > return 0; > } > > +static int acpi_add_single_object(struct acpi_device **child, > + acpi_handle handle, int type, > + unsigned long long sta) > +{ > + return __acpi_add_single_object(child, handle, type, sta, false); > +} > + > static acpi_status acpi_get_resource_memory(struct acpi_resource *ares, > void *context) > { > @@ -1892,8 +1959,8 @@ static void acpi_device_dep_initialize(s > } > } > > -static acpi_status acpi_bus_check_add(acpi_handle handle, u32 lvl_not_used, > - void *not_used, void **return_value) > +static acpi_status acpi_bus_check_add(acpi_handle handle, bool check_dep, > + struct acpi_device **adev_p) > { > struct acpi_device *device = NULL; > int type; > @@ -1913,7 +1980,7 @@ static acpi_status acpi_bus_check_add(ac > return AE_OK; > } > > - acpi_add_single_object(&device, handle, type, sta); > + __acpi_add_single_object(&device, handle, type, sta, check_dep); > if (!device) > return AE_CTRL_DEPTH; > > @@ -1921,12 +1988,24 @@ static acpi_status acpi_bus_check_add(ac > acpi_device_dep_initialize(device); > > out: > - if (!*return_value) > - *return_value = device; > + if (!*adev_p) > + *adev_p = device; > > return AE_OK; > } > > +static acpi_status acpi_bus_check_add_1(acpi_handle handle, u32 lvl_not_used, > + void *not_used, void **ret_p) > +{ > + return acpi_bus_check_add(handle, true, (struct acpi_device **)ret_p); > +} > + > +static acpi_status acpi_bus_check_add_2(acpi_handle handle, u32 lvl_not_used, > + void *not_used, void **ret_p) > +{ > + return acpi_bus_check_add(handle, false, (struct acpi_device **)ret_p); > +} > + > static void acpi_default_enumeration(struct acpi_device *device) > { > /* > @@ -1994,12 +2073,16 @@ static int acpi_scan_attach_handler(stru > return ret; > } > > -static void acpi_bus_attach(struct acpi_device *device) > +static void acpi_bus_attach(struct acpi_device *device, bool first_pass) > { > struct acpi_device *child; > + bool skip = !first_pass && device->flags.visited; > acpi_handle ejd; > int ret; > > + if (skip) > + goto ok; > + > if (ACPI_SUCCESS(acpi_bus_get_ejd(device->handle, &ejd))) > register_dock_dependent_device(device, ejd); > > @@ -2046,9 +2129,9 @@ static void acpi_bus_attach(struct acpi_ > > ok: > list_for_each_entry(child, &device->children, node) > - acpi_bus_attach(child); > + acpi_bus_attach(child, first_pass); > > - if (device->handler && device->handler->hotplug.notify_online) > + if (!skip && device->handler && device->handler->hotplug.notify_online) > device->handler->hotplug.notify_online(device); > } > > @@ -2066,7 +2149,8 @@ void acpi_walk_dep_device_list(acpi_hand > > adev->dep_unmet--; > if (!adev->dep_unmet) > - acpi_bus_attach(adev); > + acpi_bus_attach(adev, true); > + > list_del(&dep->node); > kfree(dep); > } > @@ -2091,17 +2175,32 @@ EXPORT_SYMBOL_GPL(acpi_walk_dep_device_l > */ > int acpi_bus_scan(acpi_handle handle) > { > - void *device = NULL; > + struct acpi_device *device = NULL; > + > + /* Pass 1: Avoid enumerating devices with missing dependencies. */ > > - if (ACPI_SUCCESS(acpi_bus_check_add(handle, 0, NULL, &device))) > + if (ACPI_SUCCESS(acpi_bus_check_add(handle, true, &device))) > acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX, > - acpi_bus_check_add, NULL, NULL, &device); > + acpi_bus_check_add_1, NULL, NULL, > + (void **)&device); > > - if (device) { > - acpi_bus_attach(device); > - return 0; > - } > - return -ENODEV; > + if (!device) > + return -ENODEV; > + > + acpi_bus_attach(device, true); > + > + /* Pass 2: Enumerate all of the remaining devices. */ > + > + device = NULL; > + > + if (ACPI_SUCCESS(acpi_bus_check_add(handle, false, &device))) > + acpi_walk_namespace(ACPI_TYPE_ANY, handle, ACPI_UINT32_MAX, > + acpi_bus_check_add_2, NULL, NULL, > + (void **)&device); > + > + acpi_bus_attach(device, false); > + > + return 0; > } > EXPORT_SYMBOL(acpi_bus_scan); Regards, Hans
On Sat, Dec 5, 2020 at 6:03 PM Hans de Goede <hdegoede@redhat.com> wrote: > > Hi, > > On 12/5/20 4:44 PM, Rafael J. Wysocki wrote: > > On Thursday, December 3, 2020 3:27:27 PM CET Rafael J. Wysocki wrote: > >> On Thu, Dec 3, 2020 at 10:53 AM Hans de Goede <hdegoede@redhat.com> wrote: > > > > [cut] > > > >>> That indeed is not necessary if you take the entire set and always enable the > >>> new behavior instead of using the module option. I guess we could go that route > >>> for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum > >>> testing. > > > > I'd prefer to do the whole thing in 5.11-rc and possibly revert something if it > > causes problems to occur. > > Ok, that is you call to make, so that is fine with me. > > >>> Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] > >>> thing and the module option and simply always uses the new behavior? > >> > >> No, something else. Stay tuned. > > > > The patch below illustrates what I'd like to do. It at least doesn't kill my > > test-bed system, but also it doesn't cause the enumeration ordering to change > > on that system. > > > > It really is three patches in one and (again) I borrowed some code from your > > patches in the $subject series. > > Borrowing some of my code is fine, no worries about that. I'm happy as some > fix for this gets upstream in some form :) > > > [It is on top of the "ACPI: scan: Add PNP0D80 > > to the _DEP exceptions list" patch I've just posted.] > > > > > > Please let me know what you think. > > I think this should work fine. I've 2 small remarks inline below, but nothing > big / important. > > My list of kernel things to look into is longer then my available time > (something which I assume you are familiar with), so for now I've only taken > a good look at your proposed solution and not actually tested it. > > Let me know if you want me to give this a spin on the device with the _HID > issue as is, or if you have something closer to a final version ready > which you want me to test. I will, thanks! > > --- > > drivers/acpi/scan.c | 141 ++++++++++++++++++++++++++++++++++++++++++++-------- > > 1 file changed, 120 insertions(+), 21 deletions(-) > > > > Index: linux-pm/drivers/acpi/scan.c > > =================================================================== > > --- linux-pm.orig/drivers/acpi/scan.c > > +++ linux-pm/drivers/acpi/scan.c > > @@ -1646,17 +1646,77 @@ void acpi_device_add_finalize(struct acp > > kobject_uevent(&device->dev.kobj, KOBJ_ADD); > > } > > > > -static int acpi_add_single_object(struct acpi_device **child, > > - acpi_handle handle, int type, > > - unsigned long long sta) > > +/* > > + * List of IDs for which we defer enumeration them to the second pass, because > > + * some of their methods used during addition depend on OpRegions registered by > > + * the drivers for other ACPI devices. > > + */ > > +static const char * const acpi_defer_enumeration_ids[] = { > > + "BCM2E5B", /* Acer SW3-016 bluetooth HID vs GPIO OpRegs */ > > + NULL > > +}; > > Note that since you defer if there are unmet _DEP-s, this won't be necessary: > > This list was purely there as a safer way to select devices which to defer, > the kernel cmdline option in my patch-set would switch between either using > this list, or checking _DEP. Since we are going to fully go for using _DEP > now, this can be dropped. OK If that is the case, I'd prefer to check the _DEP list even earlier, possibly in acpi_bus_check_add(), so as to avoid having to evaluate _HID or _CID for devices with non-trivial _DEP lists (after taking acpi_ignore_dep_ids[] into account) in the first pass. > > + > > +static bool acpi_should_defer_enumeration(acpi_handle handle, > > + struct acpi_device_info *info) > > +{ > > + struct acpi_handle_list dep_devices; > > + acpi_status status; > > + int i; > > + > > + if (acpi_info_matches_ids(info, acpi_defer_enumeration_ids)) > > + return true; > > + > > + /* > > + * We check for _HID here to avoid deferring the enumeration of: > > + * 1. PCI devices > > + * 2. ACPI nodes describing USB ports > > + * However, checking for _HID catches more then just these cases ... > > + */ > > + if (!(info->valid & ACPI_VALID_HID)) > > + return false; > > Interesting approach (using _HID), I went with _ADR since _ADR indicates > (more or less) that the ACPI fwnode is a companion node for a device > which will be enumerated through other means (such as PCI devices), rather > then one where the ACPI code will instantiate a platform_device, or > i2c_client (etc) for it. > > Maybe if we want to play it safe check both, and if there either is no > HID, or there is an ADR do not defer ? Note just thinking out loud here, > I'm fine with either approach. By the spec checking _HID should be equivalent to checking _ADR (Section 6.1 of ACPI 6.3 says "A device object must contain either an _HID object or an _ADR, but should not contain both"), but the presence of _HID indicates that the device is expected to be enumerated via ACPI and so _DEP is more likely to really matter IMV.
Hi, On 12/7/20 6:27 PM, Rafael J. Wysocki wrote: > On Sat, Dec 5, 2020 at 6:03 PM Hans de Goede <hdegoede@redhat.com> wrote: >> >> Hi, >> >> On 12/5/20 4:44 PM, Rafael J. Wysocki wrote: >>> On Thursday, December 3, 2020 3:27:27 PM CET Rafael J. Wysocki wrote: >>>> On Thu, Dec 3, 2020 at 10:53 AM Hans de Goede <hdegoede@redhat.com> wrote: >>> >>> [cut] >>> >>>>> That indeed is not necessary if you take the entire set and always enable the >>>>> new behavior instead of using the module option. I guess we could go that route >>>>> for 5.12, and get it into next as soon as 5.11-rc1 is available for maximum >>>>> testing. >>> >>> I'd prefer to do the whole thing in 5.11-rc and possibly revert something if it >>> causes problems to occur. >> >> Ok, that is you call to make, so that is fine with me. >> >>>>> Do you want me to do a new version of the series which drops the acpi_defer_add_hids[] >>>>> thing and the module option and simply always uses the new behavior? >>>> >>>> No, something else. Stay tuned. >>> >>> The patch below illustrates what I'd like to do. It at least doesn't kill my >>> test-bed system, but also it doesn't cause the enumeration ordering to change >>> on that system. >>> >>> It really is three patches in one and (again) I borrowed some code from your >>> patches in the $subject series. >> >> Borrowing some of my code is fine, no worries about that. I'm happy as some >> fix for this gets upstream in some form :) >> >>> [It is on top of the "ACPI: scan: Add PNP0D80 >>> to the _DEP exceptions list" patch I've just posted.] >>> >>> >>> Please let me know what you think. >> >> I think this should work fine. I've 2 small remarks inline below, but nothing >> big / important. >> >> My list of kernel things to look into is longer then my available time >> (something which I assume you are familiar with), so for now I've only taken >> a good look at your proposed solution and not actually tested it. >> >> Let me know if you want me to give this a spin on the device with the _HID >> issue as is, or if you have something closer to a final version ready >> which you want me to test. > > I will, thanks! > >>> --- >>> drivers/acpi/scan.c | 141 ++++++++++++++++++++++++++++++++++++++++++++-------- >>> 1 file changed, 120 insertions(+), 21 deletions(-) >>> >>> Index: linux-pm/drivers/acpi/scan.c >>> =================================================================== >>> --- linux-pm.orig/drivers/acpi/scan.c >>> +++ linux-pm/drivers/acpi/scan.c >>> @@ -1646,17 +1646,77 @@ void acpi_device_add_finalize(struct acp >>> kobject_uevent(&device->dev.kobj, KOBJ_ADD); >>> } >>> >>> -static int acpi_add_single_object(struct acpi_device **child, >>> - acpi_handle handle, int type, >>> - unsigned long long sta) >>> +/* >>> + * List of IDs for which we defer enumeration them to the second pass, because >>> + * some of their methods used during addition depend on OpRegions registered by >>> + * the drivers for other ACPI devices. >>> + */ >>> +static const char * const acpi_defer_enumeration_ids[] = { >>> + "BCM2E5B", /* Acer SW3-016 bluetooth HID vs GPIO OpRegs */ >>> + NULL >>> +}; >> >> Note that since you defer if there are unmet _DEP-s, this won't be necessary: >> >> This list was purely there as a safer way to select devices which to defer, >> the kernel cmdline option in my patch-set would switch between either using >> this list, or checking _DEP. Since we are going to fully go for using _DEP >> now, this can be dropped. > > OK > > If that is the case, I'd prefer to check the _DEP list even earlier, > possibly in acpi_bus_check_add(), so as to avoid having to evaluate > _HID or _CID for devices with non-trivial _DEP lists (after taking > acpi_ignore_dep_ids[] into account) in the first pass. That is fine by me. Note that in my non scientific measurement the slowdown of my patch (with the cmdline option set to using _DEP as base of the decision to defer or not) was almost non measurable (seemed to be about 10ms) on a low-power Cherry Trail system. So I don't think that we need to worry about optimizing this too much. We currently do evaluate _HID and _CID of all the deps repeatedly, typically only a few devices are used as deps for most other devices. So a possible future optimization would be an acpi_device_info cache (mapping handle-s to device_info) for devices listed as _DEP. But again, I don't think we need to worry too much about performance here. >>> + >>> +static bool acpi_should_defer_enumeration(acpi_handle handle, >>> + struct acpi_device_info *info) >>> +{ >>> + struct acpi_handle_list dep_devices; >>> + acpi_status status; >>> + int i; >>> + >>> + if (acpi_info_matches_ids(info, acpi_defer_enumeration_ids)) >>> + return true; >>> + >>> + /* >>> + * We check for _HID here to avoid deferring the enumeration of: >>> + * 1. PCI devices >>> + * 2. ACPI nodes describing USB ports >>> + * However, checking for _HID catches more then just these cases ... >>> + */ >>> + if (!(info->valid & ACPI_VALID_HID)) >>> + return false; >> >> Interesting approach (using _HID), I went with _ADR since _ADR indicates >> (more or less) that the ACPI fwnode is a companion node for a device >> which will be enumerated through other means (such as PCI devices), rather >> then one where the ACPI code will instantiate a platform_device, or >> i2c_client (etc) for it. >> >> Maybe if we want to play it safe check both, and if there either is no >> HID, or there is an ADR do not defer ? Note just thinking out loud here, >> I'm fine with either approach. > > By the spec checking _HID should be equivalent to checking _ADR > (Section 6.1 of ACPI 6.3 says "A device object must contain either an > _HID object or an _ADR, but should not contain both"), but the > presence of _HID indicates that the device is expected to be > enumerated via ACPI and so _DEP is more likely to really matter IMV. Ah, I see I did not know about the either a HID or an ADR rule. I just needed something available in acpi_device_info with which I could skip PCI devices. So I ended up picking ADR, if you prefer HID that works for me. Regards, Hans
linux 5.11 rc1 "ACPI: scan: Defer enumeration of devices with _DEP lists" cause my v891w z3735f tablet boot failed, there is only one cursor in the upper left corner. linux 5.11 rc5 "ACPI: scan: Make acpi_bus_get_device() clear return pointer on error" fixed this problem.