Message ID | 20180608084115.19242-1-m.szyprowski@samsung.com |
---|---|
State | New |
Headers | show |
Series | PM / runtime: Fix state of suppliers of links created during consumer probe | expand |
On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > When device links with DL_FLAG_PM_RUNTIME are created during consumer > probe, the supplier device is unconditionally runtime activated by > pm_runtime_get_sync(). However this get() is not noticed by > rpm_put_suppliers() called at the end of probe(), because the newly > created link lacks a rpm_active state bit. This worked when probe() called > pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch > 1e8378619841 ("PM / runtime: Fixup reference counting of device > link suppliers at probe") changed probe() to rely on the generic > rpm_put_suppliers() code. This patch updates the newly created link state > to proper value. > > Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") > Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> > --- > This fixes Exynos IOMMU driver being unconditionally set to runtime active > state since the mentioned commit. > --- > drivers/base/core.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/base/core.c b/drivers/base/core.c > index ad7b50897bcc..8ffd353f40ce 100644 > --- a/drivers/base/core.c > +++ b/drivers/base/core.c > @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, > * runtime PM usage counter after consumer probe > * in driver_probe_device(). > */ > - if (flags & DL_FLAG_PM_RUNTIME) > + if (flags & DL_FLAG_PM_RUNTIME) { A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? > pm_runtime_get_sync(supplier); > + link->rpm_active = true; > + } > > link->status = DL_STATE_CONSUMER_PROBE; > break; > -- > 2.17.1 > This seems to correct to me! Thanks for the fix! Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Kind regards Uffe
Hi Ulf, On 2018-06-08 11:29, Ulf Hansson wrote: > On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> When device links with DL_FLAG_PM_RUNTIME are created during consumer >> probe, the supplier device is unconditionally runtime activated by >> pm_runtime_get_sync(). However this get() is not noticed by >> rpm_put_suppliers() called at the end of probe(), because the newly >> created link lacks a rpm_active state bit. This worked when probe() called >> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >> 1e8378619841 ("PM / runtime: Fixup reference counting of device >> link suppliers at probe") changed probe() to rely on the generic >> rpm_put_suppliers() code. This patch updates the newly created link state >> to proper value. >> >> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >> --- >> This fixes Exynos IOMMU driver being unconditionally set to runtime active >> state since the mentioned commit. >> --- >> drivers/base/core.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/base/core.c b/drivers/base/core.c >> index ad7b50897bcc..8ffd353f40ce 100644 >> --- a/drivers/base/core.c >> +++ b/drivers/base/core.c >> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >> * runtime PM usage counter after consumer probe >> * in driver_probe_device(). >> */ >> - if (flags & DL_FLAG_PM_RUNTIME) >> + if (flags & DL_FLAG_PM_RUNTIME) { > A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? > >> pm_runtime_get_sync(supplier); >> + link->rpm_active = true; >> + } >> >> link->status = DL_STATE_CONSUMER_PROBE; >> break; >> -- >> 2.17.1 >> > This seems to correct to me! Thanks for the fix! > > Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> Now I've investigated this issue further and there is still a regression caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") commit in the following case: There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function and does nothing more, because it only registers the device in the system and doesn't touch any hardware at all. IOMMU gets resumed during jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept runtime active, because rpm_idle() returns early with -EAGAIN and doesn't call __rpm_callback()/rpm_put_suppliers(). This is valid until the first use of jpeg-codec (so the first transition of jpeg-codec device from runtime active to runtime suspended state). Do you have any suggestion how this regression can be solved? I'm really curious if there was any case which got fixed the the mentioned commit? Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > Hi Ulf, > > On 2018-06-08 11:29, Ulf Hansson wrote: >> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>> probe, the supplier device is unconditionally runtime activated by >>> pm_runtime_get_sync(). However this get() is not noticed by >>> rpm_put_suppliers() called at the end of probe(), because the newly >>> created link lacks a rpm_active state bit. This worked when probe() called >>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>> link suppliers at probe") changed probe() to rely on the generic >>> rpm_put_suppliers() code. This patch updates the newly created link state >>> to proper value. >>> >>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>> --- >>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>> state since the mentioned commit. >>> --- >>> drivers/base/core.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>> index ad7b50897bcc..8ffd353f40ce 100644 >>> --- a/drivers/base/core.c >>> +++ b/drivers/base/core.c >>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>> * runtime PM usage counter after consumer probe >>> * in driver_probe_device(). >>> */ >>> - if (flags & DL_FLAG_PM_RUNTIME) >>> + if (flags & DL_FLAG_PM_RUNTIME) { >> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >> >>> pm_runtime_get_sync(supplier); >>> + link->rpm_active = true; >>> + } >>> >>> link->status = DL_STATE_CONSUMER_PROBE; >>> break; >>> -- >>> 2.17.1 >>> >> This seems to correct to me! Thanks for the fix! >> >> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> > > Now I've investigated this issue further and there is still a regression > caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device > link suppliers at probe") commit in the following case: > > There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer > (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() > procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function > and does nothing more, because it only registers the device in the system > and doesn't touch any hardware at all. IOMMU gets resumed during > jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept > runtime active, because rpm_idle() returns early with -EAGAIN and doesn't > call __rpm_callback()/rpm_put_suppliers(). This is valid until the first > use of jpeg-codec (so the first transition of jpeg-codec device from > runtime active to runtime suspended state). > > Do you have any suggestion how this regression can be solved? I'm really > curious if there was any case which got fixed the the mentioned commit? Having had a deeper look into this code I'm now thinking that I should revert the Ulf's commit, because the case mentioned in its changelog should be covered exactly by the condition modified by your patch. Thanks, Rafael
+ Jon, Todor On 12 June 2018 at 09:10, Marek Szyprowski <m.szyprowski@samsung.com> wrote: > Hi Ulf, > > On 2018-06-08 11:29, Ulf Hansson wrote: >> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>> probe, the supplier device is unconditionally runtime activated by >>> pm_runtime_get_sync(). However this get() is not noticed by >>> rpm_put_suppliers() called at the end of probe(), because the newly >>> created link lacks a rpm_active state bit. This worked when probe() called >>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>> link suppliers at probe") changed probe() to rely on the generic >>> rpm_put_suppliers() code. This patch updates the newly created link state >>> to proper value. >>> >>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>> --- >>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>> state since the mentioned commit. >>> --- >>> drivers/base/core.c | 4 +++- >>> 1 file changed, 3 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>> index ad7b50897bcc..8ffd353f40ce 100644 >>> --- a/drivers/base/core.c >>> +++ b/drivers/base/core.c >>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>> * runtime PM usage counter after consumer probe >>> * in driver_probe_device(). >>> */ >>> - if (flags & DL_FLAG_PM_RUNTIME) >>> + if (flags & DL_FLAG_PM_RUNTIME) { >> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >> >>> pm_runtime_get_sync(supplier); >>> + link->rpm_active = true; >>> + } >>> >>> link->status = DL_STATE_CONSUMER_PROBE; >>> break; >>> -- >>> 2.17.1 >>> >> This seems to correct to me! Thanks for the fix! >> >> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> > > Now I've investigated this issue further and there is still a regression > caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device > link suppliers at probe") commit in the following case: > > There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer > (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() > procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function > and does nothing more, because it only registers the device in the system > and doesn't touch any hardware at all. IOMMU gets resumed during > jpeg-codec probe(). Right, so the IOMMU is runtime resumed because of the call to pm_runtime_get_sync(supplier) above. This happened already before. According to your information, it seems like this is actually unnecessary and it would be nice if it could be avoided. > After the mentioned patch (and my fix), IOMMU is kept > runtime active, because rpm_idle() returns early with -EAGAIN and doesn't > call __rpm_callback()/rpm_put_suppliers(). Yes, this is because the runtime PM usage count is now correctly balanced. In other words, the earlier call to pm_runtime_get_sync(supplier), keeps the IOMMU active. > This is valid until the first > use of jpeg-codec (so the first transition of jpeg-codec device from > runtime active to runtime suspended state). Yep, I get it. Thanks for the detailed description! > > Do you have any suggestion how this regression can be solved? I'm really > curious if there was any case which got fixed by the mentioned commit? So after a second thought, it seems like we should just drop the call to pm_runtime_get_sync(supplier) above altogether. It's already been done in case the the DL_FLAG_PM_RUNTIME and DL_FLAG_RPM_ACTIVE flags are set. I assume you don't have DL_FLAG_RPM_ACTIVE? It may be that the commit introduces regressions, and thanks for reporting and helping out. In my tests (just using a test driver), I noticed the problems. And if I recall correctly, so did Jon and Todor. However, yes, it's need to be able to correctly manages created device links during ->probe(). Kind regards Uffe
On 12 June 2018 at 10:22, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski > <m.szyprowski@samsung.com> wrote: >> Hi Ulf, >> >> On 2018-06-08 11:29, Ulf Hansson wrote: >>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>> probe, the supplier device is unconditionally runtime activated by >>>> pm_runtime_get_sync(). However this get() is not noticed by >>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>> created link lacks a rpm_active state bit. This worked when probe() called >>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>> link suppliers at probe") changed probe() to rely on the generic >>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>> to proper value. >>>> >>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>> --- >>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>> state since the mentioned commit. >>>> --- >>>> drivers/base/core.c | 4 +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>> --- a/drivers/base/core.c >>>> +++ b/drivers/base/core.c >>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>> * runtime PM usage counter after consumer probe >>>> * in driver_probe_device(). >>>> */ >>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>> >>>> pm_runtime_get_sync(supplier); >>>> + link->rpm_active = true; >>>> + } >>>> >>>> link->status = DL_STATE_CONSUMER_PROBE; >>>> break; >>>> -- >>>> 2.17.1 >>>> >>> This seems to correct to me! Thanks for the fix! >>> >>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >> >> Now I've investigated this issue further and there is still a regression >> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >> link suppliers at probe") commit in the following case: >> >> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >> and does nothing more, because it only registers the device in the system >> and doesn't touch any hardware at all. IOMMU gets resumed during >> jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept >> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >> call __rpm_callback()/rpm_put_suppliers(). This is valid until the first >> use of jpeg-codec (so the first transition of jpeg-codec device from >> runtime active to runtime suspended state). >> >> Do you have any suggestion how this regression can be solved? I'm really >> curious if there was any case which got fixed the the mentioned commit? > > Having had a deeper look into this code I'm now thinking that I should > revert the Ulf's commit, because the case mentioned in its changelog > should be covered exactly by the condition modified by your patch. That would break the case for DL_FLAG_STATELESS. Which is the one I have been focusing on (and clearly messed up the other case, sorry about that). Please have a look at my last reply to Marek. Actually, I think we can improve the situation compared how it were before, as we don't need to unconditionally runtime resume the supplier, but rather rely on the DL_FLAG_RPM_ACTIVE flag. Kind regards Uffe
On Tue, Jun 12, 2018 at 10:26 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > + Jon, Todor > > On 12 June 2018 at 09:10, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> Hi Ulf, >> >> On 2018-06-08 11:29, Ulf Hansson wrote: >>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>> probe, the supplier device is unconditionally runtime activated by >>>> pm_runtime_get_sync(). However this get() is not noticed by >>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>> created link lacks a rpm_active state bit. This worked when probe() called >>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>> link suppliers at probe") changed probe() to rely on the generic >>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>> to proper value. >>>> >>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>> --- >>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>> state since the mentioned commit. >>>> --- >>>> drivers/base/core.c | 4 +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>> --- a/drivers/base/core.c >>>> +++ b/drivers/base/core.c >>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>> * runtime PM usage counter after consumer probe >>>> * in driver_probe_device(). >>>> */ >>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>> >>>> pm_runtime_get_sync(supplier); >>>> + link->rpm_active = true; >>>> + } >>>> >>>> link->status = DL_STATE_CONSUMER_PROBE; >>>> break; >>>> -- >>>> 2.17.1 >>>> >>> This seems to correct to me! Thanks for the fix! >>> >>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >> >> Now I've investigated this issue further and there is still a regression >> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >> link suppliers at probe") commit in the following case: >> >> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >> and does nothing more, because it only registers the device in the system >> and doesn't touch any hardware at all. IOMMU gets resumed during >> jpeg-codec probe(). > > Right, so the IOMMU is runtime resumed because of the call to > pm_runtime_get_sync(supplier) above. This happened already before. > > According to your information, it seems like this is actually > unnecessary and it would be nice if it could be avoided. > >> After the mentioned patch (and my fix), IOMMU is kept >> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >> call __rpm_callback()/rpm_put_suppliers(). > > Yes, this is because the runtime PM usage count is now correctly > balanced. In other words, the earlier call to > pm_runtime_get_sync(supplier), keeps the IOMMU active. > >> This is valid until the first >> use of jpeg-codec (so the first transition of jpeg-codec device from >> runtime active to runtime suspended state). > > Yep, I get it. Thanks for the detailed description! > >> >> Do you have any suggestion how this regression can be solved? I'm really >> curious if there was any case which got fixed by the mentioned commit? > > So after a second thought, it seems like we should just drop the call > to pm_runtime_get_sync(supplier) above altogether. It's already been > done in case the the DL_FLAG_PM_RUNTIME and DL_FLAG_RPM_ACTIVE flags > are set. The problem is that nobody sets DL_FLAG_RPM_ACTIVE and relying on that to get the refcounts right would be fragile anyway. > I assume you don't have DL_FLAG_RPM_ACTIVE? > > It may be that the commit introduces regressions, and thanks for > reporting and helping out. > > In my tests (just using a test driver), I noticed the problems. And if > I recall correctly, so did Jon and Todor. So actually I'd like to hear about them in the first place. > However, yes, it's need to be able to correctly manages created device > links during ->probe(). Right. So I'm reverting commit 1e8378619841 and lets start again. Thanks, Rafael
On Tue, Jun 12, 2018 at 10:31 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 12 June 2018 at 10:22, Rafael J. Wysocki <rafael@kernel.org> wrote: >> On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski >> <m.szyprowski@samsung.com> wrote: >>> Hi Ulf, >>> >>> On 2018-06-08 11:29, Ulf Hansson wrote: >>>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>>> probe, the supplier device is unconditionally runtime activated by >>>>> pm_runtime_get_sync(). However this get() is not noticed by >>>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>>> created link lacks a rpm_active state bit. This worked when probe() called >>>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>> link suppliers at probe") changed probe() to rely on the generic >>>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>>> to proper value. >>>>> >>>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>> --- >>>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>>> state since the mentioned commit. >>>>> --- >>>>> drivers/base/core.c | 4 +++- >>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>>> --- a/drivers/base/core.c >>>>> +++ b/drivers/base/core.c >>>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>>> * runtime PM usage counter after consumer probe >>>>> * in driver_probe_device(). >>>>> */ >>>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>>> >>>>> pm_runtime_get_sync(supplier); >>>>> + link->rpm_active = true; >>>>> + } >>>>> >>>>> link->status = DL_STATE_CONSUMER_PROBE; >>>>> break; >>>>> -- >>>>> 2.17.1 >>>>> >>>> This seems to correct to me! Thanks for the fix! >>>> >>>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >>> >>> Now I've investigated this issue further and there is still a regression >>> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >>> link suppliers at probe") commit in the following case: >>> >>> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >>> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >>> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >>> and does nothing more, because it only registers the device in the system >>> and doesn't touch any hardware at all. IOMMU gets resumed during >>> jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept >>> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >>> call __rpm_callback()/rpm_put_suppliers(). This is valid until the first >>> use of jpeg-codec (so the first transition of jpeg-codec device from >>> runtime active to runtime suspended state). >>> >>> Do you have any suggestion how this regression can be solved? I'm really >>> curious if there was any case which got fixed the the mentioned commit? >> >> Having had a deeper look into this code I'm now thinking that I should >> revert the Ulf's commit, because the case mentioned in its changelog >> should be covered exactly by the condition modified by your patch. > > That would break the case for DL_FLAG_STATELESS. Which is the one I > have been focusing on (and clearly messed up the other case, sorry > about that). Well, so why don't we start over? This time with a clear problem description, please? > Please have a look at my last reply to Marek. Actually, I think we can > improve the situation compared how it were before, as we don't need to > unconditionally runtime resume the supplier, but rather rely on the > DL_FLAG_RPM_ACTIVE flag. I have read it. Thanks, Rafael
On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Tue, Jun 12, 2018 at 10:31 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >> On 12 June 2018 at 10:22, Rafael J. Wysocki <rafael@kernel.org> wrote: >>> On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski >>> <m.szyprowski@samsung.com> wrote: >>>> Hi Ulf, >>>> >>>> On 2018-06-08 11:29, Ulf Hansson wrote: >>>>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>>>> probe, the supplier device is unconditionally runtime activated by >>>>>> pm_runtime_get_sync(). However this get() is not noticed by >>>>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>>>> created link lacks a rpm_active state bit. This worked when probe() called >>>>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>>> link suppliers at probe") changed probe() to rely on the generic >>>>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>>>> to proper value. >>>>>> >>>>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>>> --- >>>>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>>>> state since the mentioned commit. >>>>>> --- >>>>>> drivers/base/core.c | 4 +++- >>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>>>> --- a/drivers/base/core.c >>>>>> +++ b/drivers/base/core.c >>>>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>>>> * runtime PM usage counter after consumer probe >>>>>> * in driver_probe_device(). >>>>>> */ >>>>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>>>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>>>> >>>>>> pm_runtime_get_sync(supplier); >>>>>> + link->rpm_active = true; >>>>>> + } >>>>>> >>>>>> link->status = DL_STATE_CONSUMER_PROBE; >>>>>> break; >>>>>> -- >>>>>> 2.17.1 >>>>>> >>>>> This seems to correct to me! Thanks for the fix! >>>>> >>>>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >>>> >>>> Now I've investigated this issue further and there is still a regression >>>> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>> link suppliers at probe") commit in the following case: >>>> >>>> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >>>> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >>>> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >>>> and does nothing more, because it only registers the device in the system >>>> and doesn't touch any hardware at all. IOMMU gets resumed during >>>> jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept >>>> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >>>> call __rpm_callback()/rpm_put_suppliers(). This is valid until the first >>>> use of jpeg-codec (so the first transition of jpeg-codec device from >>>> runtime active to runtime suspended state). >>>> >>>> Do you have any suggestion how this regression can be solved? I'm really >>>> curious if there was any case which got fixed the the mentioned commit? >>> >>> Having had a deeper look into this code I'm now thinking that I should >>> revert the Ulf's commit, because the case mentioned in its changelog >>> should be covered exactly by the condition modified by your patch. >> >> That would break the case for DL_FLAG_STATELESS. Which is the one I >> have been focusing on (and clearly messed up the other case, sorry >> about that). > > Well, so why don't we start over? I think it's easier with a new fix on top. Reverting it would break DL_FLAG_STATELESS, so just moving things into another broken state. But, it's up to you. [...] Kind regards Uffe
Hi Ulf, On 2018-06-12 10:26, Ulf Hansson wrote: > + Jon, Todor > > On 12 June 2018 at 09:10, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >> On 2018-06-08 11:29, Ulf Hansson wrote: >>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>> probe, the supplier device is unconditionally runtime activated by >>>> pm_runtime_get_sync(). However this get() is not noticed by >>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>> created link lacks a rpm_active state bit. This worked when probe() called >>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>> link suppliers at probe") changed probe() to rely on the generic >>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>> to proper value. >>>> >>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>> --- >>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>> state since the mentioned commit. >>>> --- >>>> drivers/base/core.c | 4 +++- >>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>> --- a/drivers/base/core.c >>>> +++ b/drivers/base/core.c >>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>> * runtime PM usage counter after consumer probe >>>> * in driver_probe_device(). >>>> */ >>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>> >>>> pm_runtime_get_sync(supplier); >>>> + link->rpm_active = true; >>>> + } >>>> >>>> link->status = DL_STATE_CONSUMER_PROBE; >>>> break; >>>> -- >>>> 2.17.1 >>>> >>> This seems to correct to me! Thanks for the fix! >>> >>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >> Now I've investigated this issue further and there is still a regression >> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >> link suppliers at probe") commit in the following case: >> >> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >> and does nothing more, because it only registers the device in the system >> and doesn't touch any hardware at all. IOMMU gets resumed during >> jpeg-codec probe(). > Right, so the IOMMU is runtime resumed because of the call to > pm_runtime_get_sync(supplier) above. This happened already before. > > According to your information, it seems like this is actually > unnecessary and it would be nice if it could be avoided. Not, it is necessary because the device driver should be able to do anything in its probe() and until it calls pm_runtime_enable() on itself, the proper resuming of parent(s) and supplier(s) are handled by PM/device core. That's my understanding of how this works. >> After the mentioned patch (and my fix), IOMMU is kept >> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >> call __rpm_callback()/rpm_put_suppliers(). > Yes, this is because the runtime PM usage count is now correctly > balanced. In other words, the earlier call to > pm_runtime_get_sync(supplier), keeps the IOMMU active. > >> This is valid until the first >> use of jpeg-codec (so the first transition of jpeg-codec device from >> runtime active to runtime suspended state). > Yep, I get it. Thanks for the detailed description! > >> Do you have any suggestion how this regression can be solved? I'm really >> curious if there was any case which got fixed by the mentioned commit? > So after a second thought, it seems like we should just drop the call > to pm_runtime_get_sync(supplier) above altogether. It's already been > done in case the the DL_FLAG_PM_RUNTIME and DL_FLAG_RPM_ACTIVE flags > are set. > > I assume you don't have DL_FLAG_RPM_ACTIVE? Right, exynos-iommu driver don't set DL_FLAG_RPM_ACTIVE, because all it wants is iommu's device runtime pm state to follow the supplier device runtime pm state. > It may be that the commit introduces regressions, and thanks for > reporting and helping out. > > In my tests (just using a test driver), I noticed the problems. And if > I recall correctly, so did Jon and Todor. > > However, yes, it's need to be able to correctly manages created device > links during ->probe(). Please CC: me if you post a patch related to device links and runtime pm handling. Best regards -- Marek Szyprowski, PhD Samsung R&D Institute Poland
On Tue, Jun 12, 2018 at 10:37 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: >> On Tue, Jun 12, 2018 at 10:31 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> On 12 June 2018 at 10:22, Rafael J. Wysocki <rafael@kernel.org> wrote: >>>> On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski >>>> <m.szyprowski@samsung.com> wrote: >>>>> Hi Ulf, >>>>> >>>>> On 2018-06-08 11:29, Ulf Hansson wrote: >>>>>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>>>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>>>>> probe, the supplier device is unconditionally runtime activated by >>>>>>> pm_runtime_get_sync(). However this get() is not noticed by >>>>>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>>>>> created link lacks a rpm_active state bit. This worked when probe() called >>>>>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>>>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>>>> link suppliers at probe") changed probe() to rely on the generic >>>>>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>>>>> to proper value. >>>>>>> >>>>>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>>>> --- >>>>>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>>>>> state since the mentioned commit. >>>>>>> --- >>>>>>> drivers/base/core.c | 4 +++- >>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>>>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>>>>> --- a/drivers/base/core.c >>>>>>> +++ b/drivers/base/core.c >>>>>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>>>>> * runtime PM usage counter after consumer probe >>>>>>> * in driver_probe_device(). >>>>>>> */ >>>>>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>>>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>>>>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>>>>> >>>>>>> pm_runtime_get_sync(supplier); >>>>>>> + link->rpm_active = true; >>>>>>> + } >>>>>>> >>>>>>> link->status = DL_STATE_CONSUMER_PROBE; >>>>>>> break; >>>>>>> -- >>>>>>> 2.17.1 >>>>>>> >>>>>> This seems to correct to me! Thanks for the fix! >>>>>> >>>>>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >>>>> >>>>> Now I've investigated this issue further and there is still a regression >>>>> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>> link suppliers at probe") commit in the following case: >>>>> >>>>> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >>>>> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >>>>> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >>>>> and does nothing more, because it only registers the device in the system >>>>> and doesn't touch any hardware at all. IOMMU gets resumed during >>>>> jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept >>>>> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >>>>> call __rpm_callback()/rpm_put_suppliers(). This is valid until the first >>>>> use of jpeg-codec (so the first transition of jpeg-codec device from >>>>> runtime active to runtime suspended state). >>>>> >>>>> Do you have any suggestion how this regression can be solved? I'm really >>>>> curious if there was any case which got fixed the the mentioned commit? >>>> >>>> Having had a deeper look into this code I'm now thinking that I should >>>> revert the Ulf's commit, because the case mentioned in its changelog >>>> should be covered exactly by the condition modified by your patch. >>> >>> That would break the case for DL_FLAG_STATELESS. Which is the one I >>> have been focusing on (and clearly messed up the other case, sorry >>> about that). >> >> Well, so why don't we start over? > > I think it's easier with a new fix on top. Reverting it would break > DL_FLAG_STATELESS, so just moving things into another broken state. Yes, the DL_FLAG_STATELESS is broken and it needs to be fixed, but I'm not really sure if the approach in your commit is the best one. Even if it is, the right way to do it is to revert things to a previous state and do a proper fix then IMO. > But, it's up to you. So as I said before, I'll revert it and let's fix the problem properly. Thanks, Rafael
[...] >>> Now I've investigated this issue further and there is still a regression >>> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >>> link suppliers at probe") commit in the following case: >>> >>> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >>> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >>> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >>> and does nothing more, because it only registers the device in the system >>> and doesn't touch any hardware at all. IOMMU gets resumed during >>> jpeg-codec probe(). >> Right, so the IOMMU is runtime resumed because of the call to >> pm_runtime_get_sync(supplier) above. This happened already before. >> >> According to your information, it seems like this is actually >> unnecessary and it would be nice if it could be avoided. > > Not, it is necessary because the device driver should be able to do anything > in its probe() and until it calls pm_runtime_enable() on itself, the proper > resuming of parent(s) and supplier(s) are handled by PM/device core. That's > my understanding of how this works. I see. So clearly, you would like the driver core to consider DL_FLAG_RPM_ACTIVE, before it runtime resumed the supplier. > >>> After the mentioned patch (and my fix), IOMMU is kept >>> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >>> call __rpm_callback()/rpm_put_suppliers(). >> Yes, this is because the runtime PM usage count is now correctly >> balanced. In other words, the earlier call to >> pm_runtime_get_sync(supplier), keeps the IOMMU active. >> >>> This is valid until the first >>> use of jpeg-codec (so the first transition of jpeg-codec device from >>> runtime active to runtime suspended state). >> Yep, I get it. Thanks for the detailed description! >> >>> Do you have any suggestion how this regression can be solved? I'm really >>> curious if there was any case which got fixed by the mentioned commit? >> So after a second thought, it seems like we should just drop the call >> to pm_runtime_get_sync(supplier) above altogether. It's already been >> done in case the the DL_FLAG_PM_RUNTIME and DL_FLAG_RPM_ACTIVE flags >> are set. >> >> I assume you don't have DL_FLAG_RPM_ACTIVE? > > Right, exynos-iommu driver don't set DL_FLAG_RPM_ACTIVE, because all it > wants > is iommu's device runtime pm state to follow the supplier device runtime pm > state. Got it, thanks! > >> It may be that the commit introduces regressions, and thanks for >> reporting and helping out. >> >> In my tests (just using a test driver), I noticed the problems. And if >> I recall correctly, so did Jon and Todor. >> >> However, yes, it's need to be able to correctly manages created device >> links during ->probe(). > > Please CC: me if you post a patch related to device links and runtime pm > handling. Yes, count on that! Kind regards Uffe
On 12 June 2018 at 10:42, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Tue, Jun 12, 2018 at 10:37 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >> On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: >>> On Tue, Jun 12, 2018 at 10:31 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >>>> On 12 June 2018 at 10:22, Rafael J. Wysocki <rafael@kernel.org> wrote: >>>>> On Tue, Jun 12, 2018 at 9:10 AM, Marek Szyprowski >>>>> <m.szyprowski@samsung.com> wrote: >>>>>> Hi Ulf, >>>>>> >>>>>> On 2018-06-08 11:29, Ulf Hansson wrote: >>>>>>> On 8 June 2018 at 10:41, Marek Szyprowski <m.szyprowski@samsung.com> wrote: >>>>>>>> When device links with DL_FLAG_PM_RUNTIME are created during consumer >>>>>>>> probe, the supplier device is unconditionally runtime activated by >>>>>>>> pm_runtime_get_sync(). However this get() is not noticed by >>>>>>>> rpm_put_suppliers() called at the end of probe(), because the newly >>>>>>>> created link lacks a rpm_active state bit. This worked when probe() called >>>>>>>> pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch >>>>>>>> 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>>>>> link suppliers at probe") changed probe() to rely on the generic >>>>>>>> rpm_put_suppliers() code. This patch updates the newly created link state >>>>>>>> to proper value. >>>>>>>> >>>>>>>> Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") >>>>>>>> Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> >>>>>>>> --- >>>>>>>> This fixes Exynos IOMMU driver being unconditionally set to runtime active >>>>>>>> state since the mentioned commit. >>>>>>>> --- >>>>>>>> drivers/base/core.c | 4 +++- >>>>>>>> 1 file changed, 3 insertions(+), 1 deletion(-) >>>>>>>> >>>>>>>> diff --git a/drivers/base/core.c b/drivers/base/core.c >>>>>>>> index ad7b50897bcc..8ffd353f40ce 100644 >>>>>>>> --- a/drivers/base/core.c >>>>>>>> +++ b/drivers/base/core.c >>>>>>>> @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, >>>>>>>> * runtime PM usage counter after consumer probe >>>>>>>> * in driver_probe_device(). >>>>>>>> */ >>>>>>>> - if (flags & DL_FLAG_PM_RUNTIME) >>>>>>>> + if (flags & DL_FLAG_PM_RUNTIME) { >>>>>>> A thought: Why isn't the DL_FLAG_RPM_ACTIVE part of the condition above? Rafael? >>>>>>> >>>>>>>> pm_runtime_get_sync(supplier); >>>>>>>> + link->rpm_active = true; >>>>>>>> + } >>>>>>>> >>>>>>>> link->status = DL_STATE_CONSUMER_PROBE; >>>>>>>> break; >>>>>>>> -- >>>>>>>> 2.17.1 >>>>>>>> >>>>>>> This seems to correct to me! Thanks for the fix! >>>>>>> >>>>>>> Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org> >>>>>> >>>>>> Now I've investigated this issue further and there is still a regression >>>>>> caused by 1e8378619841 ("PM / runtime: Fixup reference counting of device >>>>>> link suppliers at probe") commit in the following case: >>>>>> >>>>>> There are 2 devices: jpeg-codec and iommu. IOMMU is linked as a consumer >>>>>> (with DL_FLAG_PM_RUNTIME flag) to jpeg-codec during jpeg-codec probe() >>>>>> procedure. Jpeg-codec calls pm_runtime_enalbe() in is probe() function >>>>>> and does nothing more, because it only registers the device in the system >>>>>> and doesn't touch any hardware at all. IOMMU gets resumed during >>>>>> jpeg-codec probe(). After the mentioned patch (and my fix), IOMMU is kept >>>>>> runtime active, because rpm_idle() returns early with -EAGAIN and doesn't >>>>>> call __rpm_callback()/rpm_put_suppliers(). This is valid until the first >>>>>> use of jpeg-codec (so the first transition of jpeg-codec device from >>>>>> runtime active to runtime suspended state). >>>>>> >>>>>> Do you have any suggestion how this regression can be solved? I'm really >>>>>> curious if there was any case which got fixed the the mentioned commit? >>>>> >>>>> Having had a deeper look into this code I'm now thinking that I should >>>>> revert the Ulf's commit, because the case mentioned in its changelog >>>>> should be covered exactly by the condition modified by your patch. >>>> >>>> That would break the case for DL_FLAG_STATELESS. Which is the one I >>>> have been focusing on (and clearly messed up the other case, sorry >>>> about that). >>> >>> Well, so why don't we start over? >> >> I think it's easier with a new fix on top. Reverting it would break >> DL_FLAG_STATELESS, so just moving things into another broken state. > > Yes, the DL_FLAG_STATELESS is broken and it needs to be fixed, but I'm > not really sure if the approach in your commit is the best one. > > Even if it is, the right way to do it is to revert things to a > previous state and do a proper fix then IMO. > >> But, it's up to you. > > So as I said before, I'll revert it and let's fix the problem properly. Got it. I am working on a new fix then, taking also the other cases but DL_FLAG_STATELESS into account. Kind regards Uffe
On Tuesday, June 12, 2018 10:48:13 AM CEST Ulf Hansson wrote: > On 12 June 2018 at 10:42, Rafael J. Wysocki <rafael@kernel.org> wrote: > > On Tue, Jun 12, 2018 at 10:37 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > >> On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: [cut] > >>> > >>> Well, so why don't we start over? > >> > >> I think it's easier with a new fix on top. Reverting it would break > >> DL_FLAG_STATELESS, so just moving things into another broken state. > > > > Yes, the DL_FLAG_STATELESS is broken and it needs to be fixed, but I'm > > not really sure if the approach in your commit is the best one. > > > > Even if it is, the right way to do it is to revert things to a > > previous state and do a proper fix then IMO. > > > >> But, it's up to you. > > > > So as I said before, I'll revert it and let's fix the problem properly. > > Got it. I am working on a new fix then, taking also the other cases > but DL_FLAG_STATELESS into account. Actually, wouldn't something as simple as the patch below be sufficient here? --- drivers/base/core.c | 9 +++++++++ 1 file changed, 9 insertions(+) Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -229,6 +229,15 @@ struct device_link *device_link_add(stru /* Determine the initial link state. */ if (flags & DL_FLAG_STATELESS) { link->status = DL_STATE_NONE; + /* + * If the link is being added at probe time, it is necessary to + * bump up the supplier usage counter here even if the link + * itself is stateless to balance the usage counter properly. + */ + if (supplier->links.status == DL_DEV_DRIVER_BOUND && + consumer->links.status == DL_DEV_PROBING && + flags & DL_FLAG_PM_RUNTIME) + pm_runtime_get_noresume(supplier); } else { switch (supplier->links.status) { case DL_DEV_DRIVER_BOUND:
On 12 June 2018 at 11:10, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: > On Tuesday, June 12, 2018 10:48:13 AM CEST Ulf Hansson wrote: >> On 12 June 2018 at 10:42, Rafael J. Wysocki <rafael@kernel.org> wrote: >> > On Tue, Jun 12, 2018 at 10:37 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >> >> On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: > > [cut] > >> >>> >> >>> Well, so why don't we start over? >> >> >> >> I think it's easier with a new fix on top. Reverting it would break >> >> DL_FLAG_STATELESS, so just moving things into another broken state. >> > >> > Yes, the DL_FLAG_STATELESS is broken and it needs to be fixed, but I'm >> > not really sure if the approach in your commit is the best one. >> > >> > Even if it is, the right way to do it is to revert things to a >> > previous state and do a proper fix then IMO. >> > >> >> But, it's up to you. >> > >> > So as I said before, I'll revert it and let's fix the problem properly. >> >> Got it. I am working on a new fix then, taking also the other cases >> but DL_FLAG_STATELESS into account. > > Actually, wouldn't something as simple as the patch below be sufficient > here? > > --- > drivers/base/core.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > Index: linux-pm/drivers/base/core.c > =================================================================== > --- linux-pm.orig/drivers/base/core.c > +++ linux-pm/drivers/base/core.c > @@ -229,6 +229,15 @@ struct device_link *device_link_add(stru > /* Determine the initial link state. */ > if (flags & DL_FLAG_STATELESS) { > link->status = DL_STATE_NONE; > + /* > + * If the link is being added at probe time, it is necessary to > + * bump up the supplier usage counter here even if the link > + * itself is stateless to balance the usage counter properly. > + */ > + if (supplier->links.status == DL_DEV_DRIVER_BOUND && > + consumer->links.status == DL_DEV_PROBING && > + flags & DL_FLAG_PM_RUNTIME) I don't think checking the supplier->links.status is needed. For example, does this work if the supplier don't have a driver bound? Otherwise, yes, this should work. Moreover, it's good that we can keep calling pm_runtime_put() for all suppliers after probe is done. This prevents them from being unnecessary runtime resumed, which is the thing I didn't really like in my approach. Are going to send a patch, or you want me to pick it up? > + pm_runtime_get_noresume(supplier); > } else { > switch (supplier->links.status) { > case DL_DEV_DRIVER_BOUND: > Below this else, I think we should also convert the pm_runtime_get_sync(supplier), to a pm_runtime_get_noresume(supplier). Callers of device_link_add() should know that they need to explicitly use DL_FLAG_RPM_ACTIVE, if they want the supplier to be resumed. Of course, I think that is better done in a separate patch, if revert is needed. :-) Kind regards Uffe
On Tue, Jun 12, 2018 at 11:56 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: > On 12 June 2018 at 11:10, Rafael J. Wysocki <rjw@rjwysocki.net> wrote: >> On Tuesday, June 12, 2018 10:48:13 AM CEST Ulf Hansson wrote: >>> On 12 June 2018 at 10:42, Rafael J. Wysocki <rafael@kernel.org> wrote: >>> > On Tue, Jun 12, 2018 at 10:37 AM, Ulf Hansson <ulf.hansson@linaro.org> wrote: >>> >> On 12 June 2018 at 10:33, Rafael J. Wysocki <rafael@kernel.org> wrote: >> >> [cut] >> >>> >>> >>> >>> Well, so why don't we start over? >>> >> >>> >> I think it's easier with a new fix on top. Reverting it would break >>> >> DL_FLAG_STATELESS, so just moving things into another broken state. >>> > >>> > Yes, the DL_FLAG_STATELESS is broken and it needs to be fixed, but I'm >>> > not really sure if the approach in your commit is the best one. >>> > >>> > Even if it is, the right way to do it is to revert things to a >>> > previous state and do a proper fix then IMO. >>> > >>> >> But, it's up to you. >>> > >>> > So as I said before, I'll revert it and let's fix the problem properly. >>> >>> Got it. I am working on a new fix then, taking also the other cases >>> but DL_FLAG_STATELESS into account. >> >> Actually, wouldn't something as simple as the patch below be sufficient >> here? >> >> --- >> drivers/base/core.c | 9 +++++++++ >> 1 file changed, 9 insertions(+) >> >> Index: linux-pm/drivers/base/core.c >> =================================================================== >> --- linux-pm.orig/drivers/base/core.c >> +++ linux-pm/drivers/base/core.c >> @@ -229,6 +229,15 @@ struct device_link *device_link_add(stru >> /* Determine the initial link state. */ >> if (flags & DL_FLAG_STATELESS) { >> link->status = DL_STATE_NONE; >> + /* >> + * If the link is being added at probe time, it is necessary to >> + * bump up the supplier usage counter here even if the link >> + * itself is stateless to balance the usage counter properly. >> + */ >> + if (supplier->links.status == DL_DEV_DRIVER_BOUND && >> + consumer->links.status == DL_DEV_PROBING && >> + flags & DL_FLAG_PM_RUNTIME) > > I don't think checking the supplier->links.status is needed. For > example, does this work if the supplier don't have a driver bound? I wanted to follow the stateful branch which does this check, but since _put_suppliers() doesn't look at the status, you are right that this should be done in any case, so the stateful branch is missing it too in some cases. > Otherwise, yes, this should work. Moreover, it's good that we can keep > calling pm_runtime_put() for all suppliers after probe is done. This > prevents them from being unnecessary runtime resumed, which is the > thing I didn't really like in my approach. > > Are going to send a patch, or you want me to pick it up? I'll send it. > >> + pm_runtime_get_noresume(supplier); >> } else { >> switch (supplier->links.status) { >> case DL_DEV_DRIVER_BOUND: >> > > Below this else, I think we should also convert the > pm_runtime_get_sync(supplier), to a pm_runtime_get_noresume(supplier). > > Callers of device_link_add() should know that they need to explicitly > use DL_FLAG_RPM_ACTIVE, if they want the supplier to be resumed. > > Of course, I think that is better done in a separate patch, if revert > is needed. :-) Right, but since the supplied usage counter bumping up is also missing in some cases in the stateful branch, I think it needs to be done outside of that big "if ()" and so I will use noresume in it. And if someone really wants suppliers to resume, they should use DL_FLAG_RPM_ACTIVE as documented.
[...] >>> } else { >>> switch (supplier->links.status) { >>> case DL_DEV_DRIVER_BOUND: >>> >> >> Below this else, I think we should also convert the >> pm_runtime_get_sync(supplier), to a pm_runtime_get_noresume(supplier). >> >> Callers of device_link_add() should know that they need to explicitly >> use DL_FLAG_RPM_ACTIVE, if they want the supplier to be resumed. >> >> Of course, I think that is better done in a separate patch, if revert >> is needed. :-) > > Right, but since the supplied usage counter bumping up is also missing > in some cases in the stateful branch, I think it needs to be done > outside of that big "if ()" and so I will use noresume in it. And if > someone really wants suppliers to resume, they should use > DL_FLAG_RPM_ACTIVE as documented. That makes perfect sense to me as well! Thanks for helping out and kind regards Uffe
diff --git a/drivers/base/core.c b/drivers/base/core.c index ad7b50897bcc..8ffd353f40ce 100644 --- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -259,8 +259,10 @@ struct device_link *device_link_add(struct device *consumer, * runtime PM usage counter after consumer probe * in driver_probe_device(). */ - if (flags & DL_FLAG_PM_RUNTIME) + if (flags & DL_FLAG_PM_RUNTIME) { pm_runtime_get_sync(supplier); + link->rpm_active = true; + } link->status = DL_STATE_CONSUMER_PROBE; break;
When device links with DL_FLAG_PM_RUNTIME are created during consumer probe, the supplier device is unconditionally runtime activated by pm_runtime_get_sync(). However this get() is not noticed by rpm_put_suppliers() called at the end of probe(), because the newly created link lacks a rpm_active state bit. This worked when probe() called pm_runtime_(resume,suspend)_suppliers() unconditionally, but patch 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") changed probe() to rely on the generic rpm_put_suppliers() code. This patch updates the newly created link state to proper value. Fixes: 1e8378619841 ("PM / runtime: Fixup reference counting of device link suppliers at probe") Signed-off-by: Marek Szyprowski <m.szyprowski@samsung.com> --- This fixes Exynos IOMMU driver being unconditionally set to runtime active state since the mentioned commit. --- drivers/base/core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) -- 2.17.1