diff mbox series

[v2] media: qcom: camss: fix VFE pm domain off

Message ID 20241128-vfe_pm_domain_off-v2-1-0bcbbe7daaaf@mainlining.org
State New
Headers show
Series [v2] media: qcom: camss: fix VFE pm domain off | expand

Commit Message

Barnabás Czémán Nov. 28, 2024, 7:39 p.m. UTC
Fix NULL pointer check before device_link_del
is called.

Unable to handle kernel NULL pointer dereference at virtual address 000000000000032c
Call trace:
 device_link_put_kref+0xc/0xb8
 device_link_del+0x30/0x48
 vfe_pm_domain_off+0x24/0x38 [qcom_camss]
 vfe_put+0x9c/0xd0 [qcom_camss]
 vfe_set_power+0x48/0x58 [qcom_camss]
 pipeline_pm_power_one+0x154/0x158 [videodev]
 pipeline_pm_power+0x74/0xfc [videodev]
 v4l2_pipeline_pm_use+0x54/0x90 [videodev]
 v4l2_pipeline_pm_put+0x14/0x34 [videodev]
 video_release+0x2c/0x44 [qcom_camss]
 v4l2_release+0xe4/0xec [videodev]

Fixes: eb73facec2c2 ("media: qcom: camss: Use common VFE pm_domain_on/pm_domain_off where applicable")
Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
---
Changes in v2:
- Add backtrace to the commit message.
- Link to v1: https://lore.kernel.org/r/20241122-vfe_pm_domain_off-v1-1-81d18f56563d@mainlining.org
---
 drivers/media/platform/qcom/camss/camss-vfe.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


---
base-commit: ed9a4ad6e5bd3a443e81446476718abebee47e82
change-id: 20241122-vfe_pm_domain_off-c57008e54167

Best regards,

Comments

Vladimir Zapolskiy Nov. 29, 2024, 8:48 a.m. UTC | #1
On 11/28/24 21:39, Barnabás Czémán wrote:
> Fix NULL pointer check before device_link_del
> is called.
> 
> Unable to handle kernel NULL pointer dereference at virtual address 000000000000032c
> Call trace:
>   device_link_put_kref+0xc/0xb8
>   device_link_del+0x30/0x48
>   vfe_pm_domain_off+0x24/0x38 [qcom_camss]
>   vfe_put+0x9c/0xd0 [qcom_camss]
>   vfe_set_power+0x48/0x58 [qcom_camss]
>   pipeline_pm_power_one+0x154/0x158 [videodev]
>   pipeline_pm_power+0x74/0xfc [videodev]
>   v4l2_pipeline_pm_use+0x54/0x90 [videodev]
>   v4l2_pipeline_pm_put+0x14/0x34 [videodev]
>   video_release+0x2c/0x44 [qcom_camss]
>   v4l2_release+0xe4/0xec [videodev]
> 
> Fixes: eb73facec2c2 ("media: qcom: camss: Use common VFE pm_domain_on/pm_domain_off where applicable")
> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
> ---
> Changes in v2:
> - Add backtrace to the commit message.
> - Link to v1: https://lore.kernel.org/r/20241122-vfe_pm_domain_off-v1-1-81d18f56563d@mainlining.org
> ---
>   drivers/media/platform/qcom/camss/camss-vfe.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
> index 80a62ba11295042802cbaec617fb87c492ea6a55..1bf1473331f63b9ab106d21ea263c84d851c8a31 100644
> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
> @@ -595,7 +595,7 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>    */
>   void vfe_pm_domain_off(struct vfe_device *vfe)
>   {
> -	if (!vfe->genpd)
> +	if (!vfe->genpd_link)
>   		return;
>   
>   	device_link_del(vfe->genpd_link);
> 

I object to this change, there might be a problem in the code, however it
is not yet identified.

vfe->genpd is not NULL, if vfe_pm_domain_on()/vfe_pm_domain_off() are
called appropriately, the "fix" does not fix the real problem, it veils it.

--
Best wishes,
Vladimir
Bryan O'Donoghue Nov. 29, 2024, 11:06 a.m. UTC | #2
On 29/11/2024 08:48, Vladimir Zapolskiy wrote:
> On 11/28/24 21:39, Barnabás Czémán wrote:
>> Fix NULL pointer check before device_link_del
>> is called.
>>
>> Unable to handle kernel NULL pointer dereference at virtual address 
>> 000000000000032c
>> Call trace:
>>   device_link_put_kref+0xc/0xb8
>>   device_link_del+0x30/0x48
>>   vfe_pm_domain_off+0x24/0x38 [qcom_camss]
>>   vfe_put+0x9c/0xd0 [qcom_camss]
>>   vfe_set_power+0x48/0x58 [qcom_camss]
>>   pipeline_pm_power_one+0x154/0x158 [videodev]
>>   pipeline_pm_power+0x74/0xfc [videodev]
>>   v4l2_pipeline_pm_use+0x54/0x90 [videodev]
>>   v4l2_pipeline_pm_put+0x14/0x34 [videodev]
>>   video_release+0x2c/0x44 [qcom_camss]
>>   v4l2_release+0xe4/0xec [videodev]
>>
>> Fixes: eb73facec2c2 ("media: qcom: camss: Use common VFE pm_domain_on/ 
>> pm_domain_off where applicable")
>> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
>> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
>> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
>> ---
>> Changes in v2:
>> - Add backtrace to the commit message.
>> - Link to v1: https://lore.kernel.org/r/20241122-vfe_pm_domain_off- 
>> v1-1-81d18f56563d@mainlining.org
>> ---
>>   drivers/media/platform/qcom/camss/camss-vfe.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/ 
>> media/platform/qcom/camss/camss-vfe.c
>> index 
>> 80a62ba11295042802cbaec617fb87c492ea6a55..1bf1473331f63b9ab106d21ea263c84d851c8a31 100644
>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>> @@ -595,7 +595,7 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>    */
>>   void vfe_pm_domain_off(struct vfe_device *vfe)
>>   {
>> -    if (!vfe->genpd)
>> +    if (!vfe->genpd_link)
>>           return;
>>       device_link_del(vfe->genpd_link);
>>
> 
> I object to this change, there might be a problem in the code, however it
> is not yet identified.
> 
> vfe->genpd is not NULL, if vfe_pm_domain_on()/vfe_pm_domain_off() are
> called appropriately, the "fix" does not fix the real problem, it veils it.
> 
> -- 
> Best wishes,
> Vladimir
> 
> 

Let's walk through the logic.

vfe->genpd =

Can happen in vfe_subdev_init();

vfe_pm_domain_on() can fail @ vfe->genpd_link =

If it fails then I _suppose_ we are still calling vfe_pm_domain_off() at 
least that's the only logically way I see this error can manifest.

@Barnabás can you confirm that this is the case ?

If not, can you please provide more detail ?

---
bod
Vladimir Zapolskiy Nov. 29, 2024, 11:20 a.m. UTC | #3
On 11/29/24 13:06, Bryan O'Donoghue wrote:
> On 29/11/2024 08:48, Vladimir Zapolskiy wrote:
>> On 11/28/24 21:39, Barnabás Czémán wrote:
>>> Fix NULL pointer check before device_link_del
>>> is called.
>>>
>>> Unable to handle kernel NULL pointer dereference at virtual address
>>> 000000000000032c
>>> Call trace:
>>>    device_link_put_kref+0xc/0xb8
>>>    device_link_del+0x30/0x48
>>>    vfe_pm_domain_off+0x24/0x38 [qcom_camss]
>>>    vfe_put+0x9c/0xd0 [qcom_camss]
>>>    vfe_set_power+0x48/0x58 [qcom_camss]
>>>    pipeline_pm_power_one+0x154/0x158 [videodev]
>>>    pipeline_pm_power+0x74/0xfc [videodev]
>>>    v4l2_pipeline_pm_use+0x54/0x90 [videodev]
>>>    v4l2_pipeline_pm_put+0x14/0x34 [videodev]
>>>    video_release+0x2c/0x44 [qcom_camss]
>>>    v4l2_release+0xe4/0xec [videodev]
>>>
>>> Fixes: eb73facec2c2 ("media: qcom: camss: Use common VFE pm_domain_on/
>>> pm_domain_off where applicable")
>>> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
>>> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
>>> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
>>> ---
>>> Changes in v2:
>>> - Add backtrace to the commit message.
>>> - Link to v1: https://lore.kernel.org/r/20241122-vfe_pm_domain_off-
>>> v1-1-81d18f56563d@mainlining.org
>>> ---
>>>    drivers/media/platform/qcom/camss/camss-vfe.c | 2 +-
>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/
>>> media/platform/qcom/camss/camss-vfe.c
>>> index
>>> 80a62ba11295042802cbaec617fb87c492ea6a55..1bf1473331f63b9ab106d21ea263c84d851c8a31 100644
>>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>> @@ -595,7 +595,7 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>>     */
>>>    void vfe_pm_domain_off(struct vfe_device *vfe)
>>>    {
>>> -    if (!vfe->genpd)
>>> +    if (!vfe->genpd_link)
>>>            return;
>>>        device_link_del(vfe->genpd_link);
>>>
>>
>> I object to this change, there might be a problem in the code, however it
>> is not yet identified.
>>
>> vfe->genpd is not NULL, if vfe_pm_domain_on()/vfe_pm_domain_off() are
>> called appropriately, the "fix" does not fix the real problem, it veils it.
>>
>> -- 
>> Best wishes,
>> Vladimir
>>
>>
> 
> Let's walk through the logic.
> 
> vfe->genpd =
> 
> Can happen in vfe_subdev_init();
> 
> vfe_pm_domain_on() can fail @ vfe->genpd_link =
> 
> If it fails then I _suppose_ we are still calling vfe_pm_domain_off() at
> least that's the only logically way I see this error can manifest.

There should be no room for suppositions, the source code is open.

If the described by you case is true, and vfe_pm_domain_on() fails,
then vfe_pm_domain_off() shall not be called, otherwise that's the
real problem and it shall be fixed instead of being veiled by the
proposed change.

> @Barnabás can you confirm that this is the case ?
> 
> If not, can you please provide more detail ?

The change does not describe how to reproduce the problem, which commit
base is tested, which platform is testes, there is no enough information,
unfortunately.

--
Best wishes,
Vladimir
Bryan O'Donoghue Nov. 29, 2024, 11:23 a.m. UTC | #4
On 29/11/2024 11:20, Vladimir Zapolskiy wrote:
> There should be no room for suppositions, the source code is open.
> 
> If the described by you case is true, and vfe_pm_domain_on() fails,
> then vfe_pm_domain_off() shall not be called, otherwise that's the
> real problem and it shall be fixed instead of being veiled by the
> proposed change.

Maybe, maybe not I'd like to hear more from Barnabás on that, who is in 
a position to replicate this bug.

---
bod
Barnabás Czémán Nov. 29, 2024, 11:44 a.m. UTC | #5
On 2024-11-29 12:20, Vladimir Zapolskiy wrote:
> On 11/29/24 13:06, Bryan O'Donoghue wrote:
>> On 29/11/2024 08:48, Vladimir Zapolskiy wrote:
>>> On 11/28/24 21:39, Barnabás Czémán wrote:
>>>> Fix NULL pointer check before device_link_del
>>>> is called.
>>>> 
>>>> Unable to handle kernel NULL pointer dereference at virtual address
>>>> 000000000000032c
>>>> Call trace:
>>>>    device_link_put_kref+0xc/0xb8
>>>>    device_link_del+0x30/0x48
>>>>    vfe_pm_domain_off+0x24/0x38 [qcom_camss]
>>>>    vfe_put+0x9c/0xd0 [qcom_camss]
>>>>    vfe_set_power+0x48/0x58 [qcom_camss]
>>>>    pipeline_pm_power_one+0x154/0x158 [videodev]
>>>>    pipeline_pm_power+0x74/0xfc [videodev]
>>>>    v4l2_pipeline_pm_use+0x54/0x90 [videodev]
>>>>    v4l2_pipeline_pm_put+0x14/0x34 [videodev]
>>>>    video_release+0x2c/0x44 [qcom_camss]
>>>>    v4l2_release+0xe4/0xec [videodev]
>>>> 
>>>> Fixes: eb73facec2c2 ("media: qcom: camss: Use common VFE 
>>>> pm_domain_on/
>>>> pm_domain_off where applicable")
>>>> Tested-by: Yassine Oudjana <y.oudjana@protonmail.com>
>>>> Reviewed-by: Bryan O'Donoghue <bryan.odonoghue@linaro.org>
>>>> Signed-off-by: Barnabás Czémán <barnabas.czeman@mainlining.org>
>>>> ---
>>>> Changes in v2:
>>>> - Add backtrace to the commit message.
>>>> - Link to v1: https://lore.kernel.org/r/20241122-vfe_pm_domain_off-
>>>> v1-1-81d18f56563d@mainlining.org
>>>> ---
>>>>    drivers/media/platform/qcom/camss/camss-vfe.c | 2 +-
>>>>    1 file changed, 1 insertion(+), 1 deletion(-)
>>>> 
>>>> diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c 
>>>> b/drivers/
>>>> media/platform/qcom/camss/camss-vfe.c
>>>> index
>>>> 80a62ba11295042802cbaec617fb87c492ea6a55..1bf1473331f63b9ab106d21ea263c84d851c8a31 
>>>> 100644
>>>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>>> @@ -595,7 +595,7 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>>>     */
>>>>    void vfe_pm_domain_off(struct vfe_device *vfe)
>>>>    {
>>>> -    if (!vfe->genpd)
>>>> +    if (!vfe->genpd_link)
>>>>            return;
>>>>        device_link_del(vfe->genpd_link);
>>>> 
>>> 
>>> I object to this change, there might be a problem in the code, 
>>> however it
>>> is not yet identified.
>>> 
>>> vfe->genpd is not NULL, if vfe_pm_domain_on()/vfe_pm_domain_off() are
>>> called appropriately, the "fix" does not fix the real problem, it 
>>> veils it.
>>> 
>>> -- Best wishes,
>>> Vladimir
>>> 
>>> 
>> 
>> Let's walk through the logic.
>> 
>> vfe->genpd =
>> 
>> Can happen in vfe_subdev_init();
>> 
>> vfe_pm_domain_on() can fail @ vfe->genpd_link =
>> 
>> If it fails then I _suppose_ we are still calling vfe_pm_domain_off() 
>> at
>> least that's the only logically way I see this error can manifest.
> 
> There should be no room for suppositions, the source code is open.
> 
> If the described by you case is true, and vfe_pm_domain_on() fails,
> then vfe_pm_domain_off() shall not be called, otherwise that's the
> real problem and it shall be fixed instead of being veiled by the
> proposed change.
> 
>> @Barnabás can you confirm that this is the case ?
>> 
>> If not, can you please provide more detail ?
> 
> The change does not describe how to reproduce the problem, which commit
> base is tested, which platform is testes, there is no enough 
> information,
> unfortunately.
I can reproduce the problem with megapixels-sensorprofile on msm8953 and
it can be reproduced with megapixels on msm8996.
The base is the last commit on next.
> 
> --
> Best wishes,
> Vladimir
Bryan O'Donoghue Nov. 29, 2024, 12:25 p.m. UTC | #6
On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>> The change does not describe how to reproduce the problem, which commit
>> base is tested, which platform is testes, there is no enough information,
>> unfortunately.
> I can reproduce the problem with megapixels-sensorprofile on msm8953 and
> it can be reproduced with megapixels on msm8996.
> The base is the last commit on next.

Can you verify if vfe_domain_on has run and if so whether or not 
genpd_link is NULL when that function exists.

That's the question.

---
bod
Barnabás Czémán Nov. 29, 2024, 1:46 p.m. UTC | #7
On 2024-11-29 13:25, Bryan O'Donoghue wrote:
> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>> The change does not describe how to reproduce the problem, which 
>>> commit
>>> base is tested, which platform is testes, there is no enough 
>>> information,
>>> unfortunately.
>> I can reproduce the problem with megapixels-sensorprofile on msm8953 
>> and
>> it can be reproduced with megapixels on msm8996.
>> The base is the last commit on next.
> 
> Can you verify if vfe_domain_on has run and if so whether or not 
> genpd_link is NULL when that function exists.
> 
I have added some debug logs it seems pm_domain_on and pm_domain_off is 
called twice on the same object.
[   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
42973800
[   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
4e413800
[   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
42973800
[   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 link 
4e413800
[   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
42973800
[   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 0
> That's the question.
> 
> ---
> bod
Bryan O'Donoghue Nov. 29, 2024, 10:08 p.m. UTC | #8
On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>> The change does not describe how to reproduce the problem, which commit
>>>> base is tested, which platform is testes, there is no enough 
>>>> information,
>>>> unfortunately.
>>> I can reproduce the problem with megapixels-sensorprofile on msm8953 and
>>> it can be reproduced with megapixels on msm8996.
>>> The base is the last commit on next.
>>
>> Can you verify if vfe_domain_on has run and if so whether or not 
>> genpd_link is NULL when that function exists.
>>
> I have added some debug logs it seems pm_domain_on and pm_domain_off is 
> called twice on the same object.
> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
> 42973800
> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
> 4e413800
> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
> 42973800
> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 link 
> 4e413800
> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
> 42973800
> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 0
>> That's the question.
>>
>> ---
>> bod

Could you provide this output ?

index 80a62ba112950..b25b8f6b00be1 100644
--- a/drivers/media/platform/qcom/camss/camss-vfe.c
+++ b/drivers/media/platform/qcom/camss/camss-vfe.c
@@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
   */
  void vfe_pm_domain_off(struct vfe_device *vfe)
  {
+dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
+        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
+
         if (!vfe->genpd)
                 return;

@@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
  int vfe_pm_domain_on(struct vfe_device *vfe)
  {
         struct camss *camss = vfe->camss;
-
+dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
+        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
         if (!vfe->genpd)
                 return 0;

---
bod
Barnabás Czémán Nov. 29, 2024, 10:45 p.m. UTC | #9
On 2024-11-29 23:08, Bryan O'Donoghue wrote:
> On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
>> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>>> The change does not describe how to reproduce the problem, which 
>>>>> commit
>>>>> base is tested, which platform is testes, there is no enough 
>>>>> information,
>>>>> unfortunately.
>>>> I can reproduce the problem with megapixels-sensorprofile on msm8953 
>>>> and
>>>> it can be reproduced with megapixels on msm8996.
>>>> The base is the last commit on next.
>>> 
>>> Can you verify if vfe_domain_on has run and if so whether or not 
>>> genpd_link is NULL when that function exists.
>>> 
>> I have added some debug logs it seems pm_domain_on and pm_domain_off 
>> is called twice on the same object.
>> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>> 42973800
>> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
>> 4e413800
>> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>> 42973800
>> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 link 
>> 4e413800
>> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
>> 42973800
>> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 0
>>> That's the question.
>>> 
>>> ---
>>> bod
> 
> Could you provide this output ?
> 
> index 80a62ba112950..b25b8f6b00be1 100644
> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
> @@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>   */
>  void vfe_pm_domain_off(struct vfe_device *vfe)
>  {
> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
> +
>         if (!vfe->genpd)
>                 return;
> 
> @@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
>  int vfe_pm_domain_on(struct vfe_device *vfe)
>  {
>         struct camss *camss = vfe->camss;
> -
> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>         if (!vfe->genpd)
>                 return 0;
> 
> ---
> bod
I think logging in pm_domain_on should be placed after device_link_add 
because only NULL
will be visible.
[   83.040694] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000009bd8355f genpd_link 0000000000000000
[   83.049293] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
00000000bfb65e7c genpd_link 0000000000000000
[   83.049353] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000009bd8355f genpd_link 00000000ccb0acd9
[   83.049641] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
00000000bfb65e7c genpd_link 00000000348ac3c1
[   83.049654] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000009bd8355f genpd_link 00000000ccb0acd9
[   83.241498] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000009bd8355f genpd_link 0000000000000000
Bryan O'Donoghue Nov. 29, 2024, 11:07 p.m. UTC | #10
On 29/11/2024 22:45, barnabas.czeman@mainlining.org wrote:
> On 2024-11-29 23:08, Bryan O'Donoghue wrote:
>> On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
>>> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>>>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>>>> The change does not describe how to reproduce the problem, which 
>>>>>> commit
>>>>>> base is tested, which platform is testes, there is no enough 
>>>>>> information,
>>>>>> unfortunately.
>>>>> I can reproduce the problem with megapixels-sensorprofile on 
>>>>> msm8953 and
>>>>> it can be reproduced with megapixels on msm8996.
>>>>> The base is the last commit on next.
>>>>
>>>> Can you verify if vfe_domain_on has run and if so whether or not 
>>>> genpd_link is NULL when that function exists.
>>>>
>>> I have added some debug logs it seems pm_domain_on and pm_domain_off 
>>> is called twice on the same object.
>>> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>> 42973800
>>> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
>>> 4e413800
>>> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>> 42973800
>>> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 link 
>>> 4e413800
>>> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
>>> 42973800
>>> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 0
>>>> That's the question.
>>>>
>>>> ---
>>>> bod
>>
>> Could you provide this output ?
>>
>> index 80a62ba112950..b25b8f6b00be1 100644
>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>> @@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>   */
>>  void vfe_pm_domain_off(struct vfe_device *vfe)
>>  {
>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>> +
>>         if (!vfe->genpd)
>>                 return;
>>
>> @@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
>>  int vfe_pm_domain_on(struct vfe_device *vfe)
>>  {
>>         struct camss *camss = vfe->camss;
>> -
>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>         if (!vfe->genpd)
>>                 return 0;
>>
>> ---
>> bod
> I think logging in pm_domain_on should be placed after device_link_add 
> because only NULL
> will be visible.
> [   83.040694] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
> 000000009bd8355f genpd_link 0000000000000000
> [   83.049293] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
> 00000000bfb65e7c genpd_link 0000000000000000
> [   83.049353] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
> 000000009bd8355f genpd_link 00000000ccb0acd9
> [   83.049641] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
> 00000000bfb65e7c genpd_link 00000000348ac3c1
> [   83.049654] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
> 000000009bd8355f genpd_link 00000000ccb0acd9
> [   83.241498] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
> 000000009bd8355f genpd_link 0000000000000000

Could you add

+++ b/drivers/media/platform/qcom/camss/camss-vfe.c
@@ -786,7 +786,7 @@ int vfe_get(struct vfe_device *vfe)
         int ret;

         mutex_lock(&vfe->power_lock);
-
+dev_info(vfe->camss->dev, "%s vfe %d power_count %d\n", __func__, 
vfe->id, vfe->power_count);
         if (vfe->power_count == 0) {
                 ret = vfe->res->hw_ops->pm_domain_on(vfe);
                 if (ret < 0)
@@ -823,6 +823,7 @@ int vfe_get(struct vfe_device *vfe)

         mutex_unlock(&vfe->power_lock);

+dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
camss->vfe->id, 0);
         return 0;

  error_reset:
@@ -835,7 +836,7 @@ int vfe_get(struct vfe_device *vfe)

  error_pm_domain:
         mutex_unlock(&vfe->power_lock);
-
+dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
camss->vfe->id, ret);
         return ret;
  }

?

---
bod
Barnabás Czémán Nov. 29, 2024, 11:52 p.m. UTC | #11
On 2024-11-30 00:07, Bryan O'Donoghue wrote:
> On 29/11/2024 22:45, barnabas.czeman@mainlining.org wrote:
>> On 2024-11-29 23:08, Bryan O'Donoghue wrote:
>>> On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
>>>> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>>>>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>>>>> The change does not describe how to reproduce the problem, which 
>>>>>>> commit
>>>>>>> base is tested, which platform is testes, there is no enough 
>>>>>>> information,
>>>>>>> unfortunately.
>>>>>> I can reproduce the problem with megapixels-sensorprofile on 
>>>>>> msm8953 and
>>>>>> it can be reproduced with megapixels on msm8996.
>>>>>> The base is the last commit on next.
>>>>> 
>>>>> Can you verify if vfe_domain_on has run and if so whether or not 
>>>>> genpd_link is NULL when that function exists.
>>>>> 
>>>> I have added some debug logs it seems pm_domain_on and pm_domain_off 
>>>> is called twice on the same object.
>>>> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>>> 42973800
>>>> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
>>>> 4e413800
>>>> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>>> 42973800
>>>> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 link 
>>>> 4e413800
>>>> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
>>>> 42973800
>>>> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 
>>>> 0
>>>>> That's the question.
>>>>> 
>>>>> ---
>>>>> bod
>>> 
>>> Could you provide this output ?
>>> 
>>> index 80a62ba112950..b25b8f6b00be1 100644
>>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>> @@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>>   */
>>>  void vfe_pm_domain_off(struct vfe_device *vfe)
>>>  {
>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>> +
>>>         if (!vfe->genpd)
>>>                 return;
>>> 
>>> @@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
>>>  int vfe_pm_domain_on(struct vfe_device *vfe)
>>>  {
>>>         struct camss *camss = vfe->camss;
>>> -
>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>>         if (!vfe->genpd)
>>>                 return 0;
>>> 
>>> ---
>>> bod
>> I think logging in pm_domain_on should be placed after device_link_add 
>> because only NULL
>> will be visible.
>> [   83.040694] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
>> 000000009bd8355f genpd_link 0000000000000000
>> [   83.049293] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
>> 00000000bfb65e7c genpd_link 0000000000000000
>> [   83.049353] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
>> 000000009bd8355f genpd_link 00000000ccb0acd9
>> [   83.049641] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
>> 00000000bfb65e7c genpd_link 00000000348ac3c1
>> [   83.049654] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
>> 000000009bd8355f genpd_link 00000000ccb0acd9
>> [   83.241498] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
>> 000000009bd8355f genpd_link 0000000000000000
> 
> Could you add
> 
> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
> @@ -786,7 +786,7 @@ int vfe_get(struct vfe_device *vfe)
>         int ret;
> 
>         mutex_lock(&vfe->power_lock);
> -
> +dev_info(vfe->camss->dev, "%s vfe %d power_count %d\n", __func__, 
> vfe->id, vfe->power_count);
>         if (vfe->power_count == 0) {
>                 ret = vfe->res->hw_ops->pm_domain_on(vfe);
>                 if (ret < 0)
> @@ -823,6 +823,7 @@ int vfe_get(struct vfe_device *vfe)
> 
>         mutex_unlock(&vfe->power_lock);
> 
> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
> camss->vfe->id, 0);
>         return 0;
> 
>  error_reset:
> @@ -835,7 +836,7 @@ int vfe_get(struct vfe_device *vfe)
> 
>  error_pm_domain:
>         mutex_unlock(&vfe->power_lock);
> -
> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
> camss->vfe->id, ret);
>         return ret;
>  }
> 
> ?
> 
> ---
> bod
I have added little more from the logs because it is only failing in 
edge cases megapixels-sensorprofile failing by
different reason quickly and trying to release the device.
[   54.719030] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[   54.750124] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 1
[   54.750236] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[   54.751270] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
00000000beaef03c genpd_link 00000000251644d9
[   54.751433] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000007ce2da53 genpd_link 0000000000000000
[   54.755463] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
00000000beaef03c genpd_link 00000000251644d9
[   54.755531] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000007ce2da53 genpd_link 0000000058dcd4d6
[   55.861084] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 1
[   55.861177] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[   55.861778] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
00000000beaef03c genpd_link 0000000000000000
[   55.861860] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000007ce2da53 genpd_link 0000000000000000
[   55.862242] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
00000000beaef03c genpd_link 00000000251644d9
[   55.862258] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000007ce2da53 genpd_link 00000000e41c1881
[  143.911690] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 2
[  143.911762] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[  143.911800] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 3
[  143.911821] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[  143.911858] qcom-camss 1b00020.camss: vfe_get vfe 1 power_count 0
[  143.911871] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000007ce2da53 genpd_link 0000000000000000
[  143.912393] qcom-camss 1b00020.camss: vfe_get vfe 1 err=0
[  143.912489] qcom-camss 1b00020.camss: vfe_get vfe 1 power_count 1
[  143.912513] qcom-camss 1b00020.camss: vfe_get vfe 1 err=0
[  143.912553] qcom-camss 1b00020.camss: vfe_get vfe 1 power_count 2
[  143.912572] qcom-camss 1b00020.camss: vfe_get vfe 1 err=0
[  143.921750] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 3
[  143.921802] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
[  143.922670] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
00000000beaef03c genpd_link 0000000000000000
[  143.922725] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
000000007ce2da53 genpd_link 00000000d1fcd54b
[  143.922853] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 genpd 
00000000beaef03c genpd_link 00000000251644d9
[  143.922868] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000007ce2da53 genpd_link 00000000d1fcd54b
[  144.126535] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
000000007ce2da53 genpd_link 0000000000000000
Bryan O'Donoghue Nov. 30, 2024, 9:48 p.m. UTC | #12
On 29/11/2024 23:52, barnabas.czeman@mainlining.org wrote:
> On 2024-11-30 00:07, Bryan O'Donoghue wrote:
>> On 29/11/2024 22:45, barnabas.czeman@mainlining.org wrote:
>>> On 2024-11-29 23:08, Bryan O'Donoghue wrote:
>>>> On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
>>>>> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>>>>>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>>>>>> The change does not describe how to reproduce the problem, which 
>>>>>>>> commit
>>>>>>>> base is tested, which platform is testes, there is no enough 
>>>>>>>> information,
>>>>>>>> unfortunately.
>>>>>>> I can reproduce the problem with megapixels-sensorprofile on 
>>>>>>> msm8953 and
>>>>>>> it can be reproduced with megapixels on msm8996.
>>>>>>> The base is the last commit on next.
>>>>>>
>>>>>> Can you verify if vfe_domain_on has run and if so whether or not 
>>>>>> genpd_link is NULL when that function exists.
>>>>>>
>>>>> I have added some debug logs it seems pm_domain_on and 
>>>>> pm_domain_off is called twice on the same object.
>>>>> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>>>> 42973800
>>>>> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 link 
>>>>> 4e413800
>>>>> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 link 
>>>>> 42973800
>>>>> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 
>>>>> link 4e413800
>>>>> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 
>>>>> link 42973800
>>>>> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 link 0
>>>>>> That's the question.
>>>>>>
>>>>>> ---
>>>>>> bod
>>>>
>>>> Could you provide this output ?
>>>>
>>>> index 80a62ba112950..b25b8f6b00be1 100644
>>>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>>> @@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>>>   */
>>>>  void vfe_pm_domain_off(struct vfe_device *vfe)
>>>>  {
>>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>>> +
>>>>         if (!vfe->genpd)
>>>>                 return;
>>>>
>>>> @@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
>>>>  int vfe_pm_domain_on(struct vfe_device *vfe)
>>>>  {
>>>>         struct camss *camss = vfe->camss;
>>>> -
>>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>>>         if (!vfe->genpd)
>>>>                 return 0;
>>>>
>>>> ---
>>>> bod
>>> I think logging in pm_domain_on should be placed after 
>>> device_link_add because only NULL
>>> will be visible.
>>> [   83.040694] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
>>> 000000009bd8355f genpd_link 0000000000000000
>>> [   83.049293] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
>>> 00000000bfb65e7c genpd_link 0000000000000000
>>> [   83.049353] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
>>> 000000009bd8355f genpd_link 00000000ccb0acd9
>>> [   83.049641] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 
>>> genpd 00000000bfb65e7c genpd_link 00000000348ac3c1
>>> [   83.049654] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 
>>> genpd 000000009bd8355f genpd_link 00000000ccb0acd9
>>> [   83.241498] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 
>>> genpd 000000009bd8355f genpd_link 0000000000000000
>>
>> Could you add
>>
>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>> @@ -786,7 +786,7 @@ int vfe_get(struct vfe_device *vfe)
>>         int ret;
>>
>>         mutex_lock(&vfe->power_lock);
>> -
>> +dev_info(vfe->camss->dev, "%s vfe %d power_count %d\n", __func__, 
>> vfe->id, vfe->power_count);
>>         if (vfe->power_count == 0) {
>>                 ret = vfe->res->hw_ops->pm_domain_on(vfe);
>>                 if (ret < 0)
>> @@ -823,6 +823,7 @@ int vfe_get(struct vfe_device *vfe)
>>
>>         mutex_unlock(&vfe->power_lock);
>>
>> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, camss->vfe- 
>> >id, 0);
>>         return 0;
>>
>>  error_reset:
>> @@ -835,7 +836,7 @@ int vfe_get(struct vfe_device *vfe)
>>
>>  error_pm_domain:
>>         mutex_unlock(&vfe->power_lock);
>> -
>> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, camss->vfe- 
>> >id, ret);
>>         return ret;
>>  }
>>
>> ?
>>
>> ---
>> bod
> I have added little more from the logs because it is only failing in 
> edge cases megapixels-sensorprofile failing by
> different reason quickly and trying to release the device.
> [   54.719030] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
> [   54.750124] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 1
> [   54.750236] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
> [   54.751270] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
> 00000000beaef03c genpd_link 00000000251644d9

> [   54.751433] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
> 000000007ce2da53 genpd_link 0000000000000000
> [   54.755531] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
> 000000007ce2da53 genpd_link 0000000058dcd4d6

that's a bug genpd_link should be NULL unless power_count != 0

> [  143.922868] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
> 000000007ce2da53 genpd_link 00000000d1fcd54b
> [  144.126535] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
> 000000007ce2da53 genpd_link 0000000000000000

this is the corollary of the bug

can you provide the output of the attached please ?
Barnabás Czémán Nov. 30, 2024, 10:58 p.m. UTC | #13
On 2024-11-30 22:48, Bryan O'Donoghue wrote:
> On 29/11/2024 23:52, barnabas.czeman@mainlining.org wrote:
>> On 2024-11-30 00:07, Bryan O'Donoghue wrote:
>>> On 29/11/2024 22:45, barnabas.czeman@mainlining.org wrote:
>>>> On 2024-11-29 23:08, Bryan O'Donoghue wrote:
>>>>> On 29/11/2024 13:46, barnabas.czeman@mainlining.org wrote:
>>>>>> On 2024-11-29 13:25, Bryan O'Donoghue wrote:
>>>>>>> On 29/11/2024 11:44, barnabas.czeman@mainlining.org wrote:
>>>>>>>>> The change does not describe how to reproduce the problem, 
>>>>>>>>> which commit
>>>>>>>>> base is tested, which platform is testes, there is no enough 
>>>>>>>>> information,
>>>>>>>>> unfortunately.
>>>>>>>> I can reproduce the problem with megapixels-sensorprofile on 
>>>>>>>> msm8953 and
>>>>>>>> it can be reproduced with megapixels on msm8996.
>>>>>>>> The base is the last commit on next.
>>>>>>> 
>>>>>>> Can you verify if vfe_domain_on has run and if so whether or not 
>>>>>>> genpd_link is NULL when that function exists.
>>>>>>> 
>>>>>> I have added some debug logs it seems pm_domain_on and 
>>>>>> pm_domain_off is called twice on the same object.
>>>>>> [   63.473360] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 
>>>>>> link 42973800
>>>>>> [   63.481524] qcom-camss 1b00020.camss: pm_domain_on 19840080 
>>>>>> link 4e413800
>>>>>> [   63.481555] qcom-camss 1b00020.camss: pm_domain_on 19842ce8 
>>>>>> link 42973800
>>>>>> [   63.481632] qcom-camss 1b00020.camss: pm_domain_off 19840080 
>>>>>> link 4e413800
>>>>>> [   63.481641] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 
>>>>>> link 42973800
>>>>>> [   63.654004] qcom-camss 1b00020.camss: pm_domain_off 19842ce8 
>>>>>> link 0
>>>>>>> That's the question.
>>>>>>> 
>>>>>>> ---
>>>>>>> bod
>>>>> 
>>>>> Could you provide this output ?
>>>>> 
>>>>> index 80a62ba112950..b25b8f6b00be1 100644
>>>>> --- a/drivers/media/platform/qcom/camss/camss-vfe.c
>>>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>>>> @@ -595,6 +595,9 @@ void vfe_isr_reset_ack(struct vfe_device *vfe)
>>>>>   */
>>>>>  void vfe_pm_domain_off(struct vfe_device *vfe)
>>>>>  {
>>>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>>>> +
>>>>>         if (!vfe->genpd)
>>>>>                 return;
>>>>> 
>>>>> @@ -609,7 +612,8 @@ void vfe_pm_domain_off(struct vfe_device *vfe)
>>>>>  int vfe_pm_domain_on(struct vfe_device *vfe)
>>>>>  {
>>>>>         struct camss *camss = vfe->camss;
>>>>> -
>>>>> +dev_info(camss->dev, "%s VFE %d genpd %pK genpd_link %pK\n",
>>>>> +        __func__, vfe->id, vfe->genpd, vfe->genpd_link);
>>>>>         if (!vfe->genpd)
>>>>>                 return 0;
>>>>> 
>>>>> ---
>>>>> bod
>>>> I think logging in pm_domain_on should be placed after 
>>>> device_link_add because only NULL
>>>> will be visible.
>>>> [   83.040694] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 
>>>> genpd 000000009bd8355f genpd_link 0000000000000000
>>>> [   83.049293] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 
>>>> genpd 00000000bfb65e7c genpd_link 0000000000000000
>>>> [   83.049353] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 
>>>> genpd 000000009bd8355f genpd_link 00000000ccb0acd9
>>>> [   83.049641] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 0 
>>>> genpd 00000000bfb65e7c genpd_link 00000000348ac3c1
>>>> [   83.049654] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 
>>>> genpd 000000009bd8355f genpd_link 00000000ccb0acd9
>>>> [   83.241498] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 
>>>> genpd 000000009bd8355f genpd_link 0000000000000000
>>> 
>>> Could you add
>>> 
>>> +++ b/drivers/media/platform/qcom/camss/camss-vfe.c
>>> @@ -786,7 +786,7 @@ int vfe_get(struct vfe_device *vfe)
>>>         int ret;
>>> 
>>>         mutex_lock(&vfe->power_lock);
>>> -
>>> +dev_info(vfe->camss->dev, "%s vfe %d power_count %d\n", __func__, 
>>> vfe->id, vfe->power_count);
>>>         if (vfe->power_count == 0) {
>>>                 ret = vfe->res->hw_ops->pm_domain_on(vfe);
>>>                 if (ret < 0)
>>> @@ -823,6 +823,7 @@ int vfe_get(struct vfe_device *vfe)
>>> 
>>>         mutex_unlock(&vfe->power_lock);
>>> 
>>> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
>>> camss->vfe- >id, 0);
>>>         return 0;
>>> 
>>>  error_reset:
>>> @@ -835,7 +836,7 @@ int vfe_get(struct vfe_device *vfe)
>>> 
>>>  error_pm_domain:
>>>         mutex_unlock(&vfe->power_lock);
>>> -
>>> +dev_info(camss->vfe->dev, "%s vfe %d err=%d\n", __func__, 
>>> camss->vfe- >id, ret);
>>>         return ret;
>>>  }
>>> 
>>> ?
>>> 
>>> ---
>>> bod
>> I have added little more from the logs because it is only failing in 
>> edge cases megapixels-sensorprofile failing by
>> different reason quickly and trying to release the device.
>> [   54.719030] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
>> [   54.750124] qcom-camss 1b00020.camss: vfe_get vfe 0 power_count 1
>> [   54.750236] qcom-camss 1b00020.camss: vfe_get vfe 0 err=0
>> [   54.751270] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 0 genpd 
>> 00000000beaef03c genpd_link 00000000251644d9
> 
>> [   54.751433] qcom-camss 1b00020.camss: vfe_pm_domain_on VFE 1 genpd 
>> 000000007ce2da53 genpd_link 0000000000000000
>> [   54.755531] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
>> 000000007ce2da53 genpd_link 0000000058dcd4d6
> 
> that's a bug genpd_link should be NULL unless power_count != 0
> 
>> [  143.922868] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
>> 000000007ce2da53 genpd_link 00000000d1fcd54b
>> [  144.126535] qcom-camss 1b00020.camss: vfe_pm_domain_off VFE 1 genpd 
>> 000000007ce2da53 genpd_link 0000000000000000
> 
> this is the corollary of the bug
> 
> can you provide the output of the attached please ?
[   50.787730] qcom-camss 1b00020.camss: vfe_get/806 vfe 1 power_count 0
[   50.794888] qcom-camss 1b00020.camss: vfe_get/811 vfe 1 power_count 0
[   50.795040] qcom-camss 1b00020.camss: vfe_get/816 vfe 1 power_count 0
[   50.795131] qcom-camss 1b00020.camss: vfe_get/822 vfe 1 power_count 0
[   50.795172] qcom-camss 1b00020.camss: vfe_get/827 vfe 1 power_count 0
[   50.795180] qcom-camss 1b00020.camss: vfe_get/830 vfe 1 power_count 0
[   50.795188] qcom-camss 1b00020.camss: vfe_get/841 vfe 1 power_count 1
[   50.795413] qcom-camss 1b00020.camss: vfe_put/868 vfe 1 power_count 1
[   50.795422] qcom-camss 1b00020.camss: vfe_put/874 vfe 1 power_count 1
[   50.795429] qcom-camss 1b00020.camss: vfe_put/882 vfe 1 power_count 1
[   50.795468] qcom-camss 1b00020.camss: vfe_put/884 vfe 1 power_count 1
[   50.799936] qcom-camss 1b00020.camss: vfe_put/886 vfe 1 power_count 1
[   50.800247] qcom-camss 1b00020.camss: vfe_put/888 vfe 1 power_count 1
[   50.800257] qcom-camss 1b00020.camss: vfe_put/891 vfe 1 power_count 1
[   50.800263] qcom-camss 1b00020.camss: vfe_put/893 vfe 1 power_count 0
[   51.086159] qcom-camss 1b00020.camss: vfe_get/801 vfe 0 power_count 0
[   51.088158] qcom-camss 1b00020.camss: vfe_get/806 vfe 0 power_count 0
[   51.092782] qcom-camss 1b00020.camss: vfe_get/811 vfe 0 power_count 0
[   51.092872] qcom-camss 1b00020.camss: vfe_get/816 vfe 0 power_count 0
[   51.092945] qcom-camss 1b00020.camss: vfe_get/822 vfe 0 power_count 0
[   51.092980] qcom-camss 1b00020.camss: vfe_get/827 vfe 0 power_count 0
[   51.092987] qcom-camss 1b00020.camss: vfe_get/830 vfe 0 power_count 0
[   51.092994] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 1
[   51.117104] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 2
[   52.181802] qcom-camss 1b00020.camss: vfe_put/868 vfe 0 power_count 2
[   52.181828] qcom-camss 1b00020.camss: vfe_put/891 vfe 0 power_count 2
[   52.181834] qcom-camss 1b00020.camss: vfe_put/893 vfe 0 power_count 1
[   52.189017] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 2
[   64.920259] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 3
[   64.920337] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 4
[   64.920368] qcom-camss 1b00020.camss: vfe_get/801 vfe 1 power_count 0
[   64.920656] qcom-camss 1b00020.camss: vfe_get/806 vfe 1 power_count 0
[   64.920667] qcom-camss 1b00020.camss: vfe_get/811 vfe 1 power_count 0
[   64.920706] qcom-camss 1b00020.camss: vfe_get/816 vfe 1 power_count 0
[   64.920734] qcom-camss 1b00020.camss: vfe_get/822 vfe 1 power_count 0
[   64.920868] qcom-camss 1b00020.camss: vfe_get/827 vfe 1 power_count 0
[   64.920877] qcom-camss 1b00020.camss: vfe_get/830 vfe 1 power_count 0
[   64.920886] qcom-camss 1b00020.camss: vfe_get/841 vfe 1 power_count 1
[   64.920963] qcom-camss 1b00020.camss: vfe_get/841 vfe 1 power_count 2
[   64.921008] qcom-camss 1b00020.camss: vfe_get/841 vfe 1 power_count 3
[   64.921871] qcom-camss 1b00020.camss: vfe_put/868 vfe 0 power_count 4
[   64.921896] qcom-camss 1b00020.camss: vfe_put/891 vfe 0 power_count 4
[   64.921904] qcom-camss 1b00020.camss: vfe_put/893 vfe 0 power_count 3
[   64.927278] qcom-camss 1b00020.camss: vfe_get/841 vfe 0 power_count 4
[   65.096857] qcom-camss 1b00020.camss: vfe_put/868 vfe 1 power_count 3
[   65.096883] qcom-camss 1b00020.camss: vfe_put/891 vfe 1 power_count 3
[   65.096889] qcom-camss 1b00020.camss: vfe_put/893 vfe 1 power_count 2
[   65.096903] qcom-camss 1b00020.camss: vfe_put/868 vfe 1 power_count 2
[   65.096908] qcom-camss 1b00020.camss: vfe_put/891 vfe 1 power_count 2
[   65.096914] qcom-camss 1b00020.camss: vfe_put/893 vfe 1 power_count 1
[   65.096927] qcom-camss 1b00020.camss: vfe_put/868 vfe 1 power_count 1
[   65.096933] qcom-camss 1b00020.camss: vfe_put/874 vfe 1 power_count 1
[   65.096938] qcom-camss 1b00020.camss: vfe_put/882 vfe 1 power_count 1
[   65.096958] qcom-camss 1b00020.camss: vfe_put/884 vfe 1 power_count 1
[   65.096964] qcom-camss 1b00020.camss: vfe_put/886 vfe 1 power_count 1
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/camss/camss-vfe.c b/drivers/media/platform/qcom/camss/camss-vfe.c
index 80a62ba11295042802cbaec617fb87c492ea6a55..1bf1473331f63b9ab106d21ea263c84d851c8a31 100644
--- a/drivers/media/platform/qcom/camss/camss-vfe.c
+++ b/drivers/media/platform/qcom/camss/camss-vfe.c
@@ -595,7 +595,7 @@  void vfe_isr_reset_ack(struct vfe_device *vfe)
  */
 void vfe_pm_domain_off(struct vfe_device *vfe)
 {
-	if (!vfe->genpd)
+	if (!vfe->genpd_link)
 		return;
 
 	device_link_del(vfe->genpd_link);