diff mbox series

soc: qcom: pd-mapper: defer probing on sdm845

Message ID 20250205-qcom_pdm_defer-v1-1-a2e9a39ea9b9@oltmanns.dev
State New
Headers show
Series soc: qcom: pd-mapper: defer probing on sdm845 | expand

Commit Message

Frank Oltmanns Feb. 5, 2025, 9:57 p.m. UTC
On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
with the in-kernel pd-mapper. Deferring the probe solves these issues.
Specifically, audio only works reliably with the in-kernel pd-mapper, if
the probe succeeds when remoteproc3 triggers the first successful probe.
I.e., probes from remoteproc0, 1, and 2 need to be deferred until
remoteproc3 has been probed.

Introduce a device specific quirk that lists the first auxdev for which
the probe must be executed. Until then, defer probes from other auxdevs.

Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
Cc: stable@vger.kernel.org
Signed-off-by: Frank Oltmanns <frank@oltmanns.dev>
---
The in-kernel pd-mapper has been causing audio issues on sdm845
devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
observed that Stephan’s approach [1] - which defers module probing by
blocklisting the module and triggering a later probe - works reliably.

Inspired by this, I experimented with delaying the probe within the
module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
certain time (13.9 seconds after boot, based on ktime_get()) had
elapsed. This method also restored audio functionality.

Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
discovery: audio only works reliably with the in-kernel pd-mapper when
the first successful probe is triggered by remoteproc3. In other words,
probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
been probed.

To address this, I propose introducing a quirk table (which currently
only contains sdm845) to defer probing until the correct auxiliary
device (remoteproc3) initiates the probe.

I look forward to your feedback.

Thanks,
  Frank

[1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@linaro.org/
---
 drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)


---
base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
change-id: 20250205-qcom_pdm_defer-3dc1271d74d9

Best regards,

Comments

Bjorn Andersson Feb. 6, 2025, 2:54 a.m. UTC | #1
On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote:
> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
> with the in-kernel pd-mapper. Deferring the probe solves these issues.
> Specifically, audio only works reliably with the in-kernel pd-mapper, if
> the probe succeeds when remoteproc3 triggers the first successful probe.
> I.e., probes from remoteproc0, 1, and 2 need to be deferred until
> remoteproc3 has been probed.
> 
> Introduce a device specific quirk that lists the first auxdev for which
> the probe must be executed. Until then, defer probes from other auxdevs.
> 
> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
> Cc: stable@vger.kernel.org
> Signed-off-by: Frank Oltmanns <frank@oltmanns.dev>
> ---
> The in-kernel pd-mapper has been causing audio issues on sdm845
> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
> observed that Stephan’s approach [1] - which defers module probing by
> blocklisting the module and triggering a later probe - works reliably.
> 
> Inspired by this, I experimented with delaying the probe within the
> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
> certain time (13.9 seconds after boot, based on ktime_get()) had
> elapsed. This method also restored audio functionality.
> 
> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
> discovery: audio only works reliably with the in-kernel pd-mapper when
> the first successful probe is triggered by remoteproc3. In other words,
> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
> been probed.
> 

The remoteproc numbering is assigned at the time of registering each
remoteproc driver, and does not necessarily relate to the order in which
they are launched. That said, it sounds like what you're saying is that
is that audio only works if we launch the pd-mapper after the
remoteprocs has started?

Can you please confirm which remoteproc is which in your numbering? (In
particular, which remoteproc instance is the audio DSP?)

> To address this, I propose introducing a quirk table (which currently
> only contains sdm845) to defer probing until the correct auxiliary
> device (remoteproc3) initiates the probe.
> 
> I look forward to your feedback.
> 

I don't think the proposed workaround is our path forward, but I very
much appreciate your initiative and the insights it provides! Seems to
me that we have a race-condition in the pdr helper.

Regards,
Bjorn

> Thanks,
>   Frank
> 
> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@linaro.org/
> ---
>  drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
> --- a/drivers/soc/qcom/qcom_pd_mapper.c
> +++ b/drivers/soc/qcom/qcom_pd_mapper.c
> @@ -46,6 +46,11 @@ struct qcom_pdm_data {
>  	struct list_head services;
>  };
>  
> +struct qcom_pdm_probe_first_dev_quirk {
> +	const char *name;
> +	u32 id;
> +};
> +
>  static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
>  static struct qcom_pdm_data *__qcom_pdm_data;
>  
> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
>  	NULL,
>  };
>  
> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
> +	.id = 3,
> +	.name = "pd-mapper"
> +};
> +
>  static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>  	{ .compatible = "qcom,apq8016", .data = NULL, },
>  	{ .compatible = "qcom,apq8064", .data = NULL, },
> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>  	{},
>  };
>  
> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
> +	{ .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
> +	{},
> +};
>  static void qcom_pdm_stop(struct qcom_pdm_data *data)
>  {
>  	qcom_pdm_free_domains(data);
> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void)
>  	return ERR_PTR(ret);
>  }
>  
> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
> +{
> +	const struct of_device_id *match;
> +	struct device_node *root;
> +	struct qcom_pdm_probe_first_dev_quirk *first_dev;
> +
> +	root = of_find_node_by_path("/");
> +	if (!root)
> +		return true;
> +
> +	match = of_match_node(qcom_pdm_defer, root);
> +	of_node_put(root);
> +	if (!match)
> +		return true;
> +
> +	first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
> +	return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
> +}
> +
>  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>  			  const struct auxiliary_device_id *id)
>  
> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>  	mutex_lock(&qcom_pdm_mutex);
>  
>  	if (!__qcom_pdm_data) {
> +		if (!qcom_pdm_ready(auxdev)) {
> +			pr_debug("%s: Deferring probe for device %s (id: %u)\n",
> +				__func__, auxdev->name, auxdev->id);
> +			ret = -EPROBE_DEFER;
> +			goto probe_stop;
> +		}
> +		pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
> +			__func__, auxdev->name, auxdev->id);
> +
>  		data = qcom_pdm_start();
>  
>  		if (IS_ERR(data))
> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>  
>  	auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
>  
> +probe_stop:
>  	mutex_unlock(&qcom_pdm_mutex);
>  
>  	return ret;
> 
> ---
> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9
> 
> Best regards,
> -- 
> Frank Oltmanns <frank@oltmanns.dev>
>
Frank Oltmanns Feb. 6, 2025, 5:57 a.m. UTC | #2
On 2025-02-05 at 20:54:53 -0600, Bjorn Andersson <andersson@kernel.org> wrote:
> On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote:
>> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
>> with the in-kernel pd-mapper. Deferring the probe solves these issues.
>> Specifically, audio only works reliably with the in-kernel pd-mapper, if
>> the probe succeeds when remoteproc3 triggers the first successful probe.
>> I.e., probes from remoteproc0, 1, and 2 need to be deferred until
>> remoteproc3 has been probed.
>>
>> Introduce a device specific quirk that lists the first auxdev for which
>> the probe must be executed. Until then, defer probes from other auxdevs.
>>
>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Frank Oltmanns <frank@oltmanns.dev>
>> ---
>> The in-kernel pd-mapper has been causing audio issues on sdm845
>> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
>> observed that Stephan’s approach [1] - which defers module probing by
>> blocklisting the module and triggering a later probe - works reliably.
>>
>> Inspired by this, I experimented with delaying the probe within the
>> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
>> certain time (13.9 seconds after boot, based on ktime_get()) had
>> elapsed. This method also restored audio functionality.
>>
>> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
>> discovery: audio only works reliably with the in-kernel pd-mapper when
>> the first successful probe is triggered by remoteproc3. In other words,
>> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
>> been probed.
>>
>
> The remoteproc numbering is assigned at the time of registering each
> remoteproc driver, and does not necessarily relate to the order in which
> they are launched. That said, it sounds like what you're saying is that
> is that audio only works if we launch the pd-mapper after the
> remoteprocs has started?

Almost, but not quite. You are right, that remoteproc3 in my setup is
always the last one that probes the pd-mapper.

However, when experimenting with different timings I saw that the
pd-mapper really do has to respond to the probe from remoteproc3 (I'm
not sure I'm using the right terminology here, but I hope my intent
comes across). If the pd-mapper responds to remoteproc3's probe with
-EPROBE_DEFER there will again be subsequent probes from the other
remoteprocs. If we act on those probes, there is a chance that audio
(mic in my case) does not work. So, my conclusion was that remoteproc3's
probe has to be answered first before responding to the other probes.

Further note that in my experiments remoteproc1 was always the first to
do the probing, followed by either remoteproc0 or remoteproc2.
remoteproc3 was always the last.

> Can you please confirm which remoteproc is which in your numbering? (In
> particular, which remoteproc instance is the audio DSP?)

remoteproc0: adsp
remoteproc1: cdsp
remoteproc2: slpi
remoteproc3: 4080000.remoteproc

(I took them from the kernel messages "remoteproc remoteproc<X>: <xyz>
is available".)

>> To address this, I propose introducing a quirk table (which currently
>> only contains sdm845) to defer probing until the correct auxiliary
>> device (remoteproc3) initiates the probe.
>>
>> I look forward to your feedback.
>>
>
> I don't think the proposed workaround is our path forward, but I very
> much appreciate your initiative and the insights it provides!

Thank you! I was hoping that somebody with more experience in the QCOM
universe can draw further conclusions from this.

> Seems to
> me that we have a race-condition in the pdr helper.

If you need further experimenting or can give me rough guidance on where
to look next, I'll be glad to help.

Thanks again,
  Frank

>
> Regards,
> Bjorn
>
>> Thanks,
>>   Frank
>>
>> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@linaro.org/
>> ---
>>  drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 43 insertions(+)
>>
>> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
>> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
>> --- a/drivers/soc/qcom/qcom_pd_mapper.c
>> +++ b/drivers/soc/qcom/qcom_pd_mapper.c
>> @@ -46,6 +46,11 @@ struct qcom_pdm_data {
>>  	struct list_head services;
>>  };
>>
>> +struct qcom_pdm_probe_first_dev_quirk {
>> +	const char *name;
>> +	u32 id;
>> +};
>> +
>>  static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
>>  static struct qcom_pdm_data *__qcom_pdm_data;
>>
>> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
>>  	NULL,
>>  };
>>
>> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
>> +	.id = 3,
>> +	.name = "pd-mapper"
>> +};
>> +
>>  static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>>  	{ .compatible = "qcom,apq8016", .data = NULL, },
>>  	{ .compatible = "qcom,apq8064", .data = NULL, },
>> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>>  	{},
>>  };
>>
>> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
>> +	{ .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
>> +	{},
>> +};
>>  static void qcom_pdm_stop(struct qcom_pdm_data *data)
>>  {
>>  	qcom_pdm_free_domains(data);
>> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void)
>>  	return ERR_PTR(ret);
>>  }
>>
>> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
>> +{
>> +	const struct of_device_id *match;
>> +	struct device_node *root;
>> +	struct qcom_pdm_probe_first_dev_quirk *first_dev;
>> +
>> +	root = of_find_node_by_path("/");
>> +	if (!root)
>> +		return true;
>> +
>> +	match = of_match_node(qcom_pdm_defer, root);
>> +	of_node_put(root);
>> +	if (!match)
>> +		return true;
>> +
>> +	first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
>> +	return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
>> +}
>> +
>>  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>  			  const struct auxiliary_device_id *id)
>>
>> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>  	mutex_lock(&qcom_pdm_mutex);
>>
>>  	if (!__qcom_pdm_data) {
>> +		if (!qcom_pdm_ready(auxdev)) {
>> +			pr_debug("%s: Deferring probe for device %s (id: %u)\n",
>> +				__func__, auxdev->name, auxdev->id);
>> +			ret = -EPROBE_DEFER;
>> +			goto probe_stop;
>> +		}
>> +		pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
>> +			__func__, auxdev->name, auxdev->id);
>> +
>>  		data = qcom_pdm_start();
>>
>>  		if (IS_ERR(data))
>> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>
>>  	auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
>>
>> +probe_stop:
>>  	mutex_unlock(&qcom_pdm_mutex);
>>
>>  	return ret;
>>
>> ---
>> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
>> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9
>>
>> Best regards,
>> --
>> Frank Oltmanns <frank@oltmanns.dev>
>>
Frank Oltmanns Feb. 6, 2025, 6:44 a.m. UTC | #3
Hi again,

On 2025-02-06 at 06:57:46 +0100, Frank Oltmanns <frank@oltmanns.dev> wrote:
> On 2025-02-05 at 20:54:53 -0600, Bjorn Andersson <andersson@kernel.org> wrote:
>> On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote:
>>> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
>>> with the in-kernel pd-mapper. Deferring the probe solves these issues.
>>> Specifically, audio only works reliably with the in-kernel pd-mapper, if
>>> the probe succeeds when remoteproc3 triggers the first successful probe.
>>> I.e., probes from remoteproc0, 1, and 2 need to be deferred until
>>> remoteproc3 has been probed.
>>>
>>> Introduce a device specific quirk that lists the first auxdev for which
>>> the probe must be executed. Until then, defer probes from other auxdevs.
>>>
>>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
>>> Cc: stable@vger.kernel.org
>>> Signed-off-by: Frank Oltmanns <frank@oltmanns.dev>
>>> ---
>>> The in-kernel pd-mapper has been causing audio issues on sdm845
>>> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
>>> observed that Stephan’s approach [1] - which defers module probing by
>>> blocklisting the module and triggering a later probe - works reliably.
>>>
>>> Inspired by this, I experimented with delaying the probe within the
>>> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
>>> certain time (13.9 seconds after boot, based on ktime_get()) had
>>> elapsed. This method also restored audio functionality.
>>>
>>> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
>>> discovery: audio only works reliably with the in-kernel pd-mapper when
>>> the first successful probe is triggered by remoteproc3. In other words,
>>> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
>>> been probed.
>>>
>>
>> The remoteproc numbering is assigned at the time of registering each
>> remoteproc driver, and does not necessarily relate to the order in which
>> they are launched. That said, it sounds like what you're saying is that
>> is that audio only works if we launch the pd-mapper after the
>> remoteprocs has started?
>
> Almost, but not quite. You are right, that remoteproc3 in my setup is
> always the last one that probes the pd-mapper.
>
> However, when experimenting with different timings I saw that the
> pd-mapper really do has to respond to the probe from remoteproc3 (I'm
> not sure I'm using the right terminology here, but I hope my intent
> comes across). If the pd-mapper responds to remoteproc3's probe with
> -EPROBE_DEFER there will again be subsequent probes from the other
> remoteprocs. If we act on those probes, there is a chance that audio
> (mic in my case) does not work. So, my conclusion was that remoteproc3's
> probe has to be answered first before responding to the other probes.
>
> Further note that in my experiments remoteproc1 was always the first to
> do the probing, followed by either remoteproc0 or remoteproc2.
> remoteproc3 was always the last.
>
>> Can you please confirm which remoteproc is which in your numbering? (In
>> particular, which remoteproc instance is the audio DSP?)
>
> remoteproc0: adsp
> remoteproc1: cdsp
> remoteproc2: slpi
> remoteproc3: 4080000.remoteproc

I'm sorry but there's one additional thing that I really should have
mentioned earlier: My issue is specifically with *call* audio.

Call audio is only available using out-of-tree patches. The ones I'm
currently using are here:
https://gitlab.com/sdm845-mainline/linux/-/commits/sdm845-6.13-rc2-r2?ref_type=tags

Best regards,
  Frank

>
> (I took them from the kernel messages "remoteproc remoteproc<X>: <xyz>
> is available".)
>
>>> To address this, I propose introducing a quirk table (which currently
>>> only contains sdm845) to defer probing until the correct auxiliary
>>> device (remoteproc3) initiates the probe.
>>>
>>> I look forward to your feedback.
>>>
>>
>> I don't think the proposed workaround is our path forward, but I very
>> much appreciate your initiative and the insights it provides!
>
> Thank you! I was hoping that somebody with more experience in the QCOM
> universe can draw further conclusions from this.
>
>> Seems to
>> me that we have a race-condition in the pdr helper.
>
> If you need further experimenting or can give me rough guidance on where
> to look next, I'll be glad to help.
>
> Thanks again,
>   Frank
>
>>
>> Regards,
>> Bjorn
>>
>>> Thanks,
>>>   Frank
>>>
>>> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@linaro.org/
>>> ---
>>>  drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
>>>  1 file changed, 43 insertions(+)
>>>
>>> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
>>> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
>>> --- a/drivers/soc/qcom/qcom_pd_mapper.c
>>> +++ b/drivers/soc/qcom/qcom_pd_mapper.c
>>> @@ -46,6 +46,11 @@ struct qcom_pdm_data {
>>>  	struct list_head services;
>>>  };
>>>
>>> +struct qcom_pdm_probe_first_dev_quirk {
>>> +	const char *name;
>>> +	u32 id;
>>> +};
>>> +
>>>  static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
>>>  static struct qcom_pdm_data *__qcom_pdm_data;
>>>
>>> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
>>>  	NULL,
>>>  };
>>>
>>> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
>>> +	.id = 3,
>>> +	.name = "pd-mapper"
>>> +};
>>> +
>>>  static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>>>  	{ .compatible = "qcom,apq8016", .data = NULL, },
>>>  	{ .compatible = "qcom,apq8064", .data = NULL, },
>>> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>>>  	{},
>>>  };
>>>
>>> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
>>> +	{ .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
>>> +	{},
>>> +};
>>>  static void qcom_pdm_stop(struct qcom_pdm_data *data)
>>>  {
>>>  	qcom_pdm_free_domains(data);
>>> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void)
>>>  	return ERR_PTR(ret);
>>>  }
>>>
>>> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
>>> +{
>>> +	const struct of_device_id *match;
>>> +	struct device_node *root;
>>> +	struct qcom_pdm_probe_first_dev_quirk *first_dev;
>>> +
>>> +	root = of_find_node_by_path("/");
>>> +	if (!root)
>>> +		return true;
>>> +
>>> +	match = of_match_node(qcom_pdm_defer, root);
>>> +	of_node_put(root);
>>> +	if (!match)
>>> +		return true;
>>> +
>>> +	first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
>>> +	return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
>>> +}
>>> +
>>>  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>>  			  const struct auxiliary_device_id *id)
>>>
>>> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>>  	mutex_lock(&qcom_pdm_mutex);
>>>
>>>  	if (!__qcom_pdm_data) {
>>> +		if (!qcom_pdm_ready(auxdev)) {
>>> +			pr_debug("%s: Deferring probe for device %s (id: %u)\n",
>>> +				__func__, auxdev->name, auxdev->id);
>>> +			ret = -EPROBE_DEFER;
>>> +			goto probe_stop;
>>> +		}
>>> +		pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
>>> +			__func__, auxdev->name, auxdev->id);
>>> +
>>>  		data = qcom_pdm_start();
>>>
>>>  		if (IS_ERR(data))
>>> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>>>
>>>  	auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
>>>
>>> +probe_stop:
>>>  	mutex_unlock(&qcom_pdm_mutex);
>>>
>>>  	return ret;
>>>
>>> ---
>>> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
>>> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9
>>>
>>> Best regards,
>>> --
>>> Frank Oltmanns <frank@oltmanns.dev>
>>>
Frank Oltmanns Feb. 9, 2025, 12:07 p.m. UTC | #4
On 2025-02-06 at 12:25:41 -0600, Steev Klimaszewski <steev@kali.org> wrote:
Hi Steev,

> Hi Frank,
>
> On Thu, Feb 6, 2025 at 12:45 AM Frank Oltmanns <frank@oltmanns.dev> wrote:
>>
>> Hi again,
>>
>> On 2025-02-06 at 06:57:46 +0100, Frank Oltmanns <frank@oltmanns.dev> wrote:
>> > On 2025-02-05 at 20:54:53 -0600, Bjorn Andersson <andersson@kernel.org> wrote:
>> >> On Wed, Feb 05, 2025 at 10:57:11PM +0100, Frank Oltmanns wrote:
>> >>> On xiaomi-beryllium and oneplus-enchilada audio does not work reliably
>> >>> with the in-kernel pd-mapper. Deferring the probe solves these issues.
>> >>> Specifically, audio only works reliably with the in-kernel pd-mapper, if
>> >>> the probe succeeds when remoteproc3 triggers the first successful probe.
>> >>> I.e., probes from remoteproc0, 1, and 2 need to be deferred until
>> >>> remoteproc3 has been probed.
>> >>>
>> >>> Introduce a device specific quirk that lists the first auxdev for which
>> >>> the probe must be executed. Until then, defer probes from other auxdevs.
>> >>>
>> >>> Fixes: 1ebcde047c54 ("soc: qcom: add pd-mapper implementation")
>> >>> Cc: stable@vger.kernel.org
>> >>> Signed-off-by: Frank Oltmanns <frank@oltmanns.dev>
>> >>> ---
>> >>> The in-kernel pd-mapper has been causing audio issues on sdm845
>> >>> devices (specifically, xiaomi-beryllium and oneplus-enchilada). I
>> >>> observed that Stephan’s approach [1] - which defers module probing by
>> >>> blocklisting the module and triggering a later probe - works reliably.
>> >>>
>> >>> Inspired by this, I experimented with delaying the probe within the
>> >>> module itself by returning -EPROBE_DEFER in qcom_pdm_probe() until a
>> >>> certain time (13.9 seconds after boot, based on ktime_get()) had
>> >>> elapsed. This method also restored audio functionality.
>> >>>
>> >>> Further logging of auxdev->id in qcom_pdm_probe() led to an interesting
>> >>> discovery: audio only works reliably with the in-kernel pd-mapper when
>> >>> the first successful probe is triggered by remoteproc3. In other words,
>> >>> probes from remoteproc0, 1, and 2 must be deferred until remoteproc3 has
>> >>> been probed.
>> >>>
>> >>
>> >> The remoteproc numbering is assigned at the time of registering each
>> >> remoteproc driver, and does not necessarily relate to the order in which
>> >> they are launched. That said, it sounds like what you're saying is that
>> >> is that audio only works if we launch the pd-mapper after the
>> >> remoteprocs has started?
>> >
>> > Almost, but not quite. You are right, that remoteproc3 in my setup is
>> > always the last one that probes the pd-mapper.
>> >
>> > However, when experimenting with different timings I saw that the
>> > pd-mapper really do has to respond to the probe from remoteproc3 (I'm
>> > not sure I'm using the right terminology here, but I hope my intent
>> > comes across). If the pd-mapper responds to remoteproc3's probe with
>> > -EPROBE_DEFER there will again be subsequent probes from the other
>> > remoteprocs. If we act on those probes, there is a chance that audio
>> > (mic in my case) does not work. So, my conclusion was that remoteproc3's
>> > probe has to be answered first before responding to the other probes.
>> >
>> > Further note that in my experiments remoteproc1 was always the first to
>> > do the probing, followed by either remoteproc0 or remoteproc2.
>> > remoteproc3 was always the last.
>> >
>> >> Can you please confirm which remoteproc is which in your numbering? (In
>> >> particular, which remoteproc instance is the audio DSP?)
>> >
>> > remoteproc0: adsp
>> > remoteproc1: cdsp
>> > remoteproc2: slpi
>> > remoteproc3: 4080000.remoteproc
>>
>> I'm sorry but there's one additional thing that I really should have
>> mentioned earlier: My issue is specifically with *call* audio.
>>
>> Call audio is only available using out-of-tree patches. The ones I'm
>> currently using are here:
>> https://gitlab.com/sdm845-mainline/linux/-/commits/sdm845-6.13-rc2-r2?ref_type=tags
>>
>> Best regards,
>>   Frank
>>
>> >
>> > (I took them from the kernel messages "remoteproc remoteproc<X>: <xyz>
>> > is available".)
>> >
>> >>> To address this, I propose introducing a quirk table (which currently
>> >>> only contains sdm845) to defer probing until the correct auxiliary
>> >>> device (remoteproc3) initiates the probe.
>> >>>
>> >>> I look forward to your feedback.
>> >>>
>> >>
>> >> I don't think the proposed workaround is our path forward, but I very
>> >> much appreciate your initiative and the insights it provides!
>> >
>> > Thank you! I was hoping that somebody with more experience in the QCOM
>> > universe can draw further conclusions from this.
>> >
>> >> Seems to
>> >> me that we have a race-condition in the pdr helper.
>> >
>> > If you need further experimenting or can give me rough guidance on where
>> > to look next, I'll be glad to help.
>> >
>> > Thanks again,
>> >   Frank
>> >
>> >>
>> >> Regards,
>> >> Bjorn
>> >>
>> >>> Thanks,
>> >>>   Frank
>> >>>
>> >>> [1]: https://lore.kernel.org/linux-arm-msm/Zwj3jDhc9fRoCCn6@linaro.org/
>> >>> ---
>> >>>  drivers/soc/qcom/qcom_pd_mapper.c | 43 +++++++++++++++++++++++++++++++++++++++
>> >>>  1 file changed, 43 insertions(+)
>> >>>
>> >>> diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
>> >>> index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
>> >>> --- a/drivers/soc/qcom/qcom_pd_mapper.c
>> >>> +++ b/drivers/soc/qcom/qcom_pd_mapper.c
>> >>> @@ -46,6 +46,11 @@ struct qcom_pdm_data {
>> >>>     struct list_head services;
>> >>>  };
>> >>>
>> >>> +struct qcom_pdm_probe_first_dev_quirk {
>> >>> +   const char *name;
>> >>> +   u32 id;
>> >>> +};
>> >>> +
>> >>>  static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
>> >>>  static struct qcom_pdm_data *__qcom_pdm_data;
>> >>>
>> >>> @@ -526,6 +531,11 @@ static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
>> >>>     NULL,
>> >>>  };
>> >>>
>> >>> +static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
>> >>> +   .id = 3,
>> >>> +   .name = "pd-mapper"
>> >>> +};
>> >>> +
>> >>>  static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>> >>>     { .compatible = "qcom,apq8016", .data = NULL, },
>> >>>     { .compatible = "qcom,apq8064", .data = NULL, },
>> >>> @@ -566,6 +576,10 @@ static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
>> >>>     {},
>> >>>  };
>> >>>
>> >>> +static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
>> >>> +   { .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
>> >>> +   {},
>> >>> +};
>> >>>  static void qcom_pdm_stop(struct qcom_pdm_data *data)
>> >>>  {
>> >>>     qcom_pdm_free_domains(data);
>> >>> @@ -637,6 +651,25 @@ static struct qcom_pdm_data *qcom_pdm_start(void)
>> >>>     return ERR_PTR(ret);
>> >>>  }
>> >>>
>> >>> +static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
>> >>> +{
>> >>> +   const struct of_device_id *match;
>> >>> +   struct device_node *root;
>> >>> +   struct qcom_pdm_probe_first_dev_quirk *first_dev;
>> >>> +
>> >>> +   root = of_find_node_by_path("/");
>> >>> +   if (!root)
>> >>> +           return true;
>> >>> +
>> >>> +   match = of_match_node(qcom_pdm_defer, root);
>> >>> +   of_node_put(root);
>> >>> +   if (!match)
>> >>> +           return true;
>> >>> +
>> >>> +   first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
>> >>> +   return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
>> >>> +}
>> >>> +
>> >>>  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>> >>>                       const struct auxiliary_device_id *id)
>> >>>
>> >>> @@ -647,6 +680,15 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>> >>>     mutex_lock(&qcom_pdm_mutex);
>> >>>
>> >>>     if (!__qcom_pdm_data) {
>> >>> +           if (!qcom_pdm_ready(auxdev)) {
>> >>> +                   pr_debug("%s: Deferring probe for device %s (id: %u)\n",
>> >>> +                           __func__, auxdev->name, auxdev->id);
>> >>> +                   ret = -EPROBE_DEFER;
>> >>> +                   goto probe_stop;
>> >>> +           }
>> >>> +           pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
>> >>> +                   __func__, auxdev->name, auxdev->id);
>> >>> +
>> >>>             data = qcom_pdm_start();
>> >>>
>> >>>             if (IS_ERR(data))
>> >>> @@ -659,6 +701,7 @@ static int qcom_pdm_probe(struct auxiliary_device *auxdev,
>> >>>
>> >>>     auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
>> >>>
>> >>> +probe_stop:
>> >>>     mutex_unlock(&qcom_pdm_mutex);
>> >>>
>> >>>     return ret;
>> >>>
>> >>> ---
>> >>> base-commit: 7f048b202333b967782a98aa21bb3354dc379bbf
>> >>> change-id: 20250205-qcom_pdm_defer-3dc1271d74d9
>> >>>
>> >>> Best regards,
>> >>> --
>> >>> Frank Oltmanns <frank@oltmanns.dev>
>> >>>
>>
>
> I know that Bjorn already said this change is a no, but I wanted to
> mention that I have a Lenovo Yoga C630 (WOS) device here that is also
> an sdm845, and with this patch applied, it has the opposite effect and
> breaks audio on it.

Interesting! Just so that I get a better understanding: Is remoteproc3
loaded at all? Can you please send me the output of:

$ dmesg | grep "remoteproc.: .* is available"

and

$ dmesg | grep "remoteproc.: .* is now up"

(no need to apply the patch for that)

Thanks,
  Frank

>
> -- steev
diff mbox series

Patch

diff --git a/drivers/soc/qcom/qcom_pd_mapper.c b/drivers/soc/qcom/qcom_pd_mapper.c
index 154ca5beb47160cc404a46a27840818fe3187420..34b26df665a888ac4872f56e948e73b561ae3b6b 100644
--- a/drivers/soc/qcom/qcom_pd_mapper.c
+++ b/drivers/soc/qcom/qcom_pd_mapper.c
@@ -46,6 +46,11 @@  struct qcom_pdm_data {
 	struct list_head services;
 };
 
+struct qcom_pdm_probe_first_dev_quirk {
+	const char *name;
+	u32 id;
+};
+
 static DEFINE_MUTEX(qcom_pdm_mutex); /* protects __qcom_pdm_data */
 static struct qcom_pdm_data *__qcom_pdm_data;
 
@@ -526,6 +531,11 @@  static const struct qcom_pdm_domain_data *x1e80100_domains[] = {
 	NULL,
 };
 
+static const struct qcom_pdm_probe_first_dev_quirk first_dev_remoteproc3 = {
+	.id = 3,
+	.name = "pd-mapper"
+};
+
 static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
 	{ .compatible = "qcom,apq8016", .data = NULL, },
 	{ .compatible = "qcom,apq8064", .data = NULL, },
@@ -566,6 +576,10 @@  static const struct of_device_id qcom_pdm_domains[] __maybe_unused = {
 	{},
 };
 
+static const struct of_device_id qcom_pdm_defer[] __maybe_unused = {
+	{ .compatible = "qcom,sdm845", .data = &first_dev_remoteproc3, },
+	{},
+};
 static void qcom_pdm_stop(struct qcom_pdm_data *data)
 {
 	qcom_pdm_free_domains(data);
@@ -637,6 +651,25 @@  static struct qcom_pdm_data *qcom_pdm_start(void)
 	return ERR_PTR(ret);
 }
 
+static bool qcom_pdm_ready(struct auxiliary_device *auxdev)
+{
+	const struct of_device_id *match;
+	struct device_node *root;
+	struct qcom_pdm_probe_first_dev_quirk *first_dev;
+
+	root = of_find_node_by_path("/");
+	if (!root)
+		return true;
+
+	match = of_match_node(qcom_pdm_defer, root);
+	of_node_put(root);
+	if (!match)
+		return true;
+
+	first_dev = (struct qcom_pdm_probe_first_dev_quirk *) match->data;
+	return (auxdev->id == first_dev->id) && !strcmp(auxdev->name, first_dev->name);
+}
+
 static int qcom_pdm_probe(struct auxiliary_device *auxdev,
 			  const struct auxiliary_device_id *id)
 
@@ -647,6 +680,15 @@  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
 	mutex_lock(&qcom_pdm_mutex);
 
 	if (!__qcom_pdm_data) {
+		if (!qcom_pdm_ready(auxdev)) {
+			pr_debug("%s: Deferring probe for device %s (id: %u)\n",
+				__func__, auxdev->name, auxdev->id);
+			ret = -EPROBE_DEFER;
+			goto probe_stop;
+		}
+		pr_debug("%s: Probing for device %s (id: %u), starting pdm\n",
+			__func__, auxdev->name, auxdev->id);
+
 		data = qcom_pdm_start();
 
 		if (IS_ERR(data))
@@ -659,6 +701,7 @@  static int qcom_pdm_probe(struct auxiliary_device *auxdev,
 
 	auxiliary_set_drvdata(auxdev, __qcom_pdm_data);
 
+probe_stop:
 	mutex_unlock(&qcom_pdm_mutex);
 
 	return ret;