@@ -430,6 +430,9 @@ int acpm_do_xfer(const struct acpm_handle *handle, const struct acpm_xfer *xfer)
return -EOPNOTSUPP;
}
+ msg.chan_id = xfer->acpm_chan_id;
+ msg.chan_type = EXYNOS_MBOX_CHAN_TYPE_DOORBELL;
+
scoped_guard(mutex, &achan->tx_lock) {
tx_front = readl(achan->tx.front);
idx = (tx_front + 1) % achan->qlen;
@@ -446,25 +449,15 @@ int acpm_do_xfer(const struct acpm_handle *handle, const struct acpm_xfer *xfer)
/* Advance TX front. */
writel(idx, achan->tx.front);
- }
- msg.chan_id = xfer->acpm_chan_id;
- msg.chan_type = EXYNOS_MBOX_CHAN_TYPE_DOORBELL;
- ret = mbox_send_message(achan->chan, (void *)&msg);
- if (ret < 0)
- return ret;
-
- ret = acpm_wait_for_message_response(achan, xfer);
+ ret = mbox_send_message(achan->chan, (void *)&msg);
+ if (ret < 0)
+ return ret;
- /*
- * NOTE: we might prefer not to need the mailbox ticker to manage the
- * transfer queueing since the protocol layer queues things by itself.
- * Unfortunately, we have to kick the mailbox framework after we have
- * received our message.
- */
- mbox_client_txdone(achan->chan, ret);
+ mbox_client_txdone(achan->chan, 0);
+ }
- return ret;
+ return acpm_wait_for_message_response(achan, xfer);
}
/**
The mailbox framework has a single inflight request at a time. If a request is sent while another is still active, it will be queued to the mailbox core ring buffer. ACPM protocol did not serialize the calls to the mailbox subsystem so we could start the timeout ticks in parallel for multiple requests, while just one was being inflight. Consider a hypothetical case where the xfer timeout is 100ms and an ACPM transaction takes 90ms: | 0ms: Message #0 is queued in mailbox layer and sent out, then sits | at acpm_dequeue_by_polling() with a timeout of 100ms | 1ms: Message #1 is queued in mailbox layer but not sent out yet. | Since send_message() doesn't block, it also sits at | acpm_dequeue_by_polling() with a timeout of 100ms | ... | 90ms: Message #0 is completed, txdone is called and message #1 is sent | 101ms: Message #1 times out since the count started at 1ms. Even though | it has only been inflight for 11ms. Fix the problem by moving mbox_send_message() and mbox_client_txdone() immediately after the message has been written to the TX queue and while still keeping the ACPM TX queue lock. We thus tie together the TX write with the doorbell ring and mark the TX as done after the doorbell has been rung. This guarantees that the doorbell has been rang before starting the timeout ticks. We should also see some performance improvement as we no longer wait to receive a response before ringing the doorbell for the next request, so the ACPM firmware shall be able to drain faster the TX queue. Another benefit is that requests are no longer able to ring the doorbell one for the other, so it eases debugging. Finally, the mailbox software queue will always contain a single doorbell request due to the serialization done at the ACPM TX queue level. Protocols like ACPM, that handle their own hardware queues need a passthrough mailbox API, where they are able to just ring the doorbell or flip a bit directly into the mailbox controller. The mailbox software queue mechanism, the locking done into the mailbox core is not really needed, so hopefully this lays the foundation for a passthrough mailbox API. Reported-by: André Draszik <andre.draszik@linaro.org> Fixes: a88927b534ba ("firmware: add Exynos ACPM protocol driver") Signed-off-by: Tudor Ambarus <tudor.ambarus@linaro.org> --- Changes in v2: - update commit message and fix the time shown in the example. - pass zero for the second argument of mbox_client_txdone(). mbox_send_message() returns a non negative token on success, and mbox_client_txdone() expects the status of last transmission. Doesn't change behavior for ACPM, but fix it for correctness. - add to Cc the arm_scmi list, they had a similar fix at: Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=da1642bc97c4e - chckpatch complains that Reported-by: shall be followed by Closes:, but the problem was reported offline, so I don't have a Closes: tag. - Link to v1: https://lore.kernel.org/r/20250605-acpm-timeout-v1-1-1dbfdbee30da@linaro.org --- drivers/firmware/samsung/exynos-acpm.c | 25 +++++++++---------------- 1 file changed, 9 insertions(+), 16 deletions(-) --- base-commit: a0bea9e39035edc56a994630e6048c8a191a99d8 change-id: 20250605-acpm-timeout-434578500a34 Best regards,