Message ID | 20250227072341.28693-1-lihuisong@huawei.com |
---|---|
State | New |
Headers | show |
Series | mailbox: pcc: Fix can't clear level interrupt of type3 in cornor case | expand |
在 2025/3/1 0:32, Sudeep Holla 写道: > On Thu, Feb 27, 2025 at 03:23:41PM +0800, Huisong Li wrote: >> The mbox_chan_received_data() will call Rx callback of mbox client driver >> using type3 to set the flag of command completion. Then driver can continue >> to do something like sending a new command. In this case, the rest of the >> interrupt handler function may be concurrent with pcc_send_data(). >> > Understood and valid issue/bug. > >> The 'chan_in_use' flag of a channel is true after sending a command. And >> the flag of the new command may be cleared by the running interrupt handler >> in cornor case. As a result, the interrupt being level triggered can't be >> cleared in pcc_mbox_irq() and it will be disabled after the number of >> handled times exceeds the specified value. The error log is as follows: >> >> [519082.811553] kunpeng_hccs HISI04B2:00: PCC command executed timeout! > ^^^^ > These timestamps are useless, needs to be dropped. Got it. Thanks. > >> [519082.828532] kunpeng_hccs HISI04B2:00: get port link status info failed, ret = -110. >> [519082.833438] irq 13: nobody cared (try booting with the "irqpoll" option) >> [519082.844622] CPU: 304 PID: 15206 Comm: systemd-journal Kdump: loaded Tainted: G OE 5.10.0 #5 >> [519082.854959] Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS Nezha B800 V3.1.0 01/02/2024 > "To be filled by O.E.M." interesting. Either as silicon vendor, some prefer > to leave it this way to ensure the O.E.M fill them correctly or the firmware > engineers are not bothered to get these right as nothing breaks without > these. > > Anyways, good example of what not to have in the products, as it is completely > useless. > > [...] Thanks for pointing it out. Please ignore this. will drop the line. > >> To solve this issue, pcc_mbox_irq() clear 'chann_in_use' flag immediately >> after clearing interrupt ack register. >> > This may be correct way of fixing the issue here, but I am questioning the > existence of this flag now. I have some rework on this file. I will pick This flag is for shared interrupt case on type3. please see: 3db174e478cb ("mailbox: pcc: Support shared interrupt for multiple subspaces") We may need to fix it first before your refactoring this file. After all, it's an issue. A little modification is more easily to backport and merge. If it's ok for you, I'll update this commit log. > this up if I think this is needed as it may conflict with my changes or > we will drop the flag completely. Give a week or so, will post the changes > and we can take it from there. > Looking forward to your rework. /Huisong > > .
diff --git a/drivers/mailbox/pcc.c b/drivers/mailbox/pcc.c index 82102a4c5d68..077ff98366cb 100644 --- a/drivers/mailbox/pcc.c +++ b/drivers/mailbox/pcc.c @@ -333,10 +333,20 @@ static irqreturn_t pcc_mbox_irq(int irq, void *p) if (pcc_chan_reg_read_modify_write(&pchan->plat_irq_ack)) return IRQ_NONE; + /* + * The mbox_chan_received_data() will call Rx callback of mbox + * client driver using type3 to set the flag of command completion. + * Then driver can continue to do something like sending a new + * command. In this case, the rest of the interrupt handler + * function may be concurrent with pcc_send_data(). + * To avoid the 'chan_in_use' flag of new command being cleared by + * interrupt handler, clear this flag immediately after clearing + * interrupt ack register. + */ + pchan->chan_in_use = false; mbox_chan_received_data(chan, NULL); check_and_ack(pchan, chan); - pchan->chan_in_use = false; return IRQ_HANDLED; }
The mbox_chan_received_data() will call Rx callback of mbox client driver using type3 to set the flag of command completion. Then driver can continue to do something like sending a new command. In this case, the rest of the interrupt handler function may be concurrent with pcc_send_data(). The 'chan_in_use' flag of a channel is true after sending a command. And the flag of the new command may be cleared by the running interrupt handler in cornor case. As a result, the interrupt being level triggered can't be cleared in pcc_mbox_irq() and it will be disabled after the number of handled times exceeds the specified value. The error log is as follows: [519082.811553] kunpeng_hccs HISI04B2:00: PCC command executed timeout! [519082.828532] kunpeng_hccs HISI04B2:00: get port link status info failed, ret = -110. [519082.833438] irq 13: nobody cared (try booting with the "irqpoll" option) [519082.844622] CPU: 304 PID: 15206 Comm: systemd-journal Kdump: loaded Tainted: G OE 5.10.0 #5 [519082.854959] Hardware name: To be filled by O.E.M. To be filled by O.E.M./To be filled by O.E.M., BIOS Nezha B800 V3.1.0 01/02/2024 [519082.867467] Call trace: [519082.870709] dump_backtrace+0x0/0x210 [519082.875145] show_stack+0x1c/0x2c [519082.879240] dump_stack+0xec/0x130 [519082.883421] __report_bad_irq+0x50/0x190 [519082.888122] note_interrupt+0x1e4/0x260 [519082.892740] handle_irq_event+0x144/0x17c [519082.897519] handle_fasteoi_irq+0xd0/0x240 [519082.902386] __handle_domain_irq+0x80/0xf0 [519082.907255] gic_handle_irq+0x74/0x2d0 [519082.911774] el1_irq+0xbc/0x140 [519082.915698] mnt_clone_write+0x0/0x70 [519082.920131] file_update_time+0xcc/0x160 [519082.924832] fault_dirty_shared_page+0xe8/0x150 [519082.930133] do_shared_fault+0x80/0x1d0 [519082.934737] do_fault+0x118/0x1a4 [519082.938821] handle_pte_fault+0x154/0x230 [519082.943600] __handle_mm_fault+0x1ac/0x390 [519082.948465] handle_mm_fault+0xf0/0x250 [519082.953075] do_page_fault+0x184/0x454 [519082.957595] do_translation_fault+0xac/0xd4 [519082.962555] do_mem_abort+0x44/0xb4 [519082.966817] el0_da+0x40/0x74 [519082.970554] el0_sync_handler+0x60/0xb4 [519082.975160] el0_sync+0x168/0x180 [519082.979243] handlers: [519082.982300] [<0000000039882697>] pcc_mbox_irq [519082.987433] Disabling IRQ #13 To solve this issue, pcc_mbox_irq() clear 'chann_in_use' flag immediately after clearing interrupt ack register. Signed-off-by: Huisong Li <lihuisong@huawei.com> --- drivers/mailbox/pcc.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)