Message ID | 20240303150807.68117-1-jonas.gorski@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v2] serial: core: only stop transmit when HW fifo is empty | expand |
Hello, On 3/3/2024 7:08 AM, Jonas Gorski wrote: > If the circular buffer is empty, it just means we fit all characters to > send into the HW fifo, but not that the hardware finished transmitting > them. > > So if we immediately call stop_tx() after that, this may abort any > pending characters in the HW fifo, and cause dropped characters on the > console. > > Fix this by only stopping tx when the tx HW fifo is actually empty. > > Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers") > Cc: stable@vger.kernel.org > Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> > --- > (this is v2 of the bcm63xx-uart fix attempt) > > v1 -> v2 > * replace workaround with fix for core issue > * add Cc: for stable > > I'm somewhat confident this is the core issue causing the broken output > with bcm63xx-uart, and there is no actual need for the UART_TX_NOSTOP. > > I wouldn't be surprised if this also fixes mxs-uart for which > UART_TX_NOSTOP was introduced. > > If it does, there is no need for the flag anymore. > include/linux/serial_core.h | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h > index 55b1f3ba48ac..bb0f2d4ac62f 100644 > --- a/include/linux/serial_core.h > +++ b/include/linux/serial_core.h > @@ -786,7 +786,8 @@ enum UART_TX_FLAGS { > if (pending < WAKEUP_CHARS) { \ > uart_write_wakeup(__port); \ > \ > - if (!((flags) & UART_TX_NOSTOP) && pending == 0) \ > + if (!((flags) & UART_TX_NOSTOP) && pending == 0 && \ > + __port->ops->tx_empty(__port)) \ > __port->ops->stop_tx(__port); \ > } \ > \ I just upgraded to kernel 6.9 and discovered through a git bisect that this patch (7bfb915a597a301abb892f620fe5c283a9fdbd77) causes a problem with the legacy pxa.c serial driver (CONFIG_SERIAL_PXA_NON8250). I'm using it with a PXA168-based ARM device for a serial console as well as getty. With this patch applied, transmissions get hung up before they finish. The data isn't lost, because the next time a transmit occurs, the delayed data finally goes out -- but something seems to be causing it to get stuck right at the end of many, but not all, transmissions. For example, if I type "ps" and hit enter, nothing shows up until I hit enter again, which finally kickstarts the whole TX process and then I get all of the queued ps output. I'm really confused about this symptom because it seems at face value like this patch would only ever improve the situation by preventing stop_tx() from being called too early. There's something about the pxa driver that is happier when stop_tx() is called with an empty buffer even if the UART is reporting that it's not empty yet. I tested some other random systems in qemu and couldn't reproduce this issue, so the problem may very well be limited just to this driver/hardware... I realize this driver is old and deprecated (I'm likely one of the few users left of it) so I'm hesitant to call it a regression. Maybe it's really a bug in this driver that the new patch exposes? I even thought, "heck, I should probably be using the newer 8250_pxa driver instead", but that one is even worse -- it drops TX characters like crazy, regardless of whether this patch is applied. I want to look into that problem eventually. I'm hoping there is some kind of simple fix that can be made to the pxa driver to work around it with this new behavior. Can anyone think of a reason that this driver would not like this change? It seems counterintuitive to me -- the patch makes perfect sense. Thanks, Doug
Hi again, On 5/16/2024 9:22 PM, Doug Brown wrote: > I'm hoping there is some kind of simple fix that can be made to the pxa > driver to work around it with this new behavior. Can anyone think of a > reason that this driver would not like this change? It seems > counterintuitive to me -- the patch makes perfect sense. After further experimentation, I've come to the conclusion that this is a bug in the pxa uart driver, and this patch simply exposed the bug. I'll submit a patch to fix the issue in the pxa driver. If anyone's interested in the details: basically, the pxa driver in its current state doesn't work correctly if it receives a TX interrupt when the circular buffer is empty. It handles it, but then gets stuck waiting for the next TX IRQ that will never happen because no characters were transmitted. The way stop_tx() was previously being called before the transmitter was empty, it prevented that situation from happening because toggling the TX interrupt enable flag off (with stop_tx) and back on (with the next start_tx) causes a new TX interrupt to fire and kickstarts the transmit process again. The 8250 driver, for example, isn't affected by this problem because it effectively does stop_tx() on its own if it detects an empty circular buffer in the TX interrupt handler. Adding similar logic to the pxa driver fixes it. Doug
diff --git a/include/linux/serial_core.h b/include/linux/serial_core.h index 55b1f3ba48ac..bb0f2d4ac62f 100644 --- a/include/linux/serial_core.h +++ b/include/linux/serial_core.h @@ -786,7 +786,8 @@ enum UART_TX_FLAGS { if (pending < WAKEUP_CHARS) { \ uart_write_wakeup(__port); \ \ - if (!((flags) & UART_TX_NOSTOP) && pending == 0) \ + if (!((flags) & UART_TX_NOSTOP) && pending == 0 && \ + __port->ops->tx_empty(__port)) \ __port->ops->stop_tx(__port); \ } \ \
If the circular buffer is empty, it just means we fit all characters to send into the HW fifo, but not that the hardware finished transmitting them. So if we immediately call stop_tx() after that, this may abort any pending characters in the HW fifo, and cause dropped characters on the console. Fix this by only stopping tx when the tx HW fifo is actually empty. Fixes: 8275b48b2780 ("tty: serial: introduce transmit helpers") Cc: stable@vger.kernel.org Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> --- (this is v2 of the bcm63xx-uart fix attempt) v1 -> v2 * replace workaround with fix for core issue * add Cc: for stable I'm somewhat confident this is the core issue causing the broken output with bcm63xx-uart, and there is no actual need for the UART_TX_NOSTOP. I wouldn't be surprised if this also fixes mxs-uart for which UART_TX_NOSTOP was introduced. If it does, there is no need for the flag anymore. include/linux/serial_core.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)