Message ID | 20240321231149.519549-1-thiago.bauermann@linaro.org |
---|---|
Headers | show |
Series | Fix attaching to process when it has zombie threads | expand |
Hi! On Fri, 22 Mar 2024 at 00:12, Thiago Jung Bauermann <thiago.bauermann@linaro.org> wrote: > > Hello, > > This patch series fixes a GDB hang when attaching to a multi-threaded > inferior which happens often (but not always) on aarch64-linux and > powerpc64le-linux, as described in PR 31312. See patch 3 for a detailed > descripiton of the problem. > > Patches 1 and 2 are preparatory patches because I want to use existing code > to parse the /proc/PID/stat file to get the thread's starttime value, so > that GDB and gdbserver aren't fooled by PID reuse. > > This patch series was tested on native and extended-remote aarch64-linux > and armv8l-linux-gnueabihf and no regressions were found, except for the > following: > > When running gdb.threads/detach-step-over.exp on armv8l-linux-gnueabihf > extended-remote, sometimes GDBserver dies with: > > builtin_spawn /home/thiago.bauermann/.cache/builds/gdb-native-aarch32/gdb/testsuite/outputs/gdb.threads/detach-step-over/detach-step-over > Remote debugging from host 127.0.0.1, port 56624 > Process /home/thiago.bauermann/.cache/builds/gdb-native-aarch32/gdb/testsuite/outputs/gdb.threads/detach-step-over/detach-step-over created; pid = 840876 > Attached; pid = 840821 > Detaching from process 840821 > Attached; pid = 840821 > /home/thiago.bauermann/src/binutils-gdb/gdbserver/linux-low.cc:1956: A problem internal to GDBserver has been detected. > unsuspend LWP 840821, suspended=-1 > > The assertion triggered is this one: > > /* Decrement LWP's suspend count. */ > > static void > lwp_suspended_decr (struct lwp_info *lwp) > { > lwp->suspended--; > > if (lwp->suspended < 0) > { > struct thread_info *thread = get_lwp_thread (lwp); > > internal_error ("unsuspend LWP %ld, suspended=%d\n", lwpid_of (thread), > lwp->suspended); > } > } > > Unfortunately for the moment I don't have time to further debug this > problem and I didn't want to keep sitting on these patches until I can come > back to this issue. > > Note that of all the testcases in the GDB testsuite, only > detach-step-over.exp triggers the GDBserver internal error so it's a > localized problem. > > This is why I'm posting the patch series as an RFC. Considering that it > fixes a problem that is causing instability in the testsuite results for > aarch64-linux and powerpc64le-linux, does it make sense to commit it as is, > and then investigate the GDBserver internal error on armv8l-linux-gnueabihf > later? I quickly looked at the series, patches 1 and 2 LGTM, I would say patch 3 too, but it seems to be causing the random failures you mention :-( However, I think your rationale is OK, trading many failures for a single, localized one. But of course, I'm not a maintainer :-) Thanks, Christophe > > Thiago Jung Bauermann (3): > gdb/nat: Use procfs(5) indexes in linux_common_core_of_thread > gdb/nat: Factor linux_find_proc_stat_field out of > linux_common_core_of_thread > gdb/nat/linux: Fix attaching to process when it has zombie threads > > gdb/nat/linux-osdata.c | 65 +++++++++++++++++++++++++++++++++--------- > gdb/nat/linux-osdata.h | 7 +++++ > gdb/nat/linux-procfs.c | 19 ++++++++++++ > 3 files changed, 77 insertions(+), 14 deletions(-) > > > base-commit: b42aa684f6ff2bce9b8bc58aa89574723f17f1ce