Message ID | 20240507022249.554831-1-thiago.bauermann@linaro.org |
---|---|
Headers | show |
Series | Add support for AArch64 MOPS instructions | expand |
Hi Thiago, Thanks for the series! I just wanted to clarify the approach being used for this feature. From what I noticed, we are implementing motion through these sequences of MOPS instructions as atomic, right? Similar to the atomic sequences. Is there a reason for that? From what I gathered, the Linux Kernel implements the ability to single-step through each of the MOPS instructions. The catch is that if the M and E instructions get interrupted, the sequence will get restarted, and the debugger will need to be aware of that. Was the atomic block approach for MOPS instructions used as a simplification? I suppose displaced-stepping these sequences would be a bit tricky, but doable if they were relocated as a block. I'm wondering what the performance impact would be of requiring serialization of execution in a non-stop debugging scenario with multiple threads. One last point, doesn't record/replay require single-stepping each instruction individually? Does that go against the atomic block approach? I'll go stare at the code. Regards, Luis On 5/7/24 03:22, Thiago Jung Bauermann wrote: > Hello, > > I'm sending v2 because Christophe made a suggestion for the > gdb.arch/aarch64-mops-atomic-inst.exp testcase, so patch 4 incoroporates > it. > > The other patches are unchanged from v1. > > Here is the original cover letter for convenience: > > This patch series implements GDB support for the new instructions in > AArch64's MOPS feature. Patch 1 has a small overview. > > What is needed from GDB is recognizing the MOPS sequences of instructions > as atomic so that they can be stepped over during instruction single > stepping, and also to avoid doing displaced stepping with them. This is > done in patch 1. > > Patch 2 adds support for the new instructions to the record an replay > target. > > The other patches add testcases to test each of the aspects above, plus > one testcase to verify the interaction of the MOPS instructions with > watchpoints. > > Tested on Ubuntu 23.10 aarch64-linux-gnu with no regressions, using the > Arm FVP emulator as well as QEMU v8.2. > > Thiago Jung Bauermann (5): > gdb/aarch64: Implement software single stepping for MOPS instructions > gdb/aarch64: Add record support for MOPS instructions. > gdb/testsuite: Add gdb.arch/aarch64-mops-watchpoint.exp > gdb/testsuite: Add gdb.arch/aarch64-mops-atomic-inst.exp > gdb/testsuite: Add gdb.reverse/aarch64-mops.exp > > gdb/aarch64-tdep.c | 191 +++++++++++++++++- > .../gdb.arch/aarch64-mops-atomic-inst.c | 69 +++++++ > .../gdb.arch/aarch64-mops-atomic-inst.exp | 94 +++++++++ > .../gdb.arch/aarch64-mops-watchpoint.c | 66 ++++++ > .../gdb.arch/aarch64-mops-watchpoint.exp | 79 ++++++++ > gdb/testsuite/gdb.reverse/aarch64-mops.c | 71 +++++++ > gdb/testsuite/gdb.reverse/aarch64-mops.exp | 171 ++++++++++++++++ > gdb/testsuite/lib/gdb.exp | 61 ++++++ > 8 files changed, 800 insertions(+), 2 deletions(-) > create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.c > create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.exp > create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.c > create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.exp > create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.c > create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.exp > > > base-commit: 84a069db6714ddcf444095ed09dbcd7404834694
Hello Luis, Thank you for your comments! Luis Machado <luis.machado@arm.com> writes: > Hi Thiago, > > Thanks for the series! > > I just wanted to clarify the approach being used for this feature. From what I noticed, > we are implementing motion through these sequences of MOPS instructions as atomic, > right? Similar to the atomic sequences. Yes, patch 1 makes GDB consider MOPS sequences as atomic. > Is there a reason for that? From what I gathered, the Linux Kernel implements the > ability to single-step through each of the MOPS instructions. The catch is that > if the M and E instructions get interrupted, the sequence will get restarted, and > the debugger will need to be aware of that. > > Was the atomic block approach for MOPS instructions used as a simplification? No, it was just my mistaken interpretation of the Arm ARM. It says that "the prologue, main, and epilogue instructions are expected to be run in succession and to appear consecutively in memory", and I interpreted that to mean that they're atomic. I didn't know that the kernel could restart the sequence if the execution gets interrupted. I tried to test how GDB reacts to the sequence getting restarted by forcing the inferior to be scheduled to another CPU (by changing its CPU affinity set) while the MOPS instructions were single-stepped, but I wasn't able to trigger that behaviour. Though thinking about it, I don't believe it would affect GDB in any way. When GDB resumes the inferior, it will figure things out from scratch when the inferior stops again so it doesn't make a difference whether it stops in the next instruction or in the prologue instruction. So IIUC I don't think we need to do anything specific about single-stepping MOPS instructions then. Maybe one thing that GDB would need to handle is when the user requests a breakpoint in the main or epilogue instruction of the sequence. In that case I think GDB would have to put the trap instruction either in the prologue instruction or the one after the epilogue instruction. WDYT? > I suppose displaced-stepping these sequences would be a bit tricky, but doable if > they were relocated as a block. I'm wondering what the performance impact would be > of requiring serialization of execution in a non-stop debugging scenario with > multiple threads. I didn't think of the possibility of relocating the sequence as a block, so patch 1 also disables displaced stepping for MOPS instructions. Thank you for the suggestion. I will implement your idea, though next week we have the Linaro Connect conference so I think I'll only have something about two weeks from now. > One last point, doesn't record/replay require single-stepping each instruction > individually? Does that go against the atomic block approach? That's a good question. I'm not sure how the record and replay backend deals with that, to be honest. But it does work, and I added a testcase specifically to test reversing the MOPS sequences, and it passes both on QEMU and FVP. > I'll go stare at the code. Thank you! You can skip all of the first patch except for the last hunk, which disables displaced stepping for the MOPS instructions. In the next few days I'll post a v3 where patch 1 does only that, and make the corresponding adjustments the testcase in patch 4 which tests single stepping. After Linaro Connect I'll work on the displaced stepping support and (if necessary) adjusting breakpoints in the middle of MOPS sequences.
Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes: > Maybe one thing that GDB would need to handle is when the user requests > a breakpoint in the main or epilogue instruction of the sequence. In > that case I think GDB would have to put the trap instruction either in > the prologue instruction or the one after the epilogue instruction. WDYT? I did some tests with the setp/setm/sete, cpyp/cpym/cpye and cpyfp/cpyfm/cpyfe sequences and they all work fine if a software breakpoint is set on their main or epilogue instructions. Tested on Arm FVP and QEMU 8.2. So at least for the emulators, adjusting the breakpoint address isn't necessary.
On 5/10/24 06:20, Thiago Jung Bauermann wrote: > Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes: > >> Maybe one thing that GDB would need to handle is when the user requests >> a breakpoint in the main or epilogue instruction of the sequence. In >> that case I think GDB would have to put the trap instruction either in >> the prologue instruction or the one after the epilogue instruction. WDYT? > > I did some tests with the setp/setm/sete, cpyp/cpym/cpye and > cpyfp/cpyfm/cpyfe sequences and they all work fine if a software > breakpoint is set on their main or epilogue instructions. Tested on > Arm FVP and QEMU 8.2. > > So at least for the emulators, adjusting the breakpoint address isn't > necessary. > Thanks for investingating that. Looks like it will be simpler then. Regarding displaced stepping, we can disable displaced stepping temporarily for these instructions. If we find a convenient way to relocate the whole block of MOPS instructions, then we can add that later on.
On 5/9/24 00:24, Thiago Jung Bauermann wrote: >> One last point, doesn't record/replay require single-stepping each instruction >> individually? Does that go against the atomic block approach? > That's a good question. I'm not sure how the record and replay backend > deals with that, to be honest. But it does work, and I added a testcase > specifically to test reversing the MOPS sequences, and it passes both on > QEMU and FVP. On x86 we require it. That's because we record the current value of all registers of interest, including PC, which means if you tried to disassemble many instructions and record them all, you'd be recording only the PC of the first instruction many times. I haven't looked at the Arm tdep stuff for record but I can't imagine it would be too different. If I'm right, the test works because you don't try to reverse-stepi to the middle of the asm statement, you just look at the full state before and after. If GDB will treat the block as always atomic, then I guess we don't need to save the PCs of every instruction, so this approach is fine, but if people are expected to be able to stepi in between the blocks, reverse should be able to do the same. If things are atomic, should we try to warn users who try to stepi/reverse-stepi through those blocks? Sidenote, I haven't had time to look at the code yet, but personally, I'd like to see the test introduced at the same commit where support was added, because that would make it easier to bisect in the future, if necessary :)
Hello Guinevere, Guinevere Larsen <blarsen@redhat.com> writes: > On 5/9/24 00:24, Thiago Jung Bauermann wrote: > >>> One last point, doesn't record/replay require single-stepping each instruction >>> individually? Does that go against the atomic block approach? >> >> That's a good question. I'm not sure how the record and replay backend >> deals with that, to be honest. But it does work, and I added a testcase >> specifically to test reversing the MOPS sequences, and it passes both on >> QEMU and FVP. > > On x86 we require it. That's because we record the current value of all registers of interest, > including PC, which means if you tried to disassemble many instructions and record them all, you'd > be recording only the PC of the first instruction many times. I haven't looked at the Arm tdep > stuff for record but I can't imagine it would be too different. If I'm right, the test works because > you don't try to reverse-stepi to the middle of the asm statement, you just look at the full state > before and after. Thank you for the explanation! > If GDB will treat the block as always atomic, then I guess we don't need to save the PCs of every > instruction, so this approach is fine, but if people are expected to be able to stepi in between the > blocks, reverse should be able to do the same. If things are atomic, should we try to warn users > who try to stepi/reverse-stepi through those blocks? For MOPS instructions specifically in later versions of the patch series we're not treating them as atomic so this issue doesn't apply anymore. But for other atomic instructions (e.g., acquire/release) if the user stepi's (steps-i?) on the first instruction in the atomic sequence, GDB sets a breakpoint at the end of the sequence and continues the inferior until there. It doesn't show any warning or notice, though perhaps that would be friendlier. > Sidenote, I haven't had time to look at the code yet, but personally, I'd like to see the test > introduced at the same commit where support was added, because that would make it easier to > bisect in the future, if necessary :) Makes sense. Done for v4.