[v2,0/5] Add support for AArch64 MOPS instructions

Message ID	20240507022249.554831-1-thiago.bauermann@linaro.org
Headers	show Delivered-To: patch@linaro.org Received-SPF: pass (google.com: domain of gdb-patches-bounces+patch=linaro.org@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BEAAA3858D1E From: Thiago Jung Bauermann <thiago.bauermann@linaro.org> To: gdb-patches@sourceware.org Cc: Christophe Lyon <christophe.lyon@linaro.org> Subject: [PATCH v2 0/5] Add support for AArch64 MOPS instructions Date: Mon, 6 May 2024 23:22:44 -0300 Message-ID: <20240507022249.554831-1-thiago.bauermann@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gdb-patches-bounces+patch=linaro.org@sourceware.org
Series	Add support for AArch64 MOPS instructions \| expand [v2,0/5] Add support for AArch64 MOPS instructions [v2,1/5] gdb/aarch64: Implement software single stepping for MOPS instructions [v2,2/5] gdb/aarch64: Add record support for MOPS instructions. [v2,3/5] gdb/testsuite: Add gdb.arch/aarch64-mops-watchpoint.exp [v2,4/5] gdb/testsuite: Add gdb.arch/aarch64-mops-atomic-inst.exp [v2,5/5] gdb/testsuite: Add gdb.reverse/aarch64-mops.exp

Thiago Jung Bauermann May 7, 2024, 2:22 a.m. UTC

Hello,

I'm sending v2 because Christophe made a suggestion for the
gdb.arch/aarch64-mops-atomic-inst.exp testcase, so patch 4 incoroporates
it.

The other patches are unchanged from v1.

Here is the original cover letter for convenience:

This patch series implements GDB support for the new instructions in
AArch64's MOPS feature.  Patch 1 has a small overview.

What is needed from GDB is recognizing the MOPS sequences of instructions
as atomic so that they can be stepped over during instruction single
stepping, and also to avoid doing displaced stepping with them.  This is
done in patch 1.

Patch 2 adds support for the new instructions to the record an replay
target.

The other patches add testcases to test each of the aspects above, plus
one testcase to verify the interaction of the MOPS instructions with
watchpoints.

Tested on Ubuntu 23.10 aarch64-linux-gnu with no regressions, using the
Arm FVP emulator as well as QEMU v8.2.

Thiago Jung Bauermann (5):
  gdb/aarch64: Implement software single stepping for MOPS instructions
  gdb/aarch64: Add record support for MOPS instructions.
  gdb/testsuite: Add gdb.arch/aarch64-mops-watchpoint.exp
  gdb/testsuite: Add gdb.arch/aarch64-mops-atomic-inst.exp
  gdb/testsuite: Add gdb.reverse/aarch64-mops.exp

 gdb/aarch64-tdep.c                            | 191 +++++++++++++++++-
 .../gdb.arch/aarch64-mops-atomic-inst.c       |  69 +++++++
 .../gdb.arch/aarch64-mops-atomic-inst.exp     |  94 +++++++++
 .../gdb.arch/aarch64-mops-watchpoint.c        |  66 ++++++
 .../gdb.arch/aarch64-mops-watchpoint.exp      |  79 ++++++++
 gdb/testsuite/gdb.reverse/aarch64-mops.c      |  71 +++++++
 gdb/testsuite/gdb.reverse/aarch64-mops.exp    | 171 ++++++++++++++++
 gdb/testsuite/lib/gdb.exp                     |  61 ++++++
 8 files changed, 800 insertions(+), 2 deletions(-)
 create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.c
 create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.exp
 create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.c
 create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.exp
 create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.c
 create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.exp


base-commit: 84a069db6714ddcf444095ed09dbcd7404834694

Luis Machado May 8, 2024, 8:52 a.m. UTC | #1

Hi Thiago,

Thanks for the series!

I just wanted to clarify the approach being used for this feature. From what I noticed,
we are implementing motion through these sequences of MOPS instructions as atomic,
right? Similar to the atomic sequences.

Is there a reason for that? From what I gathered, the Linux Kernel implements the
ability to single-step through each of the MOPS instructions. The catch is that
if the M and E instructions get interrupted, the sequence will get restarted, and
the debugger will need to be aware of that.

Was the atomic block approach for MOPS instructions used as a simplification?

I suppose displaced-stepping these sequences would be a bit tricky, but doable if
they were relocated as a block. I'm wondering what the performance impact would be
of requiring serialization of execution in a non-stop debugging scenario with
multiple threads.

One last point, doesn't record/replay require single-stepping each instruction
individually? Does that go against the atomic block approach?

I'll go stare at the code.

Regards,
Luis

On 5/7/24 03:22, Thiago Jung Bauermann wrote:
> Hello,
> 
> I'm sending v2 because Christophe made a suggestion for the
> gdb.arch/aarch64-mops-atomic-inst.exp testcase, so patch 4 incoroporates
> it.
> 
> The other patches are unchanged from v1.
> 
> Here is the original cover letter for convenience:
> 
> This patch series implements GDB support for the new instructions in
> AArch64's MOPS feature.  Patch 1 has a small overview.
> 
> What is needed from GDB is recognizing the MOPS sequences of instructions
> as atomic so that they can be stepped over during instruction single
> stepping, and also to avoid doing displaced stepping with them.  This is
> done in patch 1.
> 
> Patch 2 adds support for the new instructions to the record an replay
> target.
> 
> The other patches add testcases to test each of the aspects above, plus
> one testcase to verify the interaction of the MOPS instructions with
> watchpoints.
> 
> Tested on Ubuntu 23.10 aarch64-linux-gnu with no regressions, using the
> Arm FVP emulator as well as QEMU v8.2.
> 
> Thiago Jung Bauermann (5):
>   gdb/aarch64: Implement software single stepping for MOPS instructions
>   gdb/aarch64: Add record support for MOPS instructions.
>   gdb/testsuite: Add gdb.arch/aarch64-mops-watchpoint.exp
>   gdb/testsuite: Add gdb.arch/aarch64-mops-atomic-inst.exp
>   gdb/testsuite: Add gdb.reverse/aarch64-mops.exp
> 
>  gdb/aarch64-tdep.c                            | 191 +++++++++++++++++-
>  .../gdb.arch/aarch64-mops-atomic-inst.c       |  69 +++++++
>  .../gdb.arch/aarch64-mops-atomic-inst.exp     |  94 +++++++++
>  .../gdb.arch/aarch64-mops-watchpoint.c        |  66 ++++++
>  .../gdb.arch/aarch64-mops-watchpoint.exp      |  79 ++++++++
>  gdb/testsuite/gdb.reverse/aarch64-mops.c      |  71 +++++++
>  gdb/testsuite/gdb.reverse/aarch64-mops.exp    | 171 ++++++++++++++++
>  gdb/testsuite/lib/gdb.exp                     |  61 ++++++
>  8 files changed, 800 insertions(+), 2 deletions(-)
>  create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.c
>  create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-atomic-inst.exp
>  create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.c
>  create mode 100644 gdb/testsuite/gdb.arch/aarch64-mops-watchpoint.exp
>  create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.c
>  create mode 100644 gdb/testsuite/gdb.reverse/aarch64-mops.exp
> 
> 
> base-commit: 84a069db6714ddcf444095ed09dbcd7404834694

Thiago Jung Bauermann May 9, 2024, 3:24 a.m. UTC | #2

Hello Luis,

Thank you for your comments!

Luis Machado <luis.machado@arm.com> writes:

> Hi Thiago,
>
> Thanks for the series!
>
> I just wanted to clarify the approach being used for this feature. From what I noticed,
> we are implementing motion through these sequences of MOPS instructions as atomic,
> right? Similar to the atomic sequences.

Yes, patch 1 makes GDB consider MOPS sequences as atomic.

> Is there a reason for that? From what I gathered, the Linux Kernel implements the
> ability to single-step through each of the MOPS instructions. The catch is that
> if the M and E instructions get interrupted, the sequence will get restarted, and
> the debugger will need to be aware of that.
>
> Was the atomic block approach for MOPS instructions used as a simplification?

No, it was just my mistaken interpretation of the Arm ARM. It says that
"the prologue, main, and epilogue instructions are expected to be run in
succession and to appear consecutively in memory", and I interpreted
that to mean that they're atomic. I didn't know that the kernel could
restart the sequence if the execution gets interrupted.

I tried to test how GDB reacts to the sequence getting restarted by
forcing the inferior to be scheduled to another CPU (by changing its CPU
affinity set) while the MOPS instructions were single-stepped, but I
wasn't able to trigger that behaviour.

Though thinking about it, I don't believe it would affect GDB in any
way. When GDB resumes the inferior, it will figure things out from
scratch when the inferior stops again so it doesn't make a difference
whether it stops in the next instruction or in the prologue instruction.
So IIUC I don't think we need to do anything specific about
single-stepping MOPS instructions then.

Maybe one thing that GDB would need to handle is when the user requests
a breakpoint in the main or epilogue instruction of the sequence. In
that case I think GDB would have to put the trap instruction either in
the prologue instruction or the one after the epilogue instruction. WDYT?

> I suppose displaced-stepping these sequences would be a bit tricky, but doable if
> they were relocated as a block. I'm wondering what the performance impact would be
> of requiring serialization of execution in a non-stop debugging scenario with
> multiple threads.

I didn't think of the possibility of relocating the sequence as a block,
so patch 1 also disables displaced stepping for MOPS instructions. Thank
you for the suggestion. I will implement your idea, though next week we
have the Linaro Connect conference so I think I'll only have something
about two weeks from now.

> One last point, doesn't record/replay require single-stepping each instruction
> individually? Does that go against the atomic block approach?

That's a good question. I'm not sure how the record and replay backend
deals with that, to be honest. But it does work, and I added a testcase
specifically to test reversing the MOPS sequences, and it passes both on
QEMU and FVP.

> I'll go stare at the code.

Thank you! You can skip all of the first patch except for the last hunk,
which disables displaced stepping for the MOPS instructions.

In the next few days I'll post a v3 where patch 1 does only that, and
make the corresponding adjustments the testcase in patch 4 which tests
single stepping.

After Linaro Connect I'll work on the displaced stepping support and (if
necessary) adjusting breakpoints in the middle of MOPS sequences.

Thiago Jung Bauermann May 10, 2024, 5:20 a.m. UTC | #3

Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes:

> Maybe one thing that GDB would need to handle is when the user requests
> a breakpoint in the main or epilogue instruction of the sequence. In
> that case I think GDB would have to put the trap instruction either in
> the prologue instruction or the one after the epilogue instruction. WDYT?

I did some tests with the setp/setm/sete, cpyp/cpym/cpye and
cpyfp/cpyfm/cpyfe sequences and they all work fine if a software
breakpoint is set on their main or epilogue instructions. Tested on
Arm FVP and QEMU 8.2.

So at least for the emulators, adjusting the breakpoint address isn't
necessary.

Luis Machado May 10, 2024, 1:11 p.m. UTC | #4

On 5/10/24 06:20, Thiago Jung Bauermann wrote:
> Thiago Jung Bauermann <thiago.bauermann@linaro.org> writes:
> 
>> Maybe one thing that GDB would need to handle is when the user requests
>> a breakpoint in the main or epilogue instruction of the sequence. In
>> that case I think GDB would have to put the trap instruction either in
>> the prologue instruction or the one after the epilogue instruction. WDYT?
> 
> I did some tests with the setp/setm/sete, cpyp/cpym/cpye and
> cpyfp/cpyfm/cpyfe sequences and they all work fine if a software
> breakpoint is set on their main or epilogue instructions. Tested on
> Arm FVP and QEMU 8.2.
> 
> So at least for the emulators, adjusting the breakpoint address isn't
> necessary.
> 

Thanks for investingating that. Looks like it will be simpler then.

Regarding displaced stepping, we can disable displaced stepping temporarily
for these instructions. If we find a convenient way to relocate the whole block
of MOPS instructions, then we can add that later on.

Guinevere Larsen May 10, 2024, 3:07 p.m. UTC | #5

On 5/9/24 00:24, Thiago Jung Bauermann wrote:
>> One last point, doesn't record/replay require single-stepping each instruction
>> individually? Does that go against the atomic block approach?
> That's a good question. I'm not sure how the record and replay backend
> deals with that, to be honest. But it does work, and I added a testcase
> specifically to test reversing the MOPS sequences, and it passes both on
> QEMU and FVP.

On x86 we require it. That's because we record the current value of all 
registers of interest, including PC, which means if you tried to 
disassemble many instructions and record them all, you'd be recording 
only the PC of the first instruction many times.  I haven't looked at 
the Arm tdep stuff for record but I can't imagine it would be too 
different. If I'm right, the test works because you don't try to 
reverse-stepi to the middle of the asm statement, you just look at the 
full state before and after.

If GDB will treat the block as always atomic, then I guess we don't need 
to save the PCs of every instruction, so this approach is fine, but if 
people are expected to be able to stepi in between the blocks, reverse 
should be able to do the same. If things are atomic, should we try to 
warn users who try to stepi/reverse-stepi through those blocks?

Sidenote, I haven't had time to look at the code yet, but personally, 
I'd like to see the test introduced at the same commit where support was 
added, because that would make it easier to bisect in the future, if 
necessary :)

Thiago Jung Bauermann May 23, 2024, 1:43 a.m. UTC | #6

Hello Guinevere,

Guinevere Larsen <blarsen@redhat.com> writes:

> On 5/9/24 00:24, Thiago Jung Bauermann wrote:
>
>>>  One last point, doesn't record/replay require single-stepping each instruction
>>> individually? Does that go against the atomic block approach?
>>
>> That's a good question. I'm not sure how the record and replay backend
>> deals with that, to be honest. But it does work, and I added a testcase
>> specifically to test reversing the MOPS sequences, and it passes both on
>> QEMU and FVP.
>
> On x86 we require it. That's because we record the current value of all registers of interest,
> including PC, which means if you tried to disassemble many instructions and record them all, you'd
> be recording only the PC of the first instruction many times.  I haven't looked at the Arm tdep
> stuff for record but I can't imagine it would be too different. If I'm right, the test works because
> you don't try to reverse-stepi to the middle of the asm statement, you just look at the full state
> before and after.

Thank you for the explanation!

> If GDB will treat the block as always atomic, then I guess we don't need to save the PCs of every
> instruction, so this approach is fine, but if people are expected to be able to stepi in between the
> blocks, reverse should be able to do the same. If things are atomic, should we try to warn users
> who try to stepi/reverse-stepi through those blocks?

For MOPS instructions specifically in later versions of the patch series
we're not treating them as atomic so this issue doesn't apply
anymore. But for other atomic instructions (e.g., acquire/release) if
the user stepi's (steps-i?) on the first instruction in the atomic
sequence, GDB sets a breakpoint at the end of the sequence and continues
the inferior until there. It doesn't show any warning or notice, though
perhaps that would be friendlier.

> Sidenote, I haven't had time to look at the code yet, but personally, I'd like to see the test
> introduced at the same commit where support was added, because that would make it easier to
> bisect in the future, if necessary :)

Makes sense. Done for v4.

[v2,0/5] Add support for AArch64 MOPS instructions

Message

Comments