Message ID | CAOgzsHWxMVNNns2UBiUbYdiVd8U_FZbc+20Xmtbca1nhEEKWbw@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi Greg > Ah... Yes, using A9 (GICv1) which means you don't have grouping without the > security extensions. Ok switching the GIC to version 2 works seems to work. In a way that Linux still boots up and i get a FIQ. I have some problems still: It seems as if the exeption of the bugsplat below is called from handle_fasteoi_irq (or is it just interrupted?). Which would mean that the cpu is not jumping to the FIQ handler but the normal irq handler. This might point to a problem in the qemu FIQ code. But i am not sure, so the error might also be in the linux user mode. I have loaded a firmware my driver module with "set_fiq_handler" but the area where the fiq has landed (0xfff1240) is filled completly with zeros? Best regards Tim Bad mode in data abort handler detected Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM Modules linked in: firq(O) ipv6 CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 task: bf2b9300 ti: bf362000 task.ti: bf362000 PC is at 0xffff1240 LR is at handle_fasteoi_irq+0x9c/0x13c pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 sp : bf363e70 ip : 07a7e79d fp : 00000000 r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user Control: 10c53c7d Table: 60004059 DAC: 00000015 Process systemd-udevd (pid: 103, stack limit = 0xbf362240) Stack: (0xbf363e70 to 0xbf364000) 3e60: bf0083c0 00000000 0000002f 80230d04 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 00000000 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 0000002f 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 80008528 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 805baa00 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 80590080 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 ffffffff 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 8058cc70 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 80023af0 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 76dd3b44 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 76f93428 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 00000000 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 9fffdc21 [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] (gic_eoi_irq+0x0/0x4c) [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) ---[ end trace 3dc3571209a017e1 ]--- Kernel panic - not syncing: Fatal exception in interrupt
Hi Tim, Responses inline. Regards, Greg On 4 November 2014 09:40, Tim Sander <tim@krieglstein.org> wrote: > Hi Greg > > Ah... Yes, using A9 (GICv1) which means you don't have grouping without > the > > security extensions. > Ok switching the GIC to version 2 works seems to work. In a way that Linux > still > boots up and i get a FIQ. > > I have some problems still: > It seems as if the exeption of the bugsplat below > is called from handle_fasteoi_irq (or is it just interrupted?). Which > would mean > that the cpu is not jumping to the FIQ handler but the normal irq handler. > This > might point to a problem in the qemu FIQ code. But i am not sure, so the > error > might also be in the linux user mode. > > I have loaded a firmware my driver module with "set_fiq_handler" but the > area > where the fiq has landed (0xfff1240) is filled completly with zeros? > > Best regards > Tim > > Bad mode in data abort handler detected > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > Modules linked in: firq(O) ipv6 > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 > task: bf2b9300 ti: bf362000 task.ti: bf362000 > PC is at 0xffff1240 > LR is at handle_fasteoi_irq+0x9c/0x13c > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > sp : bf363e70 ip : 07a7e79d fp : 00000000 > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > It looks like we are in FIQ mode and interrupts have been masked. > Control: 10c53c7d Table: 60004059 DAC: 00000015 > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > Stack: (0xbf363e70 to 0xbf364000) > 3e60: bf0083c0 00000000 0000002f > 80230d04 > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 > 00000000 > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 > 0000002f > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 > 80008528 > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 > 805baa00 > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 > 80590080 > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 > ffffffff > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 > 8058cc70 > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 > 80023af0 > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 > 76dd3b44 > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 > 76f93428 > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 > 00000000 > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 > 9fffdc21 > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] (gic_eoi_irq+0x0/0x4c) > It certainly looks like we are going down the standard IRQ patch as you suggested. I'm not a Linux driver guy, but do you see any kind of activity (break points, printfs, ...) through your FIQ handler? > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > ---[ end trace 3dc3571209a017e1 ]--- > Kernel panic - not syncing: Fatal exception in interrupt > > It is hard to determine entirely what is happening here based on this info. I do have code of my own that routes KGDB interrupts as FIQs and with the workaround I see the FIQs handled as expected. Some things we can try to get more info in hopes of pinpointing where to look: 1. At the top of hw/intc/arm_gic.c there is the following commented out line: //#define DEBUG_GIC Uncomment the line, rebuild and rerun. This will give us some trace on what is going through the GIC code. 2. Run qemu with the "-d int" option which will print a message on each interrupt. We should see an FIQ at some point if they are occurring. The only issue is that there will be numerous IRQs, so you'll have to parse through them to find an "exception 6 [FIQ]. 3. If you set a breakpoint in your driver, is it possible to see that FIQs are on from the kernel debugger. Clearly you have to try this from a path where interrupts are masked. I see the following on my system mentioned above: ... Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel ...
Hi Greg > > Bad mode in data abort handler detected > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > Modules linked in: firq(O) ipv6 > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > PC is at 0xffff1240 > > LR is at handle_fasteoi_irq+0x9c/0x13c > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > > It looks like we are in FIQ mode and interrupts have been masked. Indeed. > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > Stack: (0xbf363e70 to 0xbf364000) > > 3e60: bf0083c0 00000000 0000002f > > 80230d04 > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 > > 00000000 > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 > > 0000002f > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 > > 80008528 > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 > > 805baa00 > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 > > 80590080 > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 > > ffffffff > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 > > 8058cc70 > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 > > 80023af0 > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 > > 76dd3b44 > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 > > 76f93428 > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 > > 00000000 > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 > > 9fffdc21 > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] (gic_eoi_irq+0x0/0x4c) > > It certainly looks like we are going down the standard IRQ patch as you > suggested. I'm not a Linux driver guy, but do you see any kind of activity > (break points, printfs, ...) through your FIQ handler? I am reaching 0xffff1224 which i believe is the fiq vector address on the vexpress? > > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > ---[ end trace 3dc3571209a017e1 ]--- > > Kernel panic - not syncing: Fatal exception in interrupt > > It is hard to determine entirely what is happening here based on this > info. I do have code of my own that routes KGDB interrupts as FIQs and > with the workaround I see the FIQs handled as expected. Some things we can > try to get more info in hopes of pinpointing where to look: > > 1. At the top of hw/intc/arm_gic.c there is the following commented out > line: > //#define DEBUG_GIC > Uncomment the line, rebuild and rerun. This will give us some trace on > what is going through the GIC code. I have commented out some debug lines but i see: Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at hw/intc/arm_gic.c:120 120 DPRINTF("Raised pending FIQ %d (cpu %d)\n", best_irq, cpu); With the expected irq nr. 49 (32+17). > 2. Run qemu with the "-d int" option which will print a message on each > interrupt. We should see an FIQ at some point if they are occurring. The > only issue is that there will be numerous IRQs, so you'll have to parse > through them to find an "exception 6 [FIQ]. Here is the relevant output when the FIQ hits: Taking exception 2 [SVC] Taking exception 2 [SVC] pml: pml_timer_tick: raise_irq arm_gic: Raised pending FIQ 49 (cpu 0) Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x80221d70 Taking exception 4 [Data Abort] ...with DFSR 0x805 DFAR 0x805c604c Taking exception 4 [Data Abort] ...with DFSR 0x805 DFAR 0x805c604c Taking exception 4 [Data Abort] So the fiq is hitting but unfortunatly i have no idea where the data aborts are coming from. I have shifted all other Irqs besides 49 to group 1 so that only irq 49 is a FIQ. Might it be that i am seeing some secure violations... The address of the IFAR __idr_pre_get which lives in the linux kernel in lib/idr.c seems to be implementing ann integer ID management. > 3. If you set a breakpoint in your driver, is it possible to see that > FIQs are on from the kernel debugger. Clearly you have to try this from > a path where interrupts are masked. I see the following on my system > mentioned above: > ... > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel > ... So you mean by debugging via the qemu debug port? I have not enabled the kgdb. As stated above, i was not able to catch the fiq irq there. But it might be that i get I have debugged qemu to see if the irq is routed correctly. The depeest call i could find is this: bt #0 tcg_handle_interrupt (cpu=0x555556450790, mask=16) at /home/sander/speedy/soc/qemu/translate-all.c:1503 #1 0x0000555555755323 in cpu_interrupt (cpu=0x555556450790, mask=16) at /home/sander/speedy/soc/qemu/include/qom/cpu.h:556 #2 0x00005555557561b7 in arm_cpu_set_irq (opaque=0x555556450790, irq=1, level=1) at /home/sander/speedy/soc/qemu/target-arm/cpu.c:261 #3 0x00005555558193ec in qemu_set_irq (irq=0x55555642c840, level=1) at hw/core/irq.c:43 #4 0x0000555555879073 in gic_update_with_grouping (s=0x5555564dba80) at hw/intc/arm_gic.c:132 #5 0x000055555587936d in gic_update (s=0x5555564dba80) at hw/intc/arm_gic.c:180 #6 0x00005555558798a7 in gic_set_irq (opaque=0x5555564dba80, irq=49, level=1) at hw/intc/arm_gic.c:264 #7 0x00005555558193ec in qemu_set_irq (irq=0x555556432b00, level=1) at hw/core/irq.c:43 #8 0x0000555555661d4d in a9mp_priv_set_irq (opaque=0x5555564d7260, irq=17, level=1) at /home/sander/speedy/soc/qemu/hw/cpu/a9mpcore.c:17 #9 0x00005555558193ec in qemu_set_irq (irq=0x5555564f3c00, level=1) at hw/core/irq.c:43 #10 0x00005555558f6fed in qemu_irq_raise (irq=0x5555564f3c00) at /home/sander/speedy/soc/qemu/include/hw/irq.h:16 #11 0x00005555558f7363 in pml_timer_tick (opaque=0x555556595020) at hw/timer/pml.c:95 #12 0x000055555599be6e in aio_bh_poll (ctx=0x5555563fdad0) at async.c:82 #13 0x00005555559b2d9f in aio_dispatch (ctx=0x5555563fdad0) at aio-posix.c:137 #14 0x000055555599c2cb in aio_ctx_dispatch (source=0x5555563fdad0, callback=0x0, user_data=0x0) at async.c:221 #15 0x00007ffff7901e04 in g_main_context_dispatch () from /lib/x86_64-linux-gnu/libglib-2.0.so.0 #16 0x00005555559b0a79 in glib_pollfds_poll () at main-loop.c:200 #17 0x00005555559b0b7a in os_host_main_loop_wait (timeout=0) at main-loop.c:245 #18 0x00005555559b0c52 in main_loop_wait (nonblocking=1) at main-loop.c:494 #19 0x0000555555791d8b in main_loop () at vl.c:1872 #20 0x00005555557998d5 in main (argc=22, argv=0x7fffffffda38, envp=0x7fffffffdaf0) at vl.c:4348 I am not sure if arm_cpu_set_irq(opaque=0x555556450790, irq=1, level=1) represents a fiq and if mask 16 is the correct mask for the fiq request. Row #6 show clearly that irq 49 configured to Group 0 is triggered. All other interrupt are configured to Group 1 from my Linux kernel. The call to #4 gic_update_with_grouping shows that grouping within the GIC is enabled and that irq is triggered as FIQ within qemu. All of this looks good as far as i understand. So i am pretty confident that qemu is working correctly (minus the Prefetch and Data Aborts). Best regards Tim
On 12 November 2014 07:56, Tim Sander <tim@krieglstein.org> wrote: > Hi Greg > > > > Bad mode in data abort handler detected > > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > > Modules linked in: firq(O) ipv6 > > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 > > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > > PC is at 0xffff1240 > > > LR is at handle_fasteoi_irq+0x9c/0x13c > > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > > > > It looks like we are in FIQ mode and interrupts have been masked. > Indeed. > > > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > > Stack: (0xbf363e70 to 0xbf364000) > > > 3e60: bf0083c0 00000000 0000002f > > > 80230d04 > > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 > > > 00000000 > > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 > > > 0000002f > > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 > > > 80008528 > > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 > > > 805baa00 > > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 > > > 80590080 > > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 > > > ffffffff > > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 > > > 8058cc70 > > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 > > > 80023af0 > > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 > > > 76dd3b44 > > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 > > > 76f93428 > > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 > > > 00000000 > > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 > > > 9fffdc21 > > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] > (gic_eoi_irq+0x0/0x4c) > > > > It certainly looks like we are going down the standard IRQ patch as you > > suggested. I'm not a Linux driver guy, but do you see any kind of > activity > > (break points, printfs, ...) through your FIQ handler? > I am reaching 0xffff1224 which i believe is the fiq vector address on the > vexpress? > Hmmm.... not sure. As you mentioned previously (and as seen in the above register dump), I would expect offset 0x1240 (pc=0xffff1240) for an FIQ. I'm not sure what is at offset 0x1224, but on my Linux kernel it appears that offset 0x1220 is vector_addrexcptn (not pabort), that happens to occupy the HYP trap vector. > > > > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > > ---[ end trace 3dc3571209a017e1 ]--- > > > Kernel panic - not syncing: Fatal exception in interrupt > > > > It is hard to determine entirely what is happening here based on this > > info. I do have code of my own that routes KGDB interrupts as FIQs and > > with the workaround I see the FIQs handled as expected. Some things we > can > > try to get more info in hopes of pinpointing where to look: > > > > 1. At the top of hw/intc/arm_gic.c there is the following commented > out > > line: > > //#define DEBUG_GIC > > Uncomment the line, rebuild and rerun. This will give us some trace > on > > what is going through the GIC code. > I have commented out some debug lines but i see: > Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at > hw/intc/arm_gic.c:120 > 120 DPRINTF("Raised pending FIQ %d (cpu %d)\n", > best_irq, cpu); > > With the expected irq nr. 49 (32+17). > > > 2. Run qemu with the "-d int" option which will print a message on > each > > interrupt. We should see an FIQ at some point if they are occurring. > The > > only issue is that there will be numerous IRQs, so you'll have to parse > > through them to find an "exception 6 [FIQ]. > Here is the relevant output when the FIQ hits: > Taking exception 2 [SVC] > Taking exception 2 [SVC] > pml: pml_timer_tick: raise_irq > arm_gic: Raised pending FIQ 49 (cpu 0) > Taking exception 6 [FIQ] > This looks to me like the GIC has caught the interrupt and communicated it to the CPU causing it to take the FIQ exception. > pml: pml_write: update control flags: 1 > pml: pml_update: start timer > pml: pml_update: lower irq > pml: pml_read: read magic > pml: pml_write: update control flags: 3 > pml: pml_update: start timer > Is pml your test driver? It looks like it initiates the interrupt and possibly performs some handling following it? > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x80221d70 > Taking exception 4 [Data Abort] > ...with DFSR 0x805 DFAR 0x805c604c > Taking exception 4 [Data Abort] > ...with DFSR 0x805 DFAR 0x805c604c > Taking exception 4 [Data Abort] > > So the fiq is hitting but unfortunatly i have no idea where the data > aborts are coming from. > The data aborts are likely a side effect of the prefetch abort taken before them; it is the interesting one. > I have shifted all other Irqs besides 49 to group 1 so that only irq 49 is > a FIQ. > Might it be that i am seeing some secure violations... > The address of the IFAR __idr_pre_get which lives in the linux kernel in > lib/idr.c seems to > be implementing ann integer ID management. > > > 3. If you set a breakpoint in your driver, is it possible to see that > > FIQs are on from the kernel debugger. Clearly you have to try this > from > > a path where interrupts are masked. I see the following on my system > > mentioned above: > > ... > > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel > > ... > So you mean by debugging via the qemu debug port? I have not enabled the > kgdb. > As stated above, i was not able to catch the fiq irq there. But it might > be that i get > > I have debugged qemu to see if the irq is routed correctly. The depeest > call i could find is this: bt > #0 tcg_handle_interrupt (cpu=0x555556450790, mask=16) at > /home/sander/speedy/soc/qemu/translate-all.c:1503 > #1 0x0000555555755323 in cpu_interrupt (cpu=0x555556450790, mask=16) > at /home/sander/speedy/soc/qemu/include/qom/cpu.h:556 > #2 0x00005555557561b7 in arm_cpu_set_irq (opaque=0x555556450790, irq=1, > level=1) > at /home/sander/speedy/soc/qemu/target-arm/cpu.c:261 > #3 0x00005555558193ec in qemu_set_irq (irq=0x55555642c840, level=1) at > hw/core/irq.c:43 > #4 0x0000555555879073 in gic_update_with_grouping (s=0x5555564dba80) at > hw/intc/arm_gic.c:132 > #5 0x000055555587936d in gic_update (s=0x5555564dba80) at > hw/intc/arm_gic.c:180 > #6 0x00005555558798a7 in gic_set_irq (opaque=0x5555564dba80, irq=49, > level=1) at hw/intc/arm_gic.c:264 > #7 0x00005555558193ec in qemu_set_irq (irq=0x555556432b00, level=1) at > hw/core/irq.c:43 > #8 0x0000555555661d4d in a9mp_priv_set_irq (opaque=0x5555564d7260, > irq=17, level=1) > at /home/sander/speedy/soc/qemu/hw/cpu/a9mpcore.c:17 > #9 0x00005555558193ec in qemu_set_irq (irq=0x5555564f3c00, level=1) at > hw/core/irq.c:43 > #10 0x00005555558f6fed in qemu_irq_raise (irq=0x5555564f3c00) at > /home/sander/speedy/soc/qemu/include/hw/irq.h:16 > #11 0x00005555558f7363 in pml_timer_tick (opaque=0x555556595020) at > hw/timer/pml.c:95 > #12 0x000055555599be6e in aio_bh_poll (ctx=0x5555563fdad0) at async.c:82 > #13 0x00005555559b2d9f in aio_dispatch (ctx=0x5555563fdad0) at > aio-posix.c:137 > #14 0x000055555599c2cb in aio_ctx_dispatch (source=0x5555563fdad0, > callback=0x0, user_data=0x0) at async.c:221 > #15 0x00007ffff7901e04 in g_main_context_dispatch () from > /lib/x86_64-linux-gnu/libglib-2.0.so.0 > #16 0x00005555559b0a79 in glib_pollfds_poll () at main-loop.c:200 > #17 0x00005555559b0b7a in os_host_main_loop_wait (timeout=0) at > main-loop.c:245 > #18 0x00005555559b0c52 in main_loop_wait (nonblocking=1) at main-loop.c:494 > #19 0x0000555555791d8b in main_loop () at vl.c:1872 > #20 0x00005555557998d5 in main (argc=22, argv=0x7fffffffda38, > envp=0x7fffffffdaf0) at vl.c:4348 > > I am not sure if arm_cpu_set_irq(opaque=0x555556450790, irq=1, level=1) > represents a fiq > and if mask 16 is the correct mask for the fiq request. > Yeah this routine handles both IRQs and FIQs. I don't see anything above that stands out as suspicious. It may be interesting to try the same test driver on an A15 emulation if it is not too much trouble. This would rule out the A9 workaround not being sufficient for being GICv2. > > Row #6 show clearly that irq 49 configured to Group 0 is triggered. All > other interrupt are configured to Group 1 > from my Linux kernel. The call to #4 gic_update_with_grouping shows that > grouping within the GIC is enabled > and that irq is triggered as FIQ within qemu. All of this looks good as > far as i understand. So i am pretty confident > that qemu is working correctly (minus the Prefetch and Data Aborts). > I agree that QEMU appears to be handling the FIQ properly and it appears that the CPU is trying to dispatch it. I understand that the Linux FIQ handling is a little trickier than IRQs, so I suspect that either something in the Linux kernel handling or your driver is going awry during handling or as a result of the FIQ. Let me know if you need any additional help or you discover any misbehavior. > Best regards > Tim > Regards, Greg
Am Mittwoch, 12. November 2014, 10:00:03 schrieb Greg Bellows: > On 12 November 2014 07:56, Tim Sander <tim@krieglstein.org> wrote: > > Hi Greg > > > > > > Bad mode in data abort handler detected > > > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > > > Modules linked in: firq(O) ipv6 > > > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 #1 > > > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > > > PC is at 0xffff1240 > > > > LR is at handle_fasteoi_irq+0x9c/0x13c > > > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > > > > > > It looks like we are in FIQ mode and interrupts have been masked. > > > > Indeed. > > > > > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > > > Stack: (0xbf363e70 to 0xbf364000) > > > > 3e60: bf0083c0 00000000 0000002f > > > > 80230d04 > > > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 76f92008 > > > > 00000000 > > > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff 8005cd04 > > > > 0000002f > > > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c bf363ef8 > > > > 80008528 > > > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 00000000 > > > > 805baa00 > > > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 76e8e4d0 > > > > 80590080 > > > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 200f0113 > > > > ffffffff > > > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 bf0079c0 > > > > 8058cc70 > > > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 00000000 > > > > 80023af0 > > > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 00000000 > > > > 76dd3b44 > > > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 76f93428 > > > > 76f93428 > > > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c 76f92008 > > > > 00000000 > > > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff 9fffd821 > > > > 9fffdc21 > > > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] > > > > (gic_eoi_irq+0x0/0x4c) > > > > > It certainly looks like we are going down the standard IRQ patch as you > > > suggested. I'm not a Linux driver guy, but do you see any kind of > > > > activity > > > > > (break points, printfs, ...) through your FIQ handler? > > > > I am reaching 0xffff1224 which i believe is the fiq vector address on the > > vexpress? > > Hmmm.... not sure. As you mentioned previously (and as seen in the above > register dump), I would expect offset 0x1240 (pc=0xffff1240) for an FIQ. > I'm not sure what is at offset 0x1224, but on my Linux kernel it appears > that offset 0x1220 is vector_addrexcptn (not pabort), that happens to > occupy the HYP trap vector. Zounds! You're right, i think this was a typo in my debug script. Which i didn't notice. But i am even reaching 0x1240 before but not 0x1244 which means it aborts on the first fiq instructions. Here is the "-d int" output directly after the FIQ hits: Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x800c8dcc //kmem_cache_alloc Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x8001be00 //v7_pabort Taking exception 3 [Prefetch Abort] and then it continue to fail on v7_pabort repeatedly. This shows that there is something fishy going on. It is failing on the presumed handler for the prefetch abort? But as i see earlier resolved prefetched abort errors i can conclude that it works up to the point where the CPU is in FIQ mode. FIQ is special in a way that static mapped memory is needed to avoid a page lookup as this fails under linux in fiq mode. But 0x800c8dcc (kmem_cache_alloc) is not called in the FIQ handler which obviously can't use any Linux infrastructure. And as i do not reach the breakpoint 0xffff1244 these misses happen on the execution of the first address of the FIQ handler. > > > > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > > > ---[ end trace 3dc3571209a017e1 ]--- > > > > Kernel panic - not syncing: Fatal exception in interrupt > > > > > > It is hard to determine entirely what is happening here based on this > > > info. I do have code of my own that routes KGDB interrupts as FIQs and > > > with the workaround I see the FIQs handled as expected. Some things we > > > > can > > > > > try to get more info in hopes of pinpointing where to look: > > > 1. At the top of hw/intc/arm_gic.c there is the following commented > > > > out > > > > > line: > > > //#define DEBUG_GIC > > > > > > Uncomment the line, rebuild and rerun. This will give us some trace > > > > on > > > > > what is going through the GIC code. > > > > I have commented out some debug lines but i see: > > Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at > > hw/intc/arm_gic.c:120 > > 120 DPRINTF("Raised pending FIQ %d (cpu %d)\n", > > best_irq, cpu); > > > > With the expected irq nr. 49 (32+17). > > > > > 2. Run qemu with the "-d int" option which will print a message on > > > > each > > > > > interrupt. We should see an FIQ at some point if they are occurring. > > > > The > > > > > only issue is that there will be numerous IRQs, so you'll have to parse > > > through them to find an "exception 6 [FIQ]. > > > > Here is the relevant output when the FIQ hits: > > Taking exception 2 [SVC] > > Taking exception 2 [SVC] > > pml: pml_timer_tick: raise_irq > > arm_gic: Raised pending FIQ 49 (cpu 0) > > Taking exception 6 [FIQ] > > This looks to me like the GIC has caught the interrupt and communicated it > to the CPU causing it to take the FIQ exception. > > > pml: pml_write: update control flags: 1 > > pml: pml_update: start timer > > pml: pml_update: lower irq > > pml: pml_read: read magic > > pml: pml_write: update control flags: 3 > > pml: pml_update: start timer > > Is pml your test driver? It looks like it initiates the interrupt and > possibly performs some handling following it? Yes, its just a simple set of some registers to control an interrupt. There is i added debug output to this driver to see if and when the FIQ is accessing the registers. But i see no accesses from FIQ mode. > > Taking exception 3 [Prefetch Abort] > > ...with IFSR 0x5 IFAR 0x80221d70 > > Taking exception 4 [Data Abort] > > ...with DFSR 0x805 DFAR 0x805c604c > > Taking exception 4 [Data Abort] > > ...with DFSR 0x805 DFAR 0x805c604c > > Taking exception 4 [Data Abort] > > > > So the fiq is hitting but unfortunatly i have no idea where the data > > aborts are coming from. > > The data aborts are likely a side effect of the prefetch abort taken before > them; it is the interesting one. Still as above the address is odd. In FIQ mode it should not jump to this address at all !?! This is definetly Linux memory space and i am not calling anything linux related from FIQ. > > I have shifted all other Irqs besides 49 to group 1 so that only irq 49 is > > a FIQ. > > Might it be that i am seeing some secure violations... > > The address of the IFAR __idr_pre_get which lives in the linux kernel in > > lib/idr.c seems to > > be implementing ann integer ID management. > > > > > 3. If you set a breakpoint in your driver, is it possible to see that > > > FIQs are on from the kernel debugger. Clearly you have to try this > > > > from > > > > > a path where interrupts are masked. I see the following on my system > > > > > > mentioned above: > > > ... > > > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment kernel > > > ... > > > > So you mean by debugging via the qemu debug port? I have not enabled the > > kgdb. > > As stated above, i was not able to catch the fiq irq there. But it might > > be that i get > > > > I have debugged qemu to see if the irq is routed correctly. The depeest > > call i could find is this: bt > > #0 tcg_handle_interrupt (cpu=0x555556450790, mask=16) at > > /home/sander/speedy/soc/qemu/translate-all.c:1503 > > #1 0x0000555555755323 in cpu_interrupt (cpu=0x555556450790, mask=16) > > > > at /home/sander/speedy/soc/qemu/include/qom/cpu.h:556 > > > > #2 0x00005555557561b7 in arm_cpu_set_irq (opaque=0x555556450790, irq=1, > > level=1) > > > > at /home/sander/speedy/soc/qemu/target-arm/cpu.c:261 > > > > #3 0x00005555558193ec in qemu_set_irq (irq=0x55555642c840, level=1) at > > hw/core/irq.c:43 > > #4 0x0000555555879073 in gic_update_with_grouping (s=0x5555564dba80) at > > hw/intc/arm_gic.c:132 > > #5 0x000055555587936d in gic_update (s=0x5555564dba80) at > > hw/intc/arm_gic.c:180 > > #6 0x00005555558798a7 in gic_set_irq (opaque=0x5555564dba80, irq=49, > > level=1) at hw/intc/arm_gic.c:264 > > #7 0x00005555558193ec in qemu_set_irq (irq=0x555556432b00, level=1) at > > hw/core/irq.c:43 > > #8 0x0000555555661d4d in a9mp_priv_set_irq (opaque=0x5555564d7260, > > irq=17, level=1) > > > > at /home/sander/speedy/soc/qemu/hw/cpu/a9mpcore.c:17 > > > > #9 0x00005555558193ec in qemu_set_irq (irq=0x5555564f3c00, level=1) at > > hw/core/irq.c:43 > > #10 0x00005555558f6fed in qemu_irq_raise (irq=0x5555564f3c00) at > > /home/sander/speedy/soc/qemu/include/hw/irq.h:16 > > #11 0x00005555558f7363 in pml_timer_tick (opaque=0x555556595020) at > > hw/timer/pml.c:95 > > #12 0x000055555599be6e in aio_bh_poll (ctx=0x5555563fdad0) at async.c:82 > > #13 0x00005555559b2d9f in aio_dispatch (ctx=0x5555563fdad0) at > > aio-posix.c:137 > > #14 0x000055555599c2cb in aio_ctx_dispatch (source=0x5555563fdad0, > > callback=0x0, user_data=0x0) at async.c:221 > > #15 0x00007ffff7901e04 in g_main_context_dispatch () from > > /lib/x86_64-linux-gnu/libglib-2.0.so.0 > > #16 0x00005555559b0a79 in glib_pollfds_poll () at main-loop.c:200 > > #17 0x00005555559b0b7a in os_host_main_loop_wait (timeout=0) at > > main-loop.c:245 > > #18 0x00005555559b0c52 in main_loop_wait (nonblocking=1) at > > main-loop.c:494 > > #19 0x0000555555791d8b in main_loop () at vl.c:1872 > > #20 0x00005555557998d5 in main (argc=22, argv=0x7fffffffda38, > > envp=0x7fffffffdaf0) at vl.c:4348 > > > > I am not sure if arm_cpu_set_irq(opaque=0x555556450790, irq=1, level=1) > > represents a fiq > > and if mask 16 is the correct mask for the fiq request. > > Yeah this routine handles both IRQs and FIQs. I don't see anything above > that stands out as suspicious. It may be interesting to try the same test > driver on an A15 emulation if it is not too much trouble. This would rule > out the A9 workaround not being sufficient for being GICv2. Given the fact that the addresses in which the fault appears are bogus and not accessed by the fiq handler at all. I have seen that starting up a different cpu is just a matter of a command line option. So i started up my modified vexpress board (pml hw added) with cortex a15 cpu. Unfortunatly the results are pretty similar: pml: pml_timer_tick: raise_irq arm_gic: Raised pending FIQ 49 (cpu 0) Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 4 [Data Abort] ...with DFSR 0x5 DFAR 0xbf3d2334 //address not in Kernel space? Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x800120e0 //__dabt_svc Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x80012240 //__pabt_svc Taking exception 3 [Prefetch Abort] ...with IFSR 0x5 IFAR 0x80012240//__pabt_svc Taking exception 3 [Prefetch Abort] > > Row #6 show clearly that irq 49 configured to Group 0 is triggered. All > > other interrupt are configured to Group 1 > > from my Linux kernel. The call to #4 gic_update_with_grouping shows that > > grouping within the GIC is enabled > > and that irq is triggered as FIQ within qemu. All of this looks good as > > far as i understand. So i am pretty confident > > that qemu is working correctly (minus the Prefetch and Data Aborts). > > I agree that QEMU appears to be handling the FIQ properly and it appears > that the CPU is trying to dispatch it. I understand that the Linux FIQ > handling is a little trickier than IRQs, so I suspect that either something > in the Linux kernel handling or your driver is going awry during handling > or as a result of the FIQ. Yes FIQ's are tricky as you need to avoid the page lookup failures. These are undesirable in a FIQ anyway. So all the memory i accessed is statically mapped so that its allways available in the page table. Best regards Tim
On 13 November 2014 07:58, Tim Sander <tim@krieglstein.org> wrote: > Am Mittwoch, 12. November 2014, 10:00:03 schrieb Greg Bellows: > > On 12 November 2014 07:56, Tim Sander <tim@krieglstein.org> wrote: > > > Hi Greg > > > > > > > > Bad mode in data abort handler detected > > > > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > > > > Modules linked in: firq(O) ipv6 > > > > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 > #1 > > > > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > > > > PC is at 0xffff1240 > > > > > LR is at handle_fasteoi_irq+0x9c/0x13c > > > > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > > > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > > > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > > > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > > > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > > > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment user > > > > > > > > It looks like we are in FIQ mode and interrupts have been masked. > > > > > > Indeed. > > > > > > > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > > > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > > > > Stack: (0xbf363e70 to 0xbf364000) > > > > > 3e60: bf0083c0 00000000 > 0000002f > > > > > 80230d04 > > > > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 > 76f92008 > > > > > 00000000 > > > > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff > 8005cd04 > > > > > 0000002f > > > > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c > bf363ef8 > > > > > 80008528 > > > > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 > 00000000 > > > > > 805baa00 > > > > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 > 76e8e4d0 > > > > > 80590080 > > > > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 > 200f0113 > > > > > ffffffff > > > > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 > bf0079c0 > > > > > 8058cc70 > > > > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 > 00000000 > > > > > 80023af0 > > > > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 > 00000000 > > > > > 76dd3b44 > > > > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 > 76f93428 > > > > > 76f93428 > > > > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c > 76f92008 > > > > > 00000000 > > > > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff > 9fffd821 > > > > > 9fffdc21 > > > > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] > > > > > > (gic_eoi_irq+0x0/0x4c) > > > > > > > It certainly looks like we are going down the standard IRQ patch as > you > > > > suggested. I'm not a Linux driver guy, but do you see any kind of > > > > > > activity > > > > > > > (break points, printfs, ...) through your FIQ handler? > > > > > > I am reaching 0xffff1224 which i believe is the fiq vector address on > the > > > vexpress? > > > > Hmmm.... not sure. As you mentioned previously (and as seen in the above > > register dump), I would expect offset 0x1240 (pc=0xffff1240) for an FIQ. > > I'm not sure what is at offset 0x1224, but on my Linux kernel it appears > > that offset 0x1220 is vector_addrexcptn (not pabort), that happens to > > occupy the HYP trap vector. > Zounds! You're right, i think this was a typo in my debug script. Which i > didn't notice. But i am even reaching 0x1240 before but not 0x1244 which > means > I wouldn't expect it to reach 0x1244 as that is the word after what I believe should be a branch at 0x1240 to the FIQ handler. This would mean we are not overrunning the vector table though. > it aborts on the first fiq instructions. Here is the "-d int" output > directly > after the FIQ hits: > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x800c8dcc //kmem_cache_alloc > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x8001be00 //v7_pabort > Taking exception 3 [Prefetch Abort] > and then it continue to fail on v7_pabort repeatedly. This shows that > there is > something fishy going on. It is failing on the presumed handler for the > prefetch abort? But as i see earlier resolved prefetched abort errors i can > conclude that it works up to the point where the CPU is in FIQ mode. > FIQ is special in a way that static mapped memory is needed to avoid a page > lookup as this fails under linux in fiq mode. But 0x800c8dcc > (kmem_cache_alloc) > is not called in the FIQ handler which obviously can't use any Linux > infrastructure. And as i do not reach the breakpoint 0xffff1244 these > misses > happen on the execution of the first address of the FIQ handler. > Can we check the vector table to see if the FIQ entry is as expected? It appears that the pabort may be in the right place, but it would be good to see if the FIQ entry is correct (branching to right place). I'd expect that we should be branching to __fiq_svc? Maybe setting a breakpoint in the first level handler may be useful? > > > > > > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > > > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > > > > ---[ end trace 3dc3571209a017e1 ]--- > > > > > Kernel panic - not syncing: Fatal exception in interrupt > > > > > > > > It is hard to determine entirely what is happening here based on this > > > > info. I do have code of my own that routes KGDB interrupts as FIQs > and > > > > with the workaround I see the FIQs handled as expected. Some things > we > > > > > > can > > > > > > > try to get more info in hopes of pinpointing where to look: > > > > 1. At the top of hw/intc/arm_gic.c there is the following > commented > > > > > > out > > > > > > > line: > > > > //#define DEBUG_GIC > > > > > > > > Uncomment the line, rebuild and rerun. This will give us some > trace > > > > > > on > > > > > > > what is going through the GIC code. > > > > > > I have commented out some debug lines but i see: > > > Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at > > > hw/intc/arm_gic.c:120 > > > 120 DPRINTF("Raised pending FIQ %d (cpu %d)\n", > > > best_irq, cpu); > > > > > > With the expected irq nr. 49 (32+17). > > > > > > > 2. Run qemu with the "-d int" option which will print a message on > > > > > > each > > > > > > > interrupt. We should see an FIQ at some point if they are > occurring. > > > > > > The > > > > > > > only issue is that there will be numerous IRQs, so you'll have to > parse > > > > through them to find an "exception 6 [FIQ]. > > > > > > Here is the relevant output when the FIQ hits: > > > Taking exception 2 [SVC] > > > Taking exception 2 [SVC] > > > pml: pml_timer_tick: raise_irq > > > arm_gic: Raised pending FIQ 49 (cpu 0) > > > Taking exception 6 [FIQ] > > > > This looks to me like the GIC has caught the interrupt and communicated > it > > to the CPU causing it to take the FIQ exception. > > > > > pml: pml_write: update control flags: 1 > > > pml: pml_update: start timer > > > pml: pml_update: lower irq > > > pml: pml_read: read magic > > > pml: pml_write: update control flags: 3 > > > pml: pml_update: start timer > > > > Is pml your test driver? It looks like it initiates the interrupt and > > possibly performs some handling following it? > Yes, its just a simple set of some registers to control an interrupt. > There is > i added debug output to this driver to see if and when the FIQ is accessing > the registers. But i see no accesses from FIQ mode. > > > > Taking exception 3 [Prefetch Abort] > > > ...with IFSR 0x5 IFAR 0x80221d70 > > > Taking exception 4 [Data Abort] > > > ...with DFSR 0x805 DFAR 0x805c604c > > > Taking exception 4 [Data Abort] > > > ...with DFSR 0x805 DFAR 0x805c604c > > > Taking exception 4 [Data Abort] > > > > > > So the fiq is hitting but unfortunatly i have no idea where the data > > > aborts are coming from. > > > > The data aborts are likely a side effect of the prefetch abort taken > before > > them; it is the interesting one. > Still as above the address is odd. In FIQ mode it should not jump to this > address at all !?! This is definetly Linux memory space and i am not > calling > anything linux related from FIQ. > I'm a bit confused as it appears the exception pattern has changed. Previously, we were seeing pabt, dabt, dabt, ..., but then up above the output is pabt, pabt, pabt, ... . So, either we are jumping somewhere random thus breaking repeatability or something else changed? This is also reflected in the A15 output below, but its different. > > > > I have shifted all other Irqs besides 49 to group 1 so that only irq > 49 is > > > a FIQ. > > > Might it be that i am seeing some secure violations... > > > The address of the IFAR __idr_pre_get which lives in the linux kernel > in > > > lib/idr.c seems to > > > be implementing ann integer ID management. > > > > > > > 3. If you set a breakpoint in your driver, is it possible to see > that > > > > FIQs are on from the kernel debugger. Clearly you have to try > this > > > > > > from > > > > > > > a path where interrupts are masked. I see the following on my system > > > > > > > > mentioned above: > > > > ... > > > > Flags: nZCv IRQs off FIQs on Mode SVC_32 ISA ARM Segment > kernel > > > > ... > > > > > > So you mean by debugging via the qemu debug port? I have not enabled > the > > > kgdb. > > > As stated above, i was not able to catch the fiq irq there. But it > might > > > be that i get > > > > > > I have debugged qemu to see if the irq is routed correctly. The depeest > > > call i could find is this: bt > > > #0 tcg_handle_interrupt (cpu=0x555556450790, mask=16) at > > > /home/sander/speedy/soc/qemu/translate-all.c:1503 > > > #1 0x0000555555755323 in cpu_interrupt (cpu=0x555556450790, mask=16) > > > > > > at /home/sander/speedy/soc/qemu/include/qom/cpu.h:556 > > > > > > #2 0x00005555557561b7 in arm_cpu_set_irq (opaque=0x555556450790, > irq=1, > > > level=1) > > > > > > at /home/sander/speedy/soc/qemu/target-arm/cpu.c:261 > > > > > > #3 0x00005555558193ec in qemu_set_irq (irq=0x55555642c840, level=1) at > > > hw/core/irq.c:43 > > > #4 0x0000555555879073 in gic_update_with_grouping (s=0x5555564dba80) > at > > > hw/intc/arm_gic.c:132 > > > #5 0x000055555587936d in gic_update (s=0x5555564dba80) at > > > hw/intc/arm_gic.c:180 > > > #6 0x00005555558798a7 in gic_set_irq (opaque=0x5555564dba80, irq=49, > > > level=1) at hw/intc/arm_gic.c:264 > > > #7 0x00005555558193ec in qemu_set_irq (irq=0x555556432b00, level=1) at > > > hw/core/irq.c:43 > > > #8 0x0000555555661d4d in a9mp_priv_set_irq (opaque=0x5555564d7260, > > > irq=17, level=1) > > > > > > at /home/sander/speedy/soc/qemu/hw/cpu/a9mpcore.c:17 > > > > > > #9 0x00005555558193ec in qemu_set_irq (irq=0x5555564f3c00, level=1) at > > > hw/core/irq.c:43 > > > #10 0x00005555558f6fed in qemu_irq_raise (irq=0x5555564f3c00) at > > > /home/sander/speedy/soc/qemu/include/hw/irq.h:16 > > > #11 0x00005555558f7363 in pml_timer_tick (opaque=0x555556595020) at > > > hw/timer/pml.c:95 > > > #12 0x000055555599be6e in aio_bh_poll (ctx=0x5555563fdad0) at > async.c:82 > > > #13 0x00005555559b2d9f in aio_dispatch (ctx=0x5555563fdad0) at > > > aio-posix.c:137 > > > #14 0x000055555599c2cb in aio_ctx_dispatch (source=0x5555563fdad0, > > > callback=0x0, user_data=0x0) at async.c:221 > > > #15 0x00007ffff7901e04 in g_main_context_dispatch () from > > > /lib/x86_64-linux-gnu/libglib-2.0.so.0 > > > #16 0x00005555559b0a79 in glib_pollfds_poll () at main-loop.c:200 > > > #17 0x00005555559b0b7a in os_host_main_loop_wait (timeout=0) at > > > main-loop.c:245 > > > #18 0x00005555559b0c52 in main_loop_wait (nonblocking=1) at > > > main-loop.c:494 > > > #19 0x0000555555791d8b in main_loop () at vl.c:1872 > > > #20 0x00005555557998d5 in main (argc=22, argv=0x7fffffffda38, > > > envp=0x7fffffffdaf0) at vl.c:4348 > > > > > > I am not sure if arm_cpu_set_irq(opaque=0x555556450790, irq=1, level=1) > > > represents a fiq > > > and if mask 16 is the correct mask for the fiq request. > > > > Yeah this routine handles both IRQs and FIQs. I don't see anything above > > that stands out as suspicious. It may be interesting to try the same > test > > driver on an A15 emulation if it is not too much trouble. This would > rule > > out the A9 workaround not being sufficient for being GICv2. > Given the fact that the addresses in which the fault appears are bogus and > not > accessed by the fiq handler at all. I have seen that starting up a > different cpu > is just a matter of a command line option. So i started up my modified > vexpress > board (pml hw added) with cortex a15 cpu. Unfortunatly the results are > pretty > similar: > pml: pml_timer_tick: raise_irq > arm_gic: Raised pending FIQ 49 (cpu 0) > Taking exception 6 [FIQ] > pml: pml_write: update control flags: 1 > pml: pml_update: start timer > pml: pml_update: lower irq > pml: pml_read: read magic > pml: pml_write: update control flags: 3 > pml: pml_update: start timer > Taking exception 4 [Data Abort] > ...with DFSR 0x5 DFAR 0xbf3d2334 //address not in Kernel space? > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x800120e0 //__dabt_svc > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x80012240 //__pabt_svc > Taking exception 3 [Prefetch Abort] > ...with IFSR 0x5 IFAR 0x80012240//__pabt_svc > Taking exception 3 [Prefetch Abort] > > Good data point. Interesting that we take a data abort first rather than a prefetch abort. > > > Row #6 show clearly that irq 49 configured to Group 0 is triggered. All > > > other interrupt are configured to Group 1 > > > from my Linux kernel. The call to #4 gic_update_with_grouping shows > that > > > grouping within the GIC is enabled > > > and that irq is triggered as FIQ within qemu. All of this looks good as > > > far as i understand. So i am pretty confident > > > that qemu is working correctly (minus the Prefetch and Data Aborts). > > > > I agree that QEMU appears to be handling the FIQ properly and it appears > > that the CPU is trying to dispatch it. I understand that the Linux FIQ > > handling is a little trickier than IRQs, so I suspect that either > something > > in the Linux kernel handling or your driver is going awry during handling > > or as a result of the FIQ. > Yes FIQ's are tricky as you need to avoid the page lookup failures. These > are > undesirable in a FIQ anyway. So all the memory i accessed is statically > mapped > so that its allways available in the page table. > > Best regards > Tim >
Am Donnerstag, 13. November 2014, 09:09:33 schrieb Greg Bellows: > On 13 November 2014 07:58, Tim Sander <tim@krieglstein.org> wrote: > > Am Mittwoch, 12. November 2014, 10:00:03 schrieb Greg Bellows: > > > On 12 November 2014 07:56, Tim Sander <tim@krieglstein.org> wrote: > > > > Hi Greg > > > > > > > > > > Bad mode in data abort handler detected > > > > > > Internal error: Oops - bad mode: 0 [#1] PREEMPT SMP ARM > > > > > > Modules linked in: firq(O) ipv6 > > > > > > CPU: 0 PID: 103 Comm: systemd-udevd Tainted: G O 3.14.0 > > > > #1 > > > > > > > > task: bf2b9300 ti: bf362000 task.ti: bf362000 > > > > > > PC is at 0xffff1240 > > > > > > LR is at handle_fasteoi_irq+0x9c/0x13c > > > > > > pc : [<ffff1240>] lr : [<8005cda0>] psr: 600f01d1 > > > > > > sp : bf363e70 ip : 07a7e79d fp : 00000000 > > > > > > r10: 76f92008 r9 : 80590080 r8 : 76e8e4d0 > > > > > > r7 : f8200100 r6 : bf363fb0 r5 : bf008414 r4 : bf0083c0 > > > > > > r3 : 80230d04 r2 : 0000002f r1 : 00000000 r0 : bf0083c0 > > > > > > Flags: nZCv IRQs off FIQs off Mode FIQ_32 ISA ARM Segment > > > > > > user > > > > > > > > > > It looks like we are in FIQ mode and interrupts have been masked. > > > > > > > > Indeed. > > > > > > > > > > Control: 10c53c7d Table: 60004059 DAC: 00000015 > > > > > > Process systemd-udevd (pid: 103, stack limit = 0xbf362240) > > > > > > Stack: (0xbf363e70 to 0xbf364000) > > > > > > 3e60: bf0083c0 00000000 > > > > 0000002f > > > > > > > > 80230d04 > > > > > > 3e80: bf0083c0 bf008414 bf363fb0 f8200100 76e8e4d0 80590080 > > > > 76f92008 > > > > > > > > 00000000 > > > > > > 3ea0: 07a7e79d bf363e70 8005cda0 ffff1240 600f01d1 ffffffff > > > > 8005cd04 > > > > > > > > 0000002f > > > > > > 3ec0: 0000002f 800598bc 8058cc70 8000ed00 f820010c 8059684c > > > > bf363ef8 > > > > > > > > 80008528 > > > > > > 3ee0: 80023730 80023744 200f0113 ffffffff bf363f2c 80012180 > > > > 00000000 > > > > > > > > 805baa00 > > > > > > 3f00: 00000000 00000100 00000002 00000022 00000000 bf362000 > > > > 76e8e4d0 > > > > > > > > 80590080 > > > > > > 3f20: 76f92008 00000000 0000000a bf363f40 80023730 80023744 > > > > 200f0113 > > > > > > > > ffffffff > > > > > > 3f40: bf007a14 8059ac00 00000000 0000000a ffff8dd7 00400140 > > > > bf0079c0 > > > > > > > > 8058cc70 > > > > > > 3f60: 00000022 00000000 f8200100 76e8e4d0 76f9201c 76f92008 > > > > 00000000 > > > > > > > > 80023af0 > > > > > > 3f80: 8058cc70 8000ed04 f820010c 8059684c bf363fb0 80008528 > > > > 00000000 > > > > > > > > 76dd3b44 > > > > > > 3fa0: 600f0010 ffffffff 0000000c 8001233c 00000000 00000000 > > > > 76f93428 > > > > > > > > 76f93428 > > > > > > 3fc0: 76f93438 00000000 76f93448 0000000c 76e8e4d0 76f9201c > > > > 76f92008 > > > > > > > > 00000000 > > > > > > 3fe0: 00000000 7ec115c0 76f60914 76dd3b44 600f0010 ffffffff > > > > 9fffd821 > > > > > > > > 9fffdc21 > > > > > > [<8005cda0>] (handle_fasteoi_irq) from [<80230d04>] > > > > > > > > (gic_eoi_irq+0x0/0x4c) > > > > > > > > > It certainly looks like we are going down the standard IRQ patch as > > > > you > > > > > > > suggested. I'm not a Linux driver guy, but do you see any kind of > > > > > > > > activity > > > > > > > > > (break points, printfs, ...) through your FIQ handler? > > > > > > > > I am reaching 0xffff1224 which i believe is the fiq vector address on > > > > the > > > > > > vexpress? > > > > > > Hmmm.... not sure. As you mentioned previously (and as seen in the > > > above > > > register dump), I would expect offset 0x1240 (pc=0xffff1240) for an FIQ. > > > I'm not sure what is at offset 0x1224, but on my Linux kernel it appears > > > that offset 0x1220 is vector_addrexcptn (not pabort), that happens to > > > occupy the HYP trap vector. > > > > Zounds! You're right, i think this was a typo in my debug script. Which i > > didn't notice. But i am even reaching 0x1240 before but not 0x1244 which > > means > > I wouldn't expect it to reach 0x1244 as that is the word after what I > believe should be a branch at 0x1240 to the FIQ handler. This would mean > we are not overrunning the vector table though. There is reason that the FIQ entry is the last in the Vector table. Its allowed to put code directly at the interrupt vector table which saves one jump. Thats what Linux assumes when using arch/arm/kernel/fiq.c set_fiq_handler does. It takes some binary blob and copies it just at the FIQ handler space. > > it aborts on the first fiq instructions. Here is the "-d int" output > > directly > > after the FIQ hits: > > Taking exception 3 [Prefetch Abort] > > ...with IFSR 0x5 IFAR 0x800c8dcc //kmem_cache_alloc > > Taking exception 3 [Prefetch Abort] > > ...with IFSR 0x5 IFAR 0x8001be00 //v7_pabort > > Taking exception 3 [Prefetch Abort] > > and then it continue to fail on v7_pabort repeatedly. This shows that > > there is > > something fishy going on. It is failing on the presumed handler for the > > prefetch abort? But as i see earlier resolved prefetched abort errors i > > can > > conclude that it works up to the point where the CPU is in FIQ mode. > > FIQ is special in a way that static mapped memory is needed to avoid a > > page > > lookup as this fails under linux in fiq mode. But 0x800c8dcc > > (kmem_cache_alloc) > > is not called in the FIQ handler which obviously can't use any Linux > > infrastructure. And as i do not reach the breakpoint 0xffff1244 these > > misses > > happen on the execution of the first address of the FIQ handler. > > Can we check the vector table to see if the FIQ entry is as expected? It > appears that the pabort may be in the right place, but it would be good to > see if the FIQ entry is correct (branching to right place). I'd expect > that we should be branching to __fiq_svc? Maybe setting a breakpoint in > the first level handler may be useful? Ok, was digging also into this. I thought i had checked this already but alas it seems the values are not the ones i put into set_fiq_handler. So i digged a little deeper and tried to find out the values of vbar, mvbar hvbar. This is the gcc inline assembly syntax from my kernel module written in c: asm("mrc p15, 0, %0, c12, c0, 0" : "=r"(vbar) : : "cc"); asm("mrc p15, 0, %0, c12, c0, 1" : "=r"(mvbar) : : "cc"); <- not implemented? asm("mrc p15, 4, %0, c12, c0, 0" : "=r"(hvbar) : : "cc"); <- not implemented? It seems as if neither mvbar nor hvbar are implemented and that vbar returns zero !? I also have a problem with the addresses: The fiq handler lies at 0xffff1240 but the vectors_page in Linux points to 0xbfffe000? You where talking about the fact that the security extensions where not implemented. I was not aware that the different vbar's where already part of the security stuff? > > > > > > [<80230d04>] (gic_eoi_irq) from [<f8200100>] (0xf8200100) > > > > > > Code: ee02af10 f57ff06f e59d8000 e59d9004 (e599b00c) > > > > > > ---[ end trace 3dc3571209a017e1 ]--- > > > > > > Kernel panic - not syncing: Fatal exception in interrupt > > > > > > > > > > It is hard to determine entirely what is happening here based on > > > > > this > > > > > info. I do have code of my own that routes KGDB interrupts as FIQs > > > > and > > > > > > > with the workaround I see the FIQs handled as expected. Some things > > > > we > > > > > > can > > > > > > > > > try to get more info in hopes of pinpointing where to look: > > > > > 1. At the top of hw/intc/arm_gic.c there is the following > > > > commented > > > > > > out > > > > > > > > > line: > > > > > //#define DEBUG_GIC > > > > > > > > > > Uncomment the line, rebuild and rerun. This will give us some > > > > trace > > > > > > on > > > > > > > > > what is going through the GIC code. > > > > > > > > I have commented out some debug lines but i see: > > > > Breakpoint 1, gic_update_with_grouping (s=0x5555564dba80) at > > > > hw/intc/arm_gic.c:120 > > > > 120 DPRINTF("Raised pending FIQ %d (cpu > > > > %d)\n", > > > > best_irq, cpu); > > > > > > > > With the expected irq nr. 49 (32+17). > > > > > > > > > 2. Run qemu with the "-d int" option which will print a message > > > > > on > > > > > > > > each > > > > > > > > > interrupt. We should see an FIQ at some point if they are > > > > occurring. > > > > > > The > > > > > > > > > only issue is that there will be numerous IRQs, so you'll have to > > > > parse > > > > > > > through them to find an "exception 6 [FIQ]. > > > > > > > > Here is the relevant output when the FIQ hits: > > > > Taking exception 2 [SVC] > > > > Taking exception 2 [SVC] > > > > pml: pml_timer_tick: raise_irq > > > > arm_gic: Raised pending FIQ 49 (cpu 0) > > > > Taking exception 6 [FIQ] > > > > > > This looks to me like the GIC has caught the interrupt and communicated > > > > it > > > > > to the CPU causing it to take the FIQ exception. > > > > > > > pml: pml_write: update control flags: 1 > > > > pml: pml_update: start timer > > > > pml: pml_update: lower irq > > > > pml: pml_read: read magic > > > > pml: pml_write: update control flags: 3 > > > > pml: pml_update: start timer > > > > > > Is pml your test driver? It looks like it initiates the interrupt and > > > possibly performs some handling following it? > > > > Yes, its just a simple set of some registers to control an interrupt. > > There is > > i added debug output to this driver to see if and when the FIQ is > > accessing > > the registers. But i see no accesses from FIQ mode. > > > > > > Taking exception 3 [Prefetch Abort] > > > > ...with IFSR 0x5 IFAR 0x80221d70 > > > > Taking exception 4 [Data Abort] > > > > ...with DFSR 0x805 DFAR 0x805c604c > > > > Taking exception 4 [Data Abort] > > > > ...with DFSR 0x805 DFAR 0x805c604c > > > > Taking exception 4 [Data Abort] > > > > > > > > So the fiq is hitting but unfortunatly i have no idea where the data > > > > aborts are coming from. > > > > > > The data aborts are likely a side effect of the prefetch abort taken > > > > before > > > > > them; it is the interesting one. > > > > Still as above the address is odd. In FIQ mode it should not jump to this > > address at all !?! This is definetly Linux memory space and i am not > > calling > > anything linux related from FIQ. > > I'm a bit confused as it appears the exception pattern has changed. > Previously, we were seeing pabt, dabt, dabt, ..., but then up above the > output is pabt, pabt, pabt, ... . So, either we are jumping somewhere > random thus breaking repeatability or something else changed? This is also > reflected in the A15 output below, but its different. Mh, it seems that the first Prefetch Abort IFAR is random and IFSR is also differnt: First run: Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 3 [Prefetch Abort] ...with IFSR 0x17 IFAR 0x76f6a5e4 Taking exception 4 [Data Abort] ...with DFSR 0x17 DFAR 0x76fcc25c Taking exception 3 [Prefetch Abort] ...with IFSR 0x17 IFAR 0x76e7f884 Taking exception 3 [Prefetch Abort] ...with IFSR 0x17 IFAR 0x76e7e61c Taking exception 4 [Data Abort] ...with DFSR 0x17 DFAR 0x76ee5b40 Taking exception 4 [Data Abort] ...with DFSR 0x17 DFAR 0x76fce060 Taking exception 4 [Data Abort] ...with DFSR 0x17 DFAR 0x76fd245c Taking exception 4 [Data Abort] ...with DFSR 0x17 DFAR 0x76fcdf88 Second run: pml: pml_timer_tick: raise_irq arm_gic: Raised pending FIQ 49 (cpu 0) Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 1 [Undefined Instruction] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 1 [Undefined Instruction] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 2 [SVC] Third run: arm_gic: Raised pending FIQ 49 (cpu 0) Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer Taking exception 2 [SVC] Taking exception 3 [Prefetch Abort] ...with IFSR 0x17 IFAR 0x76ea8000 Taking exception 2 [SVC] Taking exception 1 [Undefined Instruction] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 2 [SVC] Taking exception 4 [Data Abort] ...with DFSR 0x817 DFAR 0x76d20000 Taking exception 2 [SVC] Taking exception 2 [SVC] But then again, i think that the kernel and qemu disagree about the position of the vector table. So it seems its just accessing random in the irq which leads to different results each time. So probably will just outline what i do in the kernel to get a fiq from the gic: * i create some static mappings to make sure i have no pagefault in FIQ mode * Then i reprogramm the gic in a way that all irqs are mapped to group 1 * one special irq which will be programmed to group 0 * fiq mode is enabled for group 0 It seems as if the gic implementation in qemu knows this dance as i am seeing the FIQ happening. But then i have my doubts (due to the missing vbars and the different addresses kernel vs. qemu) that the cpu is up to the task in qemu? The code i am using have been ported from an Altera SOC Cortex A9 and works there. While some addresses where hardcoded i *think* that i have meanwhile found all wrong addresses in the static mappings which bite me earlier. But now i have the impression that i have to dig on the qemu side again. Best regards Tim
On 13 November 2014 16:26, Tim Sander <tim@krieglstein.org> wrote: > This is the gcc inline assembly syntax from my kernel module written in c: > asm("mrc p15, 0, %0, c12, c0, 0" : "=r"(vbar) : : "cc"); > asm("mrc p15, 0, %0, c12, c0, 1" : "=r"(mvbar) : : "cc"); <- not implemented? > asm("mrc p15, 4, %0, c12, c0, 0" : "=r"(hvbar) : : "cc"); <- not implemented? > > It seems as if neither mvbar nor hvbar are implemented and that vbar returns > zero !? I also have a problem with the addresses: > The fiq handler lies at 0xffff1240 but the vectors_page in Linux points to > 0xbfffe000? You where talking about the fact that the security extensions > where not implemented. I was not aware that the different vbar's where > already part of the security stuff? MVBAR is part of the Security extensions. HVBAR is part of the Virtualization extensions. In mainline QEMU we implement neither of those extensions, and so don't implement the associated registers. (Strictly speaking, VBAR is also only in the Security extensions, but we provide it as a workaround for guests that assume our CPUs should implement it.) -- PMM
On 13 November 2014 10:46, Peter Maydell <peter.maydell@linaro.org> wrote: > On 13 November 2014 16:26, Tim Sander <tim@krieglstein.org> wrote: > > This is the gcc inline assembly syntax from my kernel module written in > c: > > asm("mrc p15, 0, %0, c12, c0, 0" : "=r"(vbar) : : "cc"); > > asm("mrc p15, 0, %0, c12, c0, 1" : "=r"(mvbar) : : "cc"); <- not > implemented? > > asm("mrc p15, 4, %0, c12, c0, 0" : "=r"(hvbar) : : "cc"); <- not > implemented? > > > > It seems as if neither mvbar nor hvbar are implemented and that vbar > returns > > zero !? I also have a problem with the addresses: > > The fiq handler lies at 0xffff1240 but the vectors_page in Linux points > to > > 0xbfffe000? You where talking about the fact that the security extensions > > where not implemented. I was not aware that the different vbar's where > > already part of the security stuff? > > MVBAR is part of the Security extensions. HVBAR is part of the > Virtualization extensions. In mainline QEMU we implement neither > of those extensions, and so don't implement the associated > registers. (Strictly speaking, VBAR is also only in the > Security extensions, but we provide it as a workaround for > guests that assume our CPUs should implement it.) > Peter beat me to it. None of the VBAR registers should matter in your case which coincides with the use of hivecs. It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() just to see where it goes. If CONFIG_ARM_GIC is enabled it should take you to your handler I suspect. Plus, if you get there then we have likely proven that QEMU is getting the kernel to the right place. I set a BP in this routine on my A9 run and appear to be hitting it correctly. Let me know how I can help in debugging the QEMU side of things. > > -- PMM >
> > > 0xbfffe000? You where talking about the fact that the security > > > extensions > > > where not implemented. I was not aware that the different vbar's where > > > already part of the security stuff? > > > > MVBAR is part of the Security extensions. HVBAR is part of the > > Virtualization extensions. In mainline QEMU we implement neither > > of those extensions, and so don't implement the associated > > registers. (Strictly speaking, VBAR is also only in the > > Security extensions, but we provide it as a workaround for > > guests that assume our CPUs should implement it.) > > Peter beat me to it. None of the VBAR registers should matter in your case > which coincides with the use of hivecs. While writing this mail i found out that the integrated debugger is causing harm in combination with the fiq. So everything below the braces seems to be related to the this problem. But i still wanted to keep the data points for reference: { Ok, so qemu only implements the SCTLR.V bit to control the memory address of the interrupt vector. So its either 0 or 0xffff0000. That is fine with me. Currently i have the problem that a call to set_fiq_handler does not place the binary stuff loaded at the address where qemu is jumping to which is presumably 0xffff1240. I have checked that SCTLR.V =1 under linux which is fine. The background info to set_fiq_handler from my understanding is that it copies the given stuff directly at the address where the FIQ vector is located. This works as the FIQ is the last entry and thus there is some memory space for a short interrupt handler. I checked the memory when entering the FIQ with the integrated gdb: (gdb) info reg r0 0x0 0 r1 0x0 0 r2 0x1 1 r3 0x76eb34c8 1995125960 r4 0x76eb34c8 1995125960 r5 0x76f633b8 1995846584 r6 0x2a 42 r7 0x76f4c28c 1995752076 r8 0xf8200100 -132120320 r9 0xe0040000 -536608768 r10 0x60004059 1610629209 r11 0x0 0 r12 0x0 0 sp 0x908be000 0x908be000 lr 0x76dfc108 1994375432 pc 0xffff1240 0xffff1240 <firq_fiq_handler> cpsr 0x600f01d1 1611596241 (gdb) x 0xffff1240 0xe599b00c But my firq_fiq_handler starts with 0xee12af10? I know that this works on real hardware so i suspect that this an error within qemu? Or at least that there is something amiss in the way the memory is initialized or handled. Is there a way to instrument the memory below the vector table to get debug logs if the memory is modified? } > It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() just > to see where it goes. If CONFIG_ARM_GIC is enabled it should take you to > your handler I suspect. Plus, if you get there then we have likely proven > that QEMU is getting the kernel to the right place. I set a BP in this > routine on my A9 run and appear to be hitting it correctly. So you are talking about the linux kernel, right? CONFIG_ARM_GIC=y check but i can't find handle_fiq_as_nmi? Even a fuzzier "rgrep nmi * |grep fiq" does not find anything. Concerning the fact that qemu is jumping to the right address: To i have put a breakpoint to 0xffff001c which is the fiq base vector address. There is an instruction 0xea000480 which seems to be a pc relative branch to 0x1224 which then lands at 0xffff1240. But the internal debugger gives me some concerns. If i do at the gdb command line: hb *0xffff001c hb *0xffff1240 The debugger only stops at the first breakpoint. If i leave the first breakpoint away the debugger stops at 0xffff1240. As i know that at 0xffff01c it should jump right to 0xffff1240 i would expect that both breakpoints are triggered. Then if i reach the breakpoint at 0xffff1240 i know i am at the fiq code. But (gdb) x 0xffff1240 gives the wrong value. Nevertheless i see now (after correcting the static map of the GIC) the following debug output of my test device when single-stepping from PC=0xffff1240: Taking exception 6 [FIQ] pml: pml_write: update control flags: 1 pml: pml_update: stop timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: stop timer This means that there has been some code executed, most probably my FIQ handler, but the debugger showed me: Breakpoint 1, firq_fiq_handler () at fiq.S:26 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 < ok (gdb) s <- oh my why is it single stepping into the kernel from FIQ? test_ti_thread_flag (flag=1, ti=0x8f84e000) at include/asm-generic/preempt.h:71 71 return !--*preempt_count_ptr() && tif_need_resched(); (gdb) s <- next step does not look any better... test_bit (addr=0x8f84e000, nr=1) at include/asm-generic/bitops/non- atomic.h:105 105 return 1UL & (addr[BIT_WORD(nr)] >> (nr & (BITS_PER_LONG-1))); The second run is even stranger: Breakpoint 1, firq_fiq_handler () at fiq.S:26 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 (gdb) s Cannot access memory at address 0x4 (gdb) c Continuing. Cannot access memory at address 0x4 ... qemu seems completly unusable from here on... I am pretty sure now that my FIQ handler is executed. I see multiple accesses to my virtual pml test hardware: arm_gic: Raised pending FIQ 49 (cpu 0) pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer arm_gic: Enabled IRQ 37 [ OK ] Found device /dev/ttyAMA0. pml: pml_timer_tick: raise_irq arm_gic: Raised pending FIQ 49 (cpu 0) pml: pml_write: update control flags: 1 pml: pml_update: start timer pml: pml_update: lower irq pml: pml_read: read magic pml: pml_write: update control flags: 3 pml: pml_update: start timer pml: pml_timer_tick: raise_irq Which seems like normal operation. Especially the log message shows that other stuff gets executed. But after a while the interrupts stop and nothing happens The system is not reacting to keypresses anymore. Not even Ctrl-A-X. But this seems as if the debug output in the GIC and/or my pml test driver locked the qemu up? Also if i connect to the gdb port while the fiq is running the qemu stops the execution. But besides the problems with the debugger which set me of course the qemu seems to happy emulate FIQs, which is really nice :-) Best regards Tim
On 14 November 2014 09:34, Tim Sander <tim@krieglstein.org> wrote: > > > > 0xbfffe000? You where talking about the fact that the security > > > > extensions > > > > where not implemented. I was not aware that the different vbar's > where > > > > already part of the security stuff? > > > > > > MVBAR is part of the Security extensions. HVBAR is part of the > > > Virtualization extensions. In mainline QEMU we implement neither > > > of those extensions, and so don't implement the associated > > > registers. (Strictly speaking, VBAR is also only in the > > > Security extensions, but we provide it as a workaround for > > > guests that assume our CPUs should implement it.) > > > > Peter beat me to it. None of the VBAR registers should matter in your > case > > which coincides with the use of hivecs. > While writing this mail i found out that the integrated debugger is causing > harm in combination with the fiq. So everything below the braces seems to > be related to the this problem. But i still wanted to keep the data points > for > reference: > > { > Ok, so qemu only implements the SCTLR.V bit to control the memory address > of > the interrupt vector. So its either 0 or 0xffff0000. That is fine with me. > Currently i have the problem that a call to set_fiq_handler does not place > the > binary stuff loaded at the address where qemu is jumping to which is > presumably > 0xffff1240. I have checked that SCTLR.V =1 under linux which is fine. > > The background info to set_fiq_handler from my understanding is that it > copies > the given stuff directly at the address where the FIQ vector is located. > This > works as the FIQ is the last entry and thus there is some memory space for > a > short interrupt handler. I checked the memory when entering the FIQ with > the > integrated gdb: > (gdb) info reg > r0 0x0 0 > r1 0x0 0 > r2 0x1 1 > r3 0x76eb34c8 1995125960 > r4 0x76eb34c8 1995125960 > r5 0x76f633b8 1995846584 > r6 0x2a 42 > r7 0x76f4c28c 1995752076 > r8 0xf8200100 -132120320 > r9 0xe0040000 -536608768 > r10 0x60004059 1610629209 > r11 0x0 0 > r12 0x0 0 > sp 0x908be000 0x908be000 > lr 0x76dfc108 1994375432 > pc 0xffff1240 0xffff1240 <firq_fiq_handler> > cpsr 0x600f01d1 1611596241 > (gdb) x 0xffff1240 > 0xe599b00c > > But my firq_fiq_handler starts with 0xee12af10? I know that this works on > real > hardware so i suspect that this an error within qemu? Or at least that > there > is something amiss in the way the memory is initialized or handled. > > Is there a way to instrument the memory below the vector table to get debug > logs if the memory is modified? > } > > > It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() > just > > to see where it goes. If CONFIG_ARM_GIC is enabled it should take you to > > your handler I suspect. Plus, if you get there then we have likely > proven > > that QEMU is getting the kernel to the right place. I set a BP in this > > routine on my A9 run and appear to be hitting it correctly. > So you are talking about the linux kernel, right? CONFIG_ARM_GIC=y check > but > i can't find handle_fiq_as_nmi? Even a fuzzier "rgrep nmi * |grep fiq" > does not > find anything. > Maybe we are working off different versions of the kernel sources. I'm using a kernel variant of v3.18-rc1. I took a look at my 3.15 kernel and it does not have the routine, so perhaps yours is an earlier version as well. I don't spend much time working in or tracking the Linux kernel, so I am not sure when the difference was introduced. I just found it to be a convenient function to set a BP for early FIQ debugging, you may have something different. Interestingly, as I researched the Linux FIQ support I found this mail thread... http://www.spinics.net/lists/arm-kernel/msg14960.html As I don't have access to your code, I could not verify that the SVC SPSR was being preserved, but it may be worth you looking into it as I can see this potentially resulting in all kinds of random behavior. More interestingly, this comment and code appears to have been changed in later versions of the FIQ code, so perhaps this has been fixed or improved (My 3.18 kernel does not have the comment). > Concerning the fact that qemu is jumping to the right address: > To i have put a breakpoint to 0xffff001c which is the fiq base vector > address. > There is an instruction 0xea000480 which seems to be a pc relative branch > to > 0x1224 which then lands at 0xffff1240. > > But the internal debugger gives me some concerns. If i do at the gdb > command > line: > hb *0xffff001c > hb *0xffff1240 > The debugger only stops at the first breakpoint. If i leave the first > breakpoint > away the debugger stops at 0xffff1240. As i know that at 0xffff01c it > should jump > right to 0xffff1240 i would expect that both breakpoints are triggered. > > Then if i reach the breakpoint at 0xffff1240 i know i am at the fiq code. > But > (gdb) x 0xffff1240 gives the wrong value. Nevertheless i see now (after > correcting the static map of the GIC) the following debug output of my test > device when single-stepping from PC=0xffff1240: > Taking exception 6 [FIQ] > pml: pml_write: update control flags: 1 > pml: pml_update: stop timer > pml: pml_update: lower irq > pml: pml_read: read magic > pml: pml_write: update control flags: 3 > pml: pml_update: stop timer > > This means that there has been some code executed, most probably my FIQ > handler, but the debugger showed me: > Breakpoint 1, firq_fiq_handler () at fiq.S:26 > 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 < ok > (gdb) s <- oh my why is it single stepping into the kernel from FIQ? > test_ti_thread_flag (flag=1, ti=0x8f84e000) at > include/asm-generic/preempt.h:71 > 71 return !--*preempt_count_ptr() && tif_need_resched(); > (gdb) s <- next step does not look any better... > test_bit (addr=0x8f84e000, nr=1) at include/asm-generic/bitops/non- > atomic.h:105 > 105 return 1UL & (addr[BIT_WORD(nr)] >> (nr & > (BITS_PER_LONG-1))); > > The second run is even stranger: > Breakpoint 1, firq_fiq_handler () at fiq.S:26 > 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 > (gdb) s > Cannot access memory at address 0x4 > (gdb) c > Continuing. > Cannot access memory at address 0x4 > ... > qemu seems completly unusable from here on... > > I am pretty sure now that my FIQ handler is executed. > I see multiple accesses to my virtual pml test hardware: > arm_gic: Raised pending FIQ 49 (cpu 0) > pml: pml_write: update control flags: 1 > pml: pml_update: start timer > pml: pml_update: lower irq > pml: pml_read: read magic > pml: pml_write: update control flags: 3 > pml: pml_update: start timer > arm_gic: Enabled IRQ 37 > [ OK ] Found device /dev/ttyAMA0. > pml: pml_timer_tick: raise_irq > arm_gic: Raised pending FIQ 49 (cpu 0) > pml: pml_write: update control flags: 1 > pml: pml_update: start timer > pml: pml_update: lower irq > pml: pml_read: read magic > pml: pml_write: update control flags: 3 > pml: pml_update: start timer > pml: pml_timer_tick: raise_irq > > Which seems like normal operation. Especially the log > message shows that other stuff gets executed. > > But after a while the interrupts stop and nothing happens > The system is not reacting to keypresses anymore. Not even > Ctrl-A-X. But this seems as if the debug output in the GIC and/or > my pml test driver locked the qemu up? > Hmmm... almost sounds like we lost an interrupt or ack which could be in QEMU. Does execution cease if run as A15? > > Also if i connect to the gdb port while the fiq is running the > qemu stops the execution. > > But besides the problems with the debugger which set me of course > the qemu seems to happy emulate FIQs, which is really nice :-) > > I'm happy to hear that we found a working scenario, but hangs and such should not happen. I need to determine a way to look into this more myself to see if it is related to grouping or FIQ support. > Best regards > Tim >
Hi Greg Am Freitag, 14. November 2014, 10:50:40 schrieb Greg Bellows: > On 14 November 2014 09:34, Tim Sander <tim@krieglstein.org> wrote: > > > > > 0xbfffe000? You where talking about the fact that the security > > > > > extensions > > > > > where not implemented. I was not aware that the different vbar's > > > > where > > > > > > > already part of the security stuff? > > > > > > > > MVBAR is part of the Security extensions. HVBAR is part of the > > > > Virtualization extensions. In mainline QEMU we implement neither > > > > of those extensions, and so don't implement the associated > > > > registers. (Strictly speaking, VBAR is also only in the > > > > Security extensions, but we provide it as a workaround for > > > > guests that assume our CPUs should implement it.) > > > > > > Peter beat me to it. None of the VBAR registers should matter in your > > > > case > > > > > which coincides with the use of hivecs. > > > > While writing this mail i found out that the integrated debugger is > > causing > > harm in combination with the fiq. So everything below the braces seems to > > be related to the this problem. But i still wanted to keep the data points > > for > > reference: > > > > { > > Ok, so qemu only implements the SCTLR.V bit to control the memory address > > of > > the interrupt vector. So its either 0 or 0xffff0000. That is fine with me. > > Currently i have the problem that a call to set_fiq_handler does not place > > the > > binary stuff loaded at the address where qemu is jumping to which is > > presumably > > 0xffff1240. I have checked that SCTLR.V =1 under linux which is fine. > > > > The background info to set_fiq_handler from my understanding is that it > > copies > > the given stuff directly at the address where the FIQ vector is located. > > This > > works as the FIQ is the last entry and thus there is some memory space for > > a > > short interrupt handler. I checked the memory when entering the FIQ with > > the > > integrated gdb: > > (gdb) info reg > > r0 0x0 0 > > r1 0x0 0 > > r2 0x1 1 > > r3 0x76eb34c8 1995125960 > > r4 0x76eb34c8 1995125960 > > r5 0x76f633b8 1995846584 > > r6 0x2a 42 > > r7 0x76f4c28c 1995752076 > > r8 0xf8200100 -132120320 > > r9 0xe0040000 -536608768 > > r10 0x60004059 1610629209 > > r11 0x0 0 > > r12 0x0 0 > > sp 0x908be000 0x908be000 > > lr 0x76dfc108 1994375432 > > pc 0xffff1240 0xffff1240 <firq_fiq_handler> > > cpsr 0x600f01d1 1611596241 > > (gdb) x 0xffff1240 > > 0xe599b00c > > > > But my firq_fiq_handler starts with 0xee12af10? I know that this works on > > real > > hardware so i suspect that this an error within qemu? Or at least that > > there > > is something amiss in the way the memory is initialized or handled. > > > > Is there a way to instrument the memory below the vector table to get > > debug > > logs if the memory is modified? > > } > > > > > It may be worthwhile to put a kernel breakpoint in handle_fiq_as_nmi() > > > > just > > > > > to see where it goes. If CONFIG_ARM_GIC is enabled it should take you > > > to > > > your handler I suspect. Plus, if you get there then we have likely > > > > proven > > > > > that QEMU is getting the kernel to the right place. I set a BP in this > > > routine on my A9 run and appear to be hitting it correctly. > > > > So you are talking about the linux kernel, right? CONFIG_ARM_GIC=y check > > but > > i can't find handle_fiq_as_nmi? Even a fuzzier "rgrep nmi * |grep fiq" > > does not > > find anything. > > Maybe we are working off different versions of the kernel sources. I'm > using a kernel variant of v3.18-rc1. I took a look at my 3.15 kernel and > it does not have the routine, so perhaps yours is an earlier version as > well. I am on 3.14 as i am working with rt-preempt kernels right now. > I don't spend much time working in or tracking the Linux kernel, so I am > not sure when the difference was introduced. I just found it to be a > convenient function to set a BP for early FIQ debugging, you may have > something different. > > Interestingly, as I researched the Linux FIQ support I found this mail > thread... > > http://www.spinics.net/lists/arm-kernel/msg14960.html > > As I don't have access to your code, I could not verify that the SVC SPSR > was being preserved, but it may be worth you looking into it as I can see > this potentially resulting in all kinds of random behavior. More > interestingly, this comment and code appears to have been changed in later > versions of the FIQ code, so perhaps this has been fixed or improved (My > 3.18 kernel does not have the comment). I have not following the 3.18 kernel concering the FIQ but i will take a look. But regarding the above link i think preserving SPSR is only needed if mode switching is beeing done from fiq. But as i just return from the handler i am assuming that the problem above is not mine. The only problem i have (besides the qemu debugger) is that i am missing some static mappings so i get a bad mode error when hitting a pagefault in FIQ mode. > > Concerning the fact that qemu is jumping to the right address: > > To i have put a breakpoint to 0xffff001c which is the fiq base vector > > address. > > There is an instruction 0xea000480 which seems to be a pc relative branch > > to > > 0x1224 which then lands at 0xffff1240. > > > > But the internal debugger gives me some concerns. If i do at the gdb > > command > > line: > > hb *0xffff001c > > hb *0xffff1240 > > The debugger only stops at the first breakpoint. If i leave the first > > breakpoint > > away the debugger stops at 0xffff1240. As i know that at 0xffff01c it > > should jump > > right to 0xffff1240 i would expect that both breakpoints are triggered. > > > > Then if i reach the breakpoint at 0xffff1240 i know i am at the fiq code. > > But > > (gdb) x 0xffff1240 gives the wrong value. Nevertheless i see now (after > > correcting the static map of the GIC) the following debug output of my > > test > > device when single-stepping from PC=0xffff1240: > > Taking exception 6 [FIQ] > > pml: pml_write: update control flags: 1 > > pml: pml_update: stop timer > > pml: pml_update: lower irq > > pml: pml_read: read magic > > pml: pml_write: update control flags: 3 > > pml: pml_update: stop timer > > > > This means that there has been some code executed, most probably my FIQ > > handler, but the debugger showed me: > > Breakpoint 1, firq_fiq_handler () at fiq.S:26 > > 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 < ok > > (gdb) s <- oh my why is it single stepping into the kernel from > > FIQ? > > test_ti_thread_flag (flag=1, ti=0x8f84e000) at > > include/asm-generic/preempt.h:71 > > 71 return !--*preempt_count_ptr() && tif_need_resched(); > > (gdb) s <- next step does not look any better... > > test_bit (addr=0x8f84e000, nr=1) at include/asm-generic/bitops/non- > > atomic.h:105 > > 105 return 1UL & (addr[BIT_WORD(nr)] >> (nr & > > (BITS_PER_LONG-1))); > > > > The second run is even stranger: > > Breakpoint 1, firq_fiq_handler () at fiq.S:26 > > 26 mrc p15, 0, r10, c2, c0, 0 @ read TTBR0 > > (gdb) s > > Cannot access memory at address 0x4 > > (gdb) c > > Continuing. > > Cannot access memory at address 0x4 > > ... > > qemu seems completly unusable from here on... > > > > I am pretty sure now that my FIQ handler is executed. > > I see multiple accesses to my virtual pml test hardware: > > arm_gic: Raised pending FIQ 49 (cpu 0) > > pml: pml_write: update control flags: 1 > > pml: pml_update: start timer > > pml: pml_update: lower irq > > pml: pml_read: read magic > > pml: pml_write: update control flags: 3 > > pml: pml_update: start timer > > arm_gic: Enabled IRQ 37 > > [ OK ] Found device /dev/ttyAMA0. > > pml: pml_timer_tick: raise_irq > > arm_gic: Raised pending FIQ 49 (cpu 0) > > pml: pml_write: update control flags: 1 > > pml: pml_update: start timer > > pml: pml_update: lower irq > > pml: pml_read: read magic > > pml: pml_write: update control flags: 3 > > pml: pml_update: start timer > > pml: pml_timer_tick: raise_irq > > > > Which seems like normal operation. Especially the log > > message shows that other stuff gets executed. > > > > But after a while the interrupts stop and nothing happens > > The system is not reacting to keypresses anymore. Not even > > Ctrl-A-X. But this seems as if the debug output in the GIC and/or > > my pml test driver locked the qemu up? > > Hmmm... almost sounds like we lost an interrupt or ack which could be in > QEMU. Does execution cease if run as A15? I think by now that its not related to the CPU core but the gdb debug port. As soon as the debugger is open and a fiq is hit, problems start. This was a bit unfortunate for my tests as i was using the integrated debugger to debug the fiq. But results are completly bogus and definetly do not represent the qemu execution as this is running fine as i can see from my debug output from my virtual hardware driver. > > Also if i connect to the gdb port while the fiq is running the > > qemu stops the execution. > > > > But besides the problems with the debugger which set me of course > > the qemu seems to happy emulate FIQs, which is really nice :-) > > I'm happy to hear that we found a working scenario, but hangs and such > should not happen. I need to determine a way to look into this more > myself to see if it is related to grouping or FIQ support. I can prepare you a ptxdist.org based environment and my patches for my testdriver if you need a target. This should give you a linux environment in less than 30minutes of work and about 30min of compile time (depending on cpu). Best regards Tim
--- a/hw/cpu/a9mpcore.c +++ b/hw/cpu/a9mpcore.c @@ -29,6 +29,8 @@ static void a9mp_priv_initfn(Object *obj) object_initialize(&s->gic, sizeof(s->gic), TYPE_ARM_GIC); qdev_set_parent_bus(DEVICE(&s->gic), sysbus_get_default()); + qdev_prop_set_uint32(DEVICE(&s->gic), "revision", 2); object_initialize(&s->gtimer, sizeof(s->gtimer), TYPE_A9_GTIMER); qdev_set_parent_bus(DEVICE(&s->gtimer), sysbus_get_default());