Message ID | 561E009B.3070001@linaro.org |
---|---|
State | New |
Headers | show |
On Oct 14, 2015, at 9:24 PM, Jungseok Lee wrote: > On Oct 14, 2015, at 4:13 PM, AKASHI Takahiro wrote: >> On 10/09/2015 11:24 PM, James Morse wrote: >>> Hi Jungseok, >>> >>> On 07/10/15 16:28, Jungseok Lee wrote: >>>> Currently, a call trace drops a process stack walk when a separate IRQ >>>> stack is used. It makes a call trace information much less useful when >>>> a system gets paniked in interrupt context. >>> >>> panicked >>> >>>> This patch addresses the issue with the following schemes: >>>> >>>> - Store aborted stack frame data >>>> - Decide whether another stack walk is needed or not via current sp >>>> - Loosen the frame pointer upper bound condition >>> >>> It may be worth merging this patch with its predecessor - anyone trying to >>> bisect a problem could land between these two patches, and spend time >>> debugging the truncated call traces. >>> >>> >>>> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h >>>> index 6ea82e8..e5904a1 100644 >>>> --- a/arch/arm64/include/asm/irq.h >>>> +++ b/arch/arm64/include/asm/irq.h >>>> @@ -2,13 +2,25 @@ >>>> #define __ASM_IRQ_H >>>> >>>> #include <linux/irqchip/arm-gic-acpi.h> >>>> +#include <asm/stacktrace.h> >>>> >>>> #include <asm-generic/irq.h> >>>> >>>> struct irq_stack { >>>> void *stack; >>>> + struct stackframe frame; >>>> }; >>>> >>>> +DECLARE_PER_CPU(struct irq_stack, irq_stacks); >>> >>> Good idea, storing this in the per-cpu data makes it immune to stack >>> corruption. >> >> Is this the only reason that you have a dummy stack frame in per-cpu data? >> By placing this frame in an interrupt stack, I think, we will be able to eliminate >> changes in dump_stace(). and >> >>> >>>> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c >>>> index 407991b..5124649 100644 >>>> --- a/arch/arm64/kernel/stacktrace.c >>>> +++ b/arch/arm64/kernel/stacktrace.c >>>> @@ -43,7 +43,27 @@ int notrace unwind_frame(struct stackframe *frame) >>>> low = frame->sp; >>>> high = ALIGN(low, THREAD_SIZE); >>>> >>>> - if (fp < low || fp > high - 0x18 || fp & 0xf) >>>> + /* >>>> + * A frame pointer would reach an upper bound if a prologue of the >>>> + * first function of call trace looks as follows: >>>> + * >>>> + * stp x29, x30, [sp,#-16]! >>>> + * mov x29, sp >>>> + * >>>> + * Thus, the upper bound is (top of stack - 0x20) with consideration >>> >>> The terms 'top' and 'bottom' of the stack are confusing, your 'top' appears >>> to be the highest address, which is used first, making it the bottom of the >>> stack. >>> >>> I would try to use the terms low/est and high/est, in keeping with the >>> variable names in use here. >>> >>> >>>> + * of a 16-byte empty space in THREAD_START_SP. >>>> + * >>>> + * The value, 0x20, however, does not cover all cases as interrupts >>>> + * are handled using a separate stack. That is, a call trace can start >>>> + * from elx_irq exception vectors. The symbols could not be promoted >>>> + * to candidates for a stack trace under the restriction, 0x20. >>>> + * >>>> + * The scenario is handled without complexity as 1) considering >>>> + * (bottom of stack + THREAD_START_SP) as a dummy frame pointer, the >>>> + * content of which is 0, and 2) allowing the case, which changes >>>> + * the value to 0x10 from 0x20. >>> >>> Where has 0x20 come from? The old value was 0x18. >>> >>> My understanding is the highest part of the stack looks like this: >>> high [ off-stack ] >>> high - 0x08 [ left free by THREAD_START_SP ] >>> high - 0x10 [ left free by THREAD_START_SP ] >>> high - 0x18 [#1 x30 ] >>> high - 0x20 [#1 x29 ] >>> >>> So the condition 'fp > high - 0x18' prevents returning either 'left free' >>> address, or off-stack-value as a frame. Changing it to 'fp > high - 0x10' >>> allows the first half of that reserved area to be a valid stack frame. >>> >>> This change is breaking perf using incantations [0] and [1]: >>> >>> Before, with just patch 1/2: >>> ---__do_softirq >>> | >>> |--92.95%-- __handle_domain_irq >>> | __irqentry_text_start >>> | el1_irq >>> | >>> >>> After, with both patches: >>> ---__do_softirq >>> | >>> |--83.83%-- __handle_domain_irq >>> | __irqentry_text_start >>> | el1_irq >>> | | >>> | |--99.39%-- 0x400008040d00000c >>> | --0.61%-- [...] >>> | >> >> This also shows that walk_stackframe() doesn't walk through a process stack. >> Now I'm trying the following hack on top of Jungseok's patch. >> (It doesn't traverse from an irq stack to an process stack yet. I need modify >> unwind_frame().) > > I've got a difference between perf and dump_backtrace() as reviewing perf call > chain operation. Perf relies on walk_stackframe(), but dump_backtrace() does not. > That is, a symbol is printed out *before* unwind_frame() call in case of perf. > By contrast, dump_backtrace() records a symbol *after* unwind_frame(). I think > perf behavior is correct since frame.pc is retrieved from a valid stack frame. > > So, the following diff is a prerequisite. It looks reasonable to remove dump_mem() > call since frame.sp is calculated incorrectly now. If accepted, dump_backtrace() > could utilize walk_stackframe(), which simplifies the code. > > ----8<---- > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c > index f93aae5..e18be43 100644 > --- a/arch/arm64/kernel/traps.c > +++ b/arch/arm64/kernel/traps.c > @@ -103,12 +103,15 @@ static void dump_mem(const char *lvl, const char *str, unsigned long bottom, > set_fs(fs); > } > > -static void dump_backtrace_entry(unsigned long where, unsigned long stack) > +static void dump_backtrace_entry(unsigned long where) > { > + /* > + * PC has a physical address when MMU is disabled. > + */ > + if (!kernel_text_address(where)) > + where = (unsigned long)phys_to_virt(where); > + > print_ip_sym(where); > - if (in_exception_text(where)) > - dump_mem("", "Exception stack", stack, > - stack + sizeof(struct pt_regs), false); > } > > static void dump_instr(const char *lvl, struct pt_regs *regs) > @@ -172,12 +175,17 @@ static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk) > pr_emerg("Call trace:\n"); > while (1) { > unsigned long where = frame.pc; > + unsigned long stack; > int ret; > > + dump_backtrace_entry(where); > ret = unwind_frame(&frame); > if (ret < 0) > break; > - dump_backtrace_entry(where, frame.sp); > + stack = frame.sp; > + if (in_exception_text(where)) > + dump_mem("", "Exception stack", stack, > + stack + sizeof(struct pt_regs), false); > } > } > ----8<---- > >> Thanks, >> -Takahiro AKASHI >> ----8<---- >> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S >> index 650cc05..5fbd1ea 100644 >> --- a/arch/arm64/kernel/entry.S >> +++ b/arch/arm64/kernel/entry.S >> @@ -185,14 +185,12 @@ alternative_endif >> mov x23, sp >> and x23, x23, #~(THREAD_SIZE - 1) >> cmp x20, x23 // check irq re-enterance >> + mov x19, sp >> beq 1f >> - str x29, [x19, #IRQ_FRAME_FP] >> - str x21, [x19, #IRQ_FRAME_SP] >> - str x22, [x19, #IRQ_FRAME_PC] >> - mov x29, x24 >> -1: mov x19, sp >> - csel x23, x19, x24, eq // x24 = top of irq stack >> - mov sp, x23 >> + mov sp, x24 // x24 = top of irq stack >> + stp x29, x22, [sp, #-32]! >> + mov x29, sp >> +1: >> .endm >> >> /* > > Is it possible to decide which stack is used without aborted SP information? We could know which stack is used via current SP, but how could we decide a variable 'low' in unwind_frame() when walking a process stack? Sorry for confusion. Best Regards Jungseok Lee-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S index 650cc05..5fbd1ea 100644 --- a/arch/arm64/kernel/entry.S +++ b/arch/arm64/kernel/entry.S @@ -185,14 +185,12 @@ alternative_endif mov x23, sp and x23, x23, #~(THREAD_SIZE - 1) cmp x20, x23 // check irq re-enterance + mov x19, sp beq 1f - str x29, [x19, #IRQ_FRAME_FP] - str x21, [x19, #IRQ_FRAME_SP] - str x22, [x19, #IRQ_FRAME_PC] - mov x29, x24 -1: mov x19, sp - csel x23, x19, x24, eq // x24 = top of irq stack - mov sp, x23 + mov sp, x24 // x24 = top of irq stack + stp x29, x22, [sp, #-32]! + mov x29, sp +1: .endm /*