Message ID | 20171107165226.22546-1-alex.bennee@linaro.org |
---|---|
State | New |
Headers | show |
Series | accel/tcg/translate-all: expand cpu_restore_state retaddr check | expand |
On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote: > We are still seeing signals during translation time when we walk over > a page protection boundary. This expands the check to ensure the > retaddr is inside the code generation buffer. The original suggestion > was to check versus tcg_ctx.code_gen_ptr but as we now segment the > translation buffer we have to settle for just a general check for > being inside. > > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > Reported-by: Peter Maydell <peter.maydell@linaro.org> > Suggested-by: Paolo Bonzini <pbonzini@redhat.com> > Cc: Richard Henderson <rth@twiddle.net> > --- > accel/tcg/translate-all.c | 20 ++++++++++++-------- > 1 file changed, 12 insertions(+), 8 deletions(-) > > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c > index 34c5e28d07..eb255af402 100644 > --- a/accel/tcg/translate-all.c > +++ b/accel/tcg/translate-all.c > @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) > TranslationBlock *tb; > bool r = false; > > - /* A retaddr of zero is invalid so we really shouldn't have ended > - * up here. The target code has likely forgotten to check retaddr > - * != 0 before attempting to restore state. We return early to > - * avoid blowing up on a recursive tb_lock(). The target must have > - * previously survived a failed cpu_restore_state because > - * tb_find_pc(0) would have failed anyway. It still should be > - * fixed though. > + /* The retaddr has to be in the region of current code buffer. If > + * it's not we will not be able to resolve it here. If it is zero > + * the calling code has likely forgotten to check retaddr before > + * calling here. This part of the comment isn't correct -- it's entirely expected that we will get here with a zero retaddr, because that is how the rest of the code tells this function "no state restoration required". > If it is not in the translated code we could be > + * faulting during translation itself. > + * > + * Either way we need return early to avoid blowing up on a > + * recursive tb_lock() as we can't resolve it here. > */ > > - if (!retaddr) { > + if (!retaddr || > + (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) || > + (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer + > + tcg_init_ctx.code_gen_buffer_size))) { > return r; > } thanks -- PMM
Peter Maydell <peter.maydell@linaro.org> writes: > On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote: >> We are still seeing signals during translation time when we walk over >> a page protection boundary. This expands the check to ensure the >> retaddr is inside the code generation buffer. The original suggestion >> was to check versus tcg_ctx.code_gen_ptr but as we now segment the >> translation buffer we have to settle for just a general check for >> being inside. >> >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> Reported-by: Peter Maydell <peter.maydell@linaro.org> >> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> >> Cc: Richard Henderson <rth@twiddle.net> >> --- >> accel/tcg/translate-all.c | 20 ++++++++++++-------- >> 1 file changed, 12 insertions(+), 8 deletions(-) >> >> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c >> index 34c5e28d07..eb255af402 100644 >> --- a/accel/tcg/translate-all.c >> +++ b/accel/tcg/translate-all.c >> @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) >> TranslationBlock *tb; >> bool r = false; >> >> - /* A retaddr of zero is invalid so we really shouldn't have ended >> - * up here. The target code has likely forgotten to check retaddr >> - * != 0 before attempting to restore state. We return early to >> - * avoid blowing up on a recursive tb_lock(). The target must have >> - * previously survived a failed cpu_restore_state because >> - * tb_find_pc(0) would have failed anyway. It still should be >> - * fixed though. >> + /* The retaddr has to be in the region of current code buffer. If >> + * it's not we will not be able to resolve it here. If it is zero >> + * the calling code has likely forgotten to check retaddr before >> + * calling here. > > This part of the comment isn't correct -- it's entirely expected > that we will get here with a zero retaddr, because that is > how the rest of the code tells this function "no state restoration > required". Then why call cpu_restore_state at all? We should be consistent as there are plenty of places that do things like: if (pc) { /* now we have a real cpu fault */ cpu_restore_state(cs, pc); } I'm happy to make a 0 retaddr officially valid and actually document it in exec-all.h. It's not like most callers even bother checking the return code. > >> If it is not in the translated code we could be >> + * faulting during translation itself. >> + * >> + * Either way we need return early to avoid blowing up on a >> + * recursive tb_lock() as we can't resolve it here. >> */ >> >> - if (!retaddr) { >> + if (!retaddr || >> + (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) || >> + (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer + >> + tcg_init_ctx.code_gen_buffer_size))) { >> return r; >> } > > thanks > -- PMM -- Alex Bennée
On 7 November 2017 at 18:45, Alex Bennée <alex.bennee@linaro.org> wrote: > > Peter Maydell <peter.maydell@linaro.org> writes: > >> On 7 November 2017 at 16:52, Alex Bennée <alex.bennee@linaro.org> wrote: >>> We are still seeing signals during translation time when we walk over >>> a page protection boundary. This expands the check to ensure the >>> retaddr is inside the code generation buffer. The original suggestion >>> was to check versus tcg_ctx.code_gen_ptr but as we now segment the >>> translation buffer we have to settle for just a general check for >>> being inside. >>> >>> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >>> Reported-by: Peter Maydell <peter.maydell@linaro.org> >>> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> >>> Cc: Richard Henderson <rth@twiddle.net> >>> --- >>> accel/tcg/translate-all.c | 20 ++++++++++++-------- >>> 1 file changed, 12 insertions(+), 8 deletions(-) >>> >>> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c >>> index 34c5e28d07..eb255af402 100644 >>> --- a/accel/tcg/translate-all.c >>> +++ b/accel/tcg/translate-all.c >>> @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) >>> TranslationBlock *tb; >>> bool r = false; >>> >>> - /* A retaddr of zero is invalid so we really shouldn't have ended >>> - * up here. The target code has likely forgotten to check retaddr >>> - * != 0 before attempting to restore state. We return early to >>> - * avoid blowing up on a recursive tb_lock(). The target must have >>> - * previously survived a failed cpu_restore_state because >>> - * tb_find_pc(0) would have failed anyway. It still should be >>> - * fixed though. >>> + /* The retaddr has to be in the region of current code buffer. If >>> + * it's not we will not be able to resolve it here. If it is zero >>> + * the calling code has likely forgotten to check retaddr before >>> + * calling here. >> >> This part of the comment isn't correct -- it's entirely expected >> that we will get here with a zero retaddr, because that is >> how the rest of the code tells this function "no state restoration >> required". > > Then why call cpu_restore_state at all? We should be consistent as there > are plenty of places that do things like: > > if (pc) { > /* now we have a real cpu fault */ > cpu_restore_state(cs, pc); > } > > I'm happy to make a 0 retaddr officially valid and actually document it > in exec-all.h. It's not like most callers even bother checking the > return code. Hmm, there's more places than I expected that do that "don't call if 0" check than I thought. Overall it seems better to me to officially allow the zero, rather than having lots of callsites that all have to remember to check. Incidentally if retaddr is zero then (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) is always true and you don't need to explicitly check for zero, though it might be clearer to do so if we think we might change the rest of the condition in future. thanks -- PMM
On 11/07/2017 07:53 PM, Peter Maydell wrote: >> Then why call cpu_restore_state at all? We should be consistent as there >> are plenty of places that do things like: >> >> if (pc) { >> /* now we have a real cpu fault */ >> cpu_restore_state(cs, pc); >> } >> >> I'm happy to make a 0 retaddr officially valid and actually document it >> in exec-all.h. It's not like most callers even bother checking the >> return code. This is exactly the discussion that we had last time, and we did just that. > Hmm, there's more places than I expected that do that "don't call > if 0" check than I thought. Overall it seems better to me to officially > allow the zero, rather than having lots of callsites that all have > to remember to check. ... what we didn't do is go through and change all of the call sites to remove the check for zero. > Incidentally if retaddr is zero then > (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) > is always true and you don't need to explicitly check for zero, though > it might be clearer to do so if we think we might change the rest > of the condition in future. Indeed, I was thinking retaddr - code_gen_buffer < code_gen_buffer_size which works well with unsigned arithmetic. And a large comment re zero. r~
diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c index 34c5e28d07..eb255af402 100644 --- a/accel/tcg/translate-all.c +++ b/accel/tcg/translate-all.c @@ -357,16 +357,20 @@ bool cpu_restore_state(CPUState *cpu, uintptr_t retaddr) TranslationBlock *tb; bool r = false; - /* A retaddr of zero is invalid so we really shouldn't have ended - * up here. The target code has likely forgotten to check retaddr - * != 0 before attempting to restore state. We return early to - * avoid blowing up on a recursive tb_lock(). The target must have - * previously survived a failed cpu_restore_state because - * tb_find_pc(0) would have failed anyway. It still should be - * fixed though. + /* The retaddr has to be in the region of current code buffer. If + * it's not we will not be able to resolve it here. If it is zero + * the calling code has likely forgotten to check retaddr before + * calling here. If it is not in the translated code we could be + * faulting during translation itself. + * + * Either way we need return early to avoid blowing up on a + * recursive tb_lock() as we can't resolve it here. */ - if (!retaddr) { + if (!retaddr || + (retaddr < (uintptr_t) tcg_init_ctx.code_gen_buffer) || + (retaddr > (uintptr_t) (tcg_init_ctx.code_gen_buffer + + tcg_init_ctx.code_gen_buffer_size))) { return r; }
We are still seeing signals during translation time when we walk over a page protection boundary. This expands the check to ensure the retaddr is inside the code generation buffer. The original suggestion was to check versus tcg_ctx.code_gen_ptr but as we now segment the translation buffer we have to settle for just a general check for being inside. Signed-off-by: Alex Bennée <alex.bennee@linaro.org> Reported-by: Peter Maydell <peter.maydell@linaro.org> Suggested-by: Paolo Bonzini <pbonzini@redhat.com> Cc: Richard Henderson <rth@twiddle.net> --- accel/tcg/translate-all.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) -- 2.14.2