Message ID | 1396052834-26834-2-git-send-email-rth@twiddle.net |
---|---|
State | Superseded |
Headers | show |
Richard Henderson <rth@twiddle.net> writes: > From: Peter Maydell <peter.maydell@linaro.org> > > The code which patches x86 jump instructions assumes it can do an > unaligned write of a uint32_t. This is actually safe on x86, but it's > still undefined behaviour. We have infrastructure for doing efficient > unaligned accesses which doesn't engage in undefined behaviour, so > use it. > > This is technically fractionally less efficient, at least with gcc 4.6; > instead of one instruction: > 7b2: 89 3e mov %edi,(%rsi) > we get an extra spurious store to the stack slot: > 7b2: 89 7c 24 64 mov %edi,0x64(%rsp) > 7b6: 89 3e mov %edi,(%rsi) Ehh? Is that gcc just being silly and putting parameters for an inline on the stack frame? > > Signed-off-by: Peter Maydell <peter.maydell@linaro.org> > Signed-off-by: Richard Henderson <rth@twiddle.net> <snip> Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
On 1 April 2014 13:09, Alex Bennée <alex.bennee@linaro.org> wrote: > > Richard Henderson <rth@twiddle.net> writes: > >> From: Peter Maydell <peter.maydell@linaro.org> >> >> The code which patches x86 jump instructions assumes it can do an >> unaligned write of a uint32_t. This is actually safe on x86, but it's >> still undefined behaviour. We have infrastructure for doing efficient >> unaligned accesses which doesn't engage in undefined behaviour, so >> use it. >> >> This is technically fractionally less efficient, at least with gcc 4.6; >> instead of one instruction: >> 7b2: 89 3e mov %edi,(%rsi) >> we get an extra spurious store to the stack slot: >> 7b2: 89 7c 24 64 mov %edi,0x64(%rsp) >> 7b6: 89 3e mov %edi,(%rsi) > > Ehh? Is that gcc just being silly and putting parameters for an inline > on the stack frame? It's gcc being dumb and not noticing that it has no requirement to store the inline parameter to the stack frame because it's optimised away the reference to the address of the parameter. Possibly more recent gcc versions do better. thanks -- PMM
diff --git a/include/exec/exec-all.h b/include/exec/exec-all.h index f9ac332..1c49a21 100644 --- a/include/exec/exec-all.h +++ b/include/exec/exec-all.h @@ -229,7 +229,7 @@ void ppc_tb_set_jmp_target(unsigned long jmp_addr, unsigned long addr); static inline void tb_set_jmp_target1(uintptr_t jmp_addr, uintptr_t addr) { /* patch the branch destination */ - *(uint32_t *)jmp_addr = addr - (jmp_addr + 4); + stl_p((void*)jmp_addr, addr - (jmp_addr + 4)); /* no need to flush icache explicitly */ } #elif defined(__aarch64__)