Message ID | 1343649500-18491-11-git-send-email-anton.vorontsov@linaro.org |
---|---|
State | New |
Headers | show |
On Mon, Jul 30, 2012 at 04:58:20AM -0700, Anton Vorontsov wrote: > This makes the code more izolated. > > The downside of this is that we now have an additional branch and the > code itself is 8 bytes longer. But on the bright side, this new layout > can be more cache friendly since cr_alignment address might be already > in the cache line (not that I measured anything, it's just fun to think > about it). The caches are harvard, so mixing data and code together does not increase performance. Having data which is used by the same code in the same cache line results in better performance. The additional branch will also cause a pipeline stall on older CPUs. So no, I don't see any way that this is a performance improvement. Please leave this as is.
On Mon, Jul 30, 2012 at 03:15:44PM +0100, Russell King - ARM Linux wrote: > On Mon, Jul 30, 2012 at 04:58:20AM -0700, Anton Vorontsov wrote: > > This makes the code more izolated. > > > > The downside of this is that we now have an additional branch and the > > code itself is 8 bytes longer. But on the bright side, this new layout > > can be more cache friendly since cr_alignment address might be already > > in the cache line (not that I measured anything, it's just fun to think > > about it). > > The caches are harvard, so mixing data and code together does not increase > performance. Having data which is used by the same code in the same cache > line results in better performance. > > The additional branch will also cause a pipeline stall on older CPUs. > > So no, I don't see any way that this is a performance improvement. Please > leave this as is. Sure, will drop it. Thanks!
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S index 6aeb9b8..6b04ab5 100644 --- a/arch/arm/kernel/entry-armv.S +++ b/arch/arm/kernel/entry-armv.S @@ -266,8 +266,6 @@ __pabt_svc: ENDPROC(__pabt_svc) .align 5 -.LCcralign: - .word cr_alignment #ifdef MULTI_DABORT .LCprocfns: .word processor diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S index c3c09ac..5a05e7f 100644 --- a/arch/arm/kernel/entry-header.S +++ b/arch/arm/kernel/entry-header.S @@ -38,9 +38,13 @@ .macro alignment_trap, rtemp #ifdef CONFIG_ALIGNMENT_TRAP - ldr \rtemp, .LCcralign + ldr \rtemp, 1f ldr \rtemp, [\rtemp] mcr p15, 0, \rtemp, c1, c0 + b 2f +1: + .word cr_alignment +2: #endif .endm diff --git a/arch/arm/kernel/kgdb_fiq_entry.S b/arch/arm/kernel/kgdb_fiq_entry.S index 7be3726..e7c05fc 100644 --- a/arch/arm/kernel/kgdb_fiq_entry.S +++ b/arch/arm/kernel/kgdb_fiq_entry.S @@ -18,9 +18,6 @@ .text -@ This is needed for usr_entry/alignment_trap -.LCcralign: - .long cr_alignment .LCdohandle: .long kgdb_fiq_do_handle
This makes the code more izolated. The downside of this is that we now have an additional branch and the code itself is 8 bytes longer. But on the bright side, this new layout can be more cache friendly since cr_alignment address might be already in the cache line (not that I measured anything, it's just fun to think about it). Signed-off-by: Anton Vorontsov <anton.vorontsov@linaro.org> --- arch/arm/kernel/entry-armv.S | 2 -- arch/arm/kernel/entry-header.S | 6 +++++- arch/arm/kernel/kgdb_fiq_entry.S | 3 --- 3 files changed, 5 insertions(+), 6 deletions(-)