Message ID | 1406304831.24842.54.camel@kazak.uk.xensource.com |
---|---|
State | New |
Headers | show |
On 07/25/2014 05:13 PM, Ian Campbell wrote: > On Fri, 2014-07-25 at 17:03 +0100, Ian Campbell wrote: >> On Fri, 2014-07-25 at 16:48 +0100, Julien Grall wrote: >>> On 07/25/2014 04:48 PM, Ian Campbell wrote: >>>> On Fri, 2014-07-25 at 16:42 +0100, Julien Grall wrote: >>>>> Hi Ian, >>>>> >>>>> On 07/25/2014 04:22 PM, Ian Campbell wrote: >>>>>> bitops, cmpxchg, atomics: Import: >>>>>> c32ffce ARM: 7984/1: prefetch: add prefetchw invocations for barriered atomics >>>>> >>>>> Compare to Linux we don't have specific prefetch* helpers. We directly >>>>> use the compiler builtin ones. Shouldn't we import the ARM specific >>>>> helpers to gain in performance? >>>> >>>> My binaries are full of pld instructions where I think I would expect >>>> them, so it seems like the compiler builtin ones are sufficient. >>>> >>>> I suspect the Linux define is there to cope with older compilers or >>>> something. >>> >>> If so: >> >> The compiled output is very different if I use the arch specific >> explicit variants. The explicit variant generates (lots) more pldw and >> (somewhat) fewer pld. I've no idea what this means... > > It's a bit more obvious for aarch64 where gcc 4.8 doesn't generate any > prefetches at all via the builtins... > > Here's what I've got in my tree. I've no idea if we should take some or > all of it... I don't think it will be harmful for ARMv7 to use specific prefetch* helpers. [..] > +/* > + * Prefetching support > + */ > +#define ARCH_HAS_PREFETCH > +static inline void prefetch(const void *ptr) > +{ > + asm volatile("prfm pldl1keep, %a0\n" : : "p" (ptr)); > +} > + > +#define ARCH_HAS_PREFETCHW > +static inline void prefetchw(const void *ptr) > +{ > + asm volatile("prfm pstl1keep, %a0\n" : : "p" (ptr)); > +} > + > +#define ARCH_HAS_SPINLOCK_PREFETCH > +static inline void spin_lock_prefetch(const void *x) > +{ > + prefetchw(x); > +} Looking to the code. spin_lock_prefetch is called in the tree. I'm not sure we should keep this helper. Regards,
diff --git a/xen/include/asm-arm/arm32/processor.h b/xen/include/asm-arm/arm32/processor.h index f41644d..6feacc9 100644 --- a/xen/include/asm-arm/arm32/processor.h +++ b/xen/include/asm-arm/arm32/processor.h @@ -119,6 +119,23 @@ struct cpu_user_regs #define cpu_has_erratum_766422() \ (unlikely(current_cpu_data.midr.bits == 0x410fc0f4)) +#define ARCH_HAS_PREFETCH +static inline void prefetch(const void *ptr) +{ + __asm__ __volatile__( + "pld\t%a0" + :: "p" (ptr)); +} + +#define ARCH_HAS_PREFETCHW +static inline void prefetchw(const void *ptr) +{ + __asm__ __volatile__( + ".arch_extension mp\n" + "pldw\t%a0" + :: "p" (ptr)); +} + #endif /* __ASSEMBLY__ */ #endif /* __ASM_ARM_ARM32_PROCESSOR_H */ diff --git a/xen/include/asm-arm/arm64/processor.h b/xen/include/asm-arm/arm64/processor.h index 5bf0867..56b1002 100644 --- a/xen/include/asm-arm/arm64/processor.h +++ b/xen/include/asm-arm/arm64/processor.h @@ -106,6 +106,28 @@ struct cpu_user_regs #define cpu_has_erratum_766422() 0 +/* + * Prefetching support + */ +#define ARCH_HAS_PREFETCH +static inline void prefetch(const void *ptr) +{ + asm volatile("prfm pldl1keep, %a0\n" : : "p" (ptr)); +} + +#define ARCH_HAS_PREFETCHW +static inline void prefetchw(const void *ptr) +{ + asm volatile("prfm pstl1keep, %a0\n" : : "p" (ptr)); +} + +#define ARCH_HAS_SPINLOCK_PREFETCH +static inline void spin_lock_prefetch(const void *x) +{ + prefetchw(x); +} + + #endif /* __ASSEMBLY__ */ #endif /* __ASM_ARM_ARM64_PROCESSOR_H */