Message ID | CAAqcGH==r93Adq-3TdUcKMpRz7+gOeD7Tk0giz352UD0LmtxVQ@mail.gmail.com |
---|---|
State | Superseded |
Headers | show |
On Thu, Mar 20, 2014 at 12:38:49PM +0200, Riku Voipio <riku.voipio@linaro.org> wrote: > First part of the patch adds memory fence for aarch64. Strictly, this isn't > necessary, as the fallback to __atomic_thread_fence (__ATOMIC_SEQ_CST) > would output the same dmb ish instruction. Thanks! We indeed plan to move the test for __atomic_thread_fence to earlier in the file, so it would become the default (are there any gcc versions that support aarch64 but not __atomic_thread_fence? I guess not). However, if aarch64 has some efficient load and store barriers, using them could be advantegous, i.e., if you have efficient implementations for the _ACQUIRE and _RELEASE variants, these could be quite useful. > Second part makes sure the ancient arm float format doesn't get used. As a historical sidenote, the ancient arm float fornmat is precisely the reason behind this, btw. :) However, we do have working code for both cases, and correctness beats speed, especially for these not-so-often used functions. I would prefer a massive #if statement for every useufl architeture out there and updating that over a negated one that breaks when double happens to be decimal ieee or somesuch bullshit that, in 20 years, might not be bullshit but the sane default. It's not as if new arm architectures will be introduced that often. > Alternative would be to drop all ancient arm float support - anyone still > using the abi wouldn't be upgrading to new libev. Well, I am sure old arm isn't the only platform with nonstandard fp (and I suspect future platforms might switch as well), so I would prefer a change that simply adds aarch64 to the list of "sane" architectures. Look at it that way: we also thought we had picked up all noteworthy architectures with our fences, but we regularly get patches :)
--- a/ev.c +++ b/ev.c @@ -616,6 +616,8 @@ #define ECB_MEMORY_FENCE_RELEASE __asm__ __volatile__ ("") #elif __powerpc__ || __ppc__ || __powerpc64__ || __ppc64__ #define ECB_MEMORY_FENCE __asm__ __volatile__ ("sync" : : : "memory") + #elif __aarch64__ + #define ECB_MEMORY_FENCE __asm__ __volatile__ ("dmb ish" : : : "memory") #elif defined __ARM_ARCH_6__ || defined __ARM_ARCH_6J__ \ || defined __ARM_ARCH_6K__ || defined __ARM_ARCH_6ZK__ #define ECB_MEMORY_FENCE __asm__ __volatile__ ("mcr p15,0,%0,c7,c10,5" : : "r" (0) : "memory") @@ -1043,22 +1045,14 @@ /* basically, everything uses "ieee pure-endian" floating point numbers */ /* the only noteworthy exception is ancient armle, which uses order 43218765 */ -#if 0 \ - || __i386 || __i386__ \ - || __amd64 || __amd64__ || __x86_64 || __x86_64__ \ - || __powerpc__ || __ppc__ || __powerpc64__ || __ppc64__ \ - || defined __arm__ && defined __ARM_EABI__ \ - || defined __s390__ || defined __s390x__ \ - || defined __mips__ \ - || defined __alpha__ \ - || defined __hppa__ \ - || defined __ia64__ \ - || defined _M_IX86 || defined _M_AMD64 || defined _M_IA64 - #define ECB_STDFP 1 - #include <string.h> /* for memcpy */ -#else +#if defined(__arm__) && ! (defined(__ARM_EABI__) || defined(__EABI__) \ + || defined(__VFP_FP__) || defined(_WIN32_WCE) || defined(ANDROID)) #define ECB_STDFP 0 #include <math.h> /* for frexp*, ldexp* */ + #warning "building for ancient arm" +#else + #define ECB_STDFP 1 + #include <string.h> /* for memcpy */ #endif #ifndef ECB_NO_LIBM