[Gc] Re: [libatomic_ops] bug with gcc/x86_64/CAS
patrick.marlier at unine.ch
Thu Feb 18 08:18:05 PST 2010
I think the FIX 1 is better because it uses one less register.
The cmpxchg instruction could change the rAX register so the compiler
need to reset the rAX register after that.
In the case of FIX 1, the rAX is used for "result" and don't need
another register for it.
Otherwise, the __sync_bool_compare_and_swap is a bit more efficient
because in case of branch, it doesn't need the setz instruction
(directly jnz/jz). (the __sync_ macro was introduced in GCC 4.1 if I
trust the online documentation)
PS: Note that in FIX 2, you should read "2" (not "0")
(I think my previous message was not sent) I guess this is why you have
a bad memory constraint.
**** Possible FIX 2: set RAX as earlyclobbered output ****
AO_compare_and_swap_full(volatile AO_t *addr,
AO_t old, AO_t new_val)
__asm__ __volatile__("lock; cmpxchgq %4, %0; setz %1"
: "=m"(*addr), "=q"(result) , "=&a" (old)
: "m"(*addr), "r" (new_val), "2"(old) : "memory");
return (int) result;
More information about the Gc