[Gc] Re: [libatomic_ops] bug with gcc/x86_64/CAS

Patrick Marlier patrick.marlier at unine.ch
Thu Feb 18 08:18:05 PST 2010

I think the FIX 1 is better because it uses one less register.
The cmpxchg instruction could change the rAX register so the compiler 
need to reset the rAX register after that.
In the case of FIX 1, the rAX is used for "result" and don't need 
another register for it.
Otherwise, the __sync_bool_compare_and_swap is a bit more efficient 
because in case of branch, it doesn't need the setz instruction 
(directly jnz/jz). (the __sync_ macro was introduced in GCC 4.1 if I 
trust the online documentation)

Patrick Marlier.

PS: Note that in FIX 2, you should read "2" (not "0")
(I think my previous message was not sent) I guess this is why you have 
a bad memory constraint.

**** Possible FIX 2: set RAX as earlyclobbered output ****
AO_compare_and_swap_full(volatile AO_t *addr,
AO_t old, AO_t new_val)
char result;
__asm__ __volatile__("lock; cmpxchgq %4, %0; setz %1"
: "=m"(*addr), "=q"(result) , "=&a" (old)
: "m"(*addr), "r" (new_val), "2"(old) : "memory");
return (int) result;

More information about the Gc mailing list