Re: [Gc] [libatomic_ops] bug with gcc/x86_64/CAS

Ivan Maidanski ivmai at mail.ru
Wed Feb 17 05:51:12 PST 2010


Hi!
Patrick MARLIER <patrick.marlier at unine.ch> wrote:
> Hi everybody,
> 
> I think I found a bug into libatomic_ops into AO_compare_and_swap_full 
> function for gcc and x86_64 cpu.
> 
> Platform: linux
> Compiler: gcc
> CPU: x86_64
> 
> Example:
> The application want to exchange the content of <var> to 1 if it is 
> equal to 0.
> 
> ; user
> mov $0x1,%rdx
> xor %rax,%rax
> 
> AO_compare_and_swap_full:
> lock cmpxchg %rdx, (var)
> sete %cl
> 
> ; user
> test %cl,%cl
> je AO_compare_and_swap_full
> 
> Problem:
> cmpxchg instruction change the content of RAX register if it fails (set 
> to current value of the memory location).
> In this example, if cmpxchg fails RAX register will not be 0 at the next
> loop.
> 
> Fix:
> The RAX register must be clobbered.
> 
> **** CURRENT ****
> AO_INLINE int
> AO_compare_and_swap_full(volatile AO_t *addr,
> AO_t old, AO_t new_val)
> {
> char result;
> __asm__ __volatile__("lock; cmpxchgq %3, %0; setz %1"
> : "=m"(*addr), "=q"(result)
> : "m"(*addr), "r" (new_val), "a"(old) : "memory");
> return (int) result;
> }
> 
> **** Possible FIX 1: use also RAX as output ****
> AO_INLINE int
> AO_compare_and_swap_full(volatile AO_t *addr,
> AO_t old, AO_t new_val)
> {
> char result;
> __asm__ __volatile__("lock; cmpxchgq %3, %0; setz %1"
> : "=m"(*addr), "=a"(result)
> : "m"(*addr), "r" (new_val), "a"(old) : "memory");
> return (int) result;
> }
> 
> **** Possible FIX 2: set RAX as earlyclobbered output ****
> AO_INLINE int
> AO_compare_and_swap_full(volatile AO_t *addr,
> AO_t old, AO_t new_val)
> {
> char result;
> __asm__ __volatile__("lock; cmpxchgq %4, %0; setz %1"
> : "=m"(*addr), "=q"(result) , "=&a" (old)
> : "m"(*addr), "r" (new_val), "0"(old) : "memory");
> return (int) result;
> }
> 
> I am not an expert so maybe a better solution could be done.
> 
> Patrick Marlier.
> 
> PS: I already posted it into sourceforge.

Hans -
Could you comment this?
(at least, using fix 2 should be ok, IMHO.)
I guess this should be changed for x86 too and for cmpxchg16b. What's about ia64?

Bye.


More information about the Gc mailing list