[Gc] [libatomic_ops] bug with gcc/x86_64/CAS

Boehm, Hans hans.boehm at hp.com
Wed Feb 17 21:59:31 PST 2010


> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com 
> [mailto:gc-bounces at napali.hpl.hp.com] On Behalf Of Ivan Maidanski
> Sent: Wednesday, February 17, 2010 8:45 AM
> To: gc at napali.hpl.hp.com
> Cc: Andrew Haley
> Subject: Re[2]: [Gc] [libatomic_ops] bug with gcc/x86_64/CAS
> Hi!
> Andrew Haley <aph at redhat.com> wrote:
> > On 02/17/2010 12:17 PM, Patrick MARLIER wrote:
> > 
> > > I think I found a bug into libatomic_ops into 
> > > AO_compare_and_swap_full function for gcc and x86_64 cpu.
> > > 
> > > **** Possible FIX 2: set RAX as earlyclobbered output 
> **** AO_INLINE 
> > > int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t 
> > > new_val) { char result; __asm__ __volatile__("lock; 
> cmpxchgq %4, %0; 
> > > setz %1"
> > > : "=m"(*addr), "=q"(result) , "=&a" (old)
> > > : "m"(*addr), "r" (new_val), "0"(old) : "memory"); return (int) 
> > > result; }
> > 
> > I think this asm is best, but it's pretty questionable to 
> use an asm 
> > at all, given that this is a built-in gcc function.
> Andrew -
> 1. could you explain why fix 1 is not so good as fix 2;
If I understand correctly, fix 1 overconstrains the location of result, which might generate worse code.  It also seems to be a less accurate description of what's going on.  I would also go with fix 2.
> 2. thanks for noting that __sync_... built-in funcs exist in GCC but:
> - from which GCC version they are supported?;
> - they are supported for all targets (where applicable) or 
> for Intel only?;
> - it would be good if someone send me a draft (at least) code 
> using them;
> - as only bug fixes are accepted for gc72, the changes would 
> go to the next major release (unless Hans says the opposite).
I think it's fine to use __sync for certain platforms on which they are known to have worked correctly for a while.  Unfortunately, I don't think they are currently a very good generic solution to the problem.  Most importantly for this discussion, they don't always do what they claim.  For example, last I checked __sync_synchronize generated a no-op on X86.  It's unclear to me whether this is fixable, since a fix may significantly slow down a lot of working code.  Some of the Itanium primitives didn't have the right ordering constraints, which is probably fixable.

I think the right long term fix here is to go to the C++0x atomics.  At that point we can hopefully retire libatomic_ops altogether.

But if a particular primitive is known to do the right thing on a platform, as in this case, I don't see a good reason not to use it.  I think that's true for __sync_bool_compare_and_swap on X86.  (Unlike some other architectures, it's hard to get that one wrong on X86, since the machine instruction includes a full fence, which behaves as expected.)

In this case I'm fine with either fix 2 or __sync_bool_compare_and_swap for 7.2, but only in the X86-specific code.

> Bye.
> 2. not clear to me what do you mean by "but it's pretty 
> questionable to use an asm at all, given that this is a 
> built-in gcc function" - you mean CAS 
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com
> https://www.hpl.hp.com/hosted/linux/mail-archives/gc/

More information about the Gc mailing list