[Gc] [libatomic_ops] bug with gcc/x86_64/CAS

Boehm, Hans hans.boehm at hp.com
Thu Feb 18 09:59:36 PST 2010

I understand that they've been supported.  The problem is that they haven't always behaved as documented, and the documentation has been imprecise with respect to memory ordering,  in part because the API just doesn't provide enough options.  Part of libatomic_ops charter is to try to deal with those issues.  I think it does so better than __sync, but less well than a full implementation of the C++0x atomics would.



From: John Plevyak [mailto:jplevyak at gmail.com]
Sent: Thursday, February 18, 2010 8:23 AM
To: Boehm, Hans
Cc: Ivan Maidanski; gc at napali.hpl.hp.com; Andrew Haley
Subject: Re: Re[2]: [Gc] [libatomic_ops] bug with gcc/x86_64/CAS

FYI: GCC has supported the __sync_ primitives from 4.1

It looks like the major targets are supported by the end of 4.1.X,
PowerPC, Alpha, ARM, SPARC, MIPS, etc.


On Wed, Feb 17, 2010 at 9:59 PM, Boehm, Hans <hans.boehm at hp.com<mailto:hans.boehm at hp.com>> wrote:

> -----Original Message-----
> From: gc-bounces at napali.hpl.hp.com<mailto:gc-bounces at napali.hpl.hp.com>
> [mailto:gc-bounces at napali.hpl.hp.com<mailto:gc-bounces at napali.hpl.hp.com>] On Behalf Of Ivan Maidanski
> Sent: Wednesday, February 17, 2010 8:45 AM
> To: gc at napali.hpl.hp.com<mailto:gc at napali.hpl.hp.com>
> Cc: Andrew Haley
> Subject: Re[2]: [Gc] [libatomic_ops] bug with gcc/x86_64/CAS
> Hi!
> Andrew Haley <aph at redhat.com<mailto:aph at redhat.com>> wrote:
> > On 02/17/2010 12:17 PM, Patrick MARLIER wrote:
> >
> > > I think I found a bug into libatomic_ops into
> > > AO_compare_and_swap_full function for gcc and x86_64 cpu.
> > >
> > > **** Possible FIX 2: set RAX as earlyclobbered output
> **** AO_INLINE
> > > int AO_compare_and_swap_full(volatile AO_t *addr, AO_t old, AO_t
> > > new_val) { char result; __asm__ __volatile__("lock;
> cmpxchgq %4, %0;
> > > setz %1"
> > > : "=m"(*addr), "=q"(result) , "=&a" (old)
> > > : "m"(*addr), "r" (new_val), "0"(old) : "memory"); return (int)
> > > result; }
> >
> > I think this asm is best, but it's pretty questionable to
> use an asm
> > at all, given that this is a built-in gcc function.
> Andrew -
> 1. could you explain why fix 1 is not so good as fix 2;

If I understand correctly, fix 1 overconstrains the location of result, which might generate worse code.  It also seems to be a less accurate description of what's going on.  I would also go with fix 2.

> 2. thanks for noting that __sync_... built-in funcs exist in GCC but:
> - from which GCC version they are supported?;
> - they are supported for all targets (where applicable) or
> for Intel only?;
> - it would be good if someone send me a draft (at least) code
> using them;
> - as only bug fixes are accepted for gc72, the changes would
> go to the next major release (unless Hans says the opposite).

I think it's fine to use __sync for certain platforms on which they are known to have worked correctly for a while.  Unfortunately, I don't think they are currently a very good generic solution to the problem.  Most importantly for this discussion, they don't always do what they claim.  For example, last I checked __sync_synchronize generated a no-op on X86.  It's unclear to me whether this is fixable, since a fix may significantly slow down a lot of working code.  Some of the Itanium primitives didn't have the right ordering constraints, which is probably fixable.

I think the right long term fix here is to go to the C++0x atomics.  At that point we can hopefully retire libatomic_ops altogether.

But if a particular primitive is known to do the right thing on a platform, as in this case, I don't see a good reason not to use it.  I think that's true for __sync_bool_compare_and_swap on X86.  (Unlike some other architectures, it's hard to get that one wrong on X86, since the machine instruction includes a full fence, which behaves as expected.)

In this case I'm fine with either fix 2 or __sync_bool_compare_and_swap for 7.2, but only in the X86-specific code.


> Bye.
> 2. not clear to me what do you mean by "but it's pretty
> questionable to use an asm at all, given that this is a
> built-in gcc function" - you mean CAS
> _______________________________________________
> Gc mailing list
> Gc at linux.hpl.hp.com<mailto:Gc at linux.hpl.hp.com>
> https://www.hpl.hp.com/hosted/linux/mail-archives/gc/
Gc mailing list
Gc at linux.hpl.hp.com<mailto:Gc at linux.hpl.hp.com>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://napali.hpl.hp.com/pipermail/gc/attachments/20100218/b9994a09/attachment.html

More information about the Gc mailing list