[Gc] providing memory information

Sean Middleditch elanthis at awesomeplay.com
Sat Apr 3 16:11:44 PST 2004


On Fri, 2004-04-02 at 16:07 -0800, Boehm, Hans wrote:

> That's an interesting idea, but I think it's hard to do efficiently and
> completely correctly.

My hacked approximations seem to be running way more than fast enough
for my app needs; I can generally check about 10,000,000 strings in half
a second (compiler optimizations turned off).  And that's only needed
when I assign the string pointer to a storage type; just passing them
around uses nothing but the pointer, which is (of course) as fast as it
can possibly get.  That is, of course, only my library/applications use
of these facilities.

> 
> GC_in_heap is easy, using GC_base() or GC_find_header().

Right, already doing that.  Just added a macro for consistency with
other functions if added would be nice, perhaps.  Or if a GC_in_heap
could be done a little more efficiently than just GC_base() ?

> 
> GC_on_stack is problematic, because thread stacks can appear and disappear.
> The collector keeps track of the stack bases, but not the current stack
> pointers.  And it doesn't keep them in a data structure that would allow
> quick lookup by address.

It's not too hard to make a new bounds at all.  I can actually get a
stack base using getcontext(), but it isn't portable, and I'm not sure
how it interacts with threads on all systems.  The current pointer can
be found just by creating a dummy variable and taking its address.  (In
a function that is never inlined.)  And that I've seen used on a number
of systems/architectures, so I'm guessing it's fairly portable.  

> 
> GC_is_static() has similar issues with dlopen().  GC_is_static_root() is
> a good approximation, though it can miss recently added dynamic libraries.

Something like "is in rootset" would also work perfectly well, since
that's the real goal; at least for my usage.  Dunno if that would make
it more accurate?

What I'm currently doing, and which seems to work well, altho not
ideally and possibly not as portably as I'd like, is:

 find stack base at initialization time (not thread friendly)
 find data segment using sbrk
 if the string is on the stack
   copy it
 else if the string is on the GC heap
   store pointer
 else if the string is above the data segment pointer
   copy it
 else (probably on the static segment)
  store the pointer

Specifically the data segment part I don't like.  I need that so detect
if a string was allocated in some heap besides the GCd heap.  I'm pretty
sure that'll break real quick outside of the tests I've written for the
library.

If there was a way to check that the pointer was within all roots, that
would work as well.  Even if it isn't that efficient, it would just be
used in the event that the quicker more-stable tests failed.

> 
> Hans
> 
> > -----Original Message-----
> > From: gc-bounces at napali.hpl.hp.com
> > [mailto:gc-bounces at napali.hpl.hp.com]On Behalf Of Sean Middleditch
> > Sent: Friday, April 02, 2004 11:35 AM
> > To: gc at napali.hpl.hp.com
> > Subject: [Gc] providing memory information
> > 
> > 
> > One of the libraries I'm working on currently is an attempt to provide
> > maximum efficiency for handling strings.  Namely, for 
> > applications that
> > just store/compare/display a lot of strings from a myriad of sources
> > (string literals, GCd string memory, non-GCd string memory,
> > stack-allocated buffers, etc) and want to avoid excessive unnecessary
> > copying.
> > 
> > I have said library working quite well, with the exception of some
> > fragile platform-specific code.  Code which, I might note, is already
> > used in the GC, of course.  I was hoping that perhaps some simple
> > functions could be added to utilize this information for 
> > libraries/apps
> > like mine that try to be intelligent with different memory "sources."
> > 
> > Namely, I'd like to see something like:
> > GC_API int GC_on_stack GC_PROTO((GC_PTR));
> > GC_API int GC_is_static GC_PROTO((GC_PTR));
> > GC_API int GC_in_heap GC_PROTO((GC_PTR));
> > 
> > Each of these would return 1 (true) if the pointer points to the
> > specified memory location (stack, static segment(s), and GC managed
> > heap, respectively), or 0 otherwise.
> > 
> > The general idea is that some apps need to handle objects in each
> > location slightly differently.  For example, objects on the 
> > stack or in
> > the non-collected (system malloc() allocated) heap have to be 
> > copied vs
> > just passing around a pointer since you can't control the lifetime. 
> > (Especially important when the pointer comes from code you 
> > can't modify
> > to be more GC friendly.)  Or, put another way, you can simply know you
> > never need to make a copy of data in the static segment or 
> > the collected
> > heap, since the information will never disappear under you.
> > -- 
> > Sean Middleditch <elanthis at awesomeplay.com>
> > AwesomePlay Productions, Inc.
> > 
> > _______________________________________________
> > Gc mailing list
> > Gc at linux.hpl.hp.com
> > http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> > 
-- 
Sean Middleditch <elanthis at awesomeplay.com>
AwesomePlay Productions, Inc.



More information about the Gc mailing list