[GC] Porting boehm-gc to RTEMS for GCJ

Boehm, Hans hans.boehm at hp.com
Wed Jul 13 17:47:09 PDT 2011



> -----Original Message-----
> From: Joel Sherrill [mailto:joel.sherrill at gmail.com]
> Sent: Wednesday, July 13, 2011 1:53 PM
> To: Boehm, Hans
> Cc: Jie Liu; gc at linux.hpl.hp.com
> Subject: Re: [GC] Porting boehm-gc to RTEMS for GCJ
> 
> On Wed, Jul 13, 2011 at 2:51 PM, Boehm, Hans <hans.boehm at hp.com> wrote:
> >> From: Joel Sherrill
> >>
> >> On Wed, Jul 13, 2011 at 9:03 AM, Jie Liu <lj8175 at gmail.com> wrote:
> >> > Hi,
> >> >
> >> > If I don't port "Thread support"[1] for RTEMS operating system
> while
> >> > porting GCJ to it, can I run multiple thread which created in Java
> ?
> >>
> >> RTEMS has the non-POSIX task suspend and resume.  These
> >> add an additional blocking state to a thread's state.  They are
> >> very lightweight.  Would these be suitable to implement the
> following?
> >>
> >> GC_stop_world()
> >>     Stops all threads which may access the garbage collected heap,
> >> other than the caller.
> >> GC_start_world()
> >>     Restart other threads.
> > It depends.  Is it possible to retrieve the register state of the
> suspended threads?  The GC needs these to find the stack pointer, and
> pointers residing in machine registers.
> 
> The method of blocking is independent of the answer to this question.
> 
> Yes easily to stack pointer.
> 
> Other pointers is a bit more complicated.  The RTEMS thread context
> area only contains registers which are preserved across subroutine
> calls.  So on x86, it does not contain eax, ecx and edx.  Context
> switch
> is a subroutine call and caller can assume those are clobbered.
Context switches may be preemptive?  Or is this purely a cooperative threading system?  In the former case, it seems to be that all register need to be preserved somewhere.  And the collector needs to see them, since they may contain the only copy of a pointer.  In the cooperative case, we don't care about the clobbered registers, since they can't contain anything interesting.

> 
> > You may run into other issues with suspend.  The usual problem is
> that it's possible to suspend a thread that's holding a critical
> resource needed for the system to function properly.  We've had some
> issues along these lines with the signal-based approach as well, and
> you may run into slightly different ones with explicit thread
> suspension.
> 
> OK.  Then the port should stick with the standard POSIX way
> of doing things.  That code is well tested anyway which does
> have its advantages. :)
Indeed, if that works.  That code does rely on a bit more than Posix.  But it seems to work on Posix platforms, aside from some known issues with pthread_cancel, which I believe gcj does not rely on.

> 
> >>
> >> > I ask this question because: if a RTEMS GCJ program with multiple
> >> > threads but no memory allocation in threads, the program can run
> >> > successfully. And if has memory allocation in threads, the program
> >> may
> >> > fail, e.g. new char[660] PASS but new char[680] and more will FAIL
> in
> >> > new thread. The cause of the error is wrong jump address such as
> >> > 0xFF0720FF or hanging in the program while stack error.
> >>
> >> I personally don't understand the memory layout requirements in
> >> general terms for GC.
> > Could you be more specific?
> 
> How many areas of memory does GC manage?  Special management
> on top of malloc()? Allocation from stack like alloca()?
> 
> RTEMS is logically a single process in an unprotected flat address
> space
> with multiple threads.  Without VM, it is important to understand where
> things physically are located.
The GC has to be able to find all pieces of the address space (other than the GC heap, which it knows about anyway) that may contain pointers into the GC heap.  The entire static data section may be a suitable conservative overestimate of that.

> 
> >> RTEMS does not have a main so there
> >> is no main stack.
> > Presumably there is still a stack associated with the original
> thread?
> 
> Yes.  As long as it doesn't exit.
I think that should be OK.  You probably want to consider the first thread that invokes the GC as the main thread.  Any further threads that access the GC heap then need to be registered with the GC, e.g. by intercepting the thread creation calls.
> 
> Any guidance on sizing what we refer to as the initialization task's
> stack?  I know for C it can usually be fairly small for many
> applications
> but for Ada it can need to be many times larger.  What factors go
> into determining the maximum size to reserve?
Unfortunately, currently the collector can run on any thread, and it potentially need a fair amount of stack space.  I think we currently try to insist on at least 65K or 128K on some other platforms.
> 
> >
> >> Thread stack sizes are fixed and don't grow.
> > That doesn't really matter to the collector, since you don't want the
> collector to trace from parts of the stack region that are not
> currently in use.  As far as the collector is concerned, it's important
> to be able to find the "in use" stack regions.
> 
> OK.  I think that is doable since you can easily get to the stack
> of any thread.
> 
> >
> >>
> >> What is the relationship between the various types of memory?
> > Stacks and static data are allocated by the allocated by the
> underlying system, and the collector has to be able to find those
> regions (GC_register_data_segments, GC_register_dynamic_libraries,
> GC_push_all_stacks, GC_push_current_stack).  Heap memory is allocated
> using GET_MEM, and then tracked internally (and portably) by the
> collector. GET_MEM can be defined in a platform-specific way.
> 
> By static data, you mean the .data and .sdata segments?
And usually .bss .  On many systems there may be many of each, due to dynamic libraries.

Hans

> 
> And stacks are obviously known by threading system.
> 
> Sorry to ask such fundamental questions GC has a lot of target
> architectures
> and RTEMS has even more.  I want to make sure we are making decisions
> that are very broad so the next architecture supported is easy.  And
> when
> something is broken in the RTEMS port, I have a clue as to what might
> be
> broken.
> 
> Just give us pointers and we will get there. :)
> 
> Thanks.
> 
> --joel
> 
> >
> > Hans
> >> Where does it come from?
> >>
> >> > [1]http://www.hpl.hp.com/personal/Hans_Boehm/gc/porting.html
> >> >
> >> > Thanks,
> >> > Jie
> >>
> >> And thanks again.
> >>
> >> --joel sherrill
> >>
> >> _______________________________________________
> >> Gc mailing list
> >> Gc at linux.hpl.hp.com
> >> http://www.hpl.hp.com/hosted/linux/mail-archives/gc/
> >



More information about the Gc mailing list