Only the Good Die Young

by christineflood

It’s called many things in the literature but we’ll use “generational hypothesis”.  The insight is that most objects only survive for a short while, or a small number of GC cycles.  Conversely any object that has survived several GC cycles will probably survive a lot more.  If your program behaves in this fashion, and a lot of lisp and Java programs did, it makes sense to focus your GC effort on efficiently collecting those young objects.  You can do this by partitioning your heap into multiple generations.  A small frequently collected young generation and a larger less frequently collected old generation.

The astute reader (all of you I’m sure) will be wondering how we can collect the young  generation without having to collect the much larger old generation?  We do this by keeping track of references from the old generation to the young generation.  When we garbage collect the young generation we also have to look at the objects in the old generation that point to the young generation. We can keep a little cheat sheet of the locations of these objects  which is called a card table.  Each card represents an area of memory.  If there is an object in that memory that points to the young generation than that card is marked.  When it is time to perform a young generation garbage collection you only need to scan the areas represented by marked cards, not the entire old generation.  This card table must be kept up to date, and the only way to do this is adding a tax to writes which updates the card table if appropriate.  We call this tax a write barrier.

The first high performance GC’s in the JVM were generational collectors.  ExactVM had a semi-space collector and a mark/compact old generation.  Hotspot had an eden (where objects are allocated), a semi-space collector, and a mark/compact old generation.

Next up will be several approaches to making GC pauses shorter.