Our columnist, Kirk Pepperdine, details the expected benefits of Sun's new "Garbage First (G1)" garbage collectorPublished August 2008, Author Kirk Pepperdine
If your application is still running on the 1.4 or even 1.5 JVM, a compelling argument to upgrade to Java 6 is just over the horizon. That reason is the Garbage First garbage collector, otherwise known as G1. After I explain how it works, I'll explain why I think this collector is about to change the face of GC.
As revolutionary as the effects that I expect to see produced by the G1 collector, the collector itself is only an evolutionary step away from the generational collectors that we now use. If you are a GC gearhead, then you know that the Java heap is divided into 3 regions: young, tenured, and perm. You also know that young is further divided into eden and two survivor spaces. When eden fills up, the collector finds all live objects and evacuates them to one of the survivor spaces. Older objects will eventually find themselves into tenured space. This division allows us to treat younger objects differently from how we treat older objects. The GC implementors knew that young objects tend to die very quickly. This fact is known as the "Weak Generational Hypothesis".
Since most objects die within a few 10s to 100s of microseconds, when eden is "full", only very few of the objects are live. The surviving live objects can quickly be copied to a survivor space and mark the entire space as free. I started calling this Object Harvesting as opposed to Garbage Collection because I feel it more accurately reflects what is going on. Now because we don't collect but only harvest, young generational GC (harvesting) is very cheap. The same evacuation technique can be used for the survivor spaces, making those spaces cheap to manage also. However, there is no other space to evacuate tenured to, and consequently full GCs remain somewhat expensive. I expect that with the G1 collector, this is about to change.
In the G1 collector memory will be divided into multiple regions (currently 1 megabyte regions). Each of these regions will participate in a generation. Some will be assigned to young, some to survivor, and others to old. The collector will be parallelized (multi-threaded) and will work concurrently with your application. While the activities of the young collection will be somewhat the same as it is today, the biggest change will be in the collection of regions allocated as tenured space.
The first phase of collection is to mark all live objects. In young regions, all live objects will be harvested (copied) to an empty region (survivor). In tenured, the mark phase will be used to calculate the liveliness of a region. Regions that are 100% clean will be put back on the free list. Regions that have a low liveliness will be put into a list of candidate regions to be collected (err, harvested). When a young generation collection occurs, that collector will grab a couple of these low liveliness tenured regions and harvest them. In other words, there will no longer be a separate old space collector. This notion is reflected in G1's nickname, the one and a half garbage collector.
Why is this going to be better than what we have now? Currently if we want a low latency collector, we must resort to using the Concurrent Mark and Sweep (CMS) collector. While this does a great job of increasing liveliness of our applications by reducing that stop-the-world nature of GC that we've all grown to know and love, CMS can still create some devastating pause times. It does this because CMS does not compact. A non-compacting collector will eventually leave your heap looking like swiss cheese and when it does, it needs to compact. If you need to compact you need to stop all application threads and perform memory to memory copying to eliminate these holes as well as free list management, which ain't cheap.
In fact the cost of copying objects was one of the reasons we didn't have a generational collector for so long. The engineers figured that the cost of copying would far outweigh the benefits. This is evidenced by advice found in Sun's own 1.4 GC tuning guides. They recommended that spaces be sized in such a way that it minimized copying. It is a recommendation that we also offered until I quickly sorted out that doing just the opposite worked much better. As a sidebar, this lead to my infamous answer to a question at Sun Techdays in Johannesburg where in the Q&A section of my presentation with Dr. Heinz Kabutz I was asked for references. My answer was of course, read the Sun GC tuning guide and then do exactly the opposite. But, I digress.
Fortunately you won't find that advice anymore in any of Sun's documentation as we've now come to realize that as objects die, there are fewer and fewer of them to copy. Furthermore, if you give the objects longer to die, they'll do just that. Combine that with the fact that being able to declare a whole space as being clean is so cheap that it's just not worth the effort to avoid copying. And this is exactly what the G1 looks like it is going to do: give objects just enough time to die before going in to clean out a space.
Pure speculation on my part, but I also suspect that objects in one of the tenured spaces will most likely have been promoted at the same time. I also suspect that these objects will die at about the same time so that once a space starts to empty, it will become empty fairly quickly. What we do know, and this is according to Tony Printezis (author of G1) is that many tenured regions are found to be empty during the mark phase and the others tend to have low liveliness. This means that a lot of tenured regions can be collected with little effort. It also means that spaces with high levels of liveliness will be avoided and that is a good thing also. High levels of liveliness imply very little free space which implies that the region is already compact. That means we will not waste any time moving objects, something we might do if tenured was a single contiguous space. Moreover, waiting to address a tenured region gives it more time, and with more time one can expect to see fewer and fewer objects in it making it even cheaper to collect.
I suppose there will be some pathological cases where this collector will not work well. I can imagine cases where I can cause this scheme to fail. However it is my opinion that most server based applications will benefit from this new collector. Tony has been working with some standard benchmarks and beta customers and so far the results look very promising. The collector won't be available generally for a little while but it is coming and all indications are it will be in the 1.6 JVM ast some point. Now all you have to do is prepare for the arrival of this new low latency, more efficient collector, by making sure your application is running with in the 1.6 JVM.