an introduction to young generation collection

In the post ‘brief introduction to java generation memory‘, I explained the functionality of the young generation and old generation. These two generation spaces consist of the main heap of the java application. In fact, there is a third generation space called permanent generation, whose size is not affected by application creating new java objects. It is used by java runtime to hold metadata, for example, class data structure, internal strings, and so on. These days the terms ‘old generation’ and ‘permanent generation’ are less used. Instead, the terms ‘tenured generation’ and ‘metaspace generation’ are more often used.

+------+---------------------+---------------------+
| Eden | Survivor Space (S0) | Survivor Space (S1) |
+------+---------------------+---------------------+

During a minor garbage collection, live objects in the Eden space will be copied to the empty survivor space. At the same time, the live objects in the in-use survivor space that haven’t survived a threshold of minor garbage collections will also be copied to the empty survivor space. But the live objects in the in-use survivor space that have survived a threshold of minor garbage collections will be copied to the tenured generation space. We can see that after a minor garbage collection, the Eden space is empty. The previous in-use survivor space is empty and the previous empty survivor space is in-use. If another minor garbage collection occurs, the Eden space is still empty, but the role of the two survivor spaces swaps. In short, after a minor garbage collection, the Eden space is always empty and one of the survivor space is also empty. The garbage collection for the young generation is called the copying generation collection.

Because new java objects are created in the young generation as the application runs and most of them become unreachable quickly, it is desirable for the young generation collection to occur frequently to reclaim the unused objects. In order to minimize the impact of the application performance during the young generation collection, the garbage collector for the young generation should be fast, because the garbage collector usually stops the execution of the application (It is this case for Serial GC, Parallel GC and Concurrent Mark-Sweep GC).

The first step for a young generation collection is to identify the live objects in the young generation space. This process starts from the object references in the stack and in the static fields of the loaded classes. If there is a reference to an object residing in the young generation, this object will be marked live. If there is a reference field in this live object, which refers to another object in the young generation, that object will also be marked live. Because the young generation space is relatively small, these kinds of live objects are easy to find out. What about the live objects in the young generation space that are referenced by the live objects in the tenured generation space? Because the size of the tenured generation space is very large, scanning the whole tenured generation space will be time wasteful and will cause performance penalty.

Java Hot Spot VM uses a technique called card table to solve this issue. The tenured generation space is divided into chunks, each of 512 bytes, called card. There is a card table with one entry for each card. Every time the reference field of an object in the tenured generation space is assigned the reference of an object in the young generation space, the entry in the card table corresponding to the chunk containing the updated object will be marked dirty. During a young generation collection, only the chunks in the tenured generation that have dirty value for the corresponding entries in the card table will be scanned to find the live objects in the young generation space. This greatly reduces the size of the space that need to be scanned, which decreases the time of the young generation collection.