Home Garbage Collection (GC) in JVM
Post
Cancel

Garbage Collection (GC) in JVM

Garbage Collection is a vital component of the JVM, ensuring efficient memory management and optimizing application performance. Understanding the nuances of JVM memory structure and garbage collection mechanisms enables developers to write more efficient Java applications and effectively tune the JVM for optimal performance. As the JVM continues to evolve, so do its garbage collection algorithms, reflecting the ongoing commitment to improving Java performance.


Contents


What is Garbage Collection?

Garbage Collection in the JVM is the process of identifying and discarding objects that are no longer needed by a Java application, thereby reclaiming memory and preventing memory leaks. This automatic memory management feature distinguishes Java from languages where manual memory management is required.

Memory Division in JVM for Garbage Collection

The JVM divides its memory into several regions, each serving a different purpose in the garbage collection process:

  1. Heap Memory: This is where Java objects are dynamically allocated. It’s further divided into:
    • Young Generation: Where new objects are created. It’s subdivided into:
      • Eden Space: Where most new objects are allocated.
      • Survivor Spaces (S0 and S1): Areas where objects are moved from Eden after surviving one garbage collection cycle.
    • Old Generation (Tenured Space): Contains objects that have lived for a longer time in the heap.
    • Permanent Generation (PermGen) or Metaspace: In older versions of Java, this space stores metadata about the classes in the application. In newer Java versions (Java 8 onwards), it’s replaced by Metaspace, which is not a part of the heap and grows automatically.
  2. Non-Heap Memory: This includes Method Area, Stack Memory and other memory required for the JVM to execute like Code Cache.

Why Two Survivor Spaces?

The primary reason for having two Survivor Spaces in the JVM is to efficiently manage and age the objects in the Young Generation, specifically through a process called the “Copying Collection.”

Copying Collection Process

  • Minimizing Fragmentation: In the copying collection process, live objects are moved from one area to another. Having two Survivor Spaces allows the JVM to move objects from one Survivor Space to the other, thereby minimizing fragmentation that would occur if objects were continuously shuffled around in a single space.

  • Aging of Objects: The JVM uses these Survivor Spaces to keep track of object ages. When objects survive a Minor GC (garbage collection in the Young Generation), they are moved from the Eden space to one of the Survivor Spaces. With each subsequent GC cycle, surviving objects are copied back and forth between the two Survivor Spaces. Each move increments the age of the object. When an object reaches a certain age threshold, it is promoted to the Old Generation.

Efficient Memory Management

  • Reduced GC Overhead: By alternating between the two Survivor Spaces, the JVM can clear a whole Survivor Space in each Minor GC, which is more efficient and faster than selectively clearing individual objects.

  • Optimized Use of Space: This technique ensures that at any point in time, one of the Survivor Spaces is empty and can be used to store surviving objects in the next GC cycle. This approach optimizes the use of available memory in the Young Generation.


Imagine a scenario where the Eden space is filled up, triggering a Minor GC. The GC process identifies live objects in Eden and moves them to S0. In the next GC cycle, live objects from both Eden and S0 are moved to S1, and S0 is cleared. This process continues, with each cycle swapping the roles of S0 and S1.

Why Replace PermGen with Metaspace after Java 8?

The Limitations of PermGen

  • Fixed Size: PermGen had a fixed maximum size. If the space was exhausted, it led to a java.lang.OutOfMemoryError: PermGen space error, which was hard to predict and handle.
  • Memory Management Complexity: The size of PermGen needed to be estimated and specified using JVM arguments. Incorrect sizing often resulted in either wasted memory or out-of-memory errors.

Benefits of Metaspace

  • Dynamic Resizing: Metaspace uses native memory (outside the Java heap) for the storage of class metadata and is automatically resized by the JVM. This adaptability reduces the need for tuning the memory space and lowers the risk of out-of-memory errors.
  • Improved Performance: The dynamic nature of Metaspace can lead to better performance, particularly in environments where applications dynamically load and unload a lot of classes.

Garbage Collection in Metaspace

Can Metaspace Be Scanned by GC?

  • Yes: The garbage collector does scan Metaspace but in a different manner compared to the Java heap. The GC can reclaim memory from Metaspace when classes are unloaded.
  • GC Behavior: Unlike the regular heap space, where GC is more frequent and predictable, GC in Metaspace is typically triggered under different scenarios, such as when the space is close to its maximum limit (which is by default unlimited, but can be set using -XX:MaxMetaspaceSize).

Managing Metaspace

  • Monitoring and Management: It’s important to monitor Metaspace usage, especially in applications that dynamically load a lot of classes, to avoid excessive native memory usage.
  • JVM Options: Developers can still use JVM options like -XX:MetaspaceSize and -XX:MaxMetaspaceSize to control the initial size and maximum size of Metaspace.

Metadata contains information about the objects in the heap, including their class definitions, method tables, and related information. Depending on the version of Java, the JVM stores this data in the permanent generation or Metaspace.

The JVM relies on this information to perform tasks like class loading, bytecode verification, and dynamic binding.

How Does Garbage Collection Work in JVM?

Garbage collection in the JVM involves several steps and is managed by the Garbage Collector. The primary processes are:

  1. Marking: The first step is to mark all live objects (objects that are still being used by the application).

  2. Normal Deletion: After marking live objects, the garbage collector removes all the unmarked objects, reclaiming their memory.

  3. Deletion with Compaction: To optimize memory usage and prevent fragmentation, the JVM not only deletes unused objects but may also compact the remaining objects. This process involves moving the objects closer together.

Garbage Collection Phases

  1. Minor GC: This happens in the Young Generation when Eden space fills up. Objects that are still alive are moved to one of the Survivor spaces.

  2. Major/Full GC: This involves cleaning the Old Generation and is more time-consuming. It happens less frequently than Minor GC.

  3. Incremental or Concurrent GC: Modern JVMs (like HotSpot) use algorithms that perform Garbage Collection incrementally or concurrently, reducing pause times and making the process more efficient.

Example

Consider a scenario where a Java application creates many temporary objects. As the Eden space in the Young Generation fills up, a Minor GC will be triggered, and still-alive objects are moved to a Survivor space.

Continuing to fill and trigger Minor GCs, objects that survive multiple cycles are eventually moved to the Old Generation. A Major GC is invoked when the Old Generation needs cleaning, which takes longer and affects application performance more significantly due to the complete pause of the application threads.

Minor GC vs. Full GC

Minor GC

Definition and Role

  • Minor GC refers to the process of garbage collection in the Young Generation (Eden Space and Survivor Spaces) of the JVM heap. It’s triggered when the Eden space becomes full.

Characteristics

  • Speed and Frequency: Minor GC is typically faster and more frequent than Full GC since it deals with a smaller memory area.
  • Object Promotion: Objects that survive Minor GCs are eventually moved (promoted) to the Old Generation after reaching a certain age threshold.
  • Stop-The-World Event: While Minor GC is happening, application threads are paused, though the pause times are usually short.

Example of Minor GC

  • When new objects are continuously created, and the Eden space fills up, the JVM performs a Minor GC, clearing out unused objects from Eden and moving surviving objects to one of the Survivor spaces.

Full GC

Definition and Role

  • Full GC involves cleaning up the entire heap memory, including the Young Generation (Eden and Survivor Spaces) and the Old Generation (Tenured Space), and sometimes the Permanent Generation or Metaspace, depending on the JVM version.

Characteristics

  • Thoroughness: Full GC is more comprehensive, as it looks at the entire heap.
  • Duration and Impact: It usually takes longer and has a more significant impact on application performance due to the larger area of memory being cleaned.
  • Stop-The-World Event: Full GC is also a stop-the-world event, where application threads are paused, potentially leading to noticeable pauses in application performance.

Causes of Full GC

  • Full GCs are often triggered when:
    • The Old Generation becomes full.
    • System.gc() is explicitly called in the code (though it’s generally discouraged).
    • JVM needs to collect the PermGen or Metaspace areas (for metadata like class definitions).

Example of Full GC

  • Consider an application with a large number of long-living objects. As these objects fill up the Old Generation, the JVM triggers a Full GC to reclaim space across the entire heap, including the Old Generation.

Key Differences of Minor GC and Full GC

  1. Scope: Minor GC cleans only the Young Generation, whereas Full GC cleans the entire heap (and sometimes beyond, like the Metaspace).
  2. Frequency and Duration: Minor GCs are more frequent but shorter in duration. Full GCs are less frequent but longer and more disruptive.
  3. Impact on Performance: Minor GCs generally have less impact on performance due to their shorter pause times, while Full GCs can significantly affect performance due to longer pauses.

Tuning Garbage Collection

Garbage Collection tuning is an essential aspect of JVM performance tuning. It involves adjusting the size of the heap and its different regions and choosing the right garbage collector and parameters based on the application’s needs. Some common garbage collectors in HotSpot JVM include:

  • Serial GC: Suitable for small applications with a single-threaded environment.
  • Parallel GC: Ideal for medium to large applications with multi-threading.
  • Concurrent Mark Sweep (CMS) GC: Targets shorter garbage collection pause times.
  • G1 GC: Designed for large heaps, balancing throughput and pause time.

Choosing and tuning a garbage collector depends on application requirements, such as response time constraints and throughput needs.

This post is licensed under CC BY 4.0 by the author.
ip