How is your garbage collector by the way?

It is good to know how your garbage collection works and how much pause it brings to your application. Reading through this blog post I could easily generate stats and convert them to charts I could analyze and compare. I’ll keep this short and go through the steps very quickly.

Collect Logs

Add some JVM arguments to print GC stats, something like:

-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCApplicationStoppedTime -Xloggc:gc-stats.log

Add some JVM arguments to choose your garbage collection algorithm (the last flag is useful to know if JVM has understood your configuration):

-XX:+UseG1GC -XX:+PrintFlagsFinal

to use G1 (garbage first), or

-XX:+UseParallelGC -XX:+UseParallelOldGC -XX:+PrintFlagsFinal

to use parallel garbage collection algorithm.

Now run your application and collect GC log file.

Install Naarad

Follow instructions on this page to install Naarad. It is a open source Python project on GitHub, developed by LinkedIn.

Convert and Chart

Naarad comes with a few example configurations that can be used in a simple command like this to parse GC statistics and generate HTML reports:

./bin/naarad -c examples/conf/config-gc -i logs/g1/ -o output/g1/

Which will read the log file gc_g1.log (specified in the configuration file) from logs/g1 and generate reports in output/g1.

This utility gives me graphs like the one below. This one shows GC pauses when using parallel and G1 algorithms respectively, there are a lot more of these I’m going to discover.

More

Naarad can be used to collect system telemetries, including but not limited to CPU and memory usage, which can be very useful and easily achieved. I haven’t tried that myself, but more information is provided here if you are interested.