Monitoring your Tomcat application server, or any other Java process, using JMX and OpenNMS

This post will show you how to monitor a Tomcat application server using JMX MBeans and an SNMP management application. The first part will guide you through securely enabling JMX management on the Java process. Once you have done this, you can use tools like Visual VM to monitor your process and its MBeans. If you want to take a step further and have this data gathered all the time, you might want to continue with the second part and integrate it with OpenNMS.

Requirements

It is assumed that you have Tomcat on a server you want to manage, and you have also got Sun Java 7 SDK (not JRE) to run on your machine (they both can be the same machine, but we keep them separated here). You can apply this to any other Java process with little modification (to the MBean names maybe).

Enable JMX on Server

Add a few Java arguments to your CATALINA_OPTS to enable remote connection to JMX, and then restart the application server:

export CATALINA_OPTS="-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.port=1100 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=true -Dcom.sun.management.jmxremote.password.file=/jre/lib/management/jmxremote.password"

Replacing with the appropriate path for your Java, copy /jre/lib/management/jmxremote.password.template into /jre/lib/management/jmxremote.password and set the passwords for monitorRole and controlRole at the end of the file as you wish. We will assume you have set them to MONITOR_PASS and CONTROL_PASS here .

Make sure that your firewall doesn’t block port 1100. You may change the port number as you wish, but remember to change it everywhere mentioned in this guide.

Now run a JMX utility to connect to your JMX server, like JConsole or Visual VM. If using JConsole, connect to a remote host providing hostname, port (1100), username (controlRole), and password (CONTROL_PASS). If using Visual VM with a VisualVM-MBeans plugin installed, you might be able to browse among the MBeans and call some operations (click to enlarge):

This might be enough if you want to see what’s going on in the JVM at the moment. You can even double click the numbers (only the ones highlighted) to see a graph over time.

But if you want to keep historical records of different things over time, keep reading. We will be demonstrating how to use a network management software to connect to JMX and gather information. This will be based on an open source application called OpenNMS, but you can probably use your own software so long as it supports JSR160.

Install OpenNMS

This document provides instructions for Fedora 16. Please follow the OpenNMS installation guide for other operating systems.

Add OpenNMS repository.

rpm -Uvh http://yum.opennms.org/repofiles/opennms-repo-stable-fc16.noarch.rpm

Install, init, and start PostgreSQL server, if you don’t have one.

yum -y install postgresql postgresql-server
/sbin/service postgresql initdb
/sbin/service postgresql start
/sbin/chkconfig postgresql on

To allow OpenNMS which is run as root to connect to PostgreSQL as opennms, you need to relax some access requirements. Edit /var/lib/pgsql/data/pg_hba.conf and make sure you have entries like this (change the last column):

local   all         all                               trust
host    all         all         127.0.0.1/32          trust
host    all         all         ::1/128               trust

Now restart PostgreSQL for these changes to take effect:

/sbin/service postgresql restart

Now you are ready to install OpenNMS:

yum -y install opennms
/opt/opennms/bin/runjava -S /usr/java/latest/bin/java
/opt/opennms/bin/install -dis
/sbin/service opennms start
Discover and Configure

If you have followed the instructions in the previous section, you are now able to access the web interface of OpenNMS via http://:8980/opennms/. Enter admin for both username and password when prompted. Just remember to open port 8980 in your firewall for hosts you will be accessing OpenNMS web interface from.
The easiest way to set up OpenNMS to monitor a JMX service is to hijack the configuration to set up its own JMX interface. The only things we need to change is port number, username, password, and a few names that are different in Tomcat. Please see this wiki page for an explanation.
Please modify your configuration files stated below to make sure that it matches what provided here.

/opt/opennms/etc/capsd-configuration.xml

<protocol-plugin protocol="OpenNMS-JVM" class-name="org.opennms.netmgt.capsd.plugins.Jsr160Plugin" scan="on" user-defined="false">
    <property key="port" value="1100" />
    <property key="factory" value="PASSWORD-CLEAR"/>
    <property key="username" value="controlRole" />
    <property key="password" value="CONTROL_PASS" />
    <property key="protocol" value="rmi"/>
    <property key="urlPath" value="/jmxrmi"/>
    <property key="timeout" value="3000" />
    <property key="retry" value="2" />
    <property key="type" value="default" />

/opt/opennms/etc/collectd-configuration.xml

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
        <parameter key="port" value="1100"/>
        <parameter key="factory" value="PASSWORD-CLEAR"/>
        <parameter key="username" value="controlRole" />
        <parameter key="password" value="CONTROL_PASS" />
        <parameter key="retry" value="2"/>
        <parameter key="timeout" value="3000"/>
        <parameter key="protocol" value="rmi"/>
        <parameter key="urlPath" value="/jmxrmi"/>
        <parameter key="rrd-base-name" value="java" />
        <parameter key="ds-name" value="opennms-jvm"/>
        <parameter key="friendly-name" value="opennms-jvm"/>
        <parameter key="collection" value="jsr160"/>
        <parameter key="thresholding-enabled" value="true"/>

/opt/opennms/etc/poller-configuration.xml

<service name="OpenNMS-JVM" interval="300000" user-defined="false" status="on">
  <parameter key="port" value="1100"/>
  <parameter key="factory" value="PASSWORD-CLEAR"/>
  <parameter key="username" value="controlRole"/>
  <parameter key="password" value="CONTROL_PASS"/>
  <parameter key="retry" value="2"/>
  <parameter key="timeout" value="3000"/>
  <parameter key="rrd-repository" value="/opt/opennms/share/rrd/response" />
  <parameter key="ds-name" value="opennms-jvm"/>
  <parameter key="friendly-name" value="opennms-jvm"/>

/opt/opennms/etc/jmx-datacollection-config.xml

...
<mbean name="JVM MemoryPool:Eden Space" objectname="java.lang:type=MemoryPool,name=PS Eden Space">
...
<mbean name="JVM MemoryPool:Survivor Space" objectname="java.lang:type=MemoryPool,name=PS Survivor Space">
...
<mbean name="JVM MemoryPool:Perm Gen" objectname="java.lang:type=MemoryPool,name=PS Perm Gen">
...
<mbean name="JVM MemoryPool:Old Gen" objectname="java.lang:type=MemoryPool,name=PS Old Gen">
...

Now restart the server after making these changes.

/sbin/service opennms restart

Once restarted successfully, go to the web interface and perform the following steps:

  • In the Admin tab, click “Add Interface for Scanning”, then enter and add.
  • In the Events tab, click “All Events” and look for services being discovered.
  • In the Reports tab, click “Resource Graphs”, select Tomcat server in the standard reports, select the opennms-jvm, then click “Graph Selection”.

Here are your graphs.

You need to follow the same pattern to monitor other MBeans. For example to graph total compilation time we need to follow these steps.
First, add an entry in jmx-datacollection-config.xml file to query the MBean (use JConsole or JVisualVM to find the name of the MBean you are interested in):

<mbean name="JVM Compilation" objectname="java.lang:type=Compilation">
  <attrib name="TotalCompilationTime" alias="TotCompilationTime" type="gauge"/>

Then, add a report template in the snmp-graph.properties section:

report.jvm.compilation.name=JVM Compilation
report.jvm.compilation.columns=TotCompilationTime
report.jvm.compilation.type=interfaceSnmp
report.jvm.compilation.command=--title="JVM Compilation" \
 DEF:compilationTime={rrd1}:TotCompilationTime:AVERAGE \
 LINE2:compilationTime#0000ff:"Compilation Time" \
 GPRINT:compilationTime:AVERAGE:" Avg \\: %8.2lf %s\\n"

Finally, don’t forget to add this new graph to the list of graphs (mind the semicolon and backslash at the end of the line):

reports=mib2.HCbits, mib2.bits, mib2.percentdiscards, mib2.percenterrors, \
...
jvm.gc.copy, jvm.gc.msc, jvm.gc.parnew, jvm.gc.cms, jvm.gc.psms, jvm.gc.pss, jvm.compilation, \
...

Android app to "Google" images

Some time ago I wrote a simple Android app to search for images using Google Search API, I thought it is worth sharing in case someone else needs to do the same. I had to create a custom search engine (https://www.google.com/cse/) and create an API project (https://code.google.com/apis/console) to get set up. The rest is ordinary stuff you all know.

There is one thing to mention though: Using Java search API you have to specify the web sites you want to search in. You can’t search the whole Internet using this API. You could do this using image search API, but since it is deprecated it is not worth investing in.

The complete working code along with a pre-built .apk is also provided. I have implemented other interesting features too like search suggestions and infinite loading, which will become handy down for you the road.

Simply clone https://github.com/normanatashbar/imagesearch.git or download by clicking here. Please remember to change the search engine ID and API key with your own if you are using this code as a base.

How to start investigating Java’s OutOfMemoryError

This blog post addresses services/support people and doesn’t provide too much detailed information about memory management in Java.

You get out of memory, now what? First of all you need to understand what kind of OutOfMemory is it. You may run out of OS virtual memory, native memory allocated to Java process, or Java heap. The error message normally gives a good indication of specifics, see a few examples:

OutOfMemoryError: Java heap space
OutOfMemoryError: PermGen space
OutOfMemoryError: unable to create new native thread
OutOfMemoryError: requested XXX bytes for ChunkPool::allocate

A typical Java process has a heap where Objects go into, which is also divided into different sections. Depending on the implementation of JVM they might be called: Eden Space (New), From Space (Survivor 1), To Space (Survivor 2), Old Generation (Tenured), and Perm Generation (it is considered outside Heap in some implementations). For a detailed explanation see this page or this page. On top it add memory required for class-loaders, garbage collection process, threads stack, JNI, native memory buffers, etc. For a detailed explanation see this page.

Here, I’m going to list what you need to know before analyzing an OutOfMemoryError:

1- What is the architecture of your machine and the JVM running your Java process: 32-bit, 64-bit, or else? How to proceed from here depends on answers to this question, because a 32-bit Java process is limited to about 3GB of usable virtual memory in user space with default settings, and 4GB in best case.

$ uname -a
Linux houman-laptop 3.9.10-100.fc17.x86_64 #1 SMP Sun Jul 14 01:31:27 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
$ java -version
java version "1.7.0_25"
Java(TM) SE Runtime Environment (build 1.7.0_25-b15)
Java HotSpot(TM) 64-Bit Server VM (build 23.25-b01, mixed mode)

The commands above tell me that I’m running a 64-bit JVM on a 64-bin machine.

2- How much memory is allocated to the Java process?

$ ps awwwxo pid,user,%mem,%cpu,vsz,rss,cmd | head -1; ps awwwxo pid,user,%mem,%cpu,vsz,rss,cmd | grep tomcat
 PID USER     %MEM %CPU    VSZ   RSS CMD
2885 houman   10.2  1.0 3405352 826028 /usr/java/jdk1.7.0_25/bin/java ... org.apache.catalina.startup.Bootstrap start
5774 houman    0.0  0.0 109408   872 grep --color=auto tomcat

The command above suggests that my Java process is allocated 3.4GB of memory by OS (VSZ) and is currently utilizing 0.8GB (RSS) of it.

3- How much memory is allocated to Java heap?

$ /usr/java/jdk1.7.0_25/bin/jmap -heap 2885
Attaching to process ID 2885, please wait...
Debugger attached successfully.
Server compiler detected.
JVM version is 23.25-b01
using thread-local object allocation.
Parallel GC with 4 thread(s)
Heap Configuration:
   MinHeapFreeRatio = 40
   MaxHeapFreeRatio = 70
   MaxHeapSize      = 536870912 (512.0MB)
   NewSize          = 1310720 (1.25MB)
   MaxNewSize       = 17592186044415 MB
   OldSize          = 5439488 (5.1875MB)
   NewRatio         = 2
   SurvivorRatio    = 8
   PermSize         = 134217728 (128.0MB)
   MaxPermSize      = 268435456 (256.0MB)
   G1HeapRegionSize = 0 (0.0MB)
Heap Usage:
PS Young Generation
Eden Space:
   capacity = 55836672 (53.25MB)
   used     = 44561864 (42.49750518798828MB)
   free     = 11274808 (10.752494812011719MB)
   79.8075214797902% used
From Space:
   capacity = 59637760 (56.875MB)
   used     = 51424800 (49.042510986328125MB)
   free     = 8212960 (7.832489013671875MB)
   86.2285907451923% used
To Space:
   capacity = 59637760 (56.875MB)
   used     = 0 (0.0MB)
   free     = 59637760 (56.875MB)
   0.0% used
PS Old Generation
   capacity = 300941312 (287.0MB)
   used     = 124813120 (119.03106689453125MB)
   free     = 176128192 (167.96893310546875MB)
   41.47423933607361% used
PS Perm Generation
   capacity = 209780736 (200.0625MB)
   used     = 124733744 (118.95536804199219MB)
   free     = 85046992 (81.10713195800781MB)
   59.459103051292566% used
37997 interned Strings occupying 4151064 bytes.

It says my Java process has 287MB in old generation (119MB used) and 200MB in perm generation (118MB used) for example. On top it also indicates what are the ultimate limits for the Java process. When utilization of these 2 sections get high and the capacity is near the maximum available (Eden + From + To + PS Old grow towards MaxHeapSize, and PS Perm grows towards MaxPermSize), chances are you are running out of heap space, one way or another. Referring to this picture might help you understand the output better:

This command will tell you about garbage collections performed and the time spent doing so:

$ /usr/java/jdk1.7.0_25/bin/jstat -gcutil 2885
  S0     S1     E      O      P     YGC     YGCT    FGC    FGCT     GCT   
  0.00   86.34  20.72  41.71  59.46 83      1.991   6      1.832    3.823

This command will tell you about biggest objects that live in heap:

$ /usr/java/jdk1.7.0_25/bin/jmap -histo 2885 | head -20
 num     #instances         #bytes  class name
----------------------------------------------
   1:          1244       81546688  [Lorg.apache.activemq.command.DataStructure;
   2:        229010       31452120  
   3:        229010       31156304  
   4:        239863       28604688  [C
   5:         18433       22196816  
   6:         39290       15346496  [B
   7:         18433       14414408  
   8:         14020       13783136  
   9:         71423        6119320  [Ljava.util.HashMap$Entry;
  10:        237082        5689968  java.lang.String
  11:         50255        5038688  [I
  12:        125650        4020800  java.util.HashMap$Entry
  13:         97931        3917240  java.util.LinkedHashMap$Entry
  14:         42557        3404560  java.lang.reflect.Method
  15:        107680        3022072  [Ljava.lang.String;
  16:         38100        2438400  java.util.LinkedHashMap
  17:         19563        2362808  java.lang.Class

To investigate further, we might need to refer to the application logs, or use a more sophisticate tools (like YourKit, or Eclipse Memory Analyzer).

4- How much more memory are you using besides heap? Now deduct the RSS figure by your heap capacity, and that’s what you are using for everything else.

$ bc
bc 1.06.95
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
826.028 - (53.25+56.875+56.875+287.0+200.0625)
171.9655

For me, it was 171MB. If this number is too high, it is worth checking threads. Running too many threads can affect memory usage of the application.

$ /usr/java/jdk1.7.0_25/bin/jstack 2885
Output is too long...

There is no easy way to figure out what’s wrong if native memory usage is too high, there are some methods mentioned here though.

How to read dates from Oracle database?

Last week I came across this piece of code, which made the alarm bells ring for me. Can you see the issue with this code?

/**
 * Given a date, strip it of its timezone, and return the date as if it was GMT0
 * @param oracleTimestamp
 * @param localTimeZone
 * @return
 */
private Date toLocalTime(Date oracleTimestamp, TimeZone localTimeZone)
{
    if (oracleTimestamp == null)
        return null;
    Calendar local = localTimeZone == null ? Calendar.getInstance() : Calendar.getInstance(localTimeZone);
    local.clear();
    long localToUtcDelta = local.getTimeZone().getOffset(oracleTimestamp.getTime());
    return new Date(oracleTimestamp.getTime() + localToUtcDelta);
}

The fundamental issue is that there is no concept of locality or time zone in the Date object. This method gets a single and unique moment in time and returns another single and unique moment in time with a possible Gap of a few hours. Since the event has happened in one moment only, therefore one of these values are wrong. It means we start off a wrong value and then try to amend it later. Let us review some background in order to understand the issue a little more.
java.util.Date is time-zone independent and is represented as a long number for the number of milliseconds passed since Epoch, in UTC. As an example, if you put 1,369,656,000,000 in the epoch converter you will get these values back:

So if you had to do magical conversion like this, it is a sign that you (or someone else you have been relying on) have failed to create the right Date object out of this alternative representation. Two common cases are parsing from String and reading from database. When parsing from String values, it is easy to parse using a DateFormat which is set for GMT:

DateFormat gmtDateFormat = new SimpleDateFormat("yyyyMMddHHmmss");
DateFormat localDateFormat = new SimpleDateFormat("yyyyMMddHHmmss");
gmtDateFormat.setTimeZone(TimeZone.getTimeZone("GMT"));
Date date = gmtDateFormat.parse("20130528160000");
System.out.println("GMT: " + gmtDateFormat.format(date) + " Local: " + localDateFormat.format(date));

Let’s focus on the more tricky case for the rest of this post, which is reading dates from database (Oracle in this case).

The way DATE and TIMESTAMP Oracle types work is a bit different though. They store year, month, day, … values separately, as if you had stored the string representation directly. These two fields don’t store time zone information. There are other types like “TIMESTAMP WITH TIME ZONE” that do store time zone and are a better fit for times when you care about time zone, but let’s assume that we can’t use them right now.

JDBC driver converts these two separate formats together for us, but it uses the local time zone of the environment it runs in. If your code runs in GMT+10 time zone, when you store Date(1,369,656,000,000), you end up with “Mon 27 May 2013 10:00:00” in database (with time zone information lost here). Then when you read it back it will be converted back to the right original Date value. This will all break when the time zone in read time is different than that of write time. In order to prevent that, we all agree to store DATE and TIMESTAMP values in GMT, meaning if you look into database you will see “Mon, 27 May 2013 12:00:00”.

In plain JDBC it is really easy to read and write correct Date values, without having to jump through hoops, like this:

// WRITING
Timestamp nowTimestamp = new Timestamp(nowDate.getTime());
PreparedStatement insertStmt = conn.prepareStatement(
    "INSERT INTO DATE_TEST_TABLE (ID, DATE_COLUMN, TIMESTAMP_COLUMN) VALUES (?, ?, ?)");
try
{
    insertStmt.setInt(1, getSerial());
    insertStmt.setTimestamp(2, nowTimestamp, cal);
    insertStmt.setTimestamp(3, nowTimestamp, cal);
    insertStmt.executeUpdate();
}
// READING
PreparedStatement selectStmt = conn.prepareStatement(
    "SELECT ID, DATE_COLUMN, TIMESTAMP_COLUMN FROM DATE_TEST_TABLE ORDER BY ID");
ResultSet result = null;
try
{
    result = selectStmt.executeQuery();
    while (result.next())
    {
        System.out.println(
            String.format("%2s, %s, %s",
                result.getInt(1),
                result.getTimestamp(2, cal).toString(),
                result.getTimestamp(3, cal).toString()
            ));
    }
}

In JPA, it is not as straightforward, but not hard though and requires only a row mapper:

class DateTestObjectRowMapper implements RowMapper
{
    public Object mapRow(ResultSet rs, int rowNum) throws SQLException
    {
        Calendar calendar = Calendar.getInstance(TimeZone.getTimeZone("GMT"));
        int id = rs.getInt(1);
        Date dateField = rs.getTimestamp(2, calendar);
        Date timestampField = rs.getTimestamp(3, calendar);
        return new DateTestObject(id, dateField, timestampField);
    }
}
// READING
String sql = "SELECT ID, DATE_COLUMN, TIMESTAMP_COLUMN FROM DATE_TEST_TABLE ORDER BY ID";
List objects = getJdbcTemplate().queryForList(sql);
System.out.println("Objects (wrong values): " + objects);
List objectsMappedGMT = getJdbcTemplate().query(sql, new DateTestObjectRowMapper());
System.out.println("Objects (right values, using row mapper and GMT calendar): " + objectsMappedGMT);

If you have used queryForList() method, you are using your local time zone to convert/parse dates which are stored in GMT. You get the WRONG values back, it is not only that they are not local dates, there is no such thing as local date.

Download the complete code to experiment more for yourself. Run OracleDatePlainJdbcTest and OracleDateJpaTest classes to get started.

A kind-of base Android app

I have just created a simple Android app that can be used as a starting point. This app uses ActionBarSherlock (ABS) and RoboGuice libraries. ABS gives your application new features (like action bar and fragments) while still supporting old API levels / devices. RoboGuice on the other hand helps you reduce the cluter of inter-dependencies among your application’s components. It has got a simple fragment based list-details view which takes care of different screen sizes by displaying a separated or combined view depending on the available screen size (again, a relatively new feature). It also provides a basic SQLite database and a content provider on top of it.

Simply fork/clone the code from https://github.com/normanatashbar/basedroid.git and start using it.

Remote Debugging Client Side GWT Code

A while ago I had to remotely debug client side code of a GWT application already compiled and deployed  onto a server. After searching around and reading this document a few times, I finally managed to do it following these steps:

  1. Compile and install the application on the server and write down the URL (like http://remote.host:8080/myapp/index.jsp).
  2. Create an Application launcher (Run->Edit Configurations->Applications in IntelliJ).
    • Set the main class to: com.google.gwt.dev.DevMode
    • Put this line in the program arguments: -noserver -war target/myapp -logLevel DEBUG -startupUrl http://remote.host:8080/myapp/index.jsp -port 8080 my.package.myapp.Module
  3. Debug this new run configuration. This will run the GWT Dev Mode and enable you to set breakpoints on the client side code.
  4. Copy and paste URL provided by Dev Mode into your browser (like http://remote.host:8080/myapp/indexjsp?gwt.codesvr=127.0.0.1:9997). You will need to modify Dev Mode extension options to allow this new combination (of app server and code server). On Chrome, for example, you can do this by clicking on the greyed out Dev Mode logo if you get an error.
  5. Remotely debug the server in the normal way (JDWP) to be able to debug server-side code as well.