08 May 2020

The Memory Overhead of Java Ojects

The Size of Java Objects

Note: Everything I tested here is on Java 8 running on Mac OS

Also Note: The memory overheads are most likely implementation, OS and version dependent, so this is really only a guide. If memory usage is important, you should do your own tests using your own objects and versions.

These days I work a lot on the HDFS part of Hadoop. HDFS is a "big data" file system which stores all the filesystem meta data in memory and never pages any of this data out to disk. It is not uncommon for the Namenode service within HDFS to run with 100 - 250gb of JVM heap configured, generally using the CMS GC.

When storing potentially 100s of millions of objects in memory like this, it can be important to consider the overhead a Java object can add to the actual data you want to store, and consider how to reduce it.

References under and over 32GB heap

Internally, Java uses references to store objects in various structures. When the JVM is running with less than 32GB of heap, these references are 4 bytes in size. For any heap of 32GB or greater, the references increase to 8 bytes. This means, there is often little value in increasing your heap size from 30GB to 34GB, as this extra reference overhead will likely result in you having less usable memory than before. If you are already close to the 32GB boundary, any heap increase needs to be significant.

Object Overhead

Consider this simple program, which creates 10k empty objects:

public class BasicObject {

  public static void main(String[] argv) throws InterruptedException {

    BasicObject[] objs = new BasicObject[10000];

    for (int i=0; i<10000; i++) {
      objs[i] = new BasicObject();
    }

    Thread.sleep(1000000);
  }

}

Using jmap, we can dump the live objects on the heap:

$ jmap -histo:live 27071

 num     #instances         #bytes  class name
----------------------------------------------
   1:         10000         160000  com.sodonnell.BasicObject
   2:          1070          96760  [C
   3:           488          55736  java.lang.Class
   4:             1          40016  [Lcom.sodonnell.BasicObject;

We have 10k instances of BasicObject, which occupies 16,000 bytes, or 16 bytes per object. Therefore, it is reasonable to conclude the object overhead on this platform is 16 bytes.

We can also see a single instance of an array to hold these objects, occupying 40016 bytes. Based on each object reference being 4 bytes, we can see a 16 byte overhead plus 4 bytes per entry. It is important to remember, a Java array occupies all the memory for its defined size, even if it is empty. This is easy to prove with the above program by removing the for loop and leaving the array empty.

If we extend the above program, to make BasicObject store a reference to another object, then we can see how that affects its memory usage:

public class BasicObject {

  private Object other = new Object();

  public static void main(String[] argv) throws InterruptedException {

    BasicObject[] objs = new BasicObject[10000];

    for (int i=0; i<10000; i++) {
      objs[i] = new BasicObject();
    }

    Thread.sleep(1000000);
  }

}


 num     #instances         #bytes  class name
----------------------------------------------
   1:         10035         160560  java.lang.Object
   2:         10000         160000  com.sodonnell.BasicObject
   3:          1070          96760  [C
   4:           488          55736  java.lang.Class
   5:             1          40016  [Lcom.sodonnell.BasicObject;

Ignoring the additional space used by the "Object" objects, the space used by BasicObject has not changed, which is unexpected.

Extending the program to store two objects, gives:

 num     #instances         #bytes  class name
----------------------------------------------
   1:         20035         320560  java.lang.Object
   2:         10000         240000  com.sodonnell.BasicObject
   3:          1070          96760  [C
   4:           488          55736  java.lang.Class
   5:             1          40016  [Lcom.sodonnell.BasicObject;

And 3 Objects:

 num     #instances         #bytes  class name
----------------------------------------------
   1:         30035         480560  java.lang.Object
   2:         10000         240000  com.sodonnell.BasicObject
   3:          1070          96760  [C
   4:           488          55736  java.lang.Class
   5:             1          40016  [Lcom.sodonnell.BasicObject;

An empty object needs 16 bytes as does an object with a single reference. Storing 2 or 3 references pushes the usage up to 24 bytes.

A key detail around JVM memory, is that it tends to round usage up to a multiple of 8 bytes. From this, we can conclude that the object overhead is really 12 bytes on this platform, which gets rounded to 16 bytes. If we store a single reference in this object, it uses that wasted 4 bytes. Storing a second reference uses 20 bytes, but this gets rounded 24, allowing the third reference to use that wasted space, and so on.

Conclusion

The memory overhead of an object is 12 bytes, plus 4 bytes for each object it stores a reference to. If the object holds any primitives, it will use the number of bytes the primitive occupies, eg:

Boolean - 1 byte
Char / short - 2 bytes
int - 4 bytes
long - 8 bytes

After adding up all these parts, the space is rounded up to a multiple of 8. As a test:

public class BasicObject {

  private Object other = new Object();
  private int   myint = 1234;
  private short myshort = (short)1;
  private long  mylong = 1234567;

  public static void main(String[] argv) throws InterruptedException {

    BasicObject[] objs = new BasicObject[10000];

    for (int i=0; i<10000; i++) {
      objs[i] = new BasicObject();
    }

    Thread.sleep(1000000);
  }

}

Running the above, we expect each instance of BasicObject to require 12 (overhead) + 4 (reference) + 4 (int) + 2 (short) + 8 (long) = 30 bytes, rounded to 32:

 num     #instances         #bytes  class name
----------------------------------------------
   1:         10000         320000  com.sodonnell.BasicObject

What About Arrays?

Looking at an array, it turns out to be a special type of object. We can see the array usage above is always 16 bytes plus 4 bytes per reference. This equates to 12 bytes for the object overhead, plus a 4 byte integer to hold the array size, giving a 16 byte overhead.

If you create an array of primitives, instead of 4 bytes per entry, it will instead be the size of the primitive:

Boolean - 1 byte
Char / short - 2 bytes
int - 4 bytes
long - 8 bytes

Filling a 10,000 element array with longs, it is more difficult to see from jmap what space it is using. However we get:

 num     #instances         #bytes  class name
----------------------------------------------
   1:          1070          96760  [C
   2:             2          80064  [J
   3:           487          55632  java.lang.Class
   4:            23          28496  [B
   5:           528          26424  [Ljava.lang.Object;

It appears to be the "[J" class which represents this array object. There are two instances of it, and from earlier tests, without this array of longs, there seems to be an existing entry used for some internal purposes holding 48 bytes:

 103:             1             48  [J

This means the array of 10k longs uses 80016 bytes, which is as expected - 16 byte overhead plus 8 bytes per entry.

Hashes

Now we know the overhead of an object, we can consider a HashMap, by storing our simple object into one:

public class BasicObject {

  public static void main(String[] argv) throws InterruptedException {

    Map hmap = new HashMap<BasicObject, BasicObject>();

    for (int i=0; i<10000; i++) {
      BasicObject o = new BasicObject();
      hmap.put(o, o);
    }

    Thread.sleep(1000000);
  }

}


 num     #instances         #bytes  class name
----------------------------------------------
   1:         10041         321312  java.util.HashMap$Node
   2:         10000         160000  com.sodonnell.BasicObject
   3:          1075          97064  [C
   4:            16          66816  [Ljava.util.HashMap$Node;
   5:           487          55632  java.lang.Class

We have our 10k BasicObjects and about 10k HashMap$node objects, occupying 32 bytes each. There are also 16 arrays of HashMap$Node. By creating a simple program that does not create any user defined HashMaps, it seems Java is creating 15 of them in the background somewhere:

 num     #instances         #bytes  class name
----------------------------------------------
...
17:            15           1264  [Ljava.util.HashMap$Node;

Therefore we have an overhead of 32 bytes per entry in the Node object, plus an array occupying 65552 bytes. We know this array has 16 bytes overhead as it stores object references, it is 4 bytes per object. Therefore (65552 - 16) / 4 = 16384 entries, which is a power of 2.

This means that to store these 10k entries our overhead is 320,000 + 65,552 = 38 bytes per entry.

How HashMap Works Internally

A HashMap, by default, starts with an array of 16 entries, but this can be controlled by its constructor. This array is called a table.

An entry is stored into the table by taking its hashcode and truncating it into one of these 16 table entries.

As more entries are added, their hash code hopefully spreads them throughout the table evenly.

If two entries hash to the same table entry, the entries are chained using a linked list via the Node object.

This node object, which we know occupies 32 bytes per entry, holds a "next" reference for the linked list, the hash value, the key and the value being stored. From that we can see where the 32 byte overhead comes from:

   static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

12 byte object overhead + 4 (int) + 4 (reference to key) + 4 (reference to value) + 4 (reference to next) = 28 rounded to 32.

When the table in the HashMap fills up to a certain point, the table is doubled in size and all the entries are re-hashed to their new slot in the table.

As an aside, Java 8 has a nice optimisation, where the linked list can be converted into a tree map if it gets too long.

Knowing how the HashMap works, we can understand the space used by the array. It has space for 16384 entries, as it started with 16 slots, and was doubled several times as we loaded the HashMap. In the example with 10k entries, there is space in the table for more elements before it needs to be expanded to a 32k element array.

Its also worth noting that I used the same object as the key and value in this HashMap. Often an ID or other object is used as the key. As HashMap cannot use a primitive as the key, so if you wanted to do lookups based on a long for example, it needs to be wrapped in an object giving another 24 bytes of overhead (8 for the long + 12 for the object overhead rounded to 24).

Conclusion

The overhead of storing an object into a HashMap is about 38 bytes of overhead per entry, provided the same object acts as the key and value. If a different object is the key, the overhead will be higher, but often this other object would exist anyway, making little difference.

Sorted Hash - TreeMap

A TreeMap is a structure which implements the Java Map interface, but stores the objects sorted in order of their keys. Running the same program using it gives:

 num     #instances         #bytes  class name
----------------------------------------------
   1:         10000         400000  java.util.TreeMap$Entry
   2:         10000         160000  com.sodonnell.BasicObject

Giving a overhead of 40 bytes per entry, provided the same object is used as the key and value.

HashSet

You may think that using a HashSet will reduce the overhead of a HashMap, if you are storing the same object in the key and the value. However, behind the scenes, Java uses a HashMap to implement a HashSet, where it sets a dummy object as the value. Therefore the overhead of a HashSet is exactly the same as of a HashMap. In theory, HashSet could use 8 bytes less by using a different object for the Map.Entry, as it does not need the value. I guess for most cases, this is not an enhancement worth making.

Can we do better?

For most applications, saving a few bytes of memory per entry is just not worth the hassle. As I pointed out earlier, the HDFS namenode needs to store potentially 100's of millions of entries in a sorted Hash table like structure. A TreeMap, with 40 bytes per entry, can add up to several GB of memory easily in this scenario.

FoldedTreeSet

Within Hadoop, lives a data structure call a FoldedTreeSet, which somewhat mirrors the Java TreeMap. It implements the SortedSet interface and implements a red-black tree like TreeMap does. Instead of storing each entry inside an object with 40 bytes of overhead, it allocates a 64 element array for each object, and stores the entries within the array.

Given a 64 element array uses 64 * 4 + 12 + 4 = 272 bytes, and the tree node uses 56 bytes per entry, the best case overhead, assuming all array entries are filled, is 5.125 bytes per entry.

Compared to a TreeMap, this would save about 3GB of memory per 100M objects.

LightWeightResizableGSet

Also within Hadoop is a structure called LightWeightResizableGSet. This provides an interface like Set, only it also allows the get() operation, which Set does not. Therefore it is somewhat like a HashMap, where the key and value are the same object. The later implements the set interface.

Rather than wrapping the object to be stored in another object (which is what HashMap does), it has an array to use as the hash table, and then stores the object directly in the table. To handle collisions, if forces the objects being stored to implement an interface called "linkedElement" to allow the objects to be chained in a one direction linked list.

This means that the overhead of GSet is either 0 or 4 or 8 bytes on the object being stored to add the link reference (depending on rounding to the next 8 byte boundary), plus the overhead of the array. Assuming it is 50% filled, that is about 8 bytes (4 for each entry, plus 4 wasted at 50% filled), giving a total overhead somewhere between 4 and 16. The array usage would be approximately the same in HashMap, so the saving vs HashMap (32 bytes per Node entry) is between 32 and 24, depending on round to the 8 byte object boundary.