08 May 2020
Note: Everything I tested here is on Java 8 running on Mac OS
Also Note: The memory overheads are most likely implementation, OS and version dependent, so this is really only a guide. If memory usage is important, you should do your own tests using your own objects and versions.
These days I work a lot on the HDFS part of Hadoop. HDFS is a "big data" file system which stores all the filesystem meta data in memory and never pages any of this data out to disk. It is not uncommon for the Namenode service within HDFS to run with 100 - 250gb of JVM heap configured, generally using the CMS GC.
When storing potentially 100s of millions of objects in memory like this, it can be important to consider the overhead a Java object can add to the actual data you want to store, and consider how to reduce it.
Internally, Java uses references to store objects in various structures. When the JVM is running with less than 32GB of heap, these references are 4 bytes in size. For any heap of 32GB or greater, the references increase to 8 bytes. This means, there is often little value in increasing your heap size from 30GB to 34GB, as this extra reference overhead will likely result in you having less usable memory than before. If you are already close to the 32GB boundary, any heap increase needs to be significant.
Consider this simple program, which creates 10k empty objects:
public class BasicObject {
public static void main(String[] argv) throws InterruptedException {
BasicObject[] objs = new BasicObject[10000];
for (int i=0; i<10000; i++) {
objs[i] = new BasicObject();
}
Thread.sleep(1000000);
}
}
Using jmap, we can dump the live objects on the heap:
$ jmap -histo:live 27071
num #instances #bytes class name
----------------------------------------------
1: 10000 160000 com.sodonnell.BasicObject
2: 1070 96760 [C
3: 488 55736 java.lang.Class
4: 1 40016 [Lcom.sodonnell.BasicObject;
We have 10k instances of BasicObject, which occupies 16,000 bytes, or 16 bytes per object. Therefore, it is reasonable to conclude the object overhead on this platform is 16 bytes.
We can also see a single instance of an array to hold these objects, occupying 40016 bytes. Based on each object reference being 4 bytes, we can see a 16 byte overhead plus 4 bytes per entry. It is important to remember, a Java array occupies all the memory for its defined size, even if it is empty. This is easy to prove with the above program by removing the for loop and leaving the array empty.
If we extend the above program, to make BasicObject store a reference to another object, then we can see how that affects its memory usage:
public class BasicObject {
private Object other = new Object();
public static void main(String[] argv) throws InterruptedException {
BasicObject[] objs = new BasicObject[10000];
for (int i=0; i<10000; i++) {
objs[i] = new BasicObject();
}
Thread.sleep(1000000);
}
}
num #instances #bytes class name
----------------------------------------------
1: 10035 160560 java.lang.Object
2: 10000 160000 com.sodonnell.BasicObject
3: 1070 96760 [C
4: 488 55736 java.lang.Class
5: 1 40016 [Lcom.sodonnell.BasicObject;
Ignoring the additional space used by the "Object" objects, the space used by BasicObject has not changed, which is unexpected.
Extending the program to store two objects, gives:
num #instances #bytes class name
----------------------------------------------
1: 20035 320560 java.lang.Object
2: 10000 240000 com.sodonnell.BasicObject
3: 1070 96760 [C
4: 488 55736 java.lang.Class
5: 1 40016 [Lcom.sodonnell.BasicObject;
And 3 Objects:
num #instances #bytes class name
----------------------------------------------
1: 30035 480560 java.lang.Object
2: 10000 240000 com.sodonnell.BasicObject
3: 1070 96760 [C
4: 488 55736 java.lang.Class
5: 1 40016 [Lcom.sodonnell.BasicObject;
An empty object needs 16 bytes as does an object with a single reference. Storing 2 or 3 references pushes the usage up to 24 bytes.
A key detail around JVM memory, is that it tends to round usage up to a multiple of 8 bytes. From this, we can conclude that the object overhead is really 12 bytes on this platform, which gets rounded to 16 bytes. If we store a single reference in this object, it uses that wasted 4 bytes. Storing a second reference uses 20 bytes, but this gets rounded 24, allowing the third reference to use that wasted space, and so on.
The memory overhead of an object is 12 bytes, plus 4 bytes for each object it stores a reference to. If the object holds any primitives, it will use the number of bytes the primitive occupies, eg:
After adding up all these parts, the space is rounded up to a multiple of 8. As a test:
public class BasicObject {
private Object other = new Object();
private int myint = 1234;
private short myshort = (short)1;
private long mylong = 1234567;
public static void main(String[] argv) throws InterruptedException {
BasicObject[] objs = new BasicObject[10000];
for (int i=0; i<10000; i++) {
objs[i] = new BasicObject();
}
Thread.sleep(1000000);
}
}
Running the above, we expect each instance of BasicObject to require 12 (overhead) + 4 (reference) + 4 (int) + 2 (short) + 8 (long) = 30 bytes, rounded to 32:
num #instances #bytes class name
----------------------------------------------
1: 10000 320000 com.sodonnell.BasicObject
Looking at an array, it turns out to be a special type of object. We can see the array usage above is always 16 bytes plus 4 bytes per reference. This equates to 12 bytes for the object overhead, plus a 4 byte integer to hold the array size, giving a 16 byte overhead.
If you create an array of primitives, instead of 4 bytes per entry, it will instead be the size of the primitive:
Filling a 10,000 element array with longs, it is more difficult to see from jmap what space it is using. However we get:
num #instances #bytes class name
----------------------------------------------
1: 1070 96760 [C
2: 2 80064 [J
3: 487 55632 java.lang.Class
4: 23 28496 [B
5: 528 26424 [Ljava.lang.Object;
It appears to be the "[J" class which represents this array object. There are two instances of it, and from earlier tests, without this array of longs, there seems to be an existing entry used for some internal purposes holding 48 bytes:
103: 1 48 [J
This means the array of 10k longs uses 80016 bytes, which is as expected - 16 byte overhead plus 8 bytes per entry.
Now we know the overhead of an object, we can consider a HashMap, by storing our simple object into one:
public class BasicObject {
public static void main(String[] argv) throws InterruptedException {
Map hmap = new HashMap<BasicObject, BasicObject>();
for (int i=0; i<10000; i++) {
BasicObject o = new BasicObject();
hmap.put(o, o);
}
Thread.sleep(1000000);
}
}
num #instances #bytes class name
----------------------------------------------
1: 10041 321312 java.util.HashMap$Node
2: 10000 160000 com.sodonnell.BasicObject
3: 1075 97064 [C
4: 16 66816 [Ljava.util.HashMap$Node;
5: 487 55632 java.lang.Class
We have our 10k BasicObjects and about 10k HashMap$node objects, occupying 32 bytes each. There are also 16 arrays of HashMap$Node. By creating a simple program that does not create any user defined HashMaps, it seems Java is creating 15 of them in the background somewhere:
num #instances #bytes class name
----------------------------------------------
...
17: 15 1264 [Ljava.util.HashMap$Node;
Therefore we have an overhead of 32 bytes per entry in the Node object, plus an array occupying 65552 bytes. We know this array has 16 bytes overhead as it stores object references, it is 4 bytes per object. Therefore (65552 - 16) / 4 = 16384 entries, which is a power of 2.
This means that to store these 10k entries our overhead is 320,000 + 65,552 = 38 bytes per entry.
A HashMap, by default, starts with an array of 16 entries, but this can be controlled by its constructor. This array is called a table.
An entry is stored into the table by taking its hashcode and truncating it into one of these 16 table entries.
As more entries are added, their hash code hopefully spreads them throughout the table evenly.
If two entries hash to the same table entry, the entries are chained using a linked list via the Node object.
This node object, which we know occupies 32 bytes per entry, holds a "next" reference for the linked list, the hash value, the key and the value being stored. From that we can see where the 32 byte overhead comes from:
static class Node<K,V> implements Map.Entry<K,V> {
final int hash;
final K key;
V value;
Node<K,V> next;
12 byte object overhead + 4 (int) + 4 (reference to key) + 4 (reference to value) + 4 (reference to next) = 28 rounded to 32.
When the table in the HashMap fills up to a certain point, the table is doubled in size and all the entries are re-hashed to their new slot in the table.
As an aside, Java 8 has a nice optimisation, where the linked list can be converted into a tree map if it gets too long.
Knowing how the HashMap works, we can understand the space used by the array. It has space for 16384 entries, as it started with 16 slots, and was doubled several times as we loaded the HashMap. In the example with 10k entries, there is space in the table for more elements before it needs to be expanded to a 32k element array.
Its also worth noting that I used the same object as the key and value in this HashMap. Often an ID or other object is used as the key. As HashMap cannot use a primitive as the key, so if you wanted to do lookups based on a long for example, it needs to be wrapped in an object giving another 24 bytes of overhead (8 for the long + 12 for the object overhead rounded to 24).
The overhead of storing an object into a HashMap is about 38 bytes of overhead per entry, provided the same object acts as the key and value. If a different object is the key, the overhead will be higher, but often this other object would exist anyway, making little difference.
A TreeMap is a structure which implements the Java Map interface, but stores the objects sorted in order of their keys. Running the same program using it gives:
num #instances #bytes class name
----------------------------------------------
1: 10000 400000 java.util.TreeMap$Entry
2: 10000 160000 com.sodonnell.BasicObject
Giving a overhead of 40 bytes per entry, provided the same object is used as the key and value.
You may think that using a HashSet will reduce the overhead of a HashMap, if you are storing the same object in the key and the value. However, behind the scenes, Java uses a HashMap to implement a HashSet, where it sets a dummy object as the value. Therefore the overhead of a HashSet is exactly the same as of a HashMap. In theory, HashSet could use 8 bytes less by using a different object for the Map.Entry, as it does not need the value. I guess for most cases, this is not an enhancement worth making.
For most applications, saving a few bytes of memory per entry is just not worth the hassle. As I pointed out earlier, the HDFS namenode needs to store potentially 100's of millions of entries in a sorted Hash table like structure. A TreeMap, with 40 bytes per entry, can add up to several GB of memory easily in this scenario.
Within Hadoop, lives a data structure call a FoldedTreeSet, which somewhat mirrors the Java TreeMap. It implements the SortedSet interface and implements a red-black tree like TreeMap does. Instead of storing each entry inside an object with 40 bytes of overhead, it allocates a 64 element array for each object, and stores the entries within the array.
Given a 64 element array uses 64 * 4 + 12 + 4 = 272 bytes, and the tree node uses 56 bytes per entry, the best case overhead, assuming all array entries are filled, is 5.125 bytes per entry.
Compared to a TreeMap, this would save about 3GB of memory per 100M objects.
Also within Hadoop is a structure called LightWeightResizableGSet. This provides an interface like Set, only it also allows the get() operation, which Set does not. Therefore it is somewhat like a HashMap, where the key and value are the same object. The later implements the set interface.
Rather than wrapping the object to be stored in another object (which is what HashMap does), it has an array to use as the hash table, and then stores the object directly in the table. To handle collisions, if forces the objects being stored to implement an interface called "linkedElement" to allow the objects to be chained in a one direction linked list.
This means that the overhead of GSet is either 0 or 4 or 8 bytes on the object being stored to add the link reference (depending on rounding to the next 8 byte boundary), plus the overhead of the array. Assuming it is 50% filled, that is about 8 bytes (4 for each entry, plus 4 wasted at 50% filled), giving a total overhead somewhere between 4 and 16. The array usage would be approximately the same in HashMap, so the saving vs HashMap (32 bytes per Node entry) is between 32 and 24, depending on round to the 8 byte object boundary.