Memory Efficient Java – Mission Impossible?

First, the normal usage in Java leads to massive memory waste as its pointed out in several articles or here in a nice IBM article.

Some examples for the 32 bit JVM:

Using 3 * 4 bytes for an object with no member variables.
Using 4 * 4 bytes for an empty int[] array!
Using 4 * 4 + 7 * 4 = 44 (!) bytes for an empty String object

Of course things are a bit more complicated but in short: the wrapper classes and shorter arrays should be avoided.

Second, doing it a bit more efficient is relative easy – just use the primitive collections from the trove project, where wrapper objects can be avoided. Because the standard collection classes are trimmed to be CPU efficient – not really memory efficient. Also several JVM tuning parameters could be used to reduce RAM usage.

Thrird, doing it more efficient or as efficient as in C++ is very complex and in some cases even impossible.

Let me first explain why it is in some cases impossible to be as efficient as in C++

When you allocate an array of primitives such as int, long or double in Java then the values are stored directly in the array:

int1
int2
int3
...

This is good in terms of memory. Now when you allocate an array of objects then only the references are stored:

ref1  --- > points the object (a position somewhere on the heap)
ref2  --- > points to another object
ref3  --- > etc
...

Imagine that you are using an array of an object Point with only two members latitude and longitude (e.g. both floats). Then you would waste the memory necessary for the reference (4 bytes on the 32bit JVM) and the object overhead (12 bytes). 16 bytes waste; 8 bytes data. Now assume you need 100 mio points (osm data of Germany) => you would waste 1.6 GB RAM and only if you do it efficient 😉

If you think about this a bit you would probably argue that a clever JVM could inline the objects to avoid the overhead and the additional reference. Well, I think this is impossible.

For the JVM it is nearly impossible to inline objects in arrays

You have two solvable problems when you want that the JVM inlines the objects for you:

You need one bit indicating if the entry is null (0) or not (1) – after initializing then all entries are null. Or one could even call a default constructor per configuration or whatever.
The class needs to be final, as otherwise the lengths of one subclass entry could exceed the reserved maximum length.

… but you have one really hard – unsolvable (?) problem:

3. You would need to change the move/reference semantics in Java – a no go in my opinion, not only for the language designers.

But why you would need to change the semantics? Well, imagine you have two such inline arrays and you are doing something ‘unimportant’:

// The point p refers to the memory in the array. Okay.
Point p = inlineArr1[100];
p.setX(111f);

// Uh, what to do now? In C++ you could define to copy the point into the array.
// In Java you would copy the reference - not possible for 2 inline arrays ...
inlineArr2[100] = p;
inlineArr2[100].setX(222f);
float result = p.getX();

What result would we get in the last line? With normal Java arrays you get 222f. But with the inline approach and copy semantics you get 111f – which would be against every Java-thinking.

Instead of using Java-unintuitive copy semantics one workaround could be to forbid assignments to such inline arrays. Those read-only inline arrays would be harder to use but still nice to have IMO 🙂 !

Memory efficient Java – via Primitives

Now let us investigate how we could be in Java as memory efficient as with C++.

The memory problem above is solvable in Java when one would use two float arrays. But it makes programming harder. For example how to swap two of those entries? The normal Java way for the Point class is:

Point tmp = arr[2]; arr[2]=arr[3]; arr[3]=tmp;

But the array wrapper would make it necessary to create a separate swap method which then swaps the two entries for every point:

float tmpx = x[2]; x[2]=x[3]; x[3]=tmpx;
float tmpy = y[2]; y[2]=y[3]; y[3]=tmpy;

If you retrieve the raw floats and would swap the entries outside of the wrapper it would be error prone to add more members later to the object (e.g. like a point ID). And you’ll get probably other problems as well.

But we could put an object oriented wrapper class around it which returns Point objects created from the underlying float arrays, which has the disadvantage of being CPU intensive. Now, in the next part we use a similar idea but operate on more low level arrays which then gives us the possibility to use memory mapped features – turning our datastructure in a simple storage. Read on!

Memory efficient Java – via raw Bytes

A second solution in Java is to use an object oriented wrapper around one big byte array or a ByteBuffer which then can be even memory mapped:

RandomAccessFile file = new RandomAccessFile(fileName, "rw");
MappedByteBuffer byteArray = file.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, size);

The disadvantage is that you will have additional CPU for the conversion and a much more complicated procedure to retrieve and store things. E.g. to retrieve a point you’ll need to write:

Point get(int index) {
  int offset = index * bytesPerPoint;
  return new Point(bytesToFloat(byteArray, offset), bytesToFloat(byteArray, offset + 4));
}

or to serialize the object into bytes:

void set(Point p) {
  int offset = index * bytesPerPoint;
  writeFloat(byteArray, offset, p.getX());
  writeFloat(byteArray, offset + 4, p.getY());
}

The big advantage over the method with a normal primitive array is that then “loading” objects on startup takes milliseconds and not seconds or minutes, which is quite cool and very uncommon for java 😉

Conclusion

In my opinion all the memory efficient programming methods for Java are very ugly. This time the simplicity of C++ easily beats Java, because in Java you need to operate on bits and bytes – in C++ you could just cast the bytes to your objects and be memory efficient as well.

But (sadly?) I’m a Java guy – and I still prefer faster development cycles than 100% memory efficiency. Read in some weeks how I’ve implemented a memory efficient Map of Geohashes or a simple graph in Java for my GraphHopper project.

Update – Resources

Discussion about “struct” in Java
ObjectLayout
Mechanical Sympathy
Persisted low latency messaging
NoSQL indexing and query engine with ultra-low latency

9 thoughts on “Memory Efficient Java – Mission Impossible?”

hander says:

19 April, 2012 at 19:20

Memory is cheap today but not the same as human resources.

Java is faster and easier to work with and performance as great or better than c++ one. Even nasty bugs are far more easy to find and fix them in java.
Frisian says:

20 April, 2012 at 06:14

http://vanillajava.blogspot.com/2011/11/ever-decreasing-cost-of-main-memory.html

’nuff said!
karussell says:

20 April, 2012 at 06:42

In 95% of the use cases I’m with you guys, thats why I’m a Java fan. But what if your data needs to fit into memory of one machine? Like it is the case and makes things a lot simpler for big graph problems e.g. routing of big areas like Germany or Europe?

Secondly a database or a library could use those memory efficient ways and hide all the freaking details.

Thirdly often less memory means more speed (less cache access, less GC if its Java etc)

But at the bottom line, saving every bit and byte is not required and often only the knowledge of the above things lets you write better programs.
Pingback: RSS Digest: Week Ending 21-Apr-2012 | Zahid Qureshi
peterlawrey says:

28 April, 2012 at 07:27

You should look at the efficiency of shared pointers for memory management. In a 64-bit JVM a reference is typically 32-bit but in a 64-bit C++ program a shared pointer can use 48 bytes of memory. (8 bytes for the reference to the object, 8 bytes for the reference to the counter, and 32-bytes for the counter as malloc can have a minimum size of 32-bytes)
- peterlawrey says:
  
  28 April, 2012 at 07:28
  
  If I large amounts of data it can be worth writing wrapper for off heap direct memory or memory mapped files. It not simple, but it can be done and give you much the same result as C++.
Pingback: Running Shortest-Path Algorithms on the German Road Network within a 1.5GB JVM « Find Time for the Karussell
karussell says:

4 January, 2013 at 13:03

Thanks Peter, I already use the off heap technic with good results in my graphhopper project to access a lot memory also for memory restricted environments like Android.