Project Valhalla makes Java memory efficient again

For a brief introduction into project Valhalla you need to read this wiki at OpenJDK or watch e.g. this talk from Brian Goetz. Basically it changes the layout of data in memory and introduces a possibility to define compact collections of objects. Currently only arrays of primitive types like int[] or double[] are “compact”.

Long ago in JDK 1.5 I was really excited about the new generics feature and I was trying the first prototype. But soon I understood that this was not really what I had expected. So the JVM engineers introduced a similar templating mechanism like in C++ but “only” on the surface and the memory layout was still the same. So I was a bit disappointed and writing memory efficient Java software stayed hard.

Current Memory Layout

Let me describe the current memory layout with an example. If you put many instances of a Point class with two decimal values (e.g. latitude and longitude) into an array you’ll waste lots of memory as every entry in the array is a pointer to a separate Point instance instead of an array of double values but this is not enough: additionally every Point instance needs some header information. See below for a picture with details. This is not only a waste of memory but also an unnecessary indirection and for large arrays could also mean to touch many different memory areas just for looping. This especially hurts when many small instances are stored.

Inline type to the rescue

Since several years (!) there is work going on in OpenJDK that wants to address this. It is a major undertaking as they want to integrate this deep and want that even unmodified applications benefit from this. From time to time I look about their progress – earlier it was called “Value type”, since a few month it is “Inline type”. I think they reached a very interesting milestone that you can easily play with:

I was not able to convince IntelliJ to accept the ‘inline’ keyword despite configuring JDK14 et. Not sure if this requires modifications to the IDE. But maven worked.

The Usual Point Example

As a first test I created the simple Point class

class Point { double lat; double lon; }

and I wanted to find out the memory usage. The solid but stupid way to do this is to set e.g. -Xmx500m and increase the point count until you get an OutOfMemoryError (Java heap space). The results are:

  • without anything special a point count of 14M is possible.
  • when I adding the new “inline” keyword before “class Point” it was possible to increase the count to 32.5M!

You can also use this inlined Point class with generics like ArrayList<Point> but you need a so called “Indirect projections”: ArrayList<Point?>. I.e. it allows backward compatibility but you’ll loose the memory efficiency, at least at the moment as IMO ArrayList uses Object[] and not E[].

Memory Usage Now And Then

The limit of 32.5M points is explainable via
32 500 000*16/1024.0/1024.0=496MB every point instance uses 16 bytes as expected.

The 14M limit means approx 37 bytes per point and is not that easy to explain. The first piece you’ll need is:

In a modern 64-bit JDK, an object has a 12-byte header, padded to a multiple of 8 bytes, so the minimum object size is 16 bytes. For 32-bit JVMs, the overhead is 8 bytes, padded to a multiple of 4 bytes.

Taken from this Stackoverflow answer

Additionally a reference can use between 4 and 8 byte depending on the -Xmx setting, read more about “compressed ordinary object pointers (oops)“.

This leads to the following equation for the example: 4 bytes for the reference, 12 for the header, 16 for the two doubles and 4 byte for the padding to fill the 32 bytes (multiple of 8 bytes), i.e. 36 bytes per point.

So without project Valhalla you currently waste over 55%:

Project Valhalla makes Java memory efficient again

The memory waste of the current memory layout can be even worse if the object is smaller. A Point with two double values for coordinates on earth is a bit too precise and float values are sufficient (even less than 8 bytes). An “inlined” point instance just needs 8 bytes. Without “inline” you need 28 bytes (4+12+2*4+4), which means you waste more than 70%.

Other Valhalla Features

Another feature implemented is the == sign. Try the following unit test in a current JVM:

assertTrue(new Point(11, 12) == new Point(11, 12));
assertTrue(new Point(12, 12) != new Point(11, 12));

And you’ll notice it fails. With project Valhalla this passes and you do not even have to implement an equals method!

At the moment as far as I know there no direct primitive support like ArrayList<int>.

Also “inline”-types do not support declaring an explicit super class, but you can use composition. For example:

inline class Point3D { double ele; Point point; }