Project Valhalla makes Java memory efficient again

For a brief introduction into project Valhalla you need to read this wiki at OpenJDK or watch e.g. this talk from Brian Goetz. Basically it changes the layout of data in memory and introduces a possibility to define compact collections of objects. Currently only arrays of primitive types like int[] or double[] are “compact”.

Long ago in JDK 1.5 I was really excited about the new generics feature and I was trying the first prototype. But soon I understood that this was not really what I had expected. So the JVM engineers introduced a similar templating mechanism like in C++ but “only” on the surface and the memory layout was still the same. So I was a bit disappointed and writing memory efficient Java software stayed hard.

Current Memory Layout

Let me describe the current memory layout with an example. If you put many instances of a Point class with two decimal values (e.g. latitude and longitude) into an array you’ll waste lots of memory as every entry in the array is a pointer to a separate Point instance instead of an array of double values but this is not enough: additionally every Point instance needs some header information. See below for a picture with details. This is not only a waste of memory but also an unnecessary indirection and for large arrays could also mean to touch many different memory areas just for looping. This especially hurts when many small instances are stored.

Inline type to the rescue

Since several years (!) there is work going on in OpenJDK that wants to address this. It is a major undertaking as they want to integrate this deep and want that even unmodified applications benefit from this. From time to time I look about their progress – earlier it was called “Value type”, since a few month it is “Inline type”. I think they reached a very interesting milestone that you can easily play with:

I was not able to convince IntelliJ to accept the ‘inline’ keyword despite configuring JDK14 et. Not sure if this requires modifications to the IDE. But maven worked.

The Usual Point Example

As a first test I created the simple Point class

class Point { double lat; double lon; }

and I wanted to find out the memory usage. The solid but stupid way to do this is to set e.g. -Xmx500m and increase the point count until you get an OutOfMemoryError (Java heap space). The results are:

  • without anything special a point count of 14M is possible.
  • when I adding the new “inline” keyword before “class Point” it was possible to increase the count to 32.5M!

You can also use this inlined Point class with generics like ArrayList<Point> but you need a so called “Indirect projections”: ArrayList<Point?>. I.e. it allows backward compatibility but you’ll loose the memory efficiency, at least at the moment as IMO ArrayList uses Object[] and not E[].

Memory Usage Now And Then

The limit of 32.5M points is explainable via
32 500 000*16/1024.0/1024.0=496MB every point instance uses 16 bytes as expected.

The 14M limit means approx 37 bytes per point and is not that easy to explain. The first piece you’ll need is:

In a modern 64-bit JDK, an object has a 12-byte header, padded to a multiple of 8 bytes, so the minimum object size is 16 bytes. For 32-bit JVMs, the overhead is 8 bytes, padded to a multiple of 4 bytes.

Taken from this Stackoverflow answer

Additionally a reference can use between 4 and 8 byte depending on the -Xmx setting, read more about “compressed ordinary object pointers (oops)“.

This leads to the following equation for the example: 4 bytes for the reference, 12 for the header, 16 for the two doubles and 4 byte for the padding to fill the 32 bytes (multiple of 8 bytes), i.e. 36 bytes per point.

So without project Valhalla you currently waste over 55%:

Project Valhalla makes Java memory efficient again

The memory waste of the current memory layout can be even worse if the object is smaller. A Point with two double values for coordinates on earth is a bit too precise and float values are sufficient (even less than 8 bytes). An “inlined” point instance just needs 8 bytes. Without “inline” you need 28 bytes (4+12+2*4+4), which means you waste more than 70%.

Other Valhalla Features

Another feature implemented is the == sign. Try the following unit test in a current JVM:

assertTrue(new Point(11, 12) == new Point(11, 12));
assertTrue(new Point(12, 12) != new Point(11, 12));

And you’ll notice it fails. With project Valhalla this passes and you do not even have to implement an equals method!

At the moment as far as I know there no direct primitive support like ArrayList<int>.

Also “inline”-types do not support declaring an explicit super class, but you can use composition. For example:

inline class Point3D { double ele; Point point; }
Advertisements

My Lenovo On-site Warranty Extension

Three years ago I blogged about the Thinkpad T460 that I newly bought. And I was very pleased with it. Until the mainboard broke a few weeks ago. Just when the 3 years warranty would have been over, but luckily I bought the 5 year warranty extension with on-site support.

The CPU or something froze the laptop after only a few minutes of working with it. The display was still on but neither the keyboard nor the touch pad responded. Sometimes the CPU-fan was active a few minutes after this but not at maximum level. The only possibility was to shutdown the laptop.

This was no Linux compatibility issue. I have not updated anything and so it happened out of nowhere. I disabled Wifi and Bluetooth and also looked into the kernel logs to confirm that there was no kernel panic and I even freshly installed Ubuntu 18.4. just to get the same problems. Furthermore I also updated to the latest BIOS version without success.

Day 1, 20.05.2019 (counting working days only)

After these results I called the hotline on Monday and they replied I should run the extensive diagnostics that come with the BIOS. Ok, so I did this and it froze occasionally also while the CPU stress test. This took me at least 2 hours as I wanted to be precise and helpful with my answer and provided details like that I could even make it reproducible via unplugging the power cable** or that sometimes it ran through the CPU stress test only to freeze later when running the very long running “memory test”. Also often the laptop did not even start for minutes after these freezes.

Day 2

Nothing happened and I had no time to call them again as sometimes you have to work 😉 and improve the fallback laptop.

Day 3

At 11am I still had no response although in the warranty they say “usually the next working day” (üblicherweise am nächsten Werktag) a technician will come to fix it. So I called again. “Funnily” the support Email from Day 1 contained a broken support telephone number for Germany. So the real number has just one zero after “22”, i.e. the correct support number is:

+49 201 22099 888

They confirmed that this seems to be a hardware problem and they promised to send me a new mainboard via express courier and also a technician the next day (to be safe I confirmed mobile number and address). Ok, IMO not that fast like they say, but for me acceptable as I have an old laptop with which I can continue working at least the important things.

Day 4

At 10am the UPS package with the parts arrived. Why don’t they send it to the technician?***

In the afternoon I called them once again to understand why the technician did not come. And roughly 1h later the technician called me and repaired my laptop 🙂 and it seems to work. Hope the “refurbished” sticker on the mainboard box is not a bad sign.

Day 5

Unfortunately the function key does not work anymore (or maybe never had with the new mainboard). I’m pretty sure this is a hardware issue. First of all is the Fn key something that the firmware controls and also when I switch Ctrl and Fn key then the Ctrl key works properly for e.g. a brighter display. Tried an older BIOS and the most recent BIOS but the Fn key still does not work.

I called the hotline they will send me a new keyboard. I’m unsure why this should fix my issue as the keyboard worked properly before the mainboard switch but who knows.

Day 6

The keyboard arrived.

Day 7

The technician replaced the keyboard. The function key is still not working under his Windows and also my Ubuntu. He argued that it still can be a driver issue. I argued that it worked properly with the “freezing”, old mainboard (on Ubuntu).

Day 8-10

No feedback from Lenovo regarding what to do now with this Fn key problem and I did now not call them and just waited for them to act.

Day 11

Something will arrive in the next two days they wrote via Email

Day 12

A new mainboard arrived! The technician came one hour later and installed it. The great thing is: everything is working now – finally 🙂

Conclusion

The experience was not like advertised “expected the next business day” and could be improved. The most important part to improve is to avoid forcing the customer to call the hotline over and over again to make it (days) faster: where are the parts? where is the technician?

I had to workaround a fully dysfunctional laptop only for the first 4 days. (A non-working Fn key is not that bad.)

So, out of 10 stars I would give 7. It isn’t that good and not enough information passes on to the customers, but it seems that at least they care about that issues are fully fixed and 4 days was kind of acceptable for me. And if there is an issue they likely pay more money than you pay for the warranty.

All in all I invested roughly one day into calling them, preparing the fallback hardware, writing this blog post and to find out what was actually wrong. I invested probably too much time into making sure that it is not my fault.

**one strange thing left is that the freeze is still reproducible when plugging or unplugging the power cable while the extensive test run in the BIOS (CPU stress test). So maybe this is unrelated to my issue.

***after a chat with the technician he said that last year the parts were shipped to them, which made indeed more sense to him as well. But after one more thought I think shipping it to the customer could make (a bit) sense if they expect the technician on the road and then he could go directly to the customer without going back to pick the parts.

Lenovo BIOS Update regarding Spectre

Update: since Ubuntu 18.04 this is not longer necessary as the software update triggers this and everything is done automatically.

In the following I describe the process to update the BIOS of a Lenovo Laptop that runs Linux for the recent problems related to Spectre.

1. Download the BIOS update “Bootable CD” from Lenovo, which is just an ISO image. For example for the Thinkpad T460 download it from here.

2. Extract the BIOS from the ISO as otherwise “dd” cannot copy the ISO properly:

geteltorito -o bios.img /home/user/Downloads/r06uj58d.iso

Check that the USB stick is really at /dev/sdb (confirm size)

sudo fdisk -l /dev/sdb

3. Copy the BIOS to some USB stick:

sudo dd if=bios.img of=/dev/sdb && sync

4. Reboot and while you see “To interrupt normal startup, press Enter” at boot press F1 to boot from USB. Follow the instructions after you connected your Laptop to the power cable, otherwise the update will not proceed.

Now when have updated the BIOS you should see a new BIOS version 1.32/1.11 (BIOS/EC ID: R06ET58W/R06HT31W) in the BIOS setup.

As I understand this fixes the Spectre vulnerability only. To improve the situation related to the so called Meltdown problem do a software update:

apt update && apt upgrade

Afterwards you should have at least kernel version 4.4.0-109:

$ uname -r
4.4.0-109-generic

Lenovo ThinkPad T460 – A Good Linux Laptop For Development

After several years with my Dell Latitude E6400 I was searching for a new, more powerful Linux machine for my coding and performance tweaking tasks. And although the Dell XPS line sounded interesting due to the “native” Linux support, it was also expensive with 16GB RAM and 3 year warranty (>2200€) and several users reported problems with CPU whining. I didn’t want to risc this and also reviews of the Lenovo T460 suggested a more silent and longer lasting experience. So I finally bought the T460 and was just hoping to get a good Linux support. Here are my experiences after a usage for a few months. Keep in mind that everyone has different requirements so maybe the title should be “a good Linux laptop for a certain subset of development tasks”. E.g. I’ve not yet tested 3D suff / hardware acceleration.

fronttop

side

My configuration is a T460, Intel i5-6200U CPU, 16GB RAM (PC3-12800 DDR3L, 2 DIMM) and 256GB SSD disc (Serial ATA3, opal 2.0). 14” IPS display, no touch screen, no finger print reader. The price was ~1300€ with 3 year guarantee, plus VAT. I’ve installed Xubuntu 16.04.1 on it.

From the Linux side everything I need is working now. And much of it worked out of the box with (X)Ubuntu 16.04.1, which I find really nice.

I had a few problems initially though:

  1. when playing a video the video didn’t scale and once I fixed this there was a delay. Finally I set the output to “X11 video output (XCB)” under preferences->Video of VLC.
  2. the WLAN worked fine but somehow the LAN disconnected frequently and reconnected automatically afterwards. Very strange. I didn’t find anything on the Internet how to fix this. But my brother mentioned that the software might have a problem with a slow 100Mbit connection and suggested to configure this. I switched to a 1Gbit port (same router) and it solved the problem! Now the LAN did not wake up after suspend (known bug) but I was able to start it via:
     sudo /bin/systemctl --no-block restart NetworkManager.service

Note that sleep via RAM works (“suspend”), ie. you can close the lid as usual. But sleep via disc (“hibernate”) does not work, but I find the boot time compelling enough that I do not need hibernation: ~2sec BIOS boot, plus ~13sec until login, plus 2sec to open a browser. BTW: hibernation didn’t work properly with the Dell too in recent Ubuntu versions. BTW: before the BIOS boot took 10 seconds, to fix this disable UEFI Network Stack (IPv4+IPv6 in the BIOS)

The case:

  1. The case is robust, not really beautiful, but also not ugly. An inconspicuous Thinkpad.
  2. the case feels robust but plastic, where the top cover feels of higher quality and not so plastic.
  3. It weights 1540g without the rear battery and 1870g with it. The height is 2cm without the rear battery and 3.5cm at the back with it.
  4. there is no internal DVD player, use any external
  5. some edges of the case are too sharp for my taste and feel unfinished
  6. The keyboard is ok, some love the ones of Lenovo, I find it ‘just’ okay, I really like the keys itself but I do not like the track point and you cannot disable it, without disabling also the extra mouse keys which I want.
  7. The keyboard also has the function key at the bottom left corner and not the CTRL key. I didn’t like this and switched them in the BIOS. The same for the Fn keys which I preferred over the other keys and need to switch this in the BIOS too. It still has a ESC key (!) and a big enter key which is nice to have these days.
  8. The page down/up are too close to the arrow keys for my taste but you can get use to it
  9. Sadly there is no hardware switch to turn/on of WLAN or bluethooth
  10. There is no LED indicating a power connection which is ugly when the device is turned off and has an empty battery. So you cannot be sure if it really charges.

Connectivity:

  1. 3 USB ports (USB 3.0) which are always on (nice!)
  2. HDMI
  3. LAN port
  4. the usual line for the microphone (mono) & headphone (stereo)
  5. some other stuff like a Display Port I think, look it up

Some import and partially subjective comments on the laptop:

  1. The performance is good and everything works smoothly and fast, but this is probably the case for every ordinary laptop with SSD and a normal CPU.
  2. it is super quiet. Under Idle it is indeed silent, under load only “hissing” very minimal (my old monitor is louder). And it does not get hot
  3. The IPS 14” matte display is ok and has a resolution of 1.920 x 1.080.
  4. The RAM is upgradable up to 32GB
  5. My external monitor worked out of the box with the Xubuntu inbuilt switch software (plugging in the device opens this and offers the different choices)
  6. The AC adapter is small but I like the DELL more where the cables could be bundled easier and faster than with this short and wrongly placed ‘rope’.
  7. The touch pad is good, supports also two finger gestures on Linux, but when writing you often hit it at the beginning and this sometimes garbles your text. You can learn avoid hitting it though. This is probably a software problem which should disable it when writing, it is called ‘palm detection‘ but it seems this is improvable under Linux e.g. with this post. I do not care much about it as I learned to avoid this (nearly 100%) and most of the time I use an external keyboard.
  8. The battery is really nice. On the Dell I got only 5 hours even in its early days, going down to something like 3h. Now the T460 lets you do normal work for >13 hours with the internal 23Wh and the additional big (72Wh) rear battery. We’ll have to see how this behaves with the time.
  9. The extra 72Wh rear battery is so thick that the laptop stands inclined (see picture above), which I thought is ugly at the beginning. But turns out that this is not bad, only makes minor problems if you have only a small table like in Deutsche Bahn
  10. The microphone and the chat camera is good
  11. Bluetooth works, some devices need special pulse audio setup
  12. I do not like the sound output, it is not clear but also not worse than e.g. the Dell.
  13. My printer and scanner (canon pixma mx 725) works flawlessly, even printing photos
  14. The order via the web shop took roughly 10 days, they say this is so long due to customization. I didn’t care much. Also I got a bit (sales) support and this was done via telephone and good. We’ll see how tech support looks like though. BTW: When you buy the cheapest option you can select the cheapest Windows license and upgrade the other stuff saving a bit money, also I was using a minor discount I think.

Similar products of Lenovo are the T460p (>1100€, more power, less mobile I guess) and the more expensive T460s (>1500€) and the X1 Carbon (>1800€). In all cases the 16GB RAM requirement turned out to be not that simple or expensive. I decided on the T460 because of the battery time, low/lack of noise and price.

TODOs:

  • 3D stuff, so please have a look into other reviews if you develop graphics etc
  • multiple external monitors

Conclusion

So far I like the ThinkPad T460 and can recommend it. It is powerful, has a very long lasting battery, it is silent under normal work and you can get your stuff done quickly and get solid Linux support. The Linux support is so good that I’m wondering why they do not ship it commercially to attract people like me.

The price performance ratio is good in my opinion – I can judge better in the next years when I need support and/or stuff breaks.

On the down side there is the cheap feeling of the case (plastic keyboard side&too sharp-edged) and the “track pad interferences” when you type.

See further discussion also on hacker news.

Fun with Shapefiles, CRSs and GeoTools

Although I’m now in the “GIS business” for years I had never to deal with shapefiles directly. Now it was time also to investigate tools like QGIS and hack together a simple reader for shp files. At least I thought it was simple but calling me a GIS expert afterwards would be a ridiculous understatement.

GeoTools fun

A quick look and I decided to go with GeoTools as I knew it from name and I needed a tool in Java. Thanks to QGIS I understood quickly that in my case I had to deal with a list of a list of lines containing coordinates but how to read that via GeoTools? The internet provided several solutions, but I didn’t found complete examples for my case. As it turned out: I had to explicitly cast 2 times (!) first from “Feature” to “SimpleFeature” and then from “Geometry” to “MultiLineString”. Not sure if this is really necessary. At least this makes learning a new API very hard.

Now I had the initial code:

Map connect = new HashMap();
// a File is not sufficient as a shapefile consists of multiple files
connect.put("url", file.toURI().toURL());
DataStore dataStore = DataStoreFinder.getDataStore(connect);
String[] typeNames = dataStore.getTypeNames();
String typeName = typeNames[0];
FeatureSource featureSource = dataStore.getFeatureSource(typeName);
CoordinateReferenceSystem sourceCRS = featureSource.getSchema().getCoordinateReferenceSystem();
FeatureCollection collection = featureSource.getFeatures();
// allow for some error due to different datums ('bursa wolf parameters required')
boolean lenient = true;
MathTransform transform = CRS.findMathTransform(sourceCRS, targetCRS, lenient);

List<List<GPXEntry>> lineList = new ArrayList<>();
try (FeatureIterator iterator = collection.features()) {
    while (iterator.hasNext()) {
        SimpleFeature feature = (SimpleFeature) iterator.next();
        MultiLineString mlString = (MultiLineString) feature.getDefaultGeometry();
        ...
    }
}

How short and beautiful. But: It did not compile. And that although I was using the recommended “maven procedure”. It seems that GeoTools seems to follow a bit unusual path that it requires you to define the repositories in your pom.xml – I did only find a solution with the snapshot versions but this was sufficient for the time being.

CRS fun

At least it seemed to work then. But after further longish time I found out that the coordinates had just a tiny offset, so something was wrong with the source or target coordinate reference system (CRS) or with the transformation itself. Again QGIS helped me here and determined the source CRS correctly. But GeoTools was somehow wrong and initially I thought it was GeoTools fault.

But I quickly stumbled over another CRS issue and had to deal with exactly the same CRSs leading to different results. In my case it was CRS.decode(“EPSG:4326”) vs. DefaultGeographicCRS.WGS84 – so they are identical but the results were different!? It turns out that the coordinate axes are mixed! GeoTools fault? No! GeoTools even gave me the solution in its documentation:
“So if you see some data in “EPSG:4326” you have no idea if it is in x/y order or in y/x order”!

Deployment

Puh. Okay. I was ready for deployment and used my usual git and mvn assembly procedure to push stuff on my server but then I got exceptions while runtime about missing classes! Oh no – how can this be when I use maven?
As it turns out GeoTools requires the maven shade plugin in order to bundle the database for correct CRS transformation properly via a plugin architecture I think. And look: the whole jar is now nearly 12MB!

Conclusion

The GIS and Java world are called “enterprise” for a reason. I hope I can help others with my findings. Find the fully working code here.

Units in OpenStreetMap

First of all, this is not a rant nor am I a (regular) mapper but I have some years of experience to read aka ‘interpret’ OSM data. I invite mappers to read, understand and comment on this post (in this order ;)).

Learning and understanding a specific tag

When I learn about a new tag for GraphHopper e.g. maxweight the first thing I do is that I go to taginfo and see some common use cases and implement them. Then I increase the parsing area to country-wide and I add more parsing code here and there to ignore or include commonly used values that make sense or not. Then I go worldwide doing the same. Then what is left, see this gist, are some very infrequent used values, some make sense like ‘15 US ton‘ and some don’t, like ‘agriculture‘. Now I need to decide to fix them, ignore them or include parsing code. In the case of the weight values I did see a reason to include reading values like ‘13000 lbs’ or the most frequent ones like ‘8000 (t(on)s)’ but not e.g. ‘13000 lb’ (10 times world wide) which I just fixed and converted them to SI unit – maybe I should have just added the ‘s’?

OpenStreetMap is a database

In OpenStreetMaps the tagging schema is not always clear and depends from local community to local community. And this is a good thing that OSM is flexible. The question now is, if this difference should be reflected in the data itself or if a more concise database should be preferred and the local difference could be moved into the ‘view’ like the editors. I think:

OSM should prefer more concise data if possible and this gets more important as it grows.

Now back to my example of weight values.

SomeoneElse commented today on my none automatic change where I converted ’15 US tons’ to 13.607 “SI” tons with a world wide occurrence of 5 (!) that we should not make it more complex via SI units. But if you look at the US unit system with ‘US tons’ and ‘short’ and ‘long tons’, ‘pounds’, ‘lbs’ etc, plus the various ‘weight’-possibilities like listed in this good proposal you can guess that this is already not that easy. So such an edit would be probably better done via an assisting editor which converts between weight units.

Popular OSM editors should make it possible to use local units but convert them into some SI-based when stored.

On my OSM diary someone correctly says: But “we map as it is” includes units in a way to. A limiting sign at a bridge does have a written or implied unit.
I answered: Is mapping really the process down to the database? I doubt that. Mapping means modelling the real situation with the tools we have. The tools will evolve and so should the mapping process making the database more concise and the mapping process less complex.