Fun with Shapefiles, CRSs and GeoTools

Although I’m now in the “GIS business” for years I had never to deal with shapefiles directly. Now it was time also to investigate tools like QGIS and hack together a simple reader for shp files. At least I thought it was simple but calling me a GIS expert afterwards would be a ridiculous understatement.

GeoTools fun

A quick look and I decided to go with GeoTools as I knew it from name and I needed a tool in Java. Thanks to QGIS I understood quickly that in my case I had to deal with a list of a list of lines containing coordinates but how to read that via GeoTools? The internet provided several solutions, but I didn’t found complete examples for my case. As it turned out: I had to explicitly cast 2 times (!) first from “Feature” to “SimpleFeature” and then from “Geometry” to “MultiLineString”. Not sure if this is really necessary. At least this makes learning a new API very hard.

Now I had the initial code:

Map connect = new HashMap();
// a File is not sufficient as a shapefile consists of multiple files
connect.put("url", file.toURI().toURL());
DataStore dataStore = DataStoreFinder.getDataStore(connect);
String[] typeNames = dataStore.getTypeNames();
String typeName = typeNames[0];
FeatureSource featureSource = dataStore.getFeatureSource(typeName);
CoordinateReferenceSystem sourceCRS = featureSource.getSchema().getCoordinateReferenceSystem();
FeatureCollection collection = featureSource.getFeatures();
// allow for some error due to different datums ('bursa wolf parameters required')
boolean lenient = true;
MathTransform transform = CRS.findMathTransform(sourceCRS, targetCRS, lenient);

List<List<GPXEntry>> lineList = new ArrayList<>();
try (FeatureIterator iterator = collection.features()) {
    while (iterator.hasNext()) {
        SimpleFeature feature = (SimpleFeature) iterator.next();
        MultiLineString mlString = (MultiLineString) feature.getDefaultGeometry();
        ...
    }
}

How short and beautiful. But: It did not compile. And that although I was using the recommended “maven procedure”. It seems that GeoTools seems to follow a bit unusual path that it requires you to define the repositories in your pom.xml – I did only find a solution with the snapshot versions but this was sufficient for the time being.

CRS fun

At least it seemed to work then. But after further longish time I found out that the coordinates had just a tiny offset, so something was wrong with the source or target coordinate reference system (CRS) or with the transformation itself. Again QGIS helped me here and determined the source CRS correctly. But GeoTools was somehow wrong and initially I thought it was GeoTools fault.

But I quickly stumbled over another CRS issue and had to deal with exactly the same CRSs leading to different results. In my case it was CRS.decode(“EPSG:4326”) vs. DefaultGeographicCRS.WGS84 – so they are identical but the results were different!? It turns out that the coordinate axes are mixed! GeoTools fault? No! GeoTools even gave me the solution in its documentation:
“So if you see some data in “EPSG:4326” you have no idea if it is in x/y order or in y/x order”!

Deployment

Puh. Okay. I was ready for deployment and used my usual git and mvn assembly procedure to push stuff on my server but then I got exceptions while runtime about missing classes! Oh no – how can this be when I use maven?
As it turns out GeoTools requires the maven shade plugin in order to bundle the database for correct CRS transformation properly via a plugin architecture I think. And look: the whole jar is now nearly 12MB!

Conclusion

The GIS and Java world are called “enterprise” for a reason. I hope I can help others with my findings. Find the fully working code here.