Today I did a bit of research for GraphHopper and I stumbled over yet another minor trick which could speed up the execution of the Dijkstra algorithm. Let me shortly introduce this shortest path algorithm:
If you need the path (and not only the shortest path tree) you will give the method an additional toNode parameter and compare this to distEntry.node to break the loop. When it was found you need to recursivly extract the path from the last distEntry.parent reference.
So, what should we improve?
Regarding performance I’ve already included a Map to directly get the DistanceEntry from a node, otherwise you would need to search it in the PriorityQueue which is too slow. Also Wikipedias says that we could use a Fibonacci heap which are optimal to decrease the key (aka weight) but those are very complicated to implement and memory intensive.
It turned out that you can entirely avoid the ‘decrease key’ operation if you do a visited.contains check after polling from the queue. This makes your heap bigger but you can avoid the costly update operation and use simpler data structures. Read the full paper “Priority Queues and Dijkstra’s Algorithm”.
What else can we improve?
Now we can tune some data structures:
- Make sure that you a traversing your graph with full speed. E.g. using just the graph in-memory without any persistence storage dependency could massivly improve performance. Also if you use node indices (pointing to an array) instead of node objects you can reduce memory consumption and e.g. use a BitSet instead of a set for the visited collection.
- In case your heap is relative big (>1000 entries) like for multi-dimensional graphs and even for plane graphs then a cached version with 2 or more stages could give you a 30% boost. When you like more complicated and efficient solutions you could implement the probably faster sequence heap and others.
- If you have a limited range of weights/keys you can try a TreeMap<Key, Set<Node>> which could speed up your code by roughly 10% if you heavily use the decreaseKey method.
For road networks and others you can apply A* which reduces the amount of visited nodes via guessing where the goal is – still the path is optimal IF the real path is longer than to what you guessed (e.g. use direct linear distance in road networks which is always smaller to the real distance):
PRESS ESC IF YOU GET NERVOUS 😉
Additionally if you accept some less optimal solutions you can apply heuristics like “don’t explore that much more nodes if you’r close the destination”.
If you don’t want less optimal paths and still want it faster you could