<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Karussell</title>
	<atom:link href="http://karussell.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://karussell.wordpress.com</link>
	<description>Thoughts about Java and more</description>
	<lastBuildDate>Wed, 19 Jun 2013 12:56:20 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='karussell.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Karussell</title>
		<link>http://karussell.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://karussell.wordpress.com/osd.xml" title="Karussell" />
	<atom:link rel='hub' href='http://karussell.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Make Your Dijkstra Faster</title>
		<link>http://karussell.wordpress.com/2012/12/03/make-your-dijkstra-faster/</link>
		<comments>http://karussell.wordpress.com/2012/12/03/make-your-dijkstra-faster/#comments</comments>
		<pubDate>Mon, 03 Dec 2012 20:37:25 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3839</guid>
		<description><![CDATA[Today I stumbled over yet another minor trick which could speed up the execution of the Dijkstra algorithm. Let me shortly introduce this shortest path algorithm: If you need the path (and not only the shortest path tree) you will give the method an additional toNode parameter and compare this to distEntry.node to break the &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/12/03/make-your-dijkstra-faster/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3839&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Today I stumbled over yet another minor trick which could speed up the execution of the Dijkstra algorithm. Let me shortly introduce this shortest path algorithm:</p>
<p><a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/routing/DijkstraSimple.java" rel="attachment wp-att-3843"><img class="alignnone size-full wp-image-3843" alt="dijkstra-orig" src="http://karussell.files.wordpress.com/2012/12/dijkstra-orig1.png?w=551&#038;h=629" width="551" height="629" /></a></p>
<p>If you need the path (and not only the shortest path tree) you will give the method an additional <em>toNode</em> parameter and compare this to <em>distEntry.node</em> to break the loop. When it was found you need to recursivly extract the path from the last <em>distEntry.parent</em> reference.</p>
<p><strong>So, what should we improve?</strong></p>
<p>Regarding performance I&#8217;ve already included a Map to directly get the DistanceEntry from a node, otherwise you would need to search it in the PriorityQueue which is too slow. Also <a href="http://en.wikipedia.org/wiki/Dijkstra%27s_algorithm">Wikipedias says that</a> we could use a Fibonacci heap which are optimal to decrease the key (aka weight) but those are very complicated to implement and memory intensive.</p>
<p>It turned out that you can entirely avoid the &#8216;decrease key&#8217; operation if you do a <a href="https://github.com/graphhopper/graphhopper/blob/6a406dea81ba9902c545fd272f534dbb232c2653/src/main/java/com/graphhopper/routing/DijkstraSimple.java#L86"><em>visited.contains</em> check</a> after polling from the queue. This makes your heap bigger but you can avoid the costly update operation and use simpler data structures. Read the full paper <a href="http://www.cs.utexas.edu/~shaikat/papers/TR-07-54.pdf">&#8220;Priority Queues and Dijkstra’s Algorithm&#8221;</a>.</p>
<p><strong>What else can we improve?</strong></p>
<p>Now we can tune some data structures:</p>
<ol>
<li>Make sure that you a traversing your graph with full speed. E.g. using just the graph in-memory without any persistence storage dependency could massivly improve performance. Also if you use node indices (pointing to an array) instead of node objects you can reduce memory consumption and e.g. use a BitSet instead of a set for the <em>visited</em> collection.</li>
<li>In case your heap is relative big (&gt;1000 entries) like for multi-dimensional graphs and even for plane graphs then <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/coll/MyDijkstraHeap.java">a cached version</a> with 2 or more stages could give you a 30% boost. When you like more complicated and efficient solutions you could implement the probably faster <a href="http://algo2.iti.kit.edu/sanders/papers/falenex.ps.gz">sequence heap</a> and <a href="http://www.cb.uu.se/~cris/Documents/j.patcog.2010.04.002_preprint.pdf">others</a>.</li>
<li>If you have a limited range of weights/keys you can try a <a href="http://stackoverflow.com/a/12903290/194609"><em>TreeMap&lt;Key, Set&lt;Node&gt;&gt; </em></a>which could speed up your code by roughly 10% if you heavily use the decreaseKey method.</li>
</ol>
<p>For road networks and others you can apply <a href="http://karussell.files.wordpress.com/2012/07/astar.gif">A*</a> which reduces the amount of visited nodes via guessing where the goal is &#8211; still the path is optimal IF the real path is longer than to what you guessed (e.g. use direct linear distance in road networks which is always smaller to the real distance):</p>
<p><strong>PRESS ESC IF YOU GET NERVOUS <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </strong></p>
<p><a href="http://en.wikipedia.org/wiki/A*_search_algorithm"><img class="alignnone size-full" alt="http://karussell.files.wordpress.com/2012/07/astar.gif?w=551" src="http://karussell.files.wordpress.com/2012/07/astar.gif?w=551"   /></a></p>
<p>Additionally if you accept some less optimal solutions you can apply heuristics like &#8220;don&#8217;t explore that much more nodes if you&#8217;r close the destination&#8221;.</p>
<p>If you don&#8217;t want less optimal paths and still want it faster you could</p>
<ul>
<li>use a <a href="http://karussell.files.wordpress.com/2012/06/bidijkstra.gif">bidirectional Dijkstra</a></li>
<li>prepare your graph via &#8216;<a href="http://en.wikipedia.org/wiki/Contraction_hierarchies">contraction hierarchies</a>&#8216; which introduces several shortcuts and extremly reduces the number of visited nodes. It should also work for none-road networks too.</li>
</ul>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3839/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3839/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3839&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/12/03/make-your-dijkstra-faster/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/12/dijkstra-orig1.png" medium="image">
			<media:title type="html">dijkstra-orig</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/07/astar.gif" medium="image">
			<media:title type="html">http://karussell.files.wordpress.com/2012/07/astar.gif</media:title>
		</media:content>
	</item>
		<item>
		<title>Running Shortest-Path Algorithms on the German Road Network within a 1.5GB JVM</title>
		<link>http://karussell.wordpress.com/2012/07/16/running-shortest-path-algorithms-on-the-german-road-network-within-a-1-5gb-jvm/</link>
		<comments>http://karussell.wordpress.com/2012/07/16/running-shortest-path-algorithms-on-the-german-road-network-within-a-1-5gb-jvm/#comments</comments>
		<pubDate>Mon, 16 Jul 2012 08:50:42 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[GraphHopper]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3736</guid>
		<description><![CDATA[Update: With changes introduced in January 2013 you only need 1GB &#8211; live demo! In one of my last blog posts I wrote about memory efficient ways of coding Java. My conclusion was not a bright one for Java: &#8220;This time the simplicity of C++ easily beats Java, because in Java you need to operate &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/07/16/running-shortest-path-algorithms-on-the-german-road-network-within-a-1-5gb-jvm/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3736&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><strong>Update: With changes introduced in January 2013 you only need 1GB &#8211; <a href="http://graphhopper.com/maps/?from=kiel&amp;to=freiburg%20breisgau"> live demo</a>!</strong></p>
<p>In one of my last blog posts I wrote about <a href="http://karussell.wordpress.com/2012/04/18/memory-efficient-java-mission-impossible/">memory efficient ways of coding Java</a>. My conclusion was not a bright one for Java: <em>&#8220;This time the simplicity of C++ easily beats Java, because in Java you need to operate on bits and bytes&#8221;. </em>But I am still convinced that in nearly every other area Java is a good choice. Just some guys need to implement the dirty parts of memory efficient data structures and provide a nice API for the rest of the world. That&#8217;s what I had in mind with</p>
<h3><strong><a href="http://graphhopper.com">GraphHopper</a></strong></h3>
<p>GraphHopper does not attack memory efficient data structures like <a href="http://trove4j.sourceforge.net/">Trove4j</a> etc. Instead it&#8217;ll focus on spatial indices, routing algorithms and other &#8220;geo-graph&#8221; experiments. A road network can already be stored and you can execute Dijkstra, bidirectional Dijkstra, A* etc on it.</p>
<p>Months ago I took the opportunity and tried to import the full road network of Germany via OSM. It failed. I couldn&#8217;t make it working in the first days due to massive RAM usage of HashMaps for around 100 mio data points (only 33 mio are associated to roads though). Even using only trove4j brought no success. When using Neo4J the import worked, but it was very slow and when executing algorithms the memory consumption was too high when too many nodes where requested (long paths).</p>
<p>Then after days I created a <a href="https://github.com/graphhopper/graphhopper/blob/c4079cd49f626d1d7de3435446458fa21ac18420/src/main/java/com/graphhopper/storage/MMapDataAccess.java">memory mapped graph implementation</a> in <a href="https://github.com/graphhopper/graphhopper">GraphHopper</a> and it worked too. But the implementation is a bit tricky to understand, not thread safe (even not for two reading threads <a href="http://stackoverflow.com/q/11153583/194609">yet</a>), slower compared to a pure in-memory solution. But even more important: the speed was not very predictable and very ugly to debug if off-heap memory got rare.</p>
<p>I&#8217;ve now created a <a href="https://github.com/graphhopper/graphhopper/blob/c4079cd49f626d1d7de3435446458fa21ac18420/src/main/java/com/graphhopper/storage/RAMDataAccess.java">&#8216;safe&#8217; in-memory graph</a>, which saves the data after import and reads once before it starts. At the moment this is read-thread-safe only, as full thread safety would be too slow and is not necessary (yet).</p>
<p>Now performance wise on this big network, well &#8230; I won&#8217;t talk about the speed of a normal Dijkstra, give me some more time to improve the speed up technics. For a smaller network you can see below that even for this simplistic approach (no edge-contraction or edge-reduction at all) the query time is under 150ms and will be under 100ms for <em>bidirectional</em> A* (w/o approximation!), I guess.</p>
<p>In order to perform realistic route queries on a road network we would like to satisfy two use cases:</p>
<ol>
<li>Searching addresses (cities, streets etc)</li>
<li>Clicking directly on the map to create a route query</li>
</ol>
<p>The first one is simple to solve and it is very unlikely to avoid tons of additional RAM. But we can solve it very easy with <a href="http://www.elasticsearch.org/">ElasticSearch</a> or Lucene: just associate the cities, streets etc to the node ids of the graph.</p>
<p>The second use case requires more thinking because we want it memory efficient. A normal quad-tree is not a good choice as it requires too many references. Even for a few million data points it requires several dozens of MB in <em>addition</em> to the graph! E.g. 80MB for only 4 mio nodes.</p>
<p>The solution is to use a <a href="http://wiki.openstreetmap.org/wiki/QuadTiles">raster</a> over the area &#8211; which can be a simple array addressed by <a href="http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/">spatial keys</a>. And per quadrant (aka tile) we store one array index of the graph as entry point. (In fact this is a quad-tree of depth one!) Then when a click on the map happened, we can calculate the spatial key from this point (A), then get the entry point from the array and traverse the graph to get the point in the graph which is the closest one to point A. <a href="https://github.com/graphhopper/graphhopper/blob/c4079cd49f626d1d7de3435446458fa21ac18420/src/main/java/com/graphhopper/storage/index/Location2IDQuadtree.java">Here</a> is an implementation, where only one problem remains (which is solved in the new index).</p>
<p><strong>Unfair Comparison</strong></p>
<p>In the last days, just for the sake of fun, I took Neo4J and ran my bidirectional Dijkstra with a small data set &#8211; <a href="http://download.geofabrik.de/europe/germany/bayern/unterfranken.html">unterfranken</a> (1 mio nodes). GraphHopper is around 8 times faster and uses 5 times less RAM:</p>
<p><a href="http://karussell.files.wordpress.com/2012/07/graphhopper-neo4j-perf-comparison.png"><img title="graphhopper-neo4j-perf-comparison" alt="" src="http://karussell.files.wordpress.com/2012/07/graphhopper-neo4j-perf-comparison.png?w=425&#038;h=336" width="425" height="336" /></a></p>
<p>The lower the better &#8211; it is the mean time in seconds per run on this road network where two of the algorithms (<a href="https://github.com/graphhopper/graphhopper/blob/41c17729af490f7a722982284ef22dabd39c1427/src/main/java/com/graphhopper/routing/DijkstraBidirectionRef.java">BiDijkstraRef</a>, <a href="https://github.com/graphhopper/graphhopper/blob/41c17729af490f7a722982284ef22dabd39c1427/src/main/java/com/graphhopper/routing/DijkstraBidirection.java">BiDijkstra</a>) are used. The number in brackets is the actually used memory in GB for the JVM. The lowest possible memory for GraphHopper was around 160MB, but only for the more memory friendly version (BiDikstra).</p>
<p><strong>For all Neo4J-Bashers</strong>: this is <strong>not</strong> a fair comparison as GraphHopper is highly specialized and will only be usable for 2D networks (roads, public transport) and it is also does not have transaction support etc. as pointed out by Michael:</p>
<blockquote class='twitter-tweet'><p>@<a href="https://twitter.com/timetabling">timetabling</a> gcr is a better cache from neo4j enterprise, and concurr. sync is not optional in a full ACID db, but your impl is a good test&mdash; <br />Michael Hunger (@mesirii) <a href='http://twitter.com/#!/mesirii/status/220830527480016897' data-datetime='2012-07-05T10:44:35+00:00'>July 05, 2012</a></p></blockquote>
<p>But we can learn that sometimes it is really worth the effort to create a specialized solution. The right tools for the right job.</p>
<h3><strong>Conclusion</strong></h3>
<p>Although it is not easy to create memory efficient solutions in Java, with <a href="https://github.com/graphhopper/graphhopper">GraphHopper</a> it is possible to import (2.5GB) and use (1.5GB) a road network of the size of Germany on a normal sized machine. This makes it possible to process even large road networks on one machine and e.g. lets you run algorithms on even a single, small Amazon instance. If you reduce memory usage of your (routing) application you are also very likely to avoid <a href="http://java.dzone.com/articles/how-tame-java-gc-pauses">garbage collection tuning</a>.</p>
<p>There is still a lot room to optimize memory usage and especially speed, because there is a lot of research about road networks! <strong>You&#8217;re invited to fork &amp; contribute!</strong></p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/graphhopper/'>GraphHopper</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3736/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3736/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3736&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/07/16/running-shortest-path-algorithms-on-the-german-road-network-within-a-1-5gb-jvm/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/07/graphhopper-neo4j-perf-comparison.png" medium="image">
			<media:title type="html">graphhopper-neo4j-perf-comparison</media:title>
		</media:content>
	</item>
		<item>
		<title>Failed Experiment: Memory Efficient Spatial Hashtable</title>
		<link>http://karussell.wordpress.com/2012/06/17/failed-experiment-memory-efficient-spatial-hashtable/</link>
		<comments>http://karussell.wordpress.com/2012/06/17/failed-experiment-memory-efficient-spatial-hashtable/#comments</comments>
		<pubDate>Sun, 17 Jun 2012 22:06:52 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Graph]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3590</guid>
		<description><![CDATA[The Background of my Idea The idea is to use a hash table for points (aka HashMap in Java) and try to implement neighbor searches. First of all you&#8217;ll need to understand what a spatial key is. Here you can read the details, but in short it is a binary Geohash where you avoid the memory &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/06/17/failed-experiment-memory-efficient-spatial-hashtable/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3590&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<h3>The Background of my Idea</h3>
<p>The idea is to use a hash table for points (aka HashMap in Java) and try to implement neighbor searches. First of all you&#8217;ll need to understand what a spatial key is. Here you can <a href="http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/">read the details</a>, but in short it is a binary <a href="http://en.wikipedia.org/wiki/Geohash">Geohash</a> where you avoid the memory inefficient base 32 representation.</p>
<p>Now that we have the spatial key, you can think about an array which is used to map from indices (like spatial keys) to values. This is the simplest representation of a spatial index as we don&#8217;t need to store the keys at all. <strong>But</strong> it is only memory efficient iff we would have no empty entries, which is very unlikely for clustered, real-world GIS-data.</p>
<p>If we would solve this with a normal hash table we encounter two problems:</p>
<ul>
<li>It is very unlikely that points in the same area come into the same hash bucket &#8211; making neighborhood searches slow i.e. O(n)</li>
<li>It would be necessary to store the entire point &#8211; not only the associated value. Otherwise it would be impossible in case of a hash-collision to detect which point belongs to which value.</li>
</ul>
<p>My idea is to use parts of the spatial key for the hashcode and avoid storing the entire key. It is implemented in Java (open source and <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/geohash/SpatialHashtable.java">available at GitHub</a>).</p>
<p><a href="http://karussell.files.wordpress.com/2012/05/spatial-key-hashtable2.png"><img title="spatial-key-hashtable" alt="" src="http://karussell.files.wordpress.com/2012/05/spatial-key-hashtable2.png?w=500&#038;h=218" width="500" height="218" /></a></p>
<p>As you can see we&#8217;re still using an array of buckets and a &#8220;somehow&#8221; converted spatial key to get</p>
<h3>The Bucket Index</h3>
<p>Let me explain the necessary bucket index in more detail (see picture above on the right).</p>
<p>We skip the beginning bits of every spatial key as it is  identical for an area a lot smaller than the world boundaries like Germany.</p>
<p>If we use the first part of the spatial key &#8211; in the picture identified as <em>x</em>, then the array is small enough. But with some <a href="http://maps.google.com/maps?hl=en&amp;q=unterfranken"> real world data</a> (also available as <a href="http://download.geofabrik.de/osm/europe/germany/bayern/unterfranken.osm.bz2">osm</a>) this is not sufficient. Too many overflows would happen, some buckets would have several thousands entries! If we move the used part a bit to the right this gets a lot better e.g. for 4 entries per bucket we have a <a href="http://en.wikipedia.org/wiki/Root-mean-square_deviation">RMS error</a> of about 2.</p>
<p>We now have a form of</p>
<h3>A Hashtable &amp; Quadtree Mixture</h3>
<p>We can tune if our data structure behaves like a quad tree or a hash table. When moving the bits taken from spatial key to<strong> the left</strong> we get an <strong>quad tree</strong>-like characteristics. Taking the bits more from <strong>the right</strong> we get <strong>hash table</strong>-like characteristics.</p>
<p>This would be fine if we have massive data. But we need to make this approach practical also e.g. for only 2 mio data points. Because the part of the spatial key is only 19 bits long: if we assume 4 entries per bucket we come to approx. 2 mio (4 * 2^19 = 4 * 524 288). So the bucket index alone is too short. The solution to this problem is to do a bit-operation of the left and the right part of the spatial key:</p>
<p><strong>bucketIndex = x ^ y</strong></p>
<h3>Further reduce memory consumption</h3>
<p>Besides the fact that we now have some kind of a pointer-less or linear quad tree we can further reduce the memory footprint. We store only the required part (e.g. all bits except <em>y</em>) and not the full spatial key. For this it was necessary that our bit operation (or more generic &#8220;hashing scheme&#8221;) is reversible. Ie.: we can regain the full spatial key from only the bucket index and the stored part of the key. And in our case the <a href="http://en.wikipedia.org/wiki/Bitwise_operation#XOR"><em>x XOR y</em></a> it reversible. In fact <strong>this memory reduction can be applied to any hashing procedure</strong> which fulfills this &#8216;reversible&#8217; requirement.</p>
<h3>Speed of Neighbor Queries is Bad</h3>
<p>Neighborhood searches are very slow, slower than I expected. The naiv approach resulted in 60 seconds for a 10km search &#8211; 30 times slower as it would take to process all 2 mio entries. When tuning the overflow schema we are now a bit under 2 seconds. Still 10 times slower than a normal quad tree and as slow as processing all entries. The reason for why this storage is only good for get and put operations is that the same bucket needs to be parsed several times: as the same bucket index needs to be the home for several different locations &#8211; yeah, exactly as intended.</p>
<p><strong>The good news are:</strong></p>
<p><strong>1.</strong> When I was moving the bucket-index-window a lot to the left it gets faster and faster, but took dozens of seconds to create the storage due to the heavy overflow number even for my small data set (2 mio). It could be improved a bit when applying different overflow strategies e.g. not using a linear overflow but skipping every two buckets or <a href="http://en.wikipedia.org/wiki/Hash_table#Collision_resolution">others</a>.</p>
<p><strong>2.</strong> Even in this state this idea can be used as a memory efficient spatial <em>key-value</em> storage without the neighbor search. E.g. you already have a graph of roads but you need an entrance like a HashMap&lt;Point,NodeId&gt; for it, then our data structure is an efficient hash table. Also doing a simple rectangular neighbor search should be fast: requesting only the 8 surrounding bounding boxes. Then no tree traversal is necessary and every box can be done with just a loop through the bucket array.</p>
<p><strong>3.</strong> Another possibility is to use a small quadtree as an entry (mapping spatial keys to ids) for a 2D-graph. Then traversing this graph to find neighbors. This way I&#8217;ve finally chosen as I already needed a road-network for Dijkstra. So I only need additional 10MB for the small quadtree inex, see a possible next blog entry.</p>
<p><strong>4.</strong> I&#8217;m not alone &#8211; you can take my idea and try implementing a more efficient neighbor searches yourself <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  !</p>
<h3>Conclusion</h3>
<p>In this post I&#8217;ve explained how to create a spatial hash table which is optimized in memory usage. This is achieved combining two ideas: using a hash table-alike data structure still &#8216;somehow suited&#8217; for neighborhood searches and reducing the amount of memory while storing only parts of the hashkey. This second idea could be applied on every kind of hash table but only if the hashkey creation is reversible.</p>
<p>The ideas are implemented in Java for the <a href="http://graphhopper.com">GraphHopper project</a> -  see the <a href="https://github.com/graphhopper/graphhopper/tree/master/src/main/java/com/graphhopper/geohash">geohash package</a>. Sadly the perfomance for neighbor searches is really bad. Which created a different solution in my mind (see point 3 of good news).</p>
<h3>Outlook</h3>
<p>In the literature similar data structures are called <a href="http://www.acm.org/conferences/sac/sac2000/Proceed/FinalPapers/DB-27/node3.html">linear</a> or pointer-less quad trees. After this experiment I come to the conclusion that the best way to implement a <em>memory efficient</em> spatial storage which is also able to perform <em>fast</em> neighbor queries could be a <a href="http://en.wikipedia.org/wiki/Radix_tree">prefix quad tree</a>. Still using pointers but storing two bits in very branch node and avoid those bits in the leaf nodes. Ongoing work for this is done currently in <a href="https://github.com/spatial4j/spatial4j">Spatial4J</a> &amp; <a href="https://github.com/apache/lucene-solr/blob/trunk/lucene/spatial/src/java/org/apache/lucene/spatial/prefix/RecursivePrefixTreeFilter.java">Lucene 4.0</a> &#8211; actually without the use of spatial keys.</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/graph/'>Graph</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3590/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3590/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3590&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/06/17/failed-experiment-memory-efficient-spatial-hashtable/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/05/spatial-key-hashtable2.png" medium="image">
			<media:title type="html">spatial-key-hashtable</media:title>
		</media:content>
	</item>
		<item>
		<title>How I found the Googloson and why it has negative Energy</title>
		<link>http://karussell.wordpress.com/2012/06/17/how-i-found-the-googloson-and-why-it-has-negative-energy/</link>
		<comments>http://karussell.wordpress.com/2012/06/17/how-i-found-the-googloson-and-why-it-has-negative-energy/#comments</comments>
		<pubDate>Sun, 17 Jun 2012 19:24:41 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[nonsense]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3716</guid>
		<description><![CDATA[I have finally found the Googloson with my self-made Small Googtron Collidor (SGC) &#8211; I even had the time to make a photo of it to show you that it is not a fake (see below). Everyone will say I&#8217;ve gimped that, but no, I didn&#8217;t. Really. I have developed a theoretical proof of the existence of &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/06/17/how-i-found-the-googloson-and-why-it-has-negative-energy/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3716&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>I have finally found the Googloson with my self-made Small Googtron Collidor (SGC) &#8211; I even had the time to make a photo of it to show you that it is not a fake (see below).</p>
<p>Everyone will say I&#8217;ve gimped that, but no, I didn&#8217;t. Really. I have developed a theoretical proof of the existence of the Googloson. Let me show you &#8230;</p>
<p>There was a lot of hype around Google+ in the last months, so regarding success we can  safely assume that:</p>
<p style="text-align:center;">Google+ &lt;&lt; Google</p>
<p>We can only satisfy the equation when we introduce the variable <strong>g</strong>:</p>
<p style="text-align:center;">Google + <strong>g</strong> &lt;&lt; Google</p>
<p style="text-align:center;"><strong>g</strong> &lt;&lt; 0</p>
<p>Without any mistake we can set 0 to 0<a href="http://www.google.com/search?hl=en&amp;q=eV">eV</a> which is of course equivalent to success. We obviously found some thing with <a href="http://en.wikipedia.org/wiki/Antimatter">negative energy</a>. In fact, my latest experiments show how big <strong>g</strong> is and of what kind! It is not in the range of MeV it is in the range of <a href="http://en.wikipedia.org/wiki/Googolplex">googolplex</a> eV!</p>
<p><a href="http://karussell.files.wordpress.com/2012/06/googloson.jpg"><img class="alignnone size-medium wp-image-3717" src="http://karussell.files.wordpress.com/2012/06/googloson.jpg?w=294&#038;h=300" alt="" width="294" height="300" /></a></p>
<p>And because the number of cups on my SGC is 8 (an integer) we have found a <a href="http://en.wikipedia.org/wiki/Boson">boson</a> &#8211; a so-called Googloson.</p>
<pre>qed</pre>
<p>Now it is up to the reader to give an interpretation of why <strong>g</strong> is negative. In my future research I&#8217;ll try to use <strong>g</strong> to identify the <a href="http://en.wikipedia.org/wiki/Higgs_boson">Higgs boson</a> or even Marshmallows!</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/nonsense/'>nonsense</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3716/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3716/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3716&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/06/17/how-i-found-the-googloson-and-why-it-has-negative-energy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/06/googloson.jpg?w=294" medium="image" />
	</item>
		<item>
		<title>Tricks to Speed up Neighbor Searches of Quadtrees. #geo #spatial #java</title>
		<link>http://karussell.wordpress.com/2012/05/29/tricks-to-speed-up-neighbor-searches-of-quadtrees-geo-spatial-java/</link>
		<comments>http://karussell.wordpress.com/2012/05/29/tricks-to-speed-up-neighbor-searches-of-quadtrees-geo-spatial-java/#comments</comments>
		<pubDate>Tue, 29 May 2012 09:55:44 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[GraphHopper]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3668</guid>
		<description><![CDATA[In Java land there are at least two quadtree implementations which are not yet optimal, so I though I&#8217;ll post some possibilities to tune them. Some of those possibilities are already implemented in my GraphHopper project. Quadtree What is a quadtree? Wikipedia says: &#8220;A quadtree is a tree data structure in which each internal node &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/05/29/tricks-to-speed-up-neighbor-searches-of-quadtrees-geo-spatial-java/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3668&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>In Java land there are at least two quadtree implementations which are not yet optimal, so I though I&#8217;ll post some possibilities to tune them. Some of those possibilities are already implemented in my <a href="http://graphhopper.com">GraphHopper project</a>.</p>
<h3>Quadtree</h3>
<p>What is a <a href="http://en.wikipedia.org/wiki/Quadtree">quadtree</a>? Wikipedia says: <em>&#8220;A <strong>quadtree</strong> is a <a title="Tree data structure" href="http://en.wikipedia.org/wiki/Tree_data_structure">tree data structure</a> in which each internal node has exactly four children.</em>&#8221; And then you need some leaf nodes to actually store some data &#8211; e.g. points and associated values.</p>
<p>A quadtree is often used for fast neighbor searches of spatial data like points or lines. And a quadtree with points could work like the following: fill up a leaf node until its full (e.g. 8 entries), then create a branch node with 4 pointers (north-west, north-east, south-west, south-east) and decide where the leaf node entries should go depending of its location, in this process it could be necessary to create new branch nodes if all entries are too much clustered.</p>
<p>Now a simple neighbor search can be implemented recursively. Starting from the root node:</p>
<ul>
<li>If the current node is a leaf node check if the point is in the search area. If that is the case add it to the results</li>
<li>If it is branch node check if one of the 4 sub-areas intersects with the search area. If a sub-node intersects then use that as current node and call this method recursively.</li>
</ul>
<p><strong>Trick 1 &#8211; Normalize the Distance</strong></p>
<p>Searches are normally done with a point and a radius. To check if the current area of the quadrant intersects with the search area you need to calculate the distance using the <a href="http://en.wikipedia.org/wiki/Haversine_formula">Haversine formula</a>. But you don&#8217;t need to calculate it every time in its entire complexity. E.g. you can avoid the following calculation:</p>
<pre class="brush: java; title: ; notranslate">
R * 2 * Math.asin(Math.sqrt(a));
</pre>
<p>This is ok, if you have already normalized the radius of the search area via:</p>
<pre class="brush: java; title: ; notranslate">
double tmp = Math.sin(dist / 2 / R);
return tmp * tmp;
</pre>
<p><strong>Trick 2 &#8211; Use Smart Intersect Methods</strong></p>
<p>The intersect method should fail fast. E.g. when you use again <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/util/shapes/Circle.java">a circle as search area</a> you should calculate only once the bounding box and check intersection of this with the quadrant area before applying the heavy intersect calculation with the Haversine formula:</p>
<pre class="brush: java; title: ; notranslate">
public boolean intersect(BBox b) {
    // test top intersect
    if (lat &gt; b.maxLat) {
        if (lon &lt; b.minLon)
            return normDist(b.maxLat, b.minLon)  b.maxLon)
            return normDist(b.maxLat, b.maxLon)  0;
    }

    // test bottom intersect
    if (lat &lt; b.minLat) {
        if (lon &lt; b.minLon)
            return normDist(b.minLat, b.minLon)  b.maxLon)
            return normDist(b.minLat, b.maxLon)  0;
    }

    // test middle intersect
    if (lon &lt; b.minLon)         return bbox.maxLon - b.minLon &gt; 0;
    if (lon &gt; b.maxLon)
        return b.maxLon - bbox.minLon &gt; 0;
    return true;
}
</pre>
<p>Also be very sure you defined your bounding box properly once and for all. I&#8217;m using: minLon, maxLon followed by minLat which is south(!) and maxLat. Equally to EX_GeographicBoundingBox in the ISO 19115 standard see <a href="http://osgeo-org.1560.n6.nabble.com/Boundingbox-issue-for-discussion-td3875533.html">this discussion</a>.</p>
<p><strong>Trick 3 &#8211; Use Contains() and not only Intersect()</strong></p>
<p>Now a less obvious trick. You could completely avoid intersect calls of quadrant areas which lay <strong>entirely</strong> in the search area. For this it is necessary to calculate fast if the search area fully contains the quadrant area or not. E.g. the method for a boudning box containing another <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/util/shapes/BBox.java">bounding box</a> would be:</p>
<pre class="brush: java; title: ; notranslate">
class BBox {
  public boolean contains(BBox b) {
    return maxLat &gt;= b.maxLat &amp;&amp; minLat = b.maxLon &amp;&amp; minLon   }
  ...
}
</pre>
<p>Similar for <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/util/shapes/BBox.java">BBox</a>.contains(Circle), <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/util/shapes/Circle.java">Circle</a>.contains(BBox) and Circle.contains(Circle).</p>
<p><strong>Trick 4 &#8211; Sort In Time and not just Adding</strong></p>
<p>Normally you want only 10 or less nearest neighbors and <strong>not all neighbours</strong> in a fixed distance. Especially for search engines like Lucene this should be favoured. For that you should use a priority queue to handle the ordering of the result. But not only of the leaf nodes &#8211; also when deciding which branch node should be processed next! See the paper <a href="http://www.cs.umd.edu/~hjs/pubs.html">Ranking in spatial databases</a> for more information, where also a method for incremental neighbor search is described. This would make paging through the results very efficient.</p>
<p><strong>Trick 5 &#8211; Use Branch Nodes with more than Four Children</strong></p>
<p>Instead of branch nodes with 4 children you could use a less memory efficient but faster arrangement: use branch nodes with 16 child nodes. Or you could even decide on demand which branch node you should use &#8211; this can lead to partially big branch arrays where the quadtree is complete &#8211; making searching very efficient as then the sub-quadtree is a linear quadtree (see below).</p>
<h3>Tricks for linear QuadTrees</h3>
<p><strong>Trick 6 &#8211; Avoid costly intersect methods</strong></p>
<p>This only applies if you quadtree is a <a href="http://www.acm.org/conferences/sac/sac2000/Proceed/FinalPapers/DB-27/node3.html">linear one</a> ie. one where you can access the values by <a href="http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/">spatial keys</a> (aka locational code). You&#8217;ll need to compute the spatial key of all four corners of your search area bounding box. Than compute the common bit-prefix of the points and start with that bit key instead from zero to traverse the quadtree. More details are available in the <a href="http://www.cs.umd.edu/%7Ehjs/pubs/bulkload.pdf">paper Speeding Up Construction of Quadtrees for Spatial Indexing </a> p.12.</p>
<p><strong>Trick 7 &#8211; Bulk processing for linear Quadtrees</strong></p>
<p>If you know that a quadrant is completely contained in a search area you can not only avoid further intersection calls, but also you can completely avoid branching and loop from quadrants bottom-left point to top-right. E.g. your are at key=1011 and you know the current quadrant node is contained in the search area. Then you can loop from the index &#8220;1011 000000..&#8221; to &#8220;1011 111111..&#8221;</p>
<p><strong>Trick 8 &#8211; Store some Bits from the Point in Branch Nodes</strong></p>
<p>For linear quadtrees you already encoded the point into spatial keys. Now you can use a <a href="http://en.wikipedia.org/wiki/Radix_tree">radix tree</a> to store the data memory efficient: some bits of the spatial key can be stored directly in the branch nodes. I&#8217;ve create also a <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/geohash/SpatialHashtable.java">different approach</a> of a memory efficient spatial storage but as it turns out it is not that easy to handle when you need neighbor searches.</p>
<h3>Conclusion</h3>
<p>As you can see a lot time can go into tuning a quadtree. But at least the first tricks I mentioned should be used as they are easy and fast to apply and will make your quadtree <a href="https://issues.apache.org/jira/browse/SIS-45">significant faster</a>.</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/graphhopper/'>GraphHopper</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3668/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3668/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3668&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/05/29/tricks-to-speed-up-neighbor-searches-of-quadtrees-geo-spatial-java/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>
	</item>
		<item>
		<title>Spatial Keys &#8211; Memory Efficient Geohashes</title>
		<link>http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/</link>
		<comments>http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/#comments</comments>
		<pubDate>Wed, 23 May 2012 08:13:45 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[GraphHopper]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3625</guid>
		<description><![CDATA[When you are operating on geographical data you&#8217;ll use latitude and longitude to specify a location somewhere on earth. To look up some associated information or if you want to do neighborhood searches you could create R-trees, quad-trees or similar spatial data structures to make them efficient. Some people are using Geohashes instead because then &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3625&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>When you are operating on geographical data you&#8217;ll use latitude and longitude to specify a location somewhere on earth. To look up some associated information or if you want to do neighborhood searches you could create <a href="http://en.wikipedia.org/wiki/R-tree">R-trees</a>, <a href="http://en.wikipedia.org/wiki/Quadtree">quad-trees</a> or similar spatial data structures to make them efficient. Some people are using <a href="http://en.wikipedia.org/wiki/Geohash">Geohashes</a> instead because then the neighborhood searches are relative easy to implement with simple database queries.</p>
<p>In some cases you&#8217;ll find binary representations of <a href="http://en.wikipedia.org/wiki/Geohash">Geohashes</a> &#8211; I named them <strong>spatial keys -</strong> in the literature I found a similar representation named <em>locational code</em><a href="http://www.cs.umd.edu/~hjs/pubs/SametVisualComputer89.pdf"> (Gargantini 1982</a>) or <a href="http://wiki.openstreetmap.org/wiki/QuadTiles">QuadTiles at open street map project</a>. A spatial key works like a Geohash but for the implementation a binary representation instead of one with characters is chosen:</p>
<p>At the first level (e.g. assume world boundaries) for the latitude we have to decide whether the point is at the northern (1) or southern (0) hemisphere. Then for the longitude we need to know wether it is in the west (0) or in the east (1) of the prime meridian resulting in <em>11</em> for the image below. Then for the next level we &#8220;step into&#8221; smaller boundaries (lat=0..90°,lon=0..+180°) and we do the same categorization resulting in a spatial key of 4 bits: <em>11 10</em></p>
<p>The encoding works in Java as follows:</p>
<pre class="brush: java; title: ; notranslate">
long hash = 0;
double minLat = minLatI;
double maxLat = maxLatI;
double minLon = minLonI;
double maxLon = maxLonI;
int i = 0;
while (true) {
    if (minLat  midLat) {
            hash |= 1;
            minLat = midLat;
        } else
            maxLat = midLat;
    }

    hash &lt;&lt;= 1;
    if (minLon  midLon) {
            hash |= 1;
            minLon = midLon;
        } else
            maxLon = midLon;
    }

    i++;
    if (i &lt; iterations)
        hash &lt;&lt;= 1;
    else
        break;
}
return hash;
</pre>
<p>When we have calculated 25 levels (50 bits) we are in the same range of float precision. The float precision is approx. 1m=40000km/2^25 assuming that for a lat,lon-float representation we use 3 digits before the comma and 5 digits after. Then a difference of 0.00001 (lat2-lat1) means 1m which can be easily calculated using the <a href="http://en.wikipedia.org/wiki/Haversine_formula">Haversine formula</a>. So, with spatial keys we can either safe around 14 bits per point or we are a bit more precise for the same memory usage than using 2 floats.</p>
<p><a href="http://karussell.files.wordpress.com/2012/05/spatial-key2.png"><img title="spatial-key" src="http://karussell.files.wordpress.com/2012/05/spatial-key2.png?w=308&#038;h=329" alt="" width="308" height="329" /></a></p>
<p>I choose the definition <em>Lat, Lon</em> as it is more common for a spatial point, although it is against the mathematic point <em>x,y</em>.</p>
<p>Now that we have defined the spatial key we see that it has the same properties as a Geohash &#8211; e.g. reducing the length (&#8220;removing some bits on the right&#8221;) results in a broader matching region.</p>
<p>Additionally the order of the quadrants could be chosen differently &#8211; instead of the <a href="http://en.wikipedia.org/wiki/Z-order_curve">Z-curve</a> (also known as Morton Code) you could use the <a href="http://en.wikipedia.org/wiki/Hilbert_curve">Hilbert Curve</a>:</p>
<p><a href="http://karussell.files.wordpress.com/2012/05/z-curve2.png"><img title="z-curve" src="http://karussell.files.wordpress.com/2012/05/z-curve2.png?w=500&#038;h=278" alt="" width="500" height="278" /></a></p>
<p>But as you can see, this would make the implementation a bit more complicated and e.g. the orientation of the &#8220;U&#8221; order depends on previous levels -  but the <a href="http://blog.notdot.net/2009/11/Damn-Cool-Algorithms-Spatial-indexing-with-Quadtrees-and-Hilbert-Curves">neighborhood searches would be more efficient</a> &#8211; this is also explained a bit in the <a href="http://www.cs.umd.edu/~hjs/pubs/bulkload.pdf">paper Speeding Up Construction of Quadtrees for Spatial Indexing </a> p.8.</p>
<p>I decided to avoid that optimization. Let me know if you have some working code of <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/geohash/SpatialKeyAlgo.java">SpatialKeyAlgo</a> for it <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ! It should be noted that there are other space filling curves like the ones from <a href="http://en.wikipedia.org/wiki/Space-filling_curve">Peano</a> or <a href="http://en.wikipedia.org/wiki/Sierpi%C5%84ski_curve">Sierpinsky</a>.</p>
<p><strong>One problem</strong></p>
<p>while implementing this was to get the same point out of decode(encode(point)) again. The encoding/decoding schema needs to be &#8220;nearly <a href="http://en.wikipedia.org/wiki/Bijection">bijective</a>&#8221; &#8211; i.e. it should avoid rounding problems as good as possible. To illustrate where I got a problem assume that the point P (e.g. see the image above) is encoded to 1110 &#8211; so we already lost precision, when our point is now at the bottom left of the quadrant 1110. Now if we decode it back we loose again some precision due to normal <em>double</em> rounding errors. If we would encode that point again it could end up as <strong>1001</strong> &#8211; the point moved one quadrant to the right and one to the bottom! To avoid that, you need to define position of the point at the center of the quadrant while decoding. I implemented this simply by adding half of the quadrant width to the latitude and longitude. This makes the encoding/decoding stable even if there are minor rounding issues while decoding.</p>
<p><strong>A nice property of the spatial key</strong></p>
<p>is that one bounding box e.g. for the starting bits at level 6:</p>
<pre>110011</pre>
<p>goes from the bottom left point</p>
<pre>110011 0000..</pre>
<p>to the top-right point</p>
<pre>110011 1111..</pre>
<p>making it easy to request e.g. the surrounding bounding boxes of a point for every level.</p>
<h3>Conclusion</h3>
<p>As we have seen the spatial key is just a different representation of a Geohash with the same properties but uses a lot less memory. The next time you index Geohashes in you DB use a <em>long</em> value instead of a lengthy <em>string</em>.</p>
<p>In the next post you will learn how we can implement a memory efficient spatial data structure with the help of spatial keys. If you want to look at a normal quadtree implemented with spatial keys you can take a look right now <a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/com/graphhopper/trees/QuadTreeSimple.java">at my GraphHopper project</a>. With this quadtree neighborhood searches are approx. twice times slower than one with values for latitude and longitude due to the necessary encoding/decoding. Have a look into the <a href="https://github.com/graphhopper/graphhopper-experiments/blob/master/src/main/java/com/graphhopper/compare/quadtree/App.java">performance comparison project</a>.</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/graphhopper/'>GraphHopper</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3625/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3625/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3625&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/05/23/spatial-keys-memory-efficient-geohashes/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/05/spatial-key2.png" medium="image">
			<media:title type="html">spatial-key</media:title>
		</media:content>

		<media:content url="http://karussell.files.wordpress.com/2012/05/z-curve2.png" medium="image">
			<media:title type="html">z-curve</media:title>
		</media:content>
	</item>
		<item>
		<title>Memory Efficient Java &#8211; Mission Impossible?</title>
		<link>http://karussell.wordpress.com/2012/04/18/memory-efficient-java-mission-impossible/</link>
		<comments>http://karussell.wordpress.com/2012/04/18/memory-efficient-java-mission-impossible/#comments</comments>
		<pubDate>Wed, 18 Apr 2012 10:14:57 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Cpp]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3528</guid>
		<description><![CDATA[First, the normal usage in Java leads to massive memory waste as its pointed out in several articles or here in a nice IBM article. Some examples for the 32 bit JVM: Using 3 * 4 bytes for an object with no member variables. Using 4 * 4 bytes for an empty int[] array! Using &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/04/18/memory-efficient-java-mission-impossible/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3528&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p><strong>First</strong>, the normal usage in Java leads to massive memory waste as its pointed out in several articles or here in a <a href="http://www.ibm.com/developerworks/java/library/j-codetoheap/index.html">nice IBM article</a>.</p>
<p><a href="https://github.com/graphhopper/graphhopper/blob/master/src/main/java/de/jetsli/graph/util/Helper.java#L154">Some examples</a> for the 32 bit JVM:</p>
<ol>
<li>Using 3 * 4 bytes for an object with no member variables.</li>
<li>Using 4 * 4 bytes for an empty int[] array!</li>
<li>Using 4 * 4 + 7 * 4 = 44 (!) bytes for an empty String object</li>
</ol>
<p>Of course <a href="http://www.codeinstructions.com/2008/12/java-objects-memory-structure.html">things are a bit</a> more complicated but in short: the wrapper classes and shorter arrays should be avoided.</p>
<p><strong>Second</strong>, doing it a bit more efficient is relative easy &#8211; just use the primitive collections from the trove project, where wrapper objects can be avoided. Because the standard collection classes are trimmed to be CPU efficient &#8211; not really memory efficient. Also several <a href="http://stuartleneghan.blogspot.com/2012/03/reducing-java-memory-usage-and-garbage.html">JVM tuning parameters</a> could be used to reduce RAM usage.</p>
<p><strong>Thrird</strong>, doing it more efficient or as efficient as in C++ is very complex and in some cases even impossible.</p>
<h3>Let me first explain why it is in some cases impossible to be as efficient as in C++</h3>
<p>When you allocate an <strong>array of primitives</strong> such as int, long or double in Java then the values are stored directly in the array:</p>
<pre>int1
int2
int3
...</pre>
<p>This is good in terms of memory. Now when you allocate an <strong>array of objects</strong> then only the references are stored:</p>
<pre>ref1  --- &gt; points the object (a position somewhere on the heap)
ref2  --- &gt; points to another object
ref3  --- &gt; etc
...</pre>
<p>Imagine that you are using an array of an object <em>Point</em> with only two members latitude and longitude (e.g. both floats). Then you would waste the memory necessary for the reference (4 bytes on  the 32bit JVM) and the object overhead (12 bytes). 16 bytes waste; 8 bytes data. Now assume you need 100 mio points (osm data of Germany) =&gt;<strong> you would waste 1.6 GB RAM</strong> and only if you do it efficient <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p>If you think about this a bit you would probably argue that a clever JVM could inline the objects to avoid the overhead and the additional reference. Well, I think this is impossible.</p>
<h4>For the JVM it is nearly impossible to inline objects in arrays</h4>
<p>You have two solvable problems when you want that the JVM inlines the objects for you:</p>
<ol>
<li>You need one bit indicating if the entry is null (0) or not (1) &#8211; after initializing then all entries are null. Or one could even call a default constructor per configuration or whatever.</li>
<li>The class needs to be final, as otherwise the lengths of one subclass entry could exceed the reserved maximum length.</li>
</ol>
<p>&#8230; but you have one really hard &#8211; unsolvable (?) problem:</p>
<p><em>     3. You would need to change the move/reference semantics in Java &#8211; a no go in my opinion, not only for the language designers.<br />
</em></p>
<p>But why you would need to change the semantics? Well, <a href="http://stackoverflow.com/a/9387719/194609">imagine you have two such inline arrays</a> and you are doing something &#8216;unimportant&#8217;:</p>
<pre class="brush: java; title: ; notranslate">
// The point p refers to the memory in the array. Okay.
Point p = inlineArr1[100];
p.setX(111f);

// Uh, what to do now? In C++ you could define to copy the point into the array.
// In Java you would copy the reference - not possible for 2 inline arrays ...
inlineArr2[100] = p;
inlineArr2[100].setX(222f);
float result = p.getX();
</pre>
<p>What result would we get in the last line? With normal Java arrays you get 222f. But with the inline approach and copy semantics you get 111f &#8211; which would be against every Java-thinking.</p>
<p>Instead of using Java-unintuitive copy semantics one workaround could be to forbid assignments to such inline arrays. Those read-only inline arrays would be harder to use but<strong> still nice to have IMO <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </strong> !</p>
<h3>Memory efficient Java &#8211; via Primitives</h3>
<p>Now let us investigate how we could be in Java as memory efficient as with C++.</p>
<p>The memory problem above is solvable in Java when one would use two float arrays. But it makes programming harder. For example how to swap two of those entries? The normal Java way for the Point class is:</p>
<pre class="brush: java; title: ; notranslate">
Point tmp = arr[2]; arr[2]=arr[3]; arr[3]=tmp;
</pre>
<p>But the array wrapper would make it necessary to create a separate swap method which then swaps the two entries for every point:</p>
<pre class="brush: java; title: ; notranslate">
float tmpx = x[2]; x[2]=x[3]; x[3]=tmpx;
float tmpy = y[2]; y[2]=y[3]; y[3]=tmpy;
</pre>
<p>If you retrieve the raw floats and would swap the entries outside of the wrapper it would be error prone to add more members later to the object (e.g. like a point ID). And you&#8217;ll get probably other problems as well.</p>
<p>But we could put an object oriented wrapper class around it which returns Point objects created from the underlying float arrays, which has the disadvantage of being CPU intensive. Now, in the next part we use a similar idea but operate on more low level arrays which then gives us the possibility to use memory mapped features &#8211; turning our datastructure in a simple storage. Read on!</p>
<h3>Memory efficient Java &#8211; via raw Bytes</h3>
<p>A second solution in Java is to use an object oriented wrapper around one big byte array or a ByteBuffer which then can be even memory mapped:</p>
<pre class="brush: java; title: ; notranslate">
RandomAccessFile file = new RandomAccessFile(fileName, &quot;rw&quot;);
MappedByteBuffer byteArray = file.getChannel().map(FileChannel.MapMode.READ_WRITE, 0, size);
</pre>
<p>The disadvantage is that you will have additional CPU for the conversion and a much more complicated procedure to retrieve and store things. E.g. to retrieve a point you&#8217;ll need to write:</p>
<pre class="brush: java; title: ; notranslate">
Point get(int index) {
  int offset = index * bytesPerPoint;
  return new Point(bytesToFloat(byteArray, offset), bytesToFloat(byteArray, offset + 4));
}
</pre>
<p>or to serialize the object into bytes:</p>
<pre class="brush: java; title: ; notranslate">
void set(Point p) {
  int offset = index * bytesPerPoint;
  writeFloat(byteArray, offset, p.getX());
  writeFloat(byteArray, offset + 4, p.getY());
}
</pre>
<p>The big advantage over the method with a normal primitive array is that then &#8220;loading&#8221; objects on startup takes milliseconds and not seconds or minutes, which is quite cool and very uncommon for java <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<h3>Conclusion</h3>
<p>In my opinion all the memory efficient programming methods for Java are very ugly. This time the simplicity of C++ easily beats Java, because in Java you need to operate on bits and bytes &#8211; in C++ you could just cast the bytes to your objects and be memory efficient as well.</p>
<p>But (sadly?) I&#8217;m a Java guy &#8211; and I still prefer faster development cycles than 100% memory efficiency. Read in some weeks how I&#8217;ve implemented a memory efficient Map of Geohashes or a simple graph in Java for my <a href="http://github.com/graphhopper/graphhopper">GraphHopper</a> project.</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/cpp/'>Cpp</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3528/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3528/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3528&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/04/18/memory-efficient-java-mission-impossible/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>
	</item>
		<item>
		<title>Free Online Graph Theory Books and Resources</title>
		<link>http://karussell.wordpress.com/2012/02/19/free-online-graph-theory-books-and-resources/</link>
		<comments>http://karussell.wordpress.com/2012/02/19/free-online-graph-theory-books-and-resources/#comments</comments>
		<pubDate>Sun, 19 Feb 2012 16:21:04 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Algorithm]]></category>
		<category><![CDATA[Cpp]]></category>
		<category><![CDATA[Java]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3487</guid>
		<description><![CDATA[A link compilation of some Hackernews and Stackoverflow posts and a longish personal investigation. The DaMN book and its companion book Graph Theory with Applications, J.A. Bondy and U.S.R. Murty Graph Theory, Reinhard Diestel Graph Theory Tutorials Digraphs: Theory, Algorithms and Applications, 1st Edition Wikipedia &#8211; Graph Algorithms Algorithms and Complexity, Herbert S. Wilf Lecture &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/02/19/free-online-graph-theory-books-and-resources/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3487&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>A link compilation of some Hackernews and Stackoverflow posts and a longish personal investigation.</p>
<ul>
<li><a href="http://code.google.com/p/graphbook/">The DaMN book</a> and its companion <a href="http://buzzard.pugetsound.edu/sage-practice/">book</a></li>
<li><a href="http://www.math.jussieu.fr/~jabondy/books/gtwa/gtwa.html">Graph Theory with Applications, J.A. Bondy and U.S.R. Murty</a></li>
<li><a href="http://diestel-graph-theory.com/index.html">Graph Theory, Reinhard Diestel</a></li>
<li><a href="http://www.utm.edu/departments/math/graph/">Graph Theory Tutorials</a></li>
<li><a href="http://www.cs.rhul.ac.uk/books/dbook/">Digraphs: Theory, Algorithms and Applications, 1st Edition</a></li>
<li><a href="http://en.wikipedia.org/wiki/Book:Graph_Algorithms">Wikipedia</a> &#8211; Graph Algorithms</li>
<li><a href="http://www.math.upenn.edu/%7Ewilf/AlgoComp.pdf">Algorithms and Complexity</a>, Herbert S. Wilf</li>
<li>Lecture notes <a title="users.utu.fi/harju/graphtheory/graphtheory.pdf" href="http://delicious.com/redirect?url=http%3A//users.utu.fi/harju/graphtheory/graphtheory.pdf" rel="nofollow" target="_blank">Graphtheory </a>by Tero Harju (Finland)</li>
<li>Lecture notes <a title="math.tut.fi/~ruohonen/GT_English.pdf" href="http://delicious.com/redirect?url=http%3A//math.tut.fi/%7Eruohonen/GT_English.pdf" rel="nofollow" target="_blank">Graphtheory</a> by Keijo Ruohonen (Finland)</li>
<li><a href="http://www.mathcove.net/petersen/lessons/index">Lessons at Math Cove</a></li>
<li><a href="http://cr.yp.to/2005-261/bender2/GT.pdf">Basics of Graphs</a></li>
</ul>
<p>Google books</p>
<ul>
<li><a href="http://books.google.co.in/books?id=Yr2pJA950iAC&amp;dq=graph+theory+DEO&amp;printsec=frontcover&amp;source=bl&amp;ots=9dDlNDUAXQ&amp;sig=3Ac2tS0gWjwfqrWlQA5_omd0Xzk&amp;hl=en&amp;ei=R0Y9S9fxDoTu0wT40MCXBQ&amp;sa=X&amp;oi=book_result&amp;ct=result&amp;resnum=5&amp;ved=0CBYQ6AEwBA#v=onepage&amp;q=&amp;f=false">Graph Theory with Application to Engineering and Computer Science, by Narsingh Deo</a></li>
<li><a href="http://books.google.com/books?id=5l5ps2JkyT0C&amp;printsec=frontcover&amp;dq=a+course+in+combinatorics&amp;source=bl&amp;ots=wSYYY6KPuI&amp;sig=mZLrdj0xo2mTxW4MxYt4tW_E-10&amp;hl=en&amp;ei=PoHHTOaROIHGlQegn8ibAg&amp;sa=X&amp;oi=book_result&amp;ct=result&amp;resnum=2&amp;ved=0CCkQ6AEwAQ#v=onepage&amp;q&amp;f=false">A Course in Combinatorics</a></li>
</ul>
<p>Chapters:</p>
<ul>
<li>Chapter 4. <a href="http://algs4.cs.princeton.edu/40graphs/"><em>Algorithms, 4th Edition</em> by Robert Sedgewick and Kevin Wayne</a> &#8211; a nice explanation with Java examples and exercises</li>
<li>Chapter 5. <a href="http://www.leda-tutorial.org/en/unofficial/ch05.html" rel="nofollow">Graphs and graph algorithms</a></li>
<li><a href="http://www.mat.unb.br/clausahm/area/AnAlg-07.2d/Referencias/HowToThinkAboutAlgorithms-Edmonds.pdf">Graph Search Algorithms</a> in the Book &#8220;How to Think About Algorithms&#8221; from Jeff Edmonds</li>
<li><a href="http://www.boost.org/doc/libs/1_48_0/libs/graph/doc/graph_theory_review.html">Boost Docs &#8211; for C++ guys</a></li>
<li>Chapter 7: <a href="http://ww3.algorithmdesign.net/sample/ch07-weights.pdf">Weighted Graphs</a></li>
</ul>
<p>Visualizations</p>
<ul>
<li><a href="http://oopweb.com/Algorithms/Documents/AnimatedAlgorithms/VolumeFrames.html">Dijkstra &amp; Co</a> and some <a href="http://www-b2.is.tokushima-u.ac.jp/~ikeda/suuri/dijkstra/Dijkstra.shtml">more Dijkstra</a></li>
<li><a href="https://github.com/barbeau/AstarVisualizer/wiki">A Star</a></li>
</ul>
<p>None-free</p>
<ul>
<li><a href="http://www.amazon.co.uk/Introduction-Algorithms-T-Cormen/dp/0262533057">Introduction to Algorithms</a></li>
<li>Graphentheorie by Clark and Holten</li>
<li>Chapters from <a href="http://rads.stackoverflow.com/amzn/click/1848000693" rel="nofollow">The Algorithm Design Manual</a>, by Steven S. Skiena</li>
</ul>
<br />Filed under: <a href='http://karussell.wordpress.com/category/algorithm/'>Algorithm</a>, <a href='http://karussell.wordpress.com/category/cpp/'>Cpp</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3487/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3487/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3487&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/02/19/free-online-graph-theory-books-and-resources/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>
	</item>
		<item>
		<title>Rage Against the Android &#8211; Nearly solved! #eclipse #netbeans</title>
		<link>http://karussell.wordpress.com/2012/02/08/rage-against-the-android-nearly-solved-eclipse-netbeans/</link>
		<comments>http://karussell.wordpress.com/2012/02/08/rage-against-the-android-nearly-solved-eclipse-netbeans/#comments</comments>
		<pubDate>Wed, 08 Feb 2012 15:39:43 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Eclipse]]></category>
		<category><![CDATA[Java]]></category>
		<category><![CDATA[Maven]]></category>
		<category><![CDATA[NetBeans]]></category>
		<category><![CDATA[TestDrivenTeaching]]></category>
		<category><![CDATA[Tools]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3505</guid>
		<description><![CDATA[After ranting against Android in my previous post I have mainly solved now the untestable situation Android is producing. Thanks to Robolectric all my tests are finally fast and do not unpredictable hang or similar! The tests execute nearly as fast as in a normal Maven project &#8211; plus additional 2 seconds to bootstrap (dex?). Robolectric &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/02/08/rage-against-the-android-nearly-solved-eclipse-netbeans/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3505&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>After ranting against Android in my <a href="http://karussell.wordpress.com/2012/01/23/rage-against-the-android-eclipse/">previous post</a> I have mainly solved now the untestable situation Android is producing.</p>
<p><strong>Thanks to <a href="http://pivotal.github.com/robolectric">Robolectric</a> all my tests are finally fast and do not unpredictable hang or similar!</strong></p>
<p>The tests execute nearly as fast as in a normal Maven project &#8211; plus additional 2 seconds to bootstrap (dex?). Robolectric was initially made for maven integration on IntelliJ. But it works without Maven and under Eclipse as <a href="http://pivotal.github.com/robolectric/eclipse-quick-start.html">described here</a>. I didn&#8217;t get robolectric <a href="http://groups.google.com/group/robolectric/browse_thread/thread/ac814076c40d9df1">working via Maven under Eclipse</a> &#8211; and it was overly complex e.g. you&#8217;ll need an <a href="http://rgladwell.github.com/m2e-android/">android maven bridge</a> but when you use a normal Android project and a separate <strong>Java</strong> projects for the tests then it works surprisingly well for Eclipse.</p>
<p>Things you should keep in mind <strong>for Eclipse</strong>:</p>
<ul>
<li><a href="http://pivotal.github.com/robolectric/eclipse-quick-start.html">Eclipse and Robolectric description here</a></li>
<li>In both projects put local.properties with sdk.dir=/path/to/android-sdk</li>
<li>Android dependencies should come last for the test project as well as in the pom.xml for the maven integration (see below)</li>
</ul>
<p><strong>But now even better</strong>: when you are doing your setup via Maven you can use NetBeans! Without any plugin &#8211; so even in the latest NetBeans version. But how is the setup done for Maven? <a href="http://pivotal.github.com/robolectric/maven-quick-start.html">Read this!</a> After this you only need to open the project with NetBeans (or IntelliJ).</p>
<p>Things you should keep in mind for <strong>Maven and NetBeans</strong>:</p>
<ul>
<li>Use Maven &gt;= 3.0.3</li>
<li>Again: put the android dependencies last!</li>
<li>Specify the sdk path via bashrc, pom.xml or <a href="http://pivotal.github.com/robolectric/maven-quick-start.html">put it into your settings.xml</a> ! Sometimes this was not sufficient in NetBeans &#8211; then I used ANDROID_HOME=xy in my actions</li>
<li>Make sure &#8220;compile on save&#8221; is disabled for tests too, as I got sometimes strange exceptions</li>
<li>deployment to device can be done to:<br />
mvn -DskipTests=true clean install      # produce apk in target or use -Dandroid.file=myApp.apk in the next step<br />
mvn -Dandroid.device=usb android:deploy # use it<br />
mvn android:run</li>
<li>If you don&#8217;t have a the Android plugin for NetBeans just use &#8216;adb logcat | grep something&#8217; for your UI to the logs</li>
</ul>
<p>Thats it. Back to hacking and Android TDD!</p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/android/'>Android</a>, <a href='http://karussell.wordpress.com/category/eclipse/'>Eclipse</a>, <a href='http://karussell.wordpress.com/category/java/'>Java</a>, <a href='http://karussell.wordpress.com/category/maven/'>Maven</a>, <a href='http://karussell.wordpress.com/category/netbeans/'>NetBeans</a>, <a href='http://karussell.wordpress.com/category/testdriventeaching/'>TestDrivenTeaching</a>, <a href='http://karussell.wordpress.com/category/tools/'>Tools</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3505/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3505/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3505&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/02/08/rage-against-the-android-nearly-solved-eclipse-netbeans/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>
	</item>
		<item>
		<title>Rage Against the Android #eclipse</title>
		<link>http://karussell.wordpress.com/2012/01/23/rage-against-the-android-eclipse/</link>
		<comments>http://karussell.wordpress.com/2012/01/23/rage-against-the-android-eclipse/#comments</comments>
		<pubDate>Mon, 23 Jan 2012 16:44:45 +0000</pubDate>
		<dc:creator>karussell</dc:creator>
				<category><![CDATA[Android]]></category>
		<category><![CDATA[Eclipse]]></category>

		<guid isPermaLink="false">http://karussell.wordpress.com/?p=3476</guid>
		<description><![CDATA[Developing Android applications on Linux with Eclipse sometimes can get really ugly. Sadly neither NetBeans which has a really nice Android plugin, but cannot execute a single test nor IDEA can rescue me or make me switching but probably they wouldn&#8217;t rescue me due to problems of Android development kit itself &#8211; I&#8217;m not sure. &#8230; <span class="more-link"><a href="http://karussell.wordpress.com/2012/01/23/rage-against-the-android-eclipse/">Continue reading &#187;</a></span><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3476&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></description>
				<content:encoded><![CDATA[<p>Developing Android applications on Linux with Eclipse sometimes can get really ugly. Sadly neither NetBeans which has a really nice Android plugin, but cannot execute a single test <a href="http://youtrack.jetbrains.net/issue/IDEA-6814">nor IDEA</a> can rescue me or make me switching <img src='http://s0.wp.com/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' />  but probably they wouldn&#8217;t rescue me due to problems of Android development kit itself &#8211; I&#8217;m not sure.</p>
<p><strong>Update</strong>: <a href="http://karussell.wordpress.com/2012/02/08/rage-against-the-android-nearly-solved-eclipse-netbeans/">Have a look at my solution</a></p>
<p>So, I&#8217;ve collected some of the most common problems I encountered while developing an Android app and how to &#8216;solve&#8217; them:</p>
<ul>
<li><strong>Problem</strong>: Eclipse says &#8216;Your project contains error(s), please fix it before running it.&#8217; and you cannot find a problem.<br />
<strong>Solution</strong>:<br />
1. Open the problem tab. fix the described errors.<br />
2. Make sure that you included all necessary jars in your build path<br />
3. <a href="http://stackoverflow.com/questions/4954316/your-project-contains-errors-please-fix-it-before-running-it">Sometimes</a> even this can help: rm ~/.android/debug.keystore</li>
<li><strong>Problem</strong>: you cannot debug your application<br />
<strong>Solution</strong>:<br />
1. check if debuggable = true in application tag of manifest xml<br />
2. if that does not help or if you are getting &#8220;<em>Can&#8217;t bind to local 8601 for debugger</em>&#8221; in the Console tab then <a href="http://stackoverflow.com/questions/2937532/should-i-worry-about-ddms-console-log-messages-cant-bind-to-local-nnnn-for-deb">read this</a> and make sure you use only the line<br />
127.0.0.1       localhost<br />
in your hosts file. if not, change the file and restart adb (see below)</li>
<li><strong>Problem</strong>: Error &#8220;AdbCommandRejectedException: device not found&#8221;<br />
<strong>Solution</strong>: restart adb (see below)</li>
<li><strong>Problem</strong>: you cannot select one test case to execute<br />
<strong>Solution</strong>: run the whole (android) juni test or a package and then select via right click to debug one single test</li>
</ul>
<p><strong>If nothing seems to help then try one or all of the following steps:</strong><br />
1. restart device<br />
2. restart eclipse<br />
3. restart adb: sudo adb kill-server; sudo adb start-server</p>
<p>Please add your problems and solutions in the comments <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<br />Filed under: <a href='http://karussell.wordpress.com/category/android/'>Android</a>, <a href='http://karussell.wordpress.com/category/eclipse/'>Eclipse</a>  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/karussell.wordpress.com/3476/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/karussell.wordpress.com/3476/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=karussell.wordpress.com&#038;blog=2042483&#038;post=3476&#038;subd=karussell&#038;ref=&#038;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://karussell.wordpress.com/2012/01/23/rage-against-the-android-eclipse/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://2.gravatar.com/avatar/206690a26526f07467ecfd6662f8b152?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">karussell</media:title>
		</media:content>
	</item>
	</channel>
</rss>
