Filed under Solr

Detect Stolen and Duplicate Tweets with Solr

A new feature “duplication detection” is implemented for jetwick and seems to work pretty good thanks to the great performance of Solr. To try it, go to this tweet and click on the ‘Find Similar’/’Guttenberg’ button below the tweet to investigate existing duplicates. With that feature it is possible for jetwick to skip spam, identify … Continue reading

Use cases of faceted search for Apache Solr

GraphHopper – A Java routing engine karussell ads In this post I write about some use cases of facets for Apache Solr. Please submit your own ideas in the comments. This post is splitted into the following parts: What are facets? How do you enable and use simple facets? What are other use cases? Category … Continue reading

Poor Man’s Monitoring for Solr

For jetwick I’m the developer, PR agent and sadly also the admin ;-). All in one, at once. Here is a minor snippet to get an alert email if your solr index is either not available or countains too few entries. And get a resolved mail if all is fine again. add via this via … Continue reading

jsii – full text search in 1K LOC of JavaScript!

In the previous blog post I tried to introduce node.js and its nice features. Today I will introduce my little search engine prototype called jsii (javascript inverted index). jsii provides an in-memory inverted index within approx. 1000 lines of JavaScript. Some more lines are necessary to set up a server via node.js, so that the … Continue reading

Feeding Solr with its own Logs

I always looked for a simple way to visualize our log data e.g. from solr. At that time I had in mind a combination of gnuplot and some shellscripts but this session from the lucene revolution changed my idea. (Look here for all videos from lucene revolution.) I thought: “hey thats it! Just put the … Continue reading

My Links for Apache Solr 1.4

Here is my Solr/Lucene Link list. Last update: Oct’ 2010 Solr Feature and Get Started Overview Solr FAQ, Solr Articles, Solr Wiki ApacheCon Lucid Works reference guide cheat sheet from a stackoverflow user Analyzers, Tokenizers, Token Filters Auto suggestion: via ngram, the jetwick method like described here or TermsComponent and ShingleFilter One link in the … Continue reading