Here is my Solr/Lucene Link list. Last update: Oct’ 2010
Solr
Feature and Get Started Overview
- Solr FAQ, Solr Articles, Solr Wiki
- ApacheCon
- Lucid Works reference guide
- cheat sheet from a stackoverflow user
- Analyzers, Tokenizers, Token Filters
- Auto suggestion: via ngram, the jetwick method like described here or TermsComponent and ShingleFilter
One link in the mailing list. Second Link. - Spellchecking
Query
- Solritas – better than xml responses
- Common Query Parameters
- Query Syntax
- Function Query
- DisMaxRequestHandler
- Relevancy FAQ, Relevancy Cookbook
- Debugging Relevance Issues
explain order via &debugQuery=true
and combine with &explainOther=id:juggernaut
or via the pseudo ‘score’ field: &fl=*,score - Mini cheat sheet
greater than:
solr/select?q=number:[30 TO *]
greater than by date:
solr/select?q=timestamp:[NOW-30DAY TO *]
sort asc or desc (more sort parameters separated with ,)
solr/select?q=*:*&sort=number asc
quick deleteAll from URL:
http://localhost:8983/solr/update?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E
http://localhost:8983/solr/update?stream.body=%3Ccommit/%3E
Multiple Cores
- mutliple indices specific for webcontainer e.g tomcat
(different logging in tomcat) - multicores implemented in solr with reload, merge (!), …
- lots of cores (solr 4.0 !)
Facetting/Navigators
- Simple Facet Parameters keep in mind: with facet.query you can create facet queries which then can use fq ( filter query )
- Multifaceting, Local parameters
- packpub
- Find out missing facet values: q=-field_name:[* TO *]
Grouping/Field Collapsing
- Caching -> performance boost: set HashDocSet to 0.005 of all documents!
Statistics with the StatsComponent
Updating/Indexing
- Indexing via jdbc, Xml, Json, Csv
- Update via Xml or Csv over http
- packtpub1, packtpub2
Replication for Solr >1.4
- See SOLR-561 for more information.
- Scaling article
- Dashboard via solr/admin/replication/index.jsp
- index version via solr/replication?command=details (if we would use ?indexversion this would always return 0?)
- linux script to monitor health of replication
- bugs: SOLR-1781 (and SOLR-978)
Get source via:
- svn co http://svn.apache.org/repos/asf/lucene/dev/trunk/
apply a patch via:patch -p0 < patch-file-name-here
ant clean compile test
Tips and Tricks
- If you have heavy commits (‘realtime upates’) don’t miss to read this thread about ‘Tuning Solr caches with high commit rates (NRT)’ from Peter Sturge
Lucene
Lucene FAQ
When to prefer Lucene over Solr? Or should I use Hibernate Search?