Feeding Solr with its own Logs

I always looked for a simple way to visualize our log data e.g. from solr. At that time I had in mind a combination of gnuplot and some shellscripts but this session from the lucene revolution changed my idea. (Look here for all videos from lucene revolution.)

I thought: “hey thats it! Just put the logs into solr!” So I coded something which simply reads the log files and named it Sogger. Without sharding, without message queues, … but it should work on real systems without any changes to your system (but probably to sogger).

I hope Sogger doesn’t suck, but it does not come with any warranty, so use it with care! And: It is only a proof of concept – nothing comparable to the guys from loggly.com

To get your logs sogged:

  • Download the ‘Sogger’ code via:
    hg clone http://timefinder.hg.sourceforge.net/hgroot/timefinder/sogger sogger-code
  • Download the Solr from trunk.
    svn co -r  1023329 https://svn.apache.org/repos/asf/lucene/dev/trunk solr-code

    Sogger doesn’t necessarily need the trunk version but I didn’t tested it for others yet

  • compile solr and Sogger with ant
  • cd solr-code/solr/example/
  • copy solrconfig.xml, schema.xml from Sogger into solr/conf
  • copy the *.vm files from Sogger into the files at solr/conf/velocity/
  • start solr
    java -jar start.jar
  • start feeding your logs
    cd sogger-code/
    java -jar dist/Sogger.jar url=http://localhost:8983/solr logFile=data/solr.2010-10-25.log.gz
  • to search your logs do:

Now you should see something like this

Sogger has several advantages over simple “grep-ing” or scripting with your solr logs:

  • full text search. near real time: ~1min 😉
  • performance. I hope commiting every minute does not make solr a lot slower
  • filtering by log level: Quickly find warnings and exceptions
  • filtering by webapp: If you have multiple apps or solr cores which are logging into the same file filtering is really easy with solr (with grep too, but you’ll have to re-grep the whole log …)
  • open source: you can change the feeding method I used and take care of your special needs. Tell me if you need assistance!
  • new log lines will be detected and commited ala tail -f
  • besides text files sogger accepts and detects compressed (zip, gzip/gz) files ala zgrep. So you don’t need to change your log handlers or preprocess the files.

to do’s:

  • make the log format customizable within a property file:
    line1=regular expression pattern1
    line2=regular expression pattern2
  • read and monitor multiple log files
  • make it a solr plugin via special UpdateHandler?
  • a xy plot (or barchart) in velocity for some facets or facet queries would be nice. Something like I had done before with wicket.
  • I don’t like velocity … althought it is sufficient for this … but should we use wicket!?

6 thoughts on “Feeding Solr with its own Logs

  1. Here’s a trick: when there are multi-line exception prints, the second-to-n lines are often repeated. The usual pattern is:

    timestamp problem id 12345
    Line 15 file bar
    Line 3 file foo
    timestamp id 67890
    Line 15 file bar
    Line 3 file foo

    You can make the 2-n lines a separate facet. This allows you to say “the xyz problem happened 40 times” because they all have the same stacktrace.

  2. oh, yes I neglected the “multi-line exception prints”. Do you have code which would parse them? (would you mind contribute, I can also push it to github if appropriate)

  3. The ‘parsing’ was just scanning to the end of the line of the log, and then if there are more lines just store the whole character sequence in a string field.

    I did index them also, and the exception names and file names were also useful. But you don’t need any parsing for that, one of the lucene analyzers does all of the work.

  4. Looking forward to working more closely with Sogger; it is perfect for what I need.

    Currently I’m having trouble running it; the error I get:

    Exception in thread “main” java.lang.NoClassDefFoundError: org/apache/solr/client/solrj/request/RequestWriter
    Caused by: java.lang.ClassNotFoundException: org.apache.solr.client.solrj.request.RequestWriter
    at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
    at java.security.AccessController.doPrivileged(Native Method)
    at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
    at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
    Could not find the main class: sogger.Feeder. Program will exit.

    I have both the solrj jar and the current directory . in my classpath.

Comments are closed.