Although there is a gateway feature implemented in ElasticSearch which basically recovers your index on start if it is corrupted or similar it is wise to create backups if there are bugs in Lucene or ElasticSearch (assuming you have set the fs gateway). The backup script looks as follows and uses the possibility to enable and disable the flushing for a short time:
# TO_FOLDER=/something # FROM=/your-es-installation DATE=`date +%Y-%m-%d_%H-%M` TO=$TO_FOLDER/$DATE/ echo "rsync from $FROM to $TO" # the first times rsync can take a bit long - do not disable flusing rsync -a $FROM $TO # now disable flushing and do one manual flushing $SCRIPTS/es-flush-disable.sh true $SCRIPTS/es-flush.sh # ... and sync again rsync -a $FROM $TO $SCRIPTS/es-flush-disable.sh false # now remove too old backups rm -rf `find $TO_FOLDER -maxdepth 1 -mtime +7` &> /dev/null
E.g. you could call the backup script regularly (even hourly) from cron and it will create new backups. By the way – if you want to take a look on the settings of all indices (e.g. to check the disable flushing stuff) this might be handy:
curl -XGET 'localhost:9200/_settings?pretty=true'
Here are the complete scripts as gist which I’m using for my jetslide project.
Pingback: Jetslide uses ElasticSearch as Database « Find Time for the Karussell
great post,but i am using local gateway~ 🙂
Yes, but are you doing if ES cannot start from the last snapshot? E.g. a bug or if the fs got corrupted?