21 January 2014 Engineering

Introducing Snapshot & Restore

By Igor Motov

In the last year, we saw a tremendous increase in adoption of Elasticsearch by many companies. As more and more companies are using Elasticsearch as an integral part of their business, high availability of Elasticsearch becomes increasingly important. With the help of automatic replication and failover, Elasticsearch provides a stable, highly available search and analytics platform. However, while replication can protect a cluster from hardware failures, it doesn’t help when someone accidentally deletes an index. Anyone that relies on an Elasticsearch cluster needs to perform regular backups.

It has always been possible to backup an Elasticsearch cluster. However, until version 1.0 the backup process involved turning off index flushing, identifying locations of primary shards on the file system, copying the data and then remembering to turn on flushing again. We believe that simple is best, and the previous backup process in Elasticsearch didn’t quite fit the definition of simple. That’s why in v1.0 we are introducing a new Snapshot & Restore API that should make backup process much easier.

In v1.0, backup is a simple and straightforward process. First, Elasticsearch needs to know where to backup data, which is done by registering a backup repository:

$ curl -XPUT 'http://localhost:9200/_snapshot/my_backup' -d '{
  "type": "fs",
  "settings": {
    "location": "/mount/backups/my_backup",
    "compress": true
  }
}'

Currently, we support file system and HDFS repositories. Support for Azure is coming soon. Once Elasticsearch knows about a repository, it’s possible to make a backup of the entire cluster with a single command:

$ curl -XPUT "localhost:9200/_snapshot/my_backup/snapshot_1?wait_for_completion=true"

Snapshots can be created on a live cluster that continues to perform indexing and search operations. A snapshot captures the point-in-time view of the index at the moment when a snapshot process has started. It makes the backup image of the index consistent. Restore is even simpler:

$ curl -XPOST "localhost:9200/_snapshot/my_backup/snapshot_1/_restore?wait_for_completion=true"

It’s possible to restore indices within a live cluster as well. However, indices have to be closed prior to restore. We are planning to make it possible to restore open read-only indices in a future release.

Both backup and restore operations are incremental, which means that only files that changed since the last snapshot will be copied into the repository or restored into an index. Incremental snapshots allow performing the snapshot operation as frequently as needed without too much disk space overhead. Users can now easily create a snapshot before upgrade or a risky change in the cluster and quickly rollback to the previous index state if things go wrong. The snapshot/restore mechanism can be also used to synchronize data between a “hot” cluster and a remote, “cold” backup cluster in a different geographic region for fast disaster recovery.

We are very excited about this new feature. We like to think of incremental backup as a time machine for your data. We are confident that everyone who relies on Elasticsearch as a critical component in their system and cannot afford down time for re-indexing will find the new Snapshot and Restore mechanism really helpful.

We welcome your feedback – try out the Snapshot & Restore API and let us know what you think!

Update: January 31, 2014: Fixed information on support for Azure.