Back up a cluster’s dataedit

To back up your cluster’s data, you can use the snapshot API.

A snapshot is a backup taken from a running Elasticsearch cluster. You can take snapshots of an entire cluster, including all its data streams and indices. You can also take snapshots of only specific data streams or indices in the cluster.

You must register a snapshot repository before you can create snapshots.

Snapshots can be stored in either local or remote repositories. Remote repositories can reside on Amazon S3, HDFS, Microsoft Azure, Google Cloud Storage, and other platforms supported by a repository plugin.

Elasticsearch takes snapshots incrementally: the snapshotting process only copies data to the repository that was not already copied there by an earlier snapshot, avoiding unnecessary duplication of work or storage space. This means you can safely take snapshots very frequently with minimal overhead. However, snapshots are also logically independent: deleting a snapshot does not affect the integrity of any other snapshot.

If your cluster has Elasticsearch security features enabled, when you back up your data the snapshot API call must be authorized.

The snapshot_user role is a reserved role that can be assigned to the user who is calling the snapshot endpoint. This is the only role necessary if all the user does is periodic snapshots as part of the backup procedure. This role includes the privileges to list all the existing snapshots (of any repository) as well as list and view settings of all indices, including the .security index. It does not grant privileges to create repositories, restore snapshots, or search within indices. Hence, the user can view and snapshot all indices, but cannot access or modify any data.

For more information, see Security privileges and Built-in roles.