Migrating your Elasticsearch dataedit

You might have switched to Elastic Cloud Enterprise for any number of reasons and you’re likely wondering how to get your existing Elasticsearch data into your new infrastructure. Along with easily creating as many new deployments with Elasticsearch clusters that you need, you have several options for moving your data over. Choose the option that works best for you:

  • Index your data from the original source, which is the simplest method and provides the greatest flexibility for the Elasticsearch version and ingestion method.
  • Reindex from a remote cluster, which rebuilds the index from scratch.
  • Restore from a snapshot, which copies the existing indices.

One of the many advantages of Elastic Cloud Enterprise is that you can spin up a deployment quickly, try out something, and then delete it if you don’t like it. This flexibility provides the freedom to experiment while your existing production cluster continues to work.

Before you beginedit

Depending on which option that you choose, you might have limitations or need do some preparation beforehand.

Indexing from the source
The new cluster must be the same size as your old one, or larger, to accommodate the data.
Reindex from a remote cluster
You must use the same size or larger cluster, and the same Elasticsearch version. Depending on your security settings for your old cluster, you might need to temporarily allow TCP traffic on port 9243 for this procedure. If needed, you can upgrade separately.
Restore from a snapshot
You must use the same size or larger cluster, and the same Elasticsearch version. If you have not already done so, you will need to set up snapshots for your old cluster. The repository bucket must be in the same region as your new Elasticsearch cluster. If needed, you can upgrade separately.

Before you migrate your Elasticsearch data, define your index mappings on the new cluster. Index mappings are unable to migrate during reindex operations.

Index from the sourceedit

If you still have access to the original data source, outside of your old Elasticsearch cluster, you can load the data from there. This might be the simplest option, allowing you to choose the Elasticsearch version and take advantage of the latest features. You have the option to use any ingestion method that you want—​Logstash, Beats, the Elasticsearch clients, or whatever works best for you.

If the original source isn’t available or has other issues that make it non-viable, there are still two more migration options, getting the data from a remote cluster or restoring from a snapshot.

Reindex from a remote clusteredit

Through the Elasticsearch reindex API, available in version 5.x and later, you can connect your new Elastic Cloud Enterprise deployment remotely to your old Elasticsearch cluster. This pulls the data from your old cluster and indexes it into your new one. Reindexing essentially rebuilds the index from scratch and it can be more resource intensive to run.

  1. Log into the Cloud UI.
  2. Select a deployment or create one.
  3. From the API Console or in the Kibana Console app, create the destination index on Elastic Cloud Enterprise.
  4. Copy the index from the remote cluster:

    POST _reindex
    {
      "source": {
        "remote": {
          "host": "https://REMOTE_ELASTICSEARCH_ENDPOINT:PORT",
          "username": "USER",
          "password": "PASSWORD"
        },
        "index": "INDEX_NAME",
        "query": {
          "match_all": {}
        }
      },
      "dest": {
        "index": "INDEX_NAME"
      }
    }
  1. You can also verify the new index:

    GET INDEX-NAME/_search?pretty

Restore from a snapshotedit

If you cannot connect to a remote index for whatever reason, such as if it’s in a non-working state, you can try restoring from the most recent working snapshot.

  1. On your old Elasticsearch cluster, choose an option to get the name of your snapshot repository bucket:

    GET /_snapshot
    GET /_snapshot/_all
  2. Get the snapshot name:

    GET /_snapshot/NEW-REPOSITORY-NAME/_all

    The output for each entry provides a "snapshot": value which is the snapshot name.

      {
      "snapshots": [
        {
          "snapshot": "scheduled-1527616008-instance-0000000004",
  3. From the Cloud UI of the new Elasticsearch cluster, add the snapshot repository:

    PUT /_snapshot/NEW_REPOSITORY_NAME
    {
        "type" : "s3",
        "settings" : {
          "base_path" : "/snapshots/[CLUSTER_ID]",
          "compress" : "true",
          "region" : "DEPLOYMENT_REGION",
          "bucket" : "[RANDOM_STRING]"
        }
      }
    }
  4. From the Cloud UI of the new Elasticsearch cluster, restore the snapshot:

    POST /_snapshot/REPOSITORY_NAME/SNAPSHOT_NAME/_restore?pretty
    {
    "indices": "*",
    "ignore_unavailable": true,
    "include_global_state": true
    }
  5. Verify that the index is restored in your Elastic Cloud Enterprise deployment with this query:

    GET INDEX_NAME/_search?pretty