19 décembre 2016 Technique

Migrating to Elastic Cloud

Par Dara GiesTyler Fontaine

Whether you’re trying to reduce operational costs, maintain uptime or generally simplify things, consider Elastic Cloud, Elastic’s very own hosted Elasticsearch and Kibana offering. Elastic Cloud offers ease-of-use and provides the latest Elasticsearch and Kibana releases, version 5.1 at the time this was written, with the newest features, enhancements and performance improvements. Also included are X-Pack features, such as security, monitoring, alerting & notification, Graph, and reporting. Elastic Cloud is backed by Elastic’s excellent support team and by the engineers who develop Elasticsearch and Kibana.

Interested in trying out Elastic Cloud? Let us show you how you can migrate your data to our hosted offering, regardless of whether you currently self-host Elasticsearch on your own metal,  run Elasticsearch in some other cloud offering, such as EC2, or use one of the many Elasticsearch hosting services that exist, such as the Amazon Elasticsearch Service.

Migration Approaches

If you have existing Elasticsearch indexes and are considering migrating to Elastic Cloud, there are a few approaches to consider. Regardless of the approach, keep in mind that you first have to configure your  Elastic Cloud target cluster. Creating a new cluster is very easy, but cluster settings like scripts and index settings like custom analyzers and mappings aren’t automatically copied over and must be configured prior to migrating index data.

Feed From The Source

If you have a copy of the original data some place other than your Elasticsearch cluster, then it might be simplest to load from there. Create an Elastic Cloud cluster, configure it, and feed away. Ingesting into an Elastic Cloud cluster is no different than feeding a cluster running on your own servers. You can use Logstash, Beats, the Elasticsearch Clients or any other means at your disposal.

If the original source isn’t available, is cumbersome to access, has changed, or is no longer up to date, then two other migration approaches are available: reindexing from remote or snapshotting and restoring.

Reindex From A Remote Cluster

Elasticsearch 5.x provides the ability to index from a remote cluster with the reindex API. With reindexing, it’s possible to spin up a cluster in Elastic Cloud and index data from an existing remote cluster.  First create the destination index on the cluster in Elastic Cloud and then use reindex with the `remote` option to pull data from your old cluster and index it into the current cluster, taking advantage of the latest data structures, resiliency and indexing performance Elasticsearch 5.1 improvements.

Reindexing effectively rebuilds the index from scratch so it is more expensive to run than _snapshot, which simply copies complete indexes. The tradeoff is that restored indexes can't take advantage of those nice data structures because they are not rebuilt.

Restore Snapshots From S3

It’s also possible to restore Elasticsearch snapshots sitting in S3 buckets. With this approach, snapshot to an S3 bucket from an existing Elasticsearch cluster, spin up an Elasticsearch cluster in Elastic Cloud, and then restore the snapshot from S3 into Elastic Cloud.

Bear in mind that you’re not limited to restoring snapshots from S3 buckets. Snapshots that exist on any addressable external storage can be restored into Elastic Cloud.

How to Reindex from Remote

Reindex from Elasticsearch on EC2 into Elastic Cloud

Log into Elastic Cloud and create a new cluster, minimally the same size as your Elasticsearch on EC2 cluster or larger if you anticipate growth. Be sure to enable Kibana so you have access to the console.

From your Elastic Cloud cluster, you can issue the reindex command to index data stored remotely in Elasticsearch on EC2. The reindex command requires access to port 9200 or 9243, so you should update the Security Group associated with the EC2 Elasticsearch cluster and add an inbound custom TCP entry for port 9200 or 9243, allowing access to the Elastic Cloud clusters.

Issue the reindex command:

POST reindex
{
  "source": {
    "remote": {
      "host": "http://[ec2 public hostname]:9200"
    },
    "index": "bank",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "bank"
  }

Verify that your documents were indexed:

GET /bank/accounts/_search?pretty

Reindex from the Amazon Elasticsearch Service into Elastic Cloud

Presently, the most recent Elasticsearch version available on the Amazon Elasticsearch Service is 2.3, a major version behind the current Elasticsearch release. If you would like to take advantage of the most recent Elasticsearch 5.1 features, you can reindex directly into Elastic Cloud from the Amazon Elasticsearch Service.

The reindex command is the same used in the Elasticsearch on EC2 example, with one minor change. The Amazon Elasticsearch Service proxies requests to port 9200 or 9243, so it’s necessary to specify port 80 (HTTP) or port 443 (HTTPS) with the host. Not providing the port will result in an error.

HTTPS Example

POST reindex
{
  "source": {
    "remote": {
      "host": "https://[aws es public hostname]:443"
    },
    "index": "bank",
    "query": {
      "match_all": {}
    }
  },
  "dest": {
    "index": "bank"
  }
}

How to Restore Snapshots to An Elastic Cloud Cluster

In this example, we’ll show you how to snapshot an index from EC2 into an S3 bucket and then restore the snapshot from the S3 bucket into an Elastic Cloud cluster. One thing to note is that S3 buckets shared between EC2 and Elastic Cloud clusters need to reside in the same region.

Step I - Snapshot from Elasticsearch on EC2 into S3

First you’ll need to create an S3 bucket using AWS. In this example, we’ve created an S3 bucket using the name for the region "US Standard". “US Standard” is synonymous with “US-East”.

You’ll need to edit the bucket policy and should use these recommended S3 permissions:

{
    "Id": "Policy1477318346906",
    "Version": "2012-10-17",
    "Statement": [
    {
        "Sid": "Stmt1477318310202",
        "Action": "s3:*",
        "Effect": "Allow",
        "Resource": "arn:aws:s3:::[bucket name]/*",
        "Principal": "*"
    }
    ]
}

Next, we’ll need to create a snapshot repository within the Elasticsearch cluster running on EC2:

curl -XPUT 'localhost:9200/_snapshot/[repository name]?verify=false&pretty' -d'
{
  "type": "s3",
  "settings": {
      "access_key": "[access key]",
"secret_key": "[secret key]",
    "bucket": "[bucket name]"
  }
}'

Now, let’s go ahead and snapshot the EC2 index into the repository:

curl -XPUT  'localhost:9200/_snapshot/[repository name]/snapshot_1?pretty' -d'<span></span>{<span></span>  "indices": "*",<span></span>  "ignore_unavailable": true,<span></span>  "include_global_state": true<span></span>}'

Now, verify the index snapshot exists in S3 by going into the AWS S3 console and viewing the contents of the bucket.

Step II - Restore Snapshot from S3 into Elastic Cloud

First, we’ll need to create the Snapshot Repository for the Elasticsearch cluster running in Elastic Cloud:

curl -u "elastic:ViOpqfQUW2KeU5n5SF1qzW7g" -XPUT '[elastic cloud instance host]:9200/_snapshot/[repository name]?verify=false&pretty' -d'
{
  "type": "s3",
  "settings": {
      "access_key": "[access key]",
        "secret_key": "[secret key]",
    "bucket": "[bucket name]"
  }
}'

Next, we’ll restore the snapshot on S3 into Elastic Cloud. If an index already exists with the name of the index being restored into, delete the index before restoring the snapshot.

curl -u "elastic:[user id]" -XDELETE '[elastic cloud instance host]:9200/bank’

Now, restore the snapshot:

curl -u "elastic:[user id]" -POST '[elastic cloud instance host]:9200/_snapshot/[repository name]/snapshot_1/_restore?pretty' -d'
{
"indices": "*",
"ignore_unavailable": true,
"include_global_state": true
}'

Finally, verify the index has been restored into Elastic Cloud by running a query:

curl -u "elastic:[user id]" -GET '[elastic cloud instance host]:9200/bank/_search?pretty'

While it’s possible to restore a snapshot from 2.4.1 into 5.1, if there are any index features that changed or were breaking changes, the restore will fail. Also, if they have any indices created in 1.x, those can’t be restored into 5.1.

The safest approach is to restore into the same version, then upgrade to 5.1.

Summary

Ready to give it a try? A free 14 day trial is available on us. No credit card required.

Ready to make the switch but not sure about pricing? Pricing is a snap with the pricing calculator. Select cluster size, region and high availability preferences and the price is calculated instantly. Sign up for an annual agreement and receive a 15% discount.