18 June 2018 News

Upgrading Your Hosted Cluster With Our Elasticsearch Service

By Yoav Derazon

There's more to a managed service for Elasticsearch than just good compute and storage infrastructure. Consider upgrading, for example. It's an important part of managing the life cycle of your Elastic Stack deployment. But how easy is upgrading to newer versions on your managed service? In this blog post, we’ll evaluate the upgrade process of our Elasticsearch Service in the context of other offerings out there.

Elastic - the creators of Elasticsearch, Kibana, Beats, and Logstash (or the Elastic Stack) - is continuously innovating and providing more features, fixing bugs, and improving end-user experience with minor releases shipping every 6-8 weeks, and major ones shipping once a year. In addition, Elastic occasionally upgrades deployments to the next minor version in response to newly discovered security and vulnerability issues (no action needed on your part!). We do this because we believe it is important that organizations have the ability to upgrade their production deployments to newer and better versions in a safe and non-service-disrupting way so that they can take full advantage of the latest and greatest features Elasticsearch, Kibana, Beats, and Logstash have to offer.

When it comes to ease of upgrading, not all hosted Elastic Stack services out there are created equal. For example, the Amazon Elasticsearch Service, which is not the same as our Elasticsearch Service, requires a manual procedure involving snapshotting of the data, spinning up a new cluster, uploading the data to the new cluster, and then spinning down the previous cluster. This also involves updating all the clients that access Elasticsearch with the new endpoint URL. Such a complex playbook operation, which is time-consuming and error-prone, makes users opt to not upgrade as frequently since the uptime and reliability of the service take precedence over having new features.

Here’s an example of the upgrade procedure documented for the Amazon Elasticsearch Service:

image3.png

A few observations:

  1. You need to manually create a snapshot of the current cluster. In contrast, our Elasticsearch Service does that for you, as you will see below. This also involves the hassle of dealing with a different endpoint that either needs to have all clients adjusted to point to, or an update to DNS records to account for the new cluster address.
  2. To figure out what you need to do, Amazon points you to the documentation (which is provided by Elastic). 
  3. You need to remember to spin down the old cluster to avoid excess charges. Elasticsearch Service will do that automatically, making sure you incur minimal costs due to multiple clusters running in parallel. This can be significant for large clusters.

Making Upgrades Stupid Easy with Elasticsearch Service

Over half of Elastic Cloud customers are on Elastic Stack version 6.x, demonstrating a relatively high adoption rate of the latest software. That is, in part, because we make it easy to upgrade with the click of a button:

image2.gif

While this may seem like an easy task, what happens behind the scene is a complex set of steps that ensure the cluster is upgraded safely, with no downtime, and with no data loss (Some version-to-version exclusions apply. Keep reading...). Even if something goes wrong, the system can always rollback to the previous version. With Elasticsearch Service on Elastic Cloud, you never have to worry about your data.


At a high level, a rolling upgrade has the following set of steps: 

  1. Running checks to make sure the source and target versions have a well-tested upgrade path, and to warn users of any index and configuration incompatibilities.
  2. Taking a snapshot of the current indices. This is done to ensure that we always have a safe “savepoint” to return too in case the upgrade fails.
  3. Creating a new set of nodes with the new version and waiting for them to join the cluster.
  4. Migrating all shards to the new nodes. This is all done live, if possible, with no snapshot restoration needed.
  5. Routing traffic to the new nodes until all connections to the old cluster are drained.
  6. Deleting the old nodes.

This process is one of four upgrade strategies that has been designed for our Elasticsearch Service and found most appropriate for this kind of deployment model. It is safe and easy to rollback at any point. Elastic Cloud Enterprise, the self-managed solution supports other upgrade strategies that can be selected for the desired trade off between uptime, risk, available spare infrastructure, and speed.

Note:

The above process applies to minor upgrades. For upgrades to a major version, the following pre-step applies:

    1. Verifying that the upgrade can be made by running checks using our migration assistant APIs 

This is required to validate if there are any breaking changes between the versions, such as the need to reindex and/or check for flags and toggles that may not be backward compatible or need a default value change. It can also be used to get educated on deprecated API which may affect integrated systems that consume the indices. In the above scenario, a different strategy will be applied, as a full cluster restart may be required. For example, upgrading to 6.x involves a configuration change that mandates TLS, and this change requires a full cluster restart.

By using our Elasticsearch Service, you let us take care of these nuances and stop any operation that may result in a malfunctioning cluster or data loss.

A glimpse of the action behind the scenes (recent steps top to bottom):

image1.png

Summary

The benefits of using Elasticsearch Service on Elastic Cloud:

  • Minimal downtime. Zero downtime when upgrading to a minor version, and faster process when upgrading to a major version
  • Always the latest version. By removing barriers to upgrade, users can upgrade more frequently and enjoy features, enhanced performance, and bug fixes
  • Time-savings. No need for a complex playbook to plan an upgrade. Plus, you get to keep the same endpoint
  • Improved security. Automatic upgrades in case of newly discovered vulnerabilities ensure a safer operation
  • Reduced risk. Automatic rollback and robust snapshotting reduces the risk of data loss or extended downtime

Simplify your upgrade process and stay on the latest version of the stack by switching to Elastic Cloud with a free 14 day trial!