Plan for productionedit

Elasticsearch Service supports a wide range of configurations. With such flexibility comes great freedom, but also the first rule of deployment planning: Your deployment needs to be matched to the workloads that you plan to run on your Elasticsearch clusters and Kibana instances. Specifically, this means two things:

Does your data need to be highly available?edit

With Elasticsearch Service, your deployment can be spread across as many as three separate availability zones, each hosted in its own, separate data center. Why this matters:

  • Data centers can have issues with availability. Internet outages, earthquakes, floods, or other events could affect the availability of a single data center. With a single availability zone, you have a single point of failure that can bring down your deployment.
  • Multiple availability zones help your deployment remain available. This includes your Elasticsearch cluster, provided that your cluster is sized so that it can sustain your workload on the remaining data centers and that your indices are configured to have at least one replica.
  • Multiple availability zones enable you to perform changes to resize your deployment with zero downtime.

We recommend that you use at least two data centers for production and three for mission-critical systems. Just one zone might be sufficient, if your Elasticsearch cluster is mainly used for testing or development, but should not be used for production.

The data in your Elasticsearch clusters is also backed up every 30 minutes, 4 hours, or 24 hours, depending on which snapshot interval you choose. These regular intervals provide an extra level of redundancy. We do support snapshot and restore, regardless of whether you use one, two, or three data centers. However, with only a single data center and in the event of an outage, it might take a while for your cluster come back online. Using a single availability zone also leaves your cluster exposed to the risk of data loss, if the backups you need are not useable (failed or partial snapshots missing the indices to restore) or no longer available by the time that you realize that you might need the data (snapshots have a retention policy).

Clusters that use only one availability zone are not highly available and are at risk of data loss. To safeguard against data loss, you must use at least two data centers.

Do you know when to scale?edit

Knowing how to scale your deployment is critical, especially when unexpected workloads hits. Don’t forget to check your performance metrics to make sure your deployments are healthy and can cope with your workloads.

Scaling with Elasticsearch Service is easy: simply log in to the Elasticsearch Service Console, select your deployment, select edit, and either increase the number of zones or the size per zone.

Memory tends to be the limiting factor for Elasticsearch. If you would like to learn more about why memory is so important for Elasticsearch, we’ve got an in-depth on article on Elasticsearch and memory that explains this topic in detail. We also recommend reading Sizing Elasicsearch: Scaling up and out to identify which questions to ask yourself when determining which cluster size is the best fit for your Elasticsearch use case.