Best practices in AWS

This section contains some other information about designing and managing an Elasticsearch cluster on your own AWS infrastructure. If you would prefer to avoid these operational details then you may be interested in a hosted Elasticsearch installation available on AWS-based infrastructure from https://www.elastic.co/cloud.

Storage

EC2 instances offer a number of different kinds of storage. Please be aware of the following when selecting the storage for your cluster:

Instance Store is recommended for Elasticsearch clusters as it offers excellent performance and is cheaper than EBS-based storage. Elasticsearch is designed to work well with this kind of ephemeral storage because it replicates each shard across multiple nodes. If a node fails and its Instance Store is lost then Elasticsearch will rebuild any lost shards from other copies.
EBS-based storage may be acceptable for smaller clusters (1-2 nodes). Be sure to use provisioned IOPS to ensure your cluster has satisfactory performance.
EFS-based storage is not recommended or supported as it does not offer satisfactory performance. Historically, shared network filesystems such as EFS have not always offered precisely the behaviour that Elasticsearch requires of its filesystem, and this has been known to lead to index corruption. Although EFS offers durability, shared storage, and the ability to grow and shrink filesystems dynamically, you can achieve the same benefits using Elasticsearch directly.

Choice of AMI

Prefer the Amazon Linux 2 AMIs as these allow you to benefit from the lightweight nature, support, and EC2-specific performance enhancements that these images offer.

Networking

Smaller instance types have limited network performance, in terms of both bandwidth and number of connections. If networking is a bottleneck, avoid instance types with networking labelled as Moderate or Low.
It is a good idea to distribute your nodes across multiple availability zones and use shard allocation awareness to ensure that each shard has copies in more than one availability zone.
Do not span a cluster across regions. Elasticsearch expects that node-to-node connections within a cluster are reasonably reliable and offer high bandwidth and low latency, and these properties do not hold for connections between regions. Although an Elasticsearch cluster will behave correctly when node-to-node connections are unreliable or slow, it is not optimised for this case and its performance may suffer. If you wish to geographically distribute your data, you should provision multiple clusters and use features such as cross-cluster search and cross-cluster replication.

Other recommendations

If you have split your nodes into roles, consider tagging the EC2 instances by role to make it easier to filter and view your EC2 instances in the AWS console.
Consider enabling termination protection for all of your data and master-eligible nodes. This will help to prevent accidental termination of these nodes which could temporarily reduce the resilience of the cluster and which could cause a potentially disruptive reallocation of shards.
If running your cluster using one or more auto-scaling groups, consider protecting your data and master-eligible nodes against termination during scale-in. This will help to prevent automatic termination of these nodes which could temporarily reduce the resilience of the cluster and which could cause a potentially disruptive reallocation of shards. If these instances are protected against termination during scale-in then you can use shard allocation filtering to gracefully migrate any data off these nodes before terminating them manually. Refer to Index-level shard allocation settings.