Congratulations on signing up for an Elastic Cloud account!
To help you get the most out of your Elasticsearch cluster, we’ve created this checklist for you. If you follow this preflight checklist you will be well on your way to maximizing your cluster’s performance, reliability, and security. This is a good time to sit down with a cup of tea and make sure you’re ready to deploy Elasticsearch to production.
Knowing how to scale your app is critical, especially when unexpected load hits. Scaling with Elastic Cloud is easy: simply sign in, visit the configuration page, and drag the memory slider to the desired level. Memory tends to be the limiting factor in Elasticsearch databases. CPU resources and disk IO are scaled up proportionally with memory as your cluster is resized. If you would like to learn more about why memory is so important for Elasticsearch, we’ve got an in-depth on article on Elasticsearch and memory that explains this topic in detail. We also recommend reading Sizing Elasicsearch: Scaling up and out to identify which questions to ask yourself when determining which cluster size is the best fit for your Elasticsearch use case.
For production apps it is highly recommended that at least two availability zones be used for Elasticsearch clusters. Availability zones can encounter issues with availability, which is why Amazon recommends splitting applications across multiple availability zones. With Elastic Cloud, your Elasticsearch cluster can be spread across as many as three separate availability zones with only a few clicks. This means that short of events that have concurrent impact on several regions, your search cluster will remain available so long as multiple availability zones are enabled on your Elastic Cloud cluster.
We also want to let you know that in the unlikely event of multiple availability zones completely disappearing, your data is backed up to Amazon S3 every 30 minutes for an extra level of redundancy. We take backups seriously at Elastic Cloud, and we’re careful to have multiple levels of redundancy for your data. This is the case even if only a single availability zone is enabled on your account. Whether you have one, two, or three availability zones turned on, your data is safe, though it may take a while to come back online in the event of a data-center outage with only a single availability zone enabled.
Clusters that use only one availability zone are not highly available and can be at risk of data loss, if the backups they rely on become corrupted or are unavailable. To safeguard against data loss, you must use at least two data centers.
Elastic Cloud also supports Elasticsearch’s Snapshot and Restore feature. To learn more, see Snapshot and Restore.
The first step in securing your cluster is to ensure that your app is accessing it via SSL. Always use secure HTTPS connections over port 9243. We still allow HTTP connections over port 9200, but we recommend against them and no longer list the HTTP endpoint on the cluster overview page.
For Elasticsearch clusters before version 5.0, don’t forget to enable Shield. You also want to keep your cluster endpoint URLs safe, so that others will not be able to access unsecured clusters.
For a more detailed overview of security settings you must consider before you run Elasticsearch in production, read Securing Your Elasticsearch Cluster.
We provide a large number of conveniences on top of Elasticsearch from an operational standpoint; great stuff that lets companies focus more on code, and less on infrastructure. That is our mission, of course, to help solve infrastructure problems with Elasticsearch. One thing that we can’t do quite as well as you, however, is capacity planning. No one knows your application as well as you do. We can, however, provide ample visibility into your clusters to facilitate smarter capacity planning on your end.
The first tool to check out when planning capacity is the cluster performance metrics, which provide a quick and easy way for you to see how a cluster has been performing over the last 24 hours. There is also the Paramedic tool, accessible on the overview page of your account. Paramedic tracks a lot, but the key metrics to look out for are disk utilization for the various shards, the amount of CPU usage, the heap (memory) usage, and finally the amount of disk space used by your app. It’s good to check in periodically and make sure that there’s enough headroom to handle spikes in traffic and projected growth.
If you’re unsure about your current capacity, remember that since it only takes a few clicks to scale your Elastic Cloud cluster up and down with zero downtime, it’s easy to experiment with different capacity levels while evaluating your cluster’s performance. Additionally, Elastic Cloud is billed by the hour, so scaling up for a short period of time is quite inexpensive.
Accessing Elasticsearch is easy since it’s just HTTP plus JSON! Be aware that this implies that it’s also easy to build an Elasticsearch client that performs poorly. There are a number of important considerations when choosing an Elasticsearch client. Is connection pooling with keep-alive being used? If not, setting up and tearing down HTTP connections, especially with SSL, will be expensive. Are bulk operations supported and used? Is your development team aware of and using synchronization best practices? If not, performance may be poor and data may wind up missing.
Congratulations, if you’ve followed this guide, you now know what to look for in a well built app using Elasticsearch. Be sure to read Elasticsearch in Production and Securing Your Elasticsearch Cluster before migrating to production. These and aforementioned articles are a part of our Blog, a collection of articles intended to deepen your knowledge of Elasticsearch. Finally, if you’re brand new to Elasticsearch, you might want to start by checking out Exploring Elasticsearch, a free ebook on Elasticsearch.