Run Elasticsearch in production

Many teams rely on Elasticsearch to run their key services. To ensure these services remain available and responsive under production workloads, you can design your deployment with the appropriate level of resilience, and apply performance optimizations tailored to your environment and use case.

Elasticsearch is built to be always available and to scale with your needs. It does this using a distributed architecture. By distributing your cluster, you can keep Elastic online and responsive to requests.

In cases where built-in resilience mechanisms aren't enough, Elasticsearch offers tools, such as cross-cluster replication and snapshot and restore, to help you fall back or recover quickly. You can also use cross-cluster replication to serve requests based on the geographic location of your users and resources.

Explore the following topics to learn how to build, scale, and optimize your production deployment:

Designing for resilience: Learn the foundations of resilience in Elasticsearch and what it takes to keep your deployment available during hardware failures, outages, or node disruptions. This section covers key concepts like multiple nodes, shards, and replicas, and how to combine them to build resilient architectures.
Scaling considerations: Understand when and how to scale your Elasticsearch deployment effectively. This section explains how to monitor cluster health, optimize performance, and make informed scaling decisions, whether you’re scaling manually in self-managed environments or relying on autoscaling in orchestrated deployments.
Performance optimizations: Learn how to improve Elasticsearch performance across different use cases, including indexing, search, disk usage, and approximate kNN. This section provides targeted recommendations to help you tune your cluster based on workload patterns and resource constraints.

Deployment models and operational responsibilities

Your responsibilities when running Elasticsearch in production depend on the deployment type. Depending on the platform, some aspects, such as scaling or cluster configuration, are managed for you, while others might require your attention and knowledge:

Self-managed Elasticsearch: You are responsible for setting up and managing nodes, clusters, shards, and replicas. This includes managing the underlying infrastructure, scaling, and ensuring high availability through failover and backup strategies.
Elastic Cloud Hosted: Elastic can autoscale resources in response to workload changes. You can choose from different hardware profiles and deployment architectures to apply sensible defaults for your use case. A good understanding of nodes, shards, and replicas is important, as you are still responsible for managing your data and ensuring cluster performance. Also review the plan for production for how to plan your deployment for production.
Elastic Cloud Enterprise: Similar to Elastic Cloud Hosted, ECE manages Elasticsearch deployments and automates cluster operations, including scaling and orchestration. However, you are responsible for maintaining the platform itself, including the ECE hosts, operating system updates, and software upgrades. At the deployment level, you must also manage data, monitor performance, and handle shard strategies and capacity planning.
Elastic Cloud on Kubernetes: ECK gives you powerful orchestration capabilities for running Elasticsearch on Kubernetes. It simplifies lifecycle management, component configuration, upgrades, and supports autoscaling. However, you're still responsible for the Kubernetes environment and managing the Elasticsearch deployments themselves. That includes infrastructure sizing, sharding strategies, performance monitoring, and availability planning. Think of ECK as similar to a self-managed environment, but with orchestration and automation benefits.
Elastic Cloud Serverless: You don’t need to worry about nodes, shards, or replicas. These resources are 100% automated on the serverless platform, which is designed to scale with your workload. Project performance and data retention are controlled through the Search AI Lake settings.

Note

To understand what Elastic manages and what you're responsible for in Elastic Cloud Hosted and Serverless, refer to Elastic Cloud responsibilities. It outlines the security, availability, and operational responsibilities between Elastic and you.

Additional guidance for production environments

The following topics, covered in other sections of the documentation, offer valuable guidance for running Elasticsearch in production.

Plan your data structure and formatting

Build a data architecture that best fits your needs. Based on your own access and retention policies, you can add warm, cold, and frozen data tiers, and automate deletion of old data.
Normalize event data to better analyze, visualize, and correlate your events by adopting the Elastic Common Schema (ECS). Elastic integrations use ECS out-of-the-box. If you are writing your own integrations, ECS is recommended.
Consider data streams and index lifecycle management to manage and retain your data efficiently over time.

Tip

Elastic integrations provide default index lifecycle policies, and you can build your own policies for your custom integrations.

Security and monitoring

As with any enterprise system, you need tools to secure, manage, and monitor your deployments. Security, monitoring, and administrative features that are integrated into Elasticsearch enable you to use Kibana as a control center for managing a cluster.