Hot-warm architecture template

A template that you typically use for time-series analytics and log aggregation workloads that benefit from tiered-storage automatic index curation. Includes features to manage resources efficiently when you need greater capacity, such as:

  • A tiered architecture with two different types of data nodes, hot and warm.
  • Time-based indices, with automatic index curation to move indices from hot to warm nodes over time by changing their shard allocation.

The two type of data nodes in a hot-warm architecture each have their own characteristics:

Hot data node
Handles all indexing of new data in the cluster and holds the most recent daily indices that tend to be queried most frequently. Indexing is an I/O intensive activity and the hardware these nodes run on needs to be more powerful and use SSD storage.
Warm data node
Handles a large amount of read-only indices that are not queried frequently. With read-only indices, warm nodes can use very large spindle drives instead of SSD storage. Reducing the overall cost of retaining data over time yet making it accessible for queries.

Index curation

One of the key features of a hot-warm architecture, time-based index curation automates the task of moving data from hot to warm nodes as it ages. When you deploy a hot-warm architecture, Elasticsearch Service performs regular index curation according to these rules:

  • Index curation moves indices from one Elasticsearch node to another by changing their shard allocation, always from hot to warm.
  • Index curation is always time-based and takes place when an index reaches the age specified, in days, weeks, or months.
  • Index curation always targets indexes according to one or more matching patterns. If an index matches a pattern, Elasticsearch Service moves it from a hot to a warm node.

While you create your deployment, you can define which indices get curated and when. To know more about index curation, see Configure index management

To know more about how hot-warm architectures work with Elasticsearch, see “Hot-Warm” Architecture in Elasticsearch 5.x.

In this template

The following features are included with this template:

Amazon Web Services (AWS)
  • Elasticsearch:

    • Data nodes - hot: Starts at 4 GB memory x 2 availability zones. Hosted on AWS i3 instances.
    • Data nodes - warm: Starts at 4 GB memory x 2 availability zones. Data nodes must be at least 4 GB in size. Hosted on AWS d2 instances.
    • Master nodes:

      Additional master-eligible node added when choosing 2 availability zones (to create a quorum of 3).

      When 1 AZ or 3 AZ are selected, the data nodes act as master-eligible node and there is no requirement for an additional master-eligible node.

      Configurations beyond 5 nodes per AZ can also spin up a dedicated master-eligible set of nodes (in 3 AZs always) to offload the data nodes. Hosted on AWS r4 instances.

  • Kibana: Starts at 1 GB memory x 1 availability zone. Hosted on AWS r4 instances.
  • Machine learning (ML): Disabled by default. The functionality is pre-wired into the template, but you must explicitly enable it in the UI. Hosted on AWS m5 instances.
  • APM (application performance monitoring): Enabled by default. The functionality is pre-wired into the template, but you must explicitly enable it in the UI. Hosted on AWS r4 instances.
Google Cloud Platform (GCP)
  • Elasticsearch:

    • Data nodes - hot: Starts at 4 GB memory x 2 availability zones. Hosted on custom I/O-optimized GCP instances.
    • Data nodes - warm: Starts at 4 GB memory x 2 availability zones. Data nodes must be at least 4 GB in size. Hosted on storage optmized custom GCP instances.
    • Master nodes:

      Additional master-eligible node added when choosing 2 availability zones (to create a quorum of 3).

      When 1 AZ or 3 AZ are selected, the data nodes act as master-eligible node and there is no requirement for an additional master-eligible node.

      Configurations beyond 5 nodes per AZ can also spin up a dedicated master-eligible set of nodes (in 3 AZs always) to offload the data nodes. Hosted on custom memory-optimized GCP instances.

  • Kibana: Starts at 1 GB memory x 1 availability zone. Hosted on custom memory-optimized GCP instances.
  • Machine learning (ML): Disabled by default. The functionality is pre-wired into the template, but you must explicitly enable it in the UI. Hosted on custom CPU-optimized GCP instances.
  • APM (application performance monitoring): Enabled by default. The functionality is pre-wired into the template, but you must explicitly enable it in the UI. Hosted on custom memory-optimized GCP instances
Microsoft Azure
  • Elasticsearch:

    • Data nodes - hot: Starts at 4 GB memory x 2 availability zones. Hosted on Azure l32sv2 instances.
    • Data nodes - warm: Starts at 4 GB memory x 2 availability zones. Data nodes must be at least 4 GB in size. Hosted on Azure e16sv3 instances with extra persistent storage.
    • Master nodes:

      Additional master-eligible node added when choosing 2 availability zones (to create a quorum of 3).

      When 1 AZ or 3 AZ are selected, the data nodes act as master-eligible node and there is no requirement for an additional master-eligible node.

      Configurations beyond 5 nodes per AZ can also spin up a dedicated master-eligible set of nodes (in 3 AZs always) to offload the data nodes. Hosted on Azure E32sv3 instances.

  • Kibana: Starts at 1 GB memory x 1 availability zone. Hosted on Azure E32sv3 instances.
  • Machine learning (ML): Disabled by default. The functionality is available in the template, but you must explicitly enable it in the UI. Hosted on Azure D64sv3 instances.
  • APM (application performance monitoring): The functionality is available in the template and is enabled by default (free tier 0.5GB). Hosted on Azure E32sv3 instances