For time-series use cases such as logging, metrics, and APM, you typically store data in time-based indexes. As this data ages, practicing good index management ensures that your data is being stored in the most cost-effective way possible. In practical terms this means that, if you customize a deployment that includes more than one data configuration, you must also specify how Elasticsearch Service should manage your indices.
In a hot-warm architecture, you can use index curation to specify where new indices are created initially and where they are moved to later on. However, index curation will soon be deprecated in favor of index lifecycle management (ILM), which offers additional features and more fine-grained control over indices. For instance, using ILM you can enable automatic roll-over of index aliases to new indices when existing indices become too large or too old, and you can set indices to be deleted when they are no longer useful.
Before you beginedit
Configuring index management is part of the larger task of creating deployments and customizing them. Which index management methods are available to you depends on which version of the Elastic Stack you are using. Index lifecycle management (ILM) requires that you use version 6.7 or later and is the new default index management method. For versions before 6.7, index curations is the only available method to manage your indices. Starting with version 7.8, we strongly encourage using ILM to avoid the potential for unpredictable behavior. Elasticsearch from that version and later runs with the ILM API always enabled, and other stack products and features provide default policies.
For Elasticsearch Service, there are enough data configurations available to create an index lifecycle policy that covers the hot and warm phases and you can also make use of the delete phase, along with several other features of ILM.
To configure index management when you customize a new deployment:
On the Index Management page, select the index management method that you want to use:
- Index lifecycle management (Elastic Stack 6.7 and later)
Uses the ILM feature of the Elastic Stack that provides an integrated and streamlined way to manage time-based data, making it easier to follow best practices for managing your indices. Compared to index curation, ILM gives you more fine-grained control over the lifecycle of each index.
To configure index lifecycle management:
- Select Index Lifecycle Management (ILM).
Review the available node attributes for each of your data configurations.
Node attributes are part of the deployment template and add defining metadata attributes to each data instance configuration that tell you what they can be used for, such as
data: warm. You use these node attributes in Kibana when you configure your index lifecycle management policy, whether that is a hot-warm policy or one that uses any of the other supported lifecycle management features.
- Access Kibana and go to Management > Elasticsearch > Index Lifecycle Policies.
- Click Create policy to start setting up a new index lifecycle policy.
- Enter a name for your policy.
Define the phases of the index lifecycle.
- Hot. The index is actively being queried and written to. You can roll over to a new index when the original index reaches a specified size, document count, or age. When a rollover occurs, a new index is created, added to the index alias, and designated as the new hot index. You can still query the previous indices, but you only ever write to the hot index. See Setting a rollover action.
- Warm. The index is typically searched at a lower rate than when the data is hot. The index is not used for storing new data, but might occasionally add late-arriving data, for example, from a Beat with a network problem that’s now fixed. You can optionally shrink the number replicas and move the shards to a different set of nodes with smaller or less performant hardware. You can also reduce the number of primary shards and force merge the index into smaller segments.
- Cold. The index is no longer being updated and is seldom queried, but is still searchable. If you have a big deployment, you can move it to even less performant hardware. You might also reduce the number of replicas because you expect the data to be queried less frequently. To keep the index searchable for a longer period, and reduce the hardware requirements, you can use the freeze action. Queries are slower on a frozen index because the index is reloaded from the disk to RAM on demand.
- Delete. The index is no longer relevant. You can define when it is safe to delete it.
The index lifecycle always includes an active hot phase. The warm, cold, and delete phases are optional. For example, you might define all four phases for one policy and only a hot and delete phase for another. See Actions for more information on the actions available in each phase.
There are additional benefits to ILM, such as integration with cross-cluster search, which lets you to auto-unfollow read-only indices. You can also set the recovery priority action, so that newer indices recover faster than older ones. To learn more about creating lifecycle policies and about all of the features that are available with ILM, see:
- Index curation (Elastic Stack 6.6 and earlier)
Creates new indices on hot nodes first and moves them to warm nodes later on, based on the index patterns you specify. Also manages replica counts for you, so that all shards of an index can fit on the right data nodes. Compared to index lifecycle management, index curation for time-based indices supports only one action, to move indices from nodes on one data configuration to another, but it is more straightforward to set up initially and all setup can be done directly from the Elasticsearch Service console.
If you need to delete indices once they are no longer useful, you can run Curator or your own automation script on-premise to manage indices for Elasticsearch clusters hosted on Elasticsearch Service.
To configure index curation:
- Select Index Curation.
- Select the hot data configuration where new indices get created initially.
- Select the warm nodes where older indices get moved to later on when they get curated.
Specify which indices get curated by including at least one index pattern.
By default, the pattern is
*, which means that all indices get curated. For logging use cases, you could specify to curate only the
filebeat-*index patterns, for example.
- Specify the time interval after which indices get curated.
- Click Create deployment.
After you have completed these steps, continue with creating your deployment.