Configure index management

For time-series use cases such as logging, metrics, and APM, you typically store data in time-based indexes. As this data ages, practicing good index management ensures that your data is being stored in the most cost-effective way possible. In practical terms this means that, if you customize a deployment that includes more than one data configuration, you must also specify how Elasticsearch Service should manage your indices.

In a hot-warm architecture, configuring index management typically means specifying where new indices get created initially and then where they get moved to later on. Additional features are available if you configure index lifecycle management (ILM), including the automatic roll-over of index aliases to new indices when existing indices are getting too large or too old, and deleting indices when they are no longer useful to you.

Before you begin

Configuring index management is part of the larger task of creating deployments and customizing them. Which index management methods are available to you depends on which version of the Elastic Stack you are using. Index lifecycle management (ILM) requires that you use version 6.7 or later and is the new default index management method. For versions before 6.7, index curations is the only available method to manage your indices.

For Elasticsearch Service, there are enough data configurations available to create an index lifecycle policy that covers the hot and warm phases and you can also make use of the delete phase, along with several other features of ILM.

Steps

To configure index management when you customize a new deployment:

  1. On the Index Management page, select the index management method that you want to use:

    Index lifecycle management (Elastic Stack 6.7 and later)

    Uses the ILM feature of the Elastic Stack that provides an integrated and streamlined way to manage time-based data, making it easier to follow best practices for managing your indices. Compared to index curation, ILM gives you more fine-grained control over the lifecycle of each index.

    To configure index lifecycle management:

    1. Select Index Lifecycle Management (ILM).
    2. Review the available node attributes for each of your data configurations.

      Node attributes are part of the deployment template and add defining metadata attributes to each data instance configuration that tell you what they can be used for, such as data: hot or data: warm. You use these node attributes in Kibana when you configure your index lifecycle management policy, whether that is a hot-warm policy or one that uses any of the other supported lifecycle management features.

      .. Access Kibana and go to Management > Elasticsearch > Index Lifecycle Policies. .. Click Create policy to start setting up a new index lifecycle policy. .. Enter a name for your policy. .. Define the phases of the index lifecycle.

      ILM breaks the lifecycle of an index down into four main phases:

      • Hot. The index is actively being queried and written to. You can roll over to a new index when the original index reaches a specified size, document count, or age. When a rollover occurs, a new index is created, added to the index alias, and designated as the new hot index. You can still query the previous indices, but you only ever write to the hot index. See Setting a rollover action.
      • Warm. The index is typically searched at a lower rate than when the data is hot. The index is not used for storing new data, but might occasionally add late-arriving data, for example, from a Beat with a network problem that’s now fixed. You can optionally shrink the number replicas and move the shards to a different set of nodes with smaller or less performant hardware. You can also reduce the number of primary shards and force merge the index into smaller /indices-segments.html[segments].
      • Cold. The index is no longer being updated and is seldom queried, but is still searchable. If you have a big deployment, you can move it to even less performant hardware. You might also reduce the number of replicas because you expect the data to be queried less frequently. To keep the index searchable for a longer period, and reduce the hardware requirements, you can use the /frozen-indices.html[freeze action]. Queries are slower on a frozen index because the index is reloaded from the disk to RAM on demand.
      • Delete. The index is no longer relevant. You can define when it is safe to delete it.

      The index lifecycle always includes an active hot phase. The warm, cold, and delete phases are optional. For example, you might define all four phases for one policy and only a hot and delete phase for another. See /_actions.html[Actions] for more information on the actions available in each phase.

      There are additional benefits to ILM, such as integration with cross-cluster search, which lets you to auto-unfollow read-only indices. You can also set the recovery priority action, so that newer indices recover faster than older ones. To learn more about creating lifecycle policies and about all of the features that are available with ILM, see:

    Index curation

    Creates new indices on hot nodes first and moves them to warm nodes later on, based on the index patterns you specify. Also manages replica counts for you, so that all shards of an index can fit on the right data nodes. Compared to index lifecycle management, index curation for time-based indices supports only one action, to move indices from nodes on one data configuration to another, but it is more straightforward to set up initially and all setup can be done directly from the Elasticsearch Service console.

    If you need to delete indices once they are no longer useful, you can run Curator or your own automation script on-premise to manage indices for Elasticsearch clusters hosted on Elasticsearch Service.

    To configure index curation:

    1. Select Index Curation.
    2. Select the hot data configuration where new indices get created initially.
    3. Select the warm nodes where older indices get moved to later on when they get curated.
    4. Specify which indices get curated by including at least one index pattern.

      By default, the pattern is *, which means that all indices get curated. For logging use cases, you could specify to curate only the logstash-*, metricbeat-*, or filebeat-* index patterns, for example.

    5. Specify the time interval after which indices get curated.
  2. Click Create deployment.

After you have completed these steps, continue with creating your deployment.