Data stream lifecycle settings in Elasticsearch

These are the settings available for configuring data stream lifecycle.

Cluster level settings

data_streams.lifecycle.retention.max: (Dynamic, time unit value) The maximum retention period that will apply to all user data streams managed by the data stream lifecycle. The max retention will also override the retention of a data stream whose configured retention exceeds the max retention. It should be greater than 10s.

data_streams.lifecycle.retention.default: (Dynamic, time unit value) The retention period that will apply to all user data streams managed by the data stream lifecycle that do not have retention configured. It should be greater than 10s and less than or equal to data_streams.lifecycle.retention.max.

data_streams.lifecycle.poll_interval: (Dynamic, time unit value) How often Elasticsearch checks what the next action is for all data streams with a built-in lifecycle. Defaults to 5m.

cluster.lifecycle.default.rollover

(Dynamic, string) This property accepts a key value pair formatted string and configures the conditions that would trigger a data stream to rollover when it has lifecycle configured. This property is an implementation detail and subject to change. Currently, it defaults to max_age=auto,max_primary_shard_size=50gb,min_docs=1,max_primary_shard_docs=200000000, this means that your data stream will rollover if any of the following conditions are met:

Either any primary shard reaches the size of 50GB,
or any primary shard contains 200,000,000 documents
or the index reaches a certain age which depends on the retention time of your data stream,
and has at least one document.

data_streams.lifecycle.target.merge.policy.merge_factor: (Dynamic, integer) Data stream lifecycle implements tail merging by updating the Lucene merge policy factor for the target backing index. The merge factor is both the number of segments that should be merged together, and the maximum number of segments that we expect to find on a given tier. This setting controls which value data stream lifecycle configures on the target index. It defaults to 16. The value will be visible under the index.merge.policy.merge_factor index setting on the target index.

data_streams.lifecycle.target.merge.policy.floor_segment: (Dynamic) Data stream lifecycle implements tail merging by updating the Lucene merge policy floor segment for the target backing index. This floor segment size is a way to prevent indices from having a long tail of very small segments. This setting controls which value data stream lifecycle configures on the target index. It defaults to 100MB.

data_streams.lifecycle.signalling.error_retry_interval: (Dynamic, integer) The number of retries the data stream lifecycle has to perform for an index in an error step to signal that the index is not progressing (for example, it's stuck in an error step). The current signalling mechanism is a log statement at the error level. However, the signalling mechanism can be extended in the future. Defaults to 10 retries.

data_streams.lifecycle.downsampling.max_indices_in_progress: (Dynamic, integer) The maximum number of indices per data stream that can be submitted for downsampling by data stream lifecycle. Defaults to 10.

Frozen tier transition settings

The following settings control the behavior of the frozen tier transition, which automatically converts data stream backing indices to searchable snapshots on the frozen tier. Note that the conversions happen on the currently elected master node.

dlm.frozen.transition.poll_interval: (Static, time unit value) How often the master node checks for data stream backing indices that are ready to be converted to the frozen tier. Must be at least 1m. Defaults to 5m.

dlm.frozen.transition.thread_pool.size: (Static, integer) The maximum number of backing indices that the frozen transition service converts concurrently. Defaults to the smaller of either two times the CPU core count of the node or 100.

dlm.frozen.transition.thread_pool.queue_size: (Static, integer) The maximum number of backing indices that can be queued for frozen conversion at any given time. Indices submitted beyond this limit are skipped until the next poll cycle. Defaults to the smaller of either 20 times the CPU core count of the node or 1000.

dlm.frozen.cleanup.poll_interval: (Static, time unit value) How often the master node scans for and deletes orphaned artifacts (clone indices and snapshots) left behind by interrupted frozen conversions. Must be at least 1h. Defaults to 1d.

Index level settings

Settings supported in Serverless

Elastic Cloud Serverless projects restrict the available Elasticsearch settings to a supported subset, identified with a Serverless badge next to the setting name. For a complete list of available index settings, refer to the Serverless index settings list.

The following index-level settings are typically configured on the backing indices of a data stream.

index.lifecycle.prefer_ilm: (Dynamic, boolean) This setting determines which feature is managing the backing index of a data stream if, and only if, the backing index has an index lifecycle management (ILM) policy and the data stream also has a built-in lifecycle. When true this index is managed by ILM. When false, the backing index is managed by the data stream lifecycle. Defaults to true.

index.lifecycle.origination_date: (Dynamic, long) If specified, this is the timestamp used to calculate the backing index generation age after this backing index has been rolled over. The generation age is used to determine data retention, consequently, you can use this setting if you create a backing index that contains older data and want to ensure that the retention period or other parts of the lifecycle will be applied based on the data's original timestamp and not the timestamp when it was indexed. Specified as a Unix epoch value in milliseconds.

index.dlm.frozen.created: (Static, boolean) An internal marker set on indices created by the data stream lifecycle during frozen tier conversion (clone indices and mounted searchable snapshots). This setting is not user-configurable. It is surfaced here for diagnostics and tooling purposes only.

Reindex settings

You can use the following cluster-level settings to control the behavior of the reindex data stream API:

migrate.max_concurrent_indices_reindexed_per_data_stream (Dynamic) The number of backing indices within a given data stream which will be reindexed concurrently. Defaults to 1.

migrate.data_stream_reindex_max_request_per_second (Dynamic) The average maximum number of documents within a given backing index to reindex per second. Defaults to 1000, though can be any decimal number greater than 0. To remove throttling, set to -1. This setting can be used to throttle the reindex process and manage resource usage. Consult the reindex throttle docs for more information.