﻿---
title: Setting retention for Elasticsearch data streams
description: This tutorial shows how to configure retention settings for data streams in Elasticsearch, helping you to control storage costs and compliance. Learn...
url: https://www.elastic.co/docs/manage-data/lifecycle/data-stream/tutorial-data-stream-retention
products:
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available
---

# Setting retention for Elasticsearch data streams
This tutorial shows how to configure retention settings for data streams in Elasticsearch, helping you to control storage costs and compliance. Learn to adjust lifecycle policies so old data is deleted or moved automatically.
The following options apply only to data streams managed by data stream lifecycle.
1. [What is data stream retention?](#what-is-retention)
2. [How to configure retention?](#retention-configuration)
3. [How is the effective retention calculated?](#effective-retention-calculation)
4. [How is the effective retention applied?](#effective-retention-application)

You can verify if a data steam is managed by the data stream lifecycle using the [get data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-get-data-lifecycle):
```json
```

The result should look like this:
```json
{
  "data_streams": [
    {
      "name": "my-data-stream",                                   
      "lifecycle": {
        "enabled": true                                           
      }
    }
  ]
}
```

<tip applies-to="Elastic Cloud Serverless: Generally available, Elastic Stack: Generally available since 9.2, Elastic Stack: Preview in 9.1">
  You can also review how a data stream is managed by locating it on the **Streams** page in Kibana. A stream directly corresponds to a data stream. Select a stream to view its details and go to the **Retention** tab.
</tip>


## What is data stream retention?

We define retention as the least amount of time the data of a data stream are going to be kept in Elasticsearch. After this time period has passed, Elasticsearch is allowed to remove these data to free up space or manage costs.
<note>
  Retention does not define the period that the data will be removed, but the minimum time period they will be kept.
</note>

We define 4 different types of retention:
- The data stream retention, or `data_retention`, which is the retention configured on the data stream level. It can be set using an [index template](https://www.elastic.co/docs/manage-data/data-store/templates) for future data streams or using the [PUT data stream lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle) for an existing data stream. When the data stream retention is not set, it implies that the data need to be kept forever.
- The global default retention, let's call it `default_retention`, which is a retention configured through the cluster setting [`data_streams.lifecycle.retention.default`](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings#data-streams-lifecycle-retention-default) and will be applied to all data streams managed by data stream lifecycle that do not have `data_retention` configured. Effectively, it ensures that there will be no data streams keeping their data forever. This can be set using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
- The global max retention, let's call it `max_retention`, which is a retention configured through the cluster setting [`data_streams.lifecycle.retention.max`](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings#data-streams-lifecycle-retention-max) and will be applied to all data streams managed by data stream lifecycle. Effectively, it ensures that there will be no data streams whose retention will exceed this time period. This can be set using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings).
- The effective retention, or `effective_retention`, which is the retention applied at a data stream on a given moment. Effective retention cannot be set, it is derived by taking into account all the configured retention listed above and is calculated as it is described [here](#effective-retention-calculation).

<note>
  Global default and max retention do not apply to data streams internal to elastic. Internal data streams are recognized either by having the `system` flag set to `true` or if their name is prefixed with a dot (`.`).
</note>


## How to configure retention?

- Configure data retention at the data stream level using the `data_retention` setting. You can do this in two ways:
  - For a new data stream, the `data_retention` setting can be included in the index template that is applied when the data stream is created. You can use the [create index template API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-index-template), for example:
  ```json

  {
    "index_patterns": ["my-data-stream*"],
    "data_stream": { },
    "priority": 500,
    "template": {
      "lifecycle": {
        "data_retention": "7d"
      }
    },
    "_meta": {
      "description": "Template with data stream lifecycle"
    }
  }
  ```
  - For an existing data stream, the `data_retention` setting can be configured using the [PUT lifecycle API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-data-lifecycle).
  ```json

  {
    "data_retention": "30d" <1>
  }
  ```
  <tip applies-to="Elastic Cloud Serverless: Generally available, Elastic Stack: Generally available since 9.2, Elastic Stack: Preview in 9.1">
  To adjust the retention period of a data stream in Kibana, locate a data stream on the **Streams** page. A stream maps directly to a data stream. Next, select a stream to view its details and review the **Retention** tab to find out how it's managed before making your adjustments.
  </tip>
- By setting the global retention using the `data_streams.lifecycle.retention.default` and `data_streams.lifecycle.retention.max` that are applied on a cluster level. You can set these using the [update cluster settings API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cluster-put-settings). For example:
  ```json

  {
    "persistent" : {
      "data_streams.lifecycle.retention.default" : "7d",
      "data_streams.lifecycle.retention.max" : "90d"
    }
  }
  ```


## How is the effective retention calculated?

The effective is calculated in the following way:
- The `effective_retention` is the `default_retention`, when `default_retention` is defined and the data stream does not have `data_retention`.
- The `effective_retention` is the `data_retention`, when `data_retention` is defined and if `max_retention` is defined, it is less than the `max_retention`.
- The `effective_retention` is the `max_retention`, when `max_retention` is defined, and the data stream has either no `data_retention` or its `data_retention` is greater than the `max_retention`.

The above is demonstrated in the examples below:

| `default_retention` | `max_retention` | `data_retention` | `effective_retention` | Retention determined by |
|---------------------|-----------------|------------------|-----------------------|-------------------------|
| Not set             | Not set         | Not set          | Infinite              | N/A                     |
| Not relevant        | 12 months       | **30 days**      | 30 days               | `data_retention`        |
| Not relevant        | Not set         | **30 days**      | 30 days               | `data_retention`        |
| **30 days**         | 12 months       | Not set          | 30 days               | `default_retention`     |
| **30 days**         | 30 days         | Not set          | 30 days               | `default_retention`     |
| Not relevant        | **30 days**     | 12 months        | 30 days               | `max_retention`         |
| Not set             | **30 days**     | Not set          | 30 days               | `max_retention`         |

Considering our example, if we retrieve the lifecycle of `my-data-stream`:
```json
```

We see that it will remain the same with what the user configured:
```json
{
  "global_retention" : {
    "max_retention" : "90d",                                   
    "default_retention" : "7d"                                 
  },
  "data_streams": [
    {
      "name": "my-data-stream",
      "lifecycle": {
        "enabled": true,
        "data_retention": "30d",                                
        "effective_retention": "30d",                           
        "retention_determined_by": "data_stream_configuration"  
      }
    }
  ]
}
```


## How is the effective retention applied?

Retention is applied to the remaining backing indices of a data stream as the last step of [a data stream lifecycle run](/docs/manage-data/lifecycle/data-stream#data-streams-lifecycle-how-it-works). Data stream lifecycle will retrieve the backing indices whose `generation_time` is longer than the effective retention period and delete them. The `generation_time` is only applicable to rolled over backing indices and it is either the time since the backing index got rolled over, or the time optionally configured using the [`index.lifecycle.origination_date`](https://www.elastic.co/docs/reference/elasticsearch/configuration-reference/data-stream-lifecycle-settings#index-data-stream-lifecycle-origination-date) setting.
<important>
  We use the `generation_time` instead of the creation time because this ensures that all data in the backing index have passed the retention period. As a result, the retention period is not the exact time data get deleted, but the minimum time data will be stored.
</important>