Custom index lifecycle management with APM Server

edit

This documentation refers to the standalone (legacy) method of running APM Server. This method of running APM Server will be deprecated and removed in a future release. Please consider upgrading to the Elastic APM integration. If you’ve already upgraded, please see Index lifecycle management instead.

The default ILM policy can be customized to match the needs of your data. APM Server will assist you in creating a custom index lifecycle policy, but you must first ensure that ILM and ILM setup are enabled. Then create a custom policy and map it to APM event types. All of this can be done directly from the apm-server.yml file.

If you do not want APM Server to create policies and templates, or if you want to use unmanged indices, see the ILM configuration reference for a full list of configuration settings.

Enable ILM
edit

Before enabling ILM, ensure the following conditions are true:

  • The Elasticsearch instance supports ILM.
  • output.elasticsearch is enabled.
  • Custom index or indices settings are not configured.

Enable ILM by setting ilm.enabled to "auto" (the default):

apm-server:
  ilm:
    enabled: "auto"
Set up ILM with APM Server
edit

Enable APM Server to create managed indices, and customized polices and templates that are written to Elasticsearch, by setting apm-server.ilm.setup.enabled to true.

apm-server:
  ilm:
    setup:
      enabled: true

ILM can now be customized via the remaining apm-server.ilm.setup.* configuration options.

Create a custom ILM index_suffix
edit

You can define a custom index suffix for each event type. The index suffix is limited to variables concerning observer.*; if other variables are configured, the server will refuse to start. The resulting rollover alias and index name will be of type apm-{version}-{event_type}-{custom_index_suffix}, e.g., `apm-7.9.0-span-foo. The mapped ILM policy can specify a rollover action, e.g., when an index reaches a size of 50GB, it should roll over to create a new index. The rollover alias abstracts away the concrete write index, while automatically being updated and keeping track of the current write index. This allows you to use the rollover alias instead of specific indices in queries.

For example, the default rollover alias for event type transaction would be apm-7.8.0-transaction. This alias points to a write index named apm-7.8.0-transaction-000001. When this index reaches its defined size or age, it will roll over to a new index named apm-7.8.0-transaction-000002. The rollover alias of apm-7.8.0-transaction keeps track of which index is the current write index while ingesting data.

Manually ensure that templates containing customized index information do not conflict with each other or the default templates.

By default, the APM Server creates a template without a custom index suffix per event type. When defining custom index suffixes, always ensure that templates, that might have been set up previously, are removed or do not conflict. A conflicting behavior could occur when an index matches multiple templates with the same order. For example, the APM Server was started without any customization, leading to a default index setup. Afterward, the Server configuration is customized to add the index suffix production for the event type span. When the Server restarts, it will set up a new index template based on the new custom index suffix. A newly created index named apm-server-7.9.0-span-production would now match the default template with the index pattern of apm-server-7.9.0-span*, but also the new template with the index pattern apm-server-7.9.0-span-production*. In this case, the old template needs to be manually deleted, or the index_pattern or order need to be changed to avoid conflicts.

If you customize setup.template.pattern, ensure that the configured pattern still matches the rollover aliases. If it doesn’t, the Elasticsearch index template with the predefined mappings will not match against created indices, leading to indexing issues.

The example below shows how to change the index_suffix to a custom value.

apm-server:
  ilm:
    enabled: "auto"
    setup:
      mapping:
        - event_type: "error"
          index_suffix: "dev"
        - event_type: "span"
          index_suffix: "dev"
        - event_type: "transaction"
          index_suffix: "dev"
        - event_type: "metric"
          index_suffix: "dev"
Create a custom ILM policy
edit

You can define as many policies as you’d like, but they only need to be created once, and will persist through version upgrades. Any change in existing ILM policies will only take place once the next phase is entered.

APM Server doesn’t do any validation on policies. Instead, if something is incorrectly defined, Elasticsearch will respond with 400 and APM Server won’t connect.

The default ILM policy can be viewed and edited in two places:

Here’s an example of a custom ILM policy, named apm-error-span-policy, that applies all four phases to its index lifecycle, including a cold phase with frozen indices, and a delete phase after 30 days.

  ilm:
    setup:
      policies:
        - name: "apm-error-span-policy"
          policy:
            phases:
              hot:
                actions:
                  rollover:
                    max_size: "50gb"
                    max_age: "1d"
                  set_priority:
                    priority: 100
              warm:
                min_age: "7d"
                actions:
                  set_priority:
                    priority: 50
                  readonly: {}
              cold:
                min_age: "30d"
                actions:
                  set_priority:
                    priority: 0
                  freeze: {}
              delete:
                min_age: "60d"
                actions:
                  delete: {}

Here’s an example of different policy, named apm-transaction-metric-policy, that keeps data in the hot, warm, and cold phases for a longer period of time, and does not delete any data.

  ilm:
    setup:
      policies:
        - name: "apm-transaction-metric-policy"
          policy:
            phases:
              hot:
                actions:
                  rollover:
                    max_size: "50gb"
                    max_age: "30d"
                  set_priority:
                    priority: 100
              warm:
                min_age: "60d"
                actions:
                  set_priority:
                    priority: 50
                  readonly: {}
              cold:
                min_age: "90d"
                actions:
                  set_priority:
                    priority: 0
                  freeze: {}

Head on over to the Elasticsearch documentation to learn more about all available policy phases and actions.

After starting up APM Server, you can confirm the policy was created by using the GET lifecycle policy API:

GET _ilm/policy
Map ILM policies to an event type
edit

If your policy isn’t mapped to an event type, it will not be sent to Elasticsearch. Policies are mapped to event types using the ilm.setup.mapping configuration.

Using the example from the previous step, we can map the apm-error-span-policy to errors and spans, and the apm-transaction-metric-policy to transactions and metrics.

  ilm:
    enabled: "auto"
    setup:
      mapping:
        - event_type: "error"
          policy_name: "apm-error-span-policy"
        - event_type: "span"
          policy_name: "apm-error-span-policy"
        - event_type: "transaction"
          policy_name: "apm-transaction-metric-policy"
        - event_type: "metric"
          policy_name: "apm-transaction-metric-policy"
Example ILM configuration
edit

Now that we have all of the puzzle pieces, we can put them together to see what a custom ILM configuration might look like.

As a reminder, the example below creates two different policies, one for errors and spans, and another for transactions and metrics.

The apm-error-span-policy applies all four phases to its index lifecycle, including a cold phase with frozen indices, and a delete phase after 30 days. The apm-transaction-metric-policy keeps data in the hot, warm, and cold phases for a longer period of time, and does not delete any data.

Additionally this example shows how to set custom rollover aliases.

  ilm:
    enabled: "auto"
    setup:
      mapping:
        - event_type: "error"
          policy_name: "apm-error-span-policy"
          index_suffix: "development"
        - event_type: "span"
          policy_name: "apm-error-span-policy"
          index_suffix: "development"
        - event_type: "transaction"
          policy_name: "apm-transaction-metric-policy"
          index_suffix: "development"
        - event_type: "metric"
          policy_name: "apm-transaction-metric-policy"
          index_suffix: "development"
      enabled: true
      policies:
        - name: "apm-error-span-policy"
          policy:
            phases:
              hot:
                actions:
                  rollover:
                    max_size: "50gb"
                    max_age: "1d"
                  set_priority:
                    priority: 100
              warm:
                min_age: "7d"
                actions:
                  set_priority:
                    priority: 50
                  readonly: {}
              cold:
                min_age: "30d"
                actions:
                  set_priority:
                    priority: 0
                  freeze: {}
              delete:
                min_age: "60d"
                actions:
                  delete: {}
        - name: "apm-transaction-metric-policy"
          policy:
            phases:
              hot:
                actions:
                  rollover:
                    max_size: "50gb"
                    max_age: "30d"
                  set_priority:
                    priority: 100
              warm:
                min_age: "60d"
                actions:
                  set_priority:
                    priority: 50
                  readonly: {}
              cold:
                min_age: "90d"
                actions:
                  set_priority:
                    priority: 0
                  freeze: {}