Configure the Logstash output

edit

This documentation refers to configuring the standalone (legacy) APM Server. This method of running APM Server will be deprecated and removed in a future release. Please consider upgrading to Fleet and the APM integration.

The Logstash output sends events directly to Logstash by using the lumberjack protocol, which runs over TCP. Logstash allows for additional processing and routing of generated events.

Prerequisite

To send events to Logstash, you also need to create a Logstash configuration pipeline that listens for incoming Beats connections and indexes the received events into Elasticsearch. For more information, see Getting Started with Logstash. Also see the documentation for the Beats input and Elasticsearch output plugins.

If you want to use Logstash to perform additional processing on the data collected by APM Server, you need to configure APM Server to use Logstash.

To do this, edit the APM Server configuration file to disable the Elasticsearch output by commenting it out and enable the Logstash output by uncommenting the Logstash section:

output.logstash:
  hosts: ["127.0.0.1:5044"]

The hosts option specifies the Logstash server and the port (5044) where Logstash is configured to listen for incoming Beats connections.

For this configuration, you must load the index template into Elasticsearch manually because the options for auto loading the template are only available for the Elasticsearch output.

Accessing metadata fields

edit

Every event sent to Logstash contains the following metadata fields that you can use in Logstash for indexing and filtering:

{
    ...
    "@metadata": { 
      "beat": "apm", 
      "pipeline":"apm", 
      "version": "7.17.25" 
    }
}

APM Server uses the @metadata field to send metadata to Logstash. See the Logstash documentation for more about the @metadata field.

The default is apm. To change this value, set the index option in the APM Server config file.

The default pipeline configuration: apm. Additional pipelines can be enabled with a Logstash pipeline config.

The current version of APM Server.

In addition to metadata, APM Server provides the processor.event field, which can be used to separate event types into different indices.

For example, the following Logstash configuration file tells Logstash to use the index and event types reported by APM Server for indexing events into Elasticsearch:

input {
    beats {
        port => 5044
    }
}

filter {
    if [@metadata][beat] == "apm" {
        if [processor][event] == "sourcemap" {
            mutate {
                add_field => { "[@metadata][index]" => "%{[@metadata][beat]}-%{[@metadata][version]}-%{[processor][event]}" } 
            }
        } else {
            mutate {
                add_field => { "[@metadata][index]" => "%{[@metadata][beat]}-%{[@metadata][version]}-%{[processor][event]}-%{+yyyy.MM.dd}" } 
            }
        }
    }
}

output {
    elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "%{[@metadata][index]}"
    }
}

Creates a new field named @metadata.index. %{[@metadata][beat]} sets the first part of the index name to the value of the metadata.beat field. %{[@metadata][version]} sets the second part to APM Server’s version. %{[processor][event]} sets the final part based on the APM event type. For example: apm-7.17.25-sourcemap.

In addition to the above rules, this pattern appends a date to the index name so Logstash creates a new index each day. For example: apm-7.17.25-transaction-2019.10.20.

Events indexed into Elasticsearch with the Logstash configuration shown here will be similar to events directly indexed by APM Server into Elasticsearch.

Logstash and ILM

edit

When used with Index lifecycle management, Logstash does not need to create a new index each day. Here’s a sample Logstash configuration file that would accomplish this:

input {
    beats {
        port => 5044
    }
}

output {
    elasticsearch {
        hosts => ["http://localhost:9200"]
        index => "%{[@metadata][beat]}-%{[@metadata][version]}-%{[processor][event]}" 
    }
}

Outputs documents to an index: %{[@metadata][beat]} sets the first part of the index name to the value of the metadata.beat field. %{[@metadata][version]} sets the second part to APM Server’s version. %{[processor][event]} sets the final part based on the APM event type. For example: apm-7.17.25-sourcemap.

Compatibility

edit

This output works with all compatible versions of Logstash. See the Elastic Support Matrix.

Configuration options

edit

You can specify the following options in the logstash section of the apm-server.yml config file:

enabled
edit

The enabled config is a boolean setting to enable or disable the output. If set to false, the output is disabled.

The default value is false.

hosts
edit

The list of known Logstash servers to connect to. If load balancing is disabled, but multiple hosts are configured, one host is selected randomly (there is no precedence). If one host becomes unreachable, another one is selected randomly.

All entries in this list can contain a port number. The default port number 5044 will be used if no number is given.

compression_level
edit

The gzip compression level. Setting this value to 0 disables compression. The compression level must be in the range of 1 (best speed) to 9 (best compression).

Increasing the compression level will reduce the network usage but will increase the CPU usage.

The default value is 3.

escape_html
edit

Configure escaping of HTML in strings. Set to true to enable escaping.

The default value is false.

worker
edit

The number of workers per configured host publishing events to Logstash. This is best used with load balancing mode enabled. Example: If you have 2 hosts and 3 workers, in total 6 workers are started (3 for each host).

loadbalance
edit

If set to true and multiple Logstash hosts are configured, the output plugin load balances published events onto all Logstash hosts. If set to false, the output plugin sends all events to only one host (determined at random) and will switch to another host if the selected one becomes unresponsive. The default value is false.

output.logstash:
  hosts: ["localhost:5044", "localhost:5045"]
  loadbalance: true
  index: apm-server
ttl
edit

Time to live for a connection to Logstash after which the connection will be re-established. Useful when Logstash hosts represent load balancers. Since the connections to Logstash hosts are sticky, operating behind load balancers can lead to uneven load distribution between the instances. Specifying a TTL on the connection allows to achieve equal connection distribution between the instances. Specifying a TTL of 0 will disable this feature.

The default value is 0.

The "ttl" option is not yet supported on an async Logstash client (one with the "pipelining" option set).

pipelining
edit

Configures the number of batches to be sent asynchronously to Logstash while waiting for ACK from Logstash. Output only becomes blocking once number of pipelining batches have been written. Pipelining is disabled if a value of 0 is configured. The default value is 2.

proxy_url
edit

The URL of the SOCKS5 proxy to use when connecting to the Logstash servers. The value must be a URL with a scheme of socks5://. The protocol used to communicate to Logstash is not based on HTTP so a web-proxy cannot be used.

If the SOCKS5 proxy server requires client authentication, then a username and password can be embedded in the URL as shown in the example.

When using a proxy, hostnames are resolved on the proxy server instead of on the client. You can change this behavior by setting the proxy_use_local_resolver option.

output.logstash:
  hosts: ["remote-host:5044"]
  proxy_url: socks5://user:password@socks5-proxy:2233
proxy_use_local_resolver
edit

The proxy_use_local_resolver option determines if Logstash hostnames are resolved locally when using a proxy. The default value is false, which means that when a proxy is used the name resolution occurs on the proxy server.

index
edit

The index root name to write events to. The default is the Beat name. For example "apm" generates "[apm-]7.17.25-YYYY.MM.DD" indices (for example, "apm-7.17.25-2017.04.26").

This parameter’s value will be assigned to the metadata.beat field. It can then be accessed in Logstash’s output section as %{[@metadata][beat]}.

ssl
edit

Configuration options for SSL parameters like the root CA for Logstash connections. See SSL output settings for more information. To use SSL, you must also configure the Beats input plugin for Logstash to use SSL/TLS.

timeout
edit

The number of seconds to wait for responses from the Logstash server before timing out. The default is 30 (seconds).

max_retries
edit

The number of times to retry publishing an event after a publishing failure. After the specified number of retries, the events are typically dropped.

Set max_retries to a value less than 0 to retry until all events are published.

The default is 3.

bulk_max_size
edit

The maximum number of events to bulk in a single Logstash request. The default is 2048.

If the Beat sends single events, the events are collected into batches. If the Beat publishes a large batch of events (larger than the value specified by bulk_max_size), the batch is split.

Specifying a larger batch size can improve performance by lowering the overhead of sending events. However big batch sizes can also increase processing times, which might result in API errors, killed connections, timed-out publishing requests, and, ultimately, lower throughput.

Setting bulk_max_size to values less than or equal to 0 disables the splitting of batches. When splitting is disabled, the queue decides on the number of events to be contained in a batch.

slow_start
edit

If enabled, only a subset of events in a batch of events is transferred per transaction. The number of events to be sent increases up to bulk_max_size if no error is encountered. On error, the number of events per transaction is reduced again.

The default is false.

backoff.init
edit

The number of seconds to wait before trying to reconnect to Logstash after a network error. After waiting backoff.init seconds, APM Server tries to reconnect. If the attempt fails, the backoff timer is increased exponentially up to backoff.max. After a successful connection, the backoff timer is reset. The default is 1s.

backoff.max
edit

The maximum number of seconds to wait before attempting to connect to Logstash after a network error. The default is 60s.