Filter and enhance the exported dataedit

Your use case might require only a subset of the data exported by Filebeat, or you might need to enhance the exported data (for example, by adding metadata). Filebeat provides a couple of options for filtering and enhancing exported data.

You can configure each input to include or exclude specific lines or files. This allows you to specify different filtering criteria for each input. To do this, you use the include_lines, exclude_lines, and exclude_files options under the filebeat.inputs section of the config file (see Configure inputs). The disadvantage of this approach is that you need to implement a configuration option for each filtering criteria that you need.

Another approach (the one described here) is to define processors to configure global processing across all data exported by Filebeat.

Processorsedit

You can define processors in your configuration to process events before they are sent to the configured output. The libbeat library provides processors for:

  • reducing the number of exported fields
  • enhancing events with additional metadata
  • performing additional processing and decoding

Each processor receives an event, applies a defined action to the event, and returns the event. If you define a list of processors, they are executed in the order they are defined in the Filebeat configuration file.

event -> processor 1 -> event1 -> processor 2 -> event2 ...

Drop event exampleedit

The following configuration drops all the DEBUG messages.

processors:
 - drop_event:
     when:
        regexp:
           message: "^DBG:"

To drop all the log messages coming from a certain log file:

processors:
 - drop_event:
     when:
        contains:
           source: "test"

Decode JSON exampleedit

In the following example, the fields exported by Filebeat include a field, inner, whose value is a JSON object encoded as a string:

{ "outer": "value", "inner": "{\"data\": \"value\"}" }

The following configuration decodes the inner JSON object:

filebeat.inputs:
- type: log
  paths:
    - input.json
  json.keys_under_root: true

processors:
  - decode_json_fields:
      fields: ["inner"]

output.console.pretty: true

The resulting output looks something like this:

{
  "@timestamp": "2016-12-06T17:38:11.541Z",
  "beat": {
    "hostname": "host.example.com",
    "name": "host.example.com",
    "version": "6.5.4"
  },
  "inner": {
    "data": "value"
  },
  "input": {
    "type": "log",
  },
  "offset": 55,
  "outer": "value",
  "source": "input.json",
  "type": "log"
}