09 August 2016

Brewing in Beats: Configurable index patterns

Von Tudor Golubenco

Welcome to Brewing in Beats! With this series, we're keeping you up to date with all that's new in Beats, from the details of work in progress pull requests to releases and learning resources.

New community beat: Burrowbeat

Burrowbeat is a new Beat that can be used to monitor consumer lags in Kafka queues. It is based on the Burrow project.

Configurable index patterns

The Beats now offer a lot more flexibility over the index names. They used to enforce daily indices of the form beatname-%{+yyyy.MM.dd}, which is still the default. But it’s now possible to do weekly or hourly indices using a syntax similar to Logstash, or something like beatname to work with the new rollover APIs from Elasticsearch, or even beatname-%{type}-%{+yyyy.MM.dd} to automatically split the data into separate indices based on any field from the events.

Use scaled floats for percentages

After adding support for half floats a few weeks back, Adrien did it again and added scaled floats to Elasticsearch 🎉. Scaled floats are stored as longs behind the scenes, which makes them benefit of the compression scheme used for integers in Lucene. This makes them a great fit for the way we store percentages in Beats: a number between 0 and 1 that gets formatted as a percentage by Kibana.

This PR switches our percentages to use scaled_float. We had quite a few of those, so we can expect a significant improvement in the storage footprint.

Automatically generate Kibana index patterns

The sample Beats dashboards contain the Kibana index patterns, which allow us to define custom formatting for some of our fields (think of percentages). Providing the index pattern also saves the user a step while getting started. We used to create these index patterns by exporting them from Kibana, in a mostly manual process. To improve on this, Monica created a script to generate the index patterns from our fields.yml files, which are the primary source data for everything that the Beats export.

Lookup functionality

The lookup feature allows attaching arbitrary metadata to events by calling external scripts. The scripts are called with parameters that can be taken from the original event, and the results cached for that parameters. This can be used, for example, to attach extra metadata to every file read by Filebeat by calling the Kubernetes APIs. The same could be used to add custom metadata to every process monitored by Metricbeat. Since the results are cached, the external script is called only once per file/process.

This functionality is now merged into an experimental branch called x-exec-lookup, which we plan to merge into master after we add several security checks around calling external scripts.

Filebeat: shorter close_inactive default

The close_inactive option sets after what time interval Filebeat closes the open files that don’t receive updates. The old default used to be 1h, which could mean that we keep a lot of open files in the case of quickly rotating files. This PR changes the default to 5m. If a file receives updates after it was closed, it is picked up again by Filebeat, so the lower default doesn’t mean any risk of data loss.