05 décembre 2016

Brewing in Beats: Metricbeat Kafka module improvements

Par Tudor Golubenco

Welcome to Brewing in Beats! With this series, we're keeping you up to date with all that's new in Beats, from the details of work in progress pull requests to releases and learning resources.

Pass arrays or dicts via -E and env variables

It is now possible (again, since this was also possible in versions < 5.0, but we had a regression) to specify an array or dictionary via environment variables or the -E CLI flag. This is particularly useful when specifying multiple output hosts. For example, this works as expected now: output.kafka.hosts: ${KAFKA_HOSTS} if KAFKA_HOSTS is an environment variable in the form “host1:9092”,“host2:9092”. This feature didn't make the cut for the 5.1 release, but will be available in a next minor release.

Metricbeat Kafka module improvements

Preparing for the first release of the Kafka module, we’ve got several fixes and improvements to it, including TLS support and SASL authentication (available in Kafka 0.10).

Winlogbeat fix for “invalid bounds” error message

When reading a batch of large event log records the Windows function EvtNext returns errno 1734 ("The array bounds are invalid."). This seems to be a bug in Windows because there is no documentation about this behavior. The fix handles the error by resetting the event log subscription handle (so events are not lost) and then retries the EvtNext call with maxHandles/2.

Winlogbeat benchmarks and performance improvements

Andrew is working on analyzing the performance of Winlogbeat and fixing some performance issues, like the number of allocations when converting from UTF16.

Automatic tests with ES/LS 2.x

We realized we don’t have automatic integration tests with any of the 2x versions of Elasticsearch and Logstash, but list them on our support matrix, so we’re fixing that.

Filebeat: restate publish_async as experimental

The publish_async feature allows Filebeat to continue reading batches of log lines while waiting for confirmation that the current batch was processed by the next stage. This option can improve throughput but also cause larger memory and CPU consumption. We’re now making it clear in the docs and via a warning that this feature is considered experimental at the moment.