2016年01月27日

Brewing in Beats: Beat generator

By Tudor Golubenco

Welcome to Weekly Beats! With this series, we're keeping you up to date with all that's new in Beats, from the details of work in progress pull requests to releases and learning resources.

Quite a few things going on while we’re working on the next major and preparing for Elastic{ON}. This week also brings us a new community Beat.

Beat generator: template for creating a new Beat

We started beat-generator, a new project that makes it a lot easier to start a new Beat. One command and you get all the boiler code required, the Makefile to build it, a correct Go dependency setup and a way to keep up to date with the libbeat changes (make update;). We hope to not only make it easier to start new Beats, but also to have a more unified dev process around the community Beats.  

If you plan to create a new Beat, we recommend using it already.

New community Beat: Elasticbeat

It was bound to happen, a community Beat to monitor Elasticsearch: Elasticbeat. We’ll be following this one for sure.

Improved system testing

Also on the theme of making it easier for the community Beats creators, the system tests were refactored to avoid duplication and to be importable from new Beats. In addition, they were improved to automatically fail on panics in the logs or on non-zero exit code from the Beat. Special thanks go to community contributor Cyrille Verrier, who proposed several improvements and helped with implementing them.

Filebeat: introduced close_older setting

The ignore_older setting in Filebeat used to do two things, which used to cause issues because sometimes there was no single value convenient for both. This PR splits them and also changes the defaults. ignore_older is now disabled by default and close_older is set to one hour.

Topbeat: added support for capturing the full command line

Topbeat used to only collect the process name, now it captures the full command line, which was an often requested feature request. This also works on Windows!

Performance improvements in the publisher pipeline

We’ve talked in the past about the performance issues we’re having on the “at-least-once” communication between Filebeat  and Logstash. It’s a fairly difficult issue because the Beat needs to balance between sending large batches fast to increase throughput and being able to reduce the output when Logstash is busy. We refactored the code and made it possible for Filebeat to process the next batch while waiting for the current one to be ACKed.

This helps with the overall throughput by enabling load balancing between multiple output threads, at the cost of memory usage. With the right settings and enough memory and CPU power, we’ve seen Filebeat pushing around 45K events/s, compared to around 18K before this change.

Freebsd and Solaris are now part of our CI

Jenkins now runs the tests on these two new platforms, which is the first step in supporting them. All tests from Filebeat and Packetbeat are passing already. Topbeat would require more work.

Work in progress on generic filtering in libbeat

Monica is making progress on the generic filtering feature in libbeat, from which all Beats are expected to benefit. For the moment it is possible to filter fields from a generic event, which already covers a lot of the feature requests that we received.

Work in progress on Packetbeat flows

Steffen started working on a very promising Packetbeat feature, ability to extract information about TCP/UDP/TLS flows for which we don’t understand the upper layers. This should open a new set of possible use cases for Packetbeat.