Brewing in Beats: SOCKS5 support when sending to Logstash

Welcome to Weekly Beats! With this series, we're keeping you up to date with all that's new in Beats, from the details of work in progress pull requests to releases and learning resources.

Merged this week:

DNS eTLD+1 support in Packetbeat

As Andrew wrote, “the effective top level domain plus one more label is really useful for clustering DNS requests (hostnames). For example, the eTLD+1 for foo.bar.golang.org. is golang.org.. This was the basis for the aggregations used in Detecting DNS Tunnels with Packetbeat and Watcher." The PR adds a dns.question.etld_plus_onefield to the Packetbeat DNS transactions.

SOCKS5 proxy for the connection to Logstash

This makes it possible for the Beats to connect to Logstash via SOCKS5 proxy. This can useful when sending the data between different Internet domains. It also opens the possibility of using username/password for authenticating to Logstash.

Cleaner separation between the Go unit tests and integration tests

Our Go _test.go files contain both unit and integration tests (defined as tests that require other services). We used to separate them via the “short” flag on a test by test basis. This also meant that in order to run only the unit tests, one had to call go test -short. With this change, the unit and integration tests are strictly separated in different files and we use build flags to select between them. From now, to run only the unit tests you simply write go test. To execute the integration tests, you now need go test -tags=integration.

Fix Topbeat CPU time computation on Windows

This avoids some floating point arithmetic when converting from a Windows structure to a 64 bits value. We’re hoping that this fixes a bug where huge numbers were reported for CPU usage.

Track unavailability in Metricbeat

If a monitor system is not available at all, an error document is sent to Elasticsearch so it is easier to track downtime.

And from the currently in progress pull requests:

Generic event representation

The way we currently represent internally an event in Beats is in the form of map[string]interface{}. Translating from Go, that means a map from string to anything (because anything implements the empty interface). The only requirement is that all types inside are JSON serializable, but then again, pretty much everything in Go is. This makes it quite easy to write new Beats (just throw anything have into a map), but it means that we have to rely on reflection when working with the event in libbeat, for example for the generic filtering. Reflection code is slow and error prone. Benchmarking also showed that the JSON serialization often dominates the performance of the Beats, and it is slow because it has to do reflection.

So we want to move away from this “anything goes” events. For the transition phase (we don’t have control over the community Beats) and for convenience later, we’ve written some code that takes a map[string]interface{} event and uses reflection (no way around that) to transform it in a nested map that only contains a few accepted types. This makes the rest of the libbeat code easier, especially the filtering code.

JSON support in Filebeat

We now have a second proposal for implementing JSON support in Filebeat. The advantage of this one over the one I linked in last week’s update is that you can combine it with multiline and line filtering in a more meaningful way. This makes it a good way from shipping logs from a Docker host, for example, while still being able to use multiline on the application logs.

Dashboards per Beat & module

We started to move the Kibana configuration for the sample dashboards inside the main repo and organize it per Beat and even per module (in Metricbeat). Nicolas wrote Python scripts to split the dashboards into “snippets” and combine them back.