05 12월 2016

Brewing in Beats: Processor for decoding JSON

By Tudor Golubenco

Welcome to Brewing in Beats! With this series, we're keeping you up to date with all that's new in Beats, from the details of work in progress pull requests to releases and learning resources.

New community Beat: Redditbeat

Redditbeat, by @voigt, indexes into Elasticsearch new Reddit submissions of one or multiple sub-reddits. Because why not.

Libbeat processor for decoding JSON

Filebeat can decode JSON since version 5.0, but it didn’t have so far a good way of handling JSON-in-JSON situations. This can happen, for example, if one is using the Docker JSON logging driver and the application also outputs in JSON. Thanks to a community contribution by @suraj-soni, libbeat got a new processor that can decode JSON from an arbitrary field. This is useful for Filebeat, but can be used also, for example, by Packetbeat to decode the body of HTTP requests.

Metricbeat: raw MySQL fields

The general approach we take with Metricbeat is to carefully curate the available metrics and only export the most actionable ones, which we usually put in the sample dashboards. We also rename these metrics to follow a common naming scheme and make sure to select the correct Elasticsearch type and Kibana format for each of them.

While this approach gives a good out of the box experience that people usually like, it has the disadvantage that if someone needs a more obscure metric, it’s often not readily available. To compensate, we’re experimenting with adding the option to automatically capture all metrics with their original names (or changed just enough to be usable in Elasticsearch/Kibana) under a `raw` sub-document. The first module to get the `raw` option is MySQL.

Metricbeat: Kafka module merged

The Kafka module is now merged and will be available in the 5.1 release as experimental.

Metricbeat: MongoDB module improvements

The MongoDB module was refreshed to extract the key metrics from the relatively new wiredTiger storage engine and to work well with MongoDB 3.4. We now also have a sample dashboard for MongoDB.

Metricbeat: HAProxy module improvements

The fields created by the HAProxy module were reviewed to match our naming conventions and got better documented.

Metricbeat: Docker module improvements

The same reviewing work was done for the CPU metricset from the Docker module. Also, the Docker labels are now exported as a dictionary rather than an array. This makes them easier to use in Kibana. Additionally, the ports data was removed from the container metricset. Special thanks to @rikatz for early testing the Docker module and suggesting improvements on it.

Metricbeat: parse and sanitize the connection URL

Metricbeat now offers a safer and standardized method for modules to sanitize the connection URLs (the contents of the `hosts` field) so that we have a good way of extracting the sensitive information from them.

Filebeat: faster shutdown when dealing with a large number of files

Filebeat is now able to interrupt a running scan when receiving the shutdown signal. This makes the shutdown more responsive when the scan has to deal with lots of files.

Metricbeat: fix service timeout on startup

@maddin2016 fixed a bug where the Metricbeat startup was taking too long and could cause a service timeout when being started under windows.