31 May 2016 Releases

Beats 5.0.0-alpha3 released

By Monica Sarbu

We are over-the-moon excited to announce the 5.0.0-alpha3 release, a major milestone on our road to Beats 5.0.

IMPORTANT: This is an alpha release and it is intended for testing purposes only.

Bye bye Topbeat, hello Metricbeat

We are proud to see that in just 6 months since we launched libbeat 1.0, there are already around 25 Beats created by the community. If you have a closer look at the community Beats, you notice that many of them are used to collect metrics from various services. For example, Mysqlbeat queries the MySQL server for metrics, Nginxbeat queries Nginx for metrics, Apachebeat queries Apache for metrics, and so on.

Our own Topbeat works in a similar way: it periodically queries the various operating system APIs for system statistics like CPU usage, memory usage, per process statistics, and indexes them to Elasticsearch. We’ve seen great adoption of Topbeat and have gotten constantly good feedback about it.

With so many metrics-based use cases, we wanted to provide a framework for metrics to be collected easily. So with alpha3, we are releasing a more generic Beat that collects not only system statistics but also other statistics by periodically interrogating external systems. This new Beat is called Metricbeat, with a system module that replaces Topbeat. Note, Topbeat 1.x versions will continue to work with Elastic Stack versions 5.x.

Metricbeat is designed from the beginning to be modular so you can easily add new modules that collect data from external systems. For now Metricbeat includes the following modules: system, apache, mysql, nginx, redis, and zookeeper. We will keep extending this list during the alpha and beta phases of the 5.0 release.

I would like to give special thanks to the community contributors Radovan Ondas (creator of Apachebeat), KS Chan (Nginxbeat), and Chris Black (Redisbeat) for converting their Beats to Metricbeat modules.

In the default Metricbeat configuration, only the system module is enabled, so running Metricbeat with the default configuration is equivalent with running Topbeat. It exports system statistics like CPU usage, memory, swap, per process statistics, per core statistics, and filesystem statistics. In addition, it also exports IO and network statistics, which were a popular feature request for Topbeat.

This means that if you are currently using Topbeat, migrating to Metricbeat is easy. Here is the default Topbeat 1.x configuration:

  period: 10
  procs: [".*"]
    system: true
    process: true
    filesystem: true

And here is the default Metricbeat configuration, exporting the same data:

 - module: system
     - cpu
     #- core
     #- diskio
     - filesystem
     #- fsstat
     - memory
     - process
   enabled: true
   period: 10s
   processes: ['.*']

So, if you are running the default Topbeat configuration, all you need to do is upgrade to Metricbeat and use its default configuration. You will get the same data, but in a slightly different format.

Add conditions to filtering

Filtering is a new feature added in 5.0.0-alpha2, and it’s available to all the Beats through libbeat. With filtering, you can reduce the number of fields that are exported by defining a list of filter actions that are applied to each event before it’s sent to the defined output. The filter actions are executed in the order that they are defined in the config file.

Starting with alpha3, we introduce conditions to filter the exported fields only if the condition is fulfilled. We also add the drop_event operation to drop entire events.

For example, if you are using Packetbeat to monitor your HTTP transactions, you can decide not to index the successful transactions in Elasticsearch by dropping all events that have the HTTP response 200 OK. The configuration file should include the following filters section:

- drop_event: 
      http.code: 200

Configuration files, how you like them

The Beats use YAML for configuration and have a tradition of putting all the available options commented out in the configuration file together with a short description for each of them. This means that the configuration file also acts as a reference, so you almost don’t have to read the manual (you still should, it contains useful guides and more details for some of the options). The downside of this is that as we add more options to the Beats (notably the Redis and Kafka outputs), the configuration files tend to become very large, which contradicts the lightweight nature of the Beats.

With the alpha3 release, each Beat comes with two versions of the configuration file. A short one, the default, that contains only the most common options. It’s beautifully simple. And a complete one (called, for example, filebeat.full.yml), containing all the non-deprecated options. It’s wonderfully comprehensive.

We recommend that you start with the short configuration and copy over settings from the full version as needed.

These are not the only changes we did to the configuration files. We’ve seen lots of users having trouble with the multiple indentation levels, so we reduced the number of levels by using dots in the field names. Also, YAML is just not strict enough when it comes to the accepted syntax, so we added lots of validators to catch errors early.

Become a Pioneer

A big Thank You to everyone that has tried the alpha1 and alpha2 releases and posted issues or provided feedback. We’d like to also remind you that if you post a valid, non-duplicate bug report during the alpha/beta period against any of the Elastic stack projects, you are entitled to a special gift package.