Logstash Collectd Input Plugin

A few weeks ago I finished the initial release of the Logstash collectd input plugin. I'm really excited about this new feature!

Some of you may be wondering why we would go to this effort, seeing how collectd data isn't exactly, well, logs. We consider any data that has a corresponding timestamp to be an event in Logstash. The data collectd sends is also timestamp + metric data. It will fit right alongside your other log data as a valuable companion. Now you can see if there's a corresponding change in disk I/O or CPU load when you see certain log entries, or vice versa! The possibilities are vast! I can't wait to hear how you use this plugin to help visualize your data!

collectd configuration

The simplest way to get started is to configure collectd. You may or may not already have a collectd setup where you are, but your configuration file could be as simple as this:

Hostname    "host.example.com"
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin network
<Plugin interface>
    Interface "eth0"
    IgnoreSelected false
<Plugin network>
    <Server "" "25826">

These options will send collectd information consisting of cpu load, memory stats, and network traffic information via UDP to the IP on port 25826. For your own setup just replace with the IP or hostname of your Logstash instance. The collectd plugin will populate the Logstash event host field with whatever is in the “Hostname" directive rather than what reverse lookup finds. If otherwise unset, the default configuration will send values every 10 seconds. You can learn more about collectd configuration here.

This example is only a tiny sample of the kind of plugins and configurations collectd has to offer. A comprehensive list can be found here. As with Logstash, you can write your own plugins for collectd so there are virtually endless possibilities!

Logstash configuration

Now that we have our collectd ready to send let's configure our Logstash instance

input {
  collectd {}

Yep, that's it. Pretty crazy, right? We try to come up with sane defaults for everything. The full configuration explanations are here. Let me explain some important ones.

  • buffer_size The default is 1452, which is what the official collectd server process expects too.
  • port The default is 25826, which is also what the official collectd server process expects.
  • prune_intervals The default is true. This one is a bit trickier to explain as it is part of the way collectd sends data. In order to allow some applications to properly gauge frequency a special field is sent to indicate how frequently that field will be updating. Another is subsequently sent each time that interval is reached. In Logstash we typically do not care about this field, but if you want it anyway, you can enable it by setting prune_intervals to false.
  • typesdb The typesdb is the definition file for collectd data. I included the typesdb from the most recent (5.4.0 at this writing) version of collectd so you won't need to use this option unless you have one or more custom typesdb files. This option is simply an array of paths, <br>e.g. <code>[ "/path/to/typesdb1", "/path/to/typesdb2", … ]

So, what is the result of this little example? I ran this test on my 2013 MacBook Air, with the data coming from a nearby computer with 2 Ethernet ports and 16G of memory.

This is traffic from a single box over a 1 hour period.

As you can see from this graph the memory is broken down into blocks: Inactive, Free, Wired, Active.
The CPU load histogram should be fairly self-explanatory, as are the network I/O charts. You can clearly see peaks every 5 minutes in network traffic on EN1 with little other traffic usually.

Caveat: The version of Kibana I am using here (as of 16 Dec 2013) was downloaded straight from the master branch on GitHub and is not an officially released version. I needed this version as it enabled me to do derivative graphing, where each subsequent point is the difference between the current and previous values. This is necessary with network values as collectd measures them as it simply sends the counter values from the kernel store (similar to SNMP network data). If you need this feature now you too can use the most current development version. The release of Kibana coming in Jan 2014 (coinciding with ES 1.0 release) will have this feature.

The stats themselves are unimpressive, seeing how they are from a personal server on home network with limited traffic. So I fired up collectd-tg, the collectd traffic generator. This was the result:

Keeping in mind that this was tested on a MacBook Air I thought it was a pretty good show: 3000 events per second. If I had configured servers to send an average of 30 events every 10 seconds (3 events per second) that amounts to my laptop being able to process a continuous stream of events from 1000 servers!

So, there's a brief introduction to the collectd plugin. Happy Logstashing!