Tech Topics

Logstash 1.3.1 Released!

Hello friends!

We have released logstash 1.3.1 with lots of new fixes and features. You can view the full changelog, but I’d like to highlight two of the new features, both of which were implemented by Aaron Mildenstein.

First, in our mission to make it easy to integrate with lots of tools, we have a new input plugin, collectd. This plugin lets you receive metrics from collectd agents with logstash and ship them anywhere you want. Want to know more? Check out the documentation for the collectd input.

Second, as a way to make logstash have the best possible default behavior for the most users, we now have logstash providing its own index template. This index template was built to try and solve some common problems for users by using better default analyzers and mappings specific to logging use cases.

To demonstrate how this new feature works, let’s look at some apache logs in Kibana. I’ve parsed these apache logs with the following logstash config:

filter {
  grok {
    match => { "message" => "%{COMBINEDAPACHELOG}" }
  }

  date {
    match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
  }
}

The filter configuration above uses grok with the built-in apache log pattern to parse apache logs into separate fields such as the request path, http response code, bytes sent, user agent, etc. The second filter takes the original timestamp field in the apache log and parses it to be used as the canonical timestamp of the event – this gives you more accurate search results over time and also lets you ingest old log data correctly.

Now, a common search pattern is to ask for the top N of something. In Kibana, you can either use the ‘top N’ query or you can use a pie chart, depending on your goals. In this example, I’ll just use a pie chart. Adding a pie chart with mode ‘terms’ on the ‘request’ field gets me this:

pie chart!

Most folks, in this situation, sit and scratch their heads, right? I know I did the first time. I’m pretty certain “docs” and “centralized” aren’t valid paths on the logstash.net website! The problem here is that the pie chart is built from a terms facet. With the default text analyzer in elasticsearch, a path like “/docs/1.3.1/filters/” becomes 3 terms {docs, 1.3.1, filters}, so when we ask for a terms facet, we only get individual terms back!

Index templates to the rescue! The logstash index template we provide adds a “.raw” field to every field you index. These “.raw” fields are set by logstash as “not_analyzed” so that no analysis or tokenization takes place – our original value is used as-is! If we update our pie chart above to instead use the “request.raw” field, we get the following:

pie chart with 'request.raw' field

Much better!

And because we still index both the terms and the not-analyzed parts for each field, you can still do simple term queries like “request:docs” to find all requests with ‘docs’ in the text.

I hope this helps explain the new feature. Happy logstashing!