The Logstash Lines: 2.0 Feedback

作者

Welcome back to The Logstash Lines! In these weekly posts, we'll share the latest happenings in the world of Logstash and its ecosystem.

2.0 Feedback

Thanks to our users for downloading and trying Logstash 2.0 -- we've received great feedback so far! In particular, a few users ran into issues while upgrading both their Elasticsearch and Logstash versions to 2.0. These were caused by breaking changes introduced in 2.0, which is a major version upgrade.

  1. Dots in field names: Although we fixed metrics and elapsed filters to not use dots in field names, there are some filters out in the wild that still use dots. To help with this, we wrote a de_dot filter. It replaces fields which have dots to underscore by default (foo.bar becomes foo_bar). There is also provision to convert them to actual nested fields like [foo][bar]. Check out this blog for more details
  2. Mapping issue: A Logstash update will not overwrite existing template in Elasticsearch. This is because users could have custom mapping changes. A user has to set a flag in Elasticsearch output to force an update.
  3. Log Courier: This is a popular fork of Logstash Forwarder. Changes introduced to JrJackson library in 2.0 broke one of the APIs this project was using. Jason, the creator of Log Courier has released a new version which is compatible with 2.0.

You can find more information about upgrading Logstash and Elasticsearch to 2.0 here

Java Event: We started an initiative to rewrite the Event class in pure Java. Event is the main object which encapsulates data and provides an API for the plugins to perform processing on the event content. Having Event implemented in pure Java will improve performance, make possible faster serialization by avoiding costly type conversion between JRuby and Java which in turn will help with an efficient persistence implementation. These changes are internal and will not involve any breaking changes to users.

Plugins Land:

  • Twitter Input: Working on adding proxy support to Twitter Input. This has been a popular ask since Elasticsearch rivers were deprecated.
  • Multiline Codec: Adding stream identity to this codec so we can use this to safely collect multiline data from multiple file sources.
  • Elasticsearch Output: Refactored this output to modularize existing methods. Changed retry logic to not be asynchronous and removed the use of extra Stud::Buffer. This which will help transition to Persistent Queues data structure. Currently the output blocks until all failures in bulk are retried.

As you can see, there’s been a ton of activity in the past week. We are gearing up for another feature release, so stay tuned!