Product release

Logstash 5.1.1 released

We are pleased to announce the release of Logstash 5.1.1, loaded with new features. If you can't wait to get your hands on it, head straight to our downloads page. You can also view the release notes here.

A note about Logstash 5.1.0

Version 5.1.0 doesn’t exist because, for a short period of time, the Elastic Yum and Apt repositories included unreleased binaries labeled 5.1.0. To avoid confusion and upgrade issues for the people that have installed these without realizing, we decided to skip the 5.1.0 version and release 5.1.1 instead.

Introducing Persistent Queues Beta

Logstash is a critical component in many data ingestion architecture, shipping millions of events to Elasticsearch and other outputs. One of the highly anticipated features for Logstash has been improvements in data resiliency. Today, we are glad to announce our first iteration of persistent queues, which aims at improving data resiliency.

Please keep in mind that this functionality is in beta and is subject to change. It should be deployed in production at your own risk.

By default, Logstash uses in-memory queuing between the pipeline stages (input → filter, and filter → output) to transfer events. The size of the queue is fixed and not configurable. If Logstash terminates abruptly, either as the result of a software failure or the user forcing an unsafe shutdown, all in-flight events are lost. To prevent loss in these scenarios, you can configure Logstash to use persistent queues. With persistent queues enabled, Logstash persists events before processing them. The queue size is variable with configurable limits, which means that you can buffer events in Logstash instead of at the source or edge node which will also help manage situations that can result in backpressure at the source. With such features, for a simple, single instance deployment that requires message buffering, you can now use the inbuilt persistent queue instead of deploying and managing a message queue, such as Redis, RabbitMQ, or Apache Kafka. If your production deployment already uses a message queue like Kafka, you can continue to use it, and we'll continue to integrate seamlessly with those. Keep in mind that the persistent queue is not replicated and works only with a single Logstash instance, which means you cannot share the queue across multiple instances. For use cases that require such distributed handling, our recommendation is to continue to use your favorite message queue product.

To enable this feature in Logstash, set queue.type to be persisted in your logstash.yml. For more configuration options and detailed description, check our documentation for this feature. We have plenty of planned enhancements in the pipeline for this feature and would love your feedback and improvements as we iterate in the upcoming releases.

Slowlog

Ever wish to find out what is taking a long time in your Logstash Pipeline? Continuing on our 5.x theme of making Logstash easier to operate, we are introducing a new slowlog feature. In 5.0, we introduced the Monitoring APIs, that helps peek into any costly operations at the thread level, but there was no easy way to see what actions are taking long and what events triggered this slow-ness. In 5.1, filters can be configured to log event data and related context when it exceeds a specified execution time. These slowlogs will be collected in a separate file called logstash-slowlog-plain-YYYY-MM-dd.log

Date Filter: Nitro boost

We've had many reports in the past that the Date Filter was unexpectedly being slow when using multiple patterns. This filter allows you to try a pattern, and if it fails, step through other patterns until one of them is successful. This filter has been enhanced to handle failing patterns efficiently. While re-working the implementation to solve this problem, we were able to deliver a general performance boost for all date processing configurations. The verdict? 2.5x faster for all cases, and in some pattern sequences, a scorching 16x increase in throughput.

Introducing the Truncate Filter

In some Logstash deployments, the administrators have no control over the type and size of events that are processed by Logstash. By default, Logstash is started with a small heapsize for the JVM and does not need a big memory footprint. When a large event (in bytes) flows through Logstash, it can consume the entire heap allocated eventually crashing Logstash. Truncate is a new filter that allows you to truncate fields longer than a given byte-length.

Kafka Enhancements

This plugin now supports 0.10.1.0 release of Apache Kafka. Unfortunately, 0.10.1.0 client is not backwards compatible with 0.10.0.1 broker, so we couldn't include this update in Logstash 5.1 (to not break existing upgrades from 5.0.x). If you need 0.10.1.0 support, you can install the latest version on 5.1 by using:

bin/logstash-plugin update logstash-input-kafka
bin/logstash-plugin update logstash-output-kafka

The bundled Kafka plugin adds Kerberos authentication support.

CEF Codec Improvements

This codec implements the ArcSight Common Event Format (CEF) to connect ArcSight data into Logstash, which can then be shipped to Elasticsearch and other outputs like S3. A new delimiter setting has been added so this plugin can work with the TCP input (or any other stream-of-bytes input) in addition to single-message inputs like Kafka. We've also updated the implementation to use the dictionary translation for abbreviated CEF field names as listed in the CEF specification.

Feedback

We are super excited for this release of Logstash and look forward to your feedback. You can reach us at our forum, open issues in your GitHub repo or twitter(@elastic). Happy 'stashing!