Logstash Lines: Persistent Queue fixes, a new filter

Welcome back to The Logstash Lines! In these weekly posts, we'll share the latest happenings in the world of Logstash and its ecosystem.

Changes in 5.3.0

  • Added bin/ruby command line utility which can be used to launch ruby scripts during development. This runs the script using the Logstash context which correctly includes the Logstash environment settings, JRuby flags and more.
  • When you attempt to shutdown Logstash and it's busy processing remaining events or even stalled (wedged I/O etc), a detailed shutdown report is logged. This feature regressed when we introduced Log4j2 changes in 5.0. It has been fixed now.
  • $LS_HOME/bin/system-install is used to create a custom startup script for Logstash and uses a file called startup.options. We fixed a bug to ensure environment variables defined in the startup.options are available to the launched LS process.
  • Support multiple ES hosts in the logstash monitoring x-pack setting xpack.monitoring.elasticsearch.url. This setting can accept an array now.
  • Elasticsearch Input/Filter: Upgrade es-ruby client to support correct content-type fixes in preparation for elastic/elasticsearch#22691

Persistent Queues: A reminder to our readers that Persistent Queues are still in beta. Our plan is to GA this feature in 5.4. We've added a few important enhancements in 5.3 to ensure data integrity and improve usability:

  • The queue uses two separate files internally — a data file to persist user data flowing through Logstash, and a checkpoint file used to track state, i.e. how far along in the queue we've processed messages. In 5.3, we added a recovery process for the queue which helps in recovering data that was written to the data file, but not yet checkpoint'd. Consider a situation when the input has written a bunch of data to the queue, but Logstash crashes before writing to the checkpoint file. Now on LS startup, we scan if there is any data in the queue but ahead of the checkpoint location to re-process them.
  • Added exclusive access to the persistent queue defined in path.queue setting. Using a file lock we ensure that only a single Logstash instance has access to mutate the queue defined in the config path. Without this fix, multiple instances could be started with the same path.queue and step over each other.
  • A user reported that config reload feature would not work when Persistent Queues were enabled. This bug was related to how we swap pipelines when a new config change was introduced. In 5.3, the reload sequence has been changed to reliably shutdown the first pipeline before a new one is started with same settings. The addition of locks described above also helps here to make sure multiple pipelines aren't concurrently modifying the queue.

New Plugin:

Logstash Age Filter: A simple filter that calculates age by subtracting the event timestamp from the current time. This filter can then be used with the drop filter to drop aged events if it reaches a certain threshold. Thanks to Joshua Spence for the contribution.