May 16, 2017

Logstash Lines: Grok debugger, Dead letter queues

By Suyog Rao

Welcome back to The Logstash Lines! In these weekly posts, we'll share the latest happenings in the world of Logstash and its ecosystem.

Feature: Dead Letter Queue (master and 5.5)

The long awaited dead letter queue (DLQ) feature has been merged to master and 5.5. This adds the ability to shunt poisoned or unsuccessful events in the running pipeline to a local file-based store. Users can then read these dead events using a DLQ input for further processing. This infrastructure is currently being used in ES output. Previously, any mapping issue would be logged and dropped to the ground. Right now, they get redirected to the DLQ. Users can then read these in another LS instance, choose to drop the field that cause the mapping error and re-index to ES. The DLQ input allows users to plug into the entire LS pipeline framework providing access to hundreds of filters and output destinations.

While this first version provides an immediate solution to the existing data loss in ES output, we are evaluating other places in the pipeline that can use a DLQ. Once we ship the multi-pipeline support, the DLQ pipeline can coexist within the same production LS instance, which will make all of this very powerful.

Feature: Grok Debugger (master and 5.5)

An initial version of the Grok Debugger tool in Kibana has been merged. Users can debug or simulate field extraction using their log lines and grok patterns ala https://grokdebug.herokuapp.com/.

Screen Shot 2017-05-15 at 14.55.39.png

Infra: Dockerize and Jenkinize integration tests

Currently, Logstash's integration test suite (called internally as RATS, inspired from bats) provides a ruby-based framework to standup services such as ES, Kafka, filebeat and run rspec tests against it. We build tar artifacts for every pull request, untar it, and run tests against the various services. We will now be using docker for all the services which should provide resource isolation and allow us to test on multiple versions easily. Also, the tests will be moved to internal Jenkins from Travis. This change will allow us to run tests faster and be able to ssh into the build machine if there are issues.

Other changes in master

  • Make -e and -f flags mutually exclusive. Previously, -e used to concatenate the config string supplied in CLI with configs read from the file source (Breaking).
  • Do not append stdin input and stdout output block to the pipeline when -f option is used to source configs (Breaking).

Other changes in 5.5

  • We now support environment variables in logstash.yml. We already had support for env variables in pipeline config, but this extends the support in the settings file as well. This change brings it inline with Beats and ES's behavior. It also helps the docker use case which relies a lot on env variables for configuration.
  • Perf improvement in persistent queue - stop allocating byte[] to compute CRC32.
  • Tests: Fix portability issues with Kafka Broker IPV4 and log path creation.
  • Kafka: Added support for Kafka 0.10.2.1 client and broker. This change makes LS backwards-compatible with older Kafka brokers.
  • Grok: Fixed an issue where a sub directory under patterns directory can crash Logstash at startup.
  • User agent: Fix an issue where manual yaml file path was causing LS to error out.
  • CEF Codec: Added an option to keep the original, unparsed data in the event.