30 August 2017

Logstash Lines: Java execution for filters/outputs

By Suyog Rao

Welcome back to The Logstash Lines! In these weekly posts, we'll share the latest happenings in the world of Logstash and its ecosystem.

Filters/Output execution in Java (In-progress, 6.1)

In 6.0, LS engine can read existing LS configurations and convert them to a graph representation which is used by the pipeline visualization. We're now moving on to the second phase of this project where the graph model is now used for executing the filters and outputs section. Previously, the filter and output phases would be consolidated to a big ruby function and executed. Now all of this will happen in Java. Here's a good way to summarize the evolution of LS execution:

5.x and before:
[Filter/Output Config] → [AST] → [Evaluate Ruby code and execute]

6.0:
[Filter/Output Config] → [Graph representation] → [Read only, for pipeline viewer]
In parallel,
[Filter/Output Config] → [AST] → [Evaluate Ruby code and execute]

6.1:
[Filter/Output Config] → [Graph LIR] → [Java execution]

The other part of this project is to execute the conditionals in the config natively in Java.

Benefits: All this work will directly benefits users with complex configurations with plenty of conditionals. Specifically this will add:
1. Type safety for filters and conditionals
2. Performance improvement because we reduce crossing the JRuby/Ruby boundary.
3. Performance improvement because the JVM can now inline conditional execution.
4. Better unit testing for conditionals.
5. Better reporting of errors in conditionals (previously left to the Ruby interpreter and exposed to the user).

In early testing, we've seen ~40% throughtput improvement in processing apache pipeline.

Changes in master

  • Performance improvements: Move Memory Queue Drain Loop to Java Fully

Changes in 5.6

  • Fixed: pipeline.events.in shows 0 when there are multiple inputs where one has 0 events flowing (#8011)
  • Added missing settings to the multi-pipeline settings white list to allow DLQ to be used with multi-pipelines (#8069)
  • HTTP Output: Added a new option `http_compression` for sending compressed payload with the `Content-Encoding: gzip` header to the configured http endpoint (#63)