We’re happy to announce that the new Java execution engine in Logstash has reached the production candidate stage. It features faster performance, reduced memory usage, and lower config startup and reload times. And you can use it now in Logstash 6.3.0 with the
Logstash is built with JRuby and Java which combines the elegance and expressivity of Ruby with the maturity and performance of the Java Virtual Machine. However, the mixing of the Ruby and Java ecosystems must be done with care in performance-sensitive scenarios. The Java execution engine is the first major deliverable of a series of projects that are underway to optimize the interaction between Ruby and Java in Logstash in order to provide a first-class experience for Logstash plugin developers who wish to use either Ruby or Java.
Here are some of the performance improvements we’ve seen with the new Java execution engine (higher bars are better):
As always, no single benchmark can fully characterize the performance of Logstash because different workloads can have very different performance profiles. The two different scenarios above are intended to model workloads at opposite ends of the plugin resource demand spectrum. In the Apache log parsing scenario, reading the Apache logs and filtering them through grok and other filters requires lots of CPU and IO resources, so the time spent in plugins tends to be the dominating factor in the overall throughput of the pipeline. In the generated events scenario, very little time is spent in plugins, so the efficiency of the execution engine is the dominating factor in the overall throughput of the pipeline.
The two big takeaways from the performance charts above are:
- Across-the-board performance in both execution engines has improved significantly between 6.0 and 6.1 when the alpha version of the Java execution engine was completed. You should upgrade if you’re on 6.0 or an earlier release!
- Both the Java and Ruby execution engines have shown progressive performance improvements from the first version in 6.1 through the current release in 6.3.
There are two aspects of the Java execution engine that contribute to these performance improvements.
The first factor is the all-new config parsing and bytecode generation. The original Ruby execution engine reads the Logstash config files using Treetop to parse and convert them to executable Ruby source code which is then interpreted and executed at runtime by the JRuby runtime. The Java execution engine replaces the Ruby source code generation phase with compilation directly to JVM bytecode using Janino. Some execution optimizations can be applied at this stage and the startup and reload times for some large configurations have been reduced by as much as 20x. These improvements apply only to workloads running in the Java execution engine.
The second factor is optimizing the points of contact between Ruby and Java. Due to the dynamic nature of Ruby, there are performance costs involved in calling from JRuby code into Java code and these can be significant when made in tight loops or other performance-sensitive areas. A concerted effort has been made to minimize those calls in the pipeline execution engine. Where such calls were unavoidable, JRuby extension classes were used which also bypass that performance penalty. Although the Ruby execution engine is slated for eventual removal, one nice thing about this work is that many of these performance improvements were able to be applied to both the Ruby and Java execution engines so you can benefit from them now.
The Java execution engine also positions us to be able to work more efficiently and deliver new features such as our upcoming pure Java plugin API for Logstash. We expect that along with these new features, we’ll continue to deliver incremental performance improvements as well. We’d encourage you to give it a try in Logstash 6.3, see how it works for you, and let us know if you encounter any problems!