August 1, 2017

Logstash Lines: Introducing a benchmarking tool for Logstash

By Suyog Rao

Welcome back to The Logstash Lines! In these weekly posts, we'll share the latest happenings in the world of Logstash and its ecosystem.

Benchmarking tool

Inspired by Rally's success (and design) for Elasticsearch, we've created a similar benchmarking tool for Logstash. In the past, we've created similar tools that collect performance stats for LS, but they weren't based on the monitoring APIs introduced in 5.x. The new benchmarking tool just needs a JVM and is based on APIs built into LS. It also has a richer set of functionality. 

Users can run benchmarks on predefined configs and data sets like apache. You can perform a benchmarking via several ways: on a particular LS distribution version (5.4, 5.6 etc), on an installed LS, from a github hash/tag or on a branch. At the end of the run, it outputs stats like “throughput (events/sec)”, CPU usage, etc, to the terminal. This tool also has an option to ship performance stats directly to Elasticsearch. Internally, we plan to store nightly benchmark results to catch regressions. 

This is just the beginning though — there's plenty to do and polish up here. Next up, we want to add the ability to run benchmarks on custom configs and datasets, detailed documentation and more!

Fixes in master and 6.0

  • Dead Letter Queue input: When using the “seek by timestamp” feature while processing events in DLQ, we found a bug where events could be duplicated or skipped. This has now been fixed (#7789)
  • Accessors should only handle ConvertedMap and ConvertedList as Collection, non-scalar types.
  • Performance improvements fixes across the board in core:
    • Metrics calculation: Using RubyArray.hash as a key for metric's fast lookup method instead of an array provides a 10% gain (#7772)
    • Optimized throughtput when using in-memory queue by replacing SynchronousQueue with ArrayBlockingQueue(#7690).
    • Correctly size pathcache backing map used in field references.
  • Elasticsearch Filter: Support ca_file setting when using https URI in hosts parameter (#58).
  • Fingerprint filter: Added option to automatically include all field names in fingerprint calculation.