Today we release Logstash 6.4.0! This release focuses on both providing new features that have been in demand for a long time and continuing the trend of improving core stability and performance.
Plugins have always been crucial to Logstash's success so we've made several improvements such as introducing a read mode to the file input and rewriting the HTTP input internals to make it much faster. On the core side, Logstash has received over 25 pull requests of Ruby to Java refactorings and is now also capable of detecting incorrect or ambiguous field reference notations.
Last but certainly not least, we've introduced the Azure Module (experimental feature): a solution to integrate your Azure activity logs and SQL diagnostic logs with the Elastic Stack. This addition brings into Logstash a new default input plugin to read data from Azure Event Hubs: the logstash-input-azure_event_hubs plugin.
Read on for a dive into these highlights and, for a complete list of changes, check the release notes.
Introducing the Logstash Azure Module (experimental)
This new module brings Azure operational monitoring, metrics and SQL activity into the Elastic Stack. It is now possible to monitor multiple facets of your Azure cloud environment, including:
- Infrastructure activity monitoring like service health, user activity and alerts
- SQL Database monitoring like storage utilization, wait times and performance of queries
All this information is presented in curated Kibana dashboards for easy navigation and exploration. While the data itself is consumed by Logstash through its new Azure Event Hubs input plugin, you won't have to create a Logstash pipeline to process the data, as all the necessary heavy lifting is done in the module, which you only have to configure through the logstash.yml configuration file. See the configuration section of the documentation to learn more about how to set up the Azure module.
For more information about the Azure Module please check the documentation page. This is an experimental feature and is subject to changes including breaks in backwards compatibility.
File Input now has a "read" mode
One of the most demanded features in Logstash's file input was the ability to not just tail files but also process files whose content was final and unchanged once they were discovered by the plugin. The existing tail mode would previously cause the plugin to monitor these files for a long time waiting for new content, consuming unnecessary resources.
This plugin now can be configured to work in "read" mode, treating an end-of-file (EOF) as information that this file wont receive more content and the resources to monitor it can be freed. Also, since we don't expect the file to be modified, we can process files that are compressed (using GZIP only). Finally, knowing that a file has been completely processed allows the plugin to take an action for each individual file such as deleting it or writing its full path to an append only log file for external bookkeeping. For more information on tail vs read mode and other features, please check the Logstash Input File documentation.
Faster and more stable HTTP Input
The HTTP Input has been a very successful way to push data into the Elastic Stack by allowing Logstash to accept HTTP requests and passing the data along to Elasticsearch. The goal of its first implementation was to have a working HTTP server within Logstash with minimal effort. While this has been a successful plugin, as our users and customers increasingly rely on it, we found that delegating most of the work to an http server library made it difficult to control certain aspects like backpressure, threading, memory consumption and error handling. This has led to a rewrite of the internals of this plugin while keeping the user facing options and schema of the produced events the same as not to break backwards compatibility.
This rewrite has been shown to process HTTP requests up to ~20% faster, while also bringing the ability to configure the maximum number of pending requests and greater flexibility when setting up SSL/TLS, all of which are documented in the plugin's documentation page.
On removing ambiguity from field references
As you know, in Logstash we use square brackets to reference fields in an event:
/tmp/logstash-6.4.0 % bin/logstash -i irb 2.3.0 :001 > e = LogStash::Event.new => #<LogStash::Event:0x43937937> 2.3.0 :002 > e.set("a", 1) => 1 2.3.0 :003 > e.get("[a]") => 1
The code responsible for parsing the field references has historically been very permissive, allowing for very ambiguous notations:
/tmp/logstash-6.4.0 % bin/logstash -i irb 2.3.0 :001 > e = LogStash::Event.new => #<LogStash::Event:0x43937937> 2.3.0 :002 > e.set("a", 1) => 1 2.3.0 :003 > e.get("[]][[[[a]][[") => 1
This is not a desirable feature and it only introduces confusion, so we're working towards the goal of making field references behave in a more strict manner, where examples like the one above should raise an error. To accomplish this goal without breaking changes, Logstash now has a config.field_reference.parser option which can be set to either:
LEGACY: Parse with the legacy parser, which is known to handle ambiguous and illegal-syntax in surprising ways; warnings will not be emitted.
COMPAT: Warn once for each distinct ambiguous or illegal input, but continue to expand field references with the legacy parser.
STRICT: Parse in a strict manner; when given ambiguous input, do not expand the reference.
For 6.x, starting with 6.4 the parser will be set to COMPAT, giving out warnings whenever an ambiguous field reference is used:
/tmp/logstash-6.4.0 % bin/logstash -i irb 2.3.0 :001 > e = LogStash::Event.new => #<LogStash::Event:0x43937937> 2.3.0 :002 > e.set("a", 1) => 1 2.3.0 :011 > e.get("[][[[[[a]]][[") [2018-08-20T19:09:26,389][WARN ][org.logstash.FieldReference] Detected ambiguous Field Reference `[][[[[[a]]][[`, which we expanded to the path `[a]`; in a future release of Logstash, ambiguous Field References will not be expanded. => 1
Starting with logstash 7.0, the option will default to STRICT, raising an exception when such a reference is encountered:
>2.3.0 :009 > e.get("[][[[[[a]]][[") RuntimeError: Invalid FieldReference: `[][[[[[a]]][[` from org/logstash/ext/JrubyEventExtLibrary.java:85:in `get'