Logstash 2.0.0 beta2 released
We are pleased to announce the beta2 release for Logstash 2.0.0. Please note that there are some breaking changes in this release and we encourage you to read the entire blog.
IMPORTANT: This is a beta release and is intended for testing purposes only. There is no guarantee that Logstash 2.0.0-beta2 will be compatible with Logstash 2.0.0 GA.
Breaking Changes in 2.0.0.beta2
Releasing a major version of Logstash creates the opportunity to remove features and configuration options that have been deprecated in previous releases.
The beta2 release changes the default value of the filter_workers
setting which is used to control the number of filter workers. This release also removes many plugin configuration options that are now obsolete. They are detailed below.
Obsolete Configuration Options
Until the introduction of conditionals, the only way to selectively apply filters and outputs to some events was to set the type
, tags
and exclude_tags
options.
These configuration options have been deprecated for some time now and this release either removes them, or they are marked with the new obsolete
tag. You will not be able to use them in your configuration file without a resulting error.
Feedback when obsolete config options are used
To make the removal of settings less aggressive to the user, an option to tag a plugin configuration setting as obsolete has been introduced.
After a feature has been deprecated for some time, it will then be marked as obsolete. If an obsolete setting is used, a pre-configured message is presented to the user informing them of the obsolete status of the option, then Logstash terminates immediately.
How does this look in real life? Imagine that a plugin has an option called :foo
and we wish to remove it. Here is a potential timeline of plugin releases:
v1.0.0 => plugin is shipped, includes the feature and option :foo
config :foo, :validate => :string
v1.1.0 => minor version release, deprecates :foo
config :foo, :validate => :string, :deprecated => "this setting is going to be deprecated, use X instead"
v2.0.0 => major version, removes the code related to :foo and marks it as obsolete
config :foo, :validate => :string, :obsolete => "this setting is no longer available, use X instead"
v2.1.0 => minor version, removes :foo
As you can see, the :obsolete
tag softens the user experience for a removed feature, making the user aware that the feature is gone and presenting the recommended alternative!
A Better Shutdown Strategy
This release improves shutdown handling in Logstash and its plugins. Up to and including 1.5.x, when a pipeline shutdown is initiated, either by SIGTERM or SIGINT, the following events occur inside Logstash:
- SIG* signal is trapped in the pipeline thread
Pipeline#shutdown
is called from the signal handler- All input plugins threads are injected an exception using Ruby's Thread#raise
- Normally, the plugins exit their
run
method at which point the pipeline thread will again call the inputteardown
methods - After input plugins are terminated, filters are injected with a special event to trigger their shutdown
- After filter plugins are terminated, outputs are also injected with a special event to trigger their shutdown
- When outputs terminate, the pipeline shuts down and Logstash quits.
The way input plugins terminate in step 3 is problematic, since raising exceptions on the input plugin threads from the outside is unpredictable. Calling a Thread#raise
means that any execution happening in that thread must deal with the exception or terminate.
In the context of input plugins, the exception can happen during the execution of code from third party libraries that many input plugins use.
Being unable to predict how the code behaves leads to undesirable outcomes. A plugin may exit normally, exit abruptly and lose buffered data, get stuck in an inconsistent state, or some other unknown behavior.
The solution
To make the input plugin shutdown more reliable, a strategy was proposed (#3210) to avoid Thread#raise
and instead signal the plugin—in a thread-safe way—that a shutdown sequence has been started. Doing this delegates the responsibility of deciding how and when to shut down within each input plugin.
This proposal was implemented at the core level, such that now all input plugins have three methods for shutdown purposes: stop
, stop?
and close
.
stop
is meant to be called from outside the plugin thread (from the pipeline thread) and its job is to ask/signal the plugin to stop.stop?
returns true if the stop method was called and can be used within the plugin to verify if a stop was requested.close
is called once when the plugin is stopped to perform any final bookkeeping.
The way to know a plugin is stopped is simply by waiting for the plugin run method to return. When the run method exits and the plugin thread exits, only then will the close method be called, and that only once.
API Cleanup
Making this change also created an opportunity to review the rest of the shutdown API between the plugins and the pipeline, and several methods in the Plugin Base class were found to be unnecessary and thus could be removed: shutdown
, finished
, finished?
, running?
and terminating?
.
The teardown
method is renamed to close
and retains the responsibility of post-termination bookkeeping, as described above.
Plugin Developers
This change of shutdown strategy implies that all input plugins must take ownership of checking when it's time to shut down, either by checking stop?
or by implementing their own stop
method. Also, filters and outputs that call the obsolete methods noted above needed to be changed.
Changing so many plugins required some coordinated effort, but now the pipeline/plugin contract is leaner and input plugins can perform a much safer and more predictable shutdown!
To help new plugin developers, the plugin examples repositories (input, filter, output, codec) have been updated to reflect and demonstrate the new shutdown contract.
Setting better defaults
A major release is also a good time to revisit the default values for configuration settings used in Logstash.
Default value for filter_worker
Until 1.5 the default value for the filter_worker
setting was 1, which severely limited the pipeline performance: having a single filter worker meant that only one event was handled at a time since filters are evaluated in sequence.
Now, out of the box, the default value of the filter_workers
setting will be set to half of the CPU cores of the machine. Increasing the workers provides parallelism in filter execution which is crucial when doing heavier processing like complex grok patterns or the useragent filter.
Note that, as before, this setting can be changed with the "w" flag, e.g. bin/logstash -f test.config -w 1
.
Showing default settings at start-up
It's important to inform the user which values are being used in the current running instance so now Logstash will inform the use of the values being used for the default settings, namely filter_workers
.
Beta Feedback
Give the 2.0.0-beta2 a spin! If you find any bugs please report them as issues either on the logstash core repo or on the appropriate logstash-plugins repository. You can also head over to our forum. We're excited to release Logstash 2.0, but can't do it without your help!