Product release

Logstash 2.0.0 beta2 released

We are pleased to announce the beta2 release for Logstash 2.0.0. Please note that there are some breaking changes in this release and we encourage you to read the entire blog.

IMPORTANT: This is a beta release and is intended for testing purposes only. There is no guarantee that Logstash 2.0.0-beta2 will be compatible with Logstash 2.0.0 GA.

Breaking Changes in 2.0.0.beta2

Releasing a major version of Logstash creates the opportunity to remove features and configuration options that have been deprecated in previous releases. The beta2 release changes the default value of the filter_workers setting which is used to control the number of filter workers. This release also removes many plugin configuration options that are now obsolete. They are detailed below.

Obsolete Configuration Options

Until the introduction of conditionals, the only way to selectively apply filters and outputs to some events was to set the type, tags and exclude_tags options. These configuration options have been deprecated for some time now and this release either removes them, or they are marked with the new obsolete tag. You will not be able to use them in your configuration file without a resulting error.

Feedback when obsolete config options are used

To make the removal of settings less aggressive to the user, an option to tag a plugin configuration setting as obsolete has been introduced.

After a feature has been deprecated for some time, it will then be marked as obsolete. If an obsolete setting is used, a pre-configured message is presented to the user informing them of the obsolete status of the option, then Logstash terminates immediately.

How does this look in real life? Imagine that a plugin has an option called :foo and we wish to remove it. Here is a potential timeline of plugin releases:

v1.0.0 => plugin is shipped, includes the feature and option :foo

config :foo, :validate => :string

v1.1.0 => minor version release, deprecates :foo

config :foo, :validate => :string, :deprecated => "this setting is going to be deprecated, use X instead"

v2.0.0 => major version, removes the code related to :foo and marks it as obsolete

config :foo, :validate => :string, :obsolete => "this setting is no longer available, use X instead"

v2.1.0 => minor version, removes :foo

As you can see, the :obsolete tag softens the user experience for a removed feature, making the user aware that the feature is gone and presenting the recommended alternative!

A Better Shutdown Strategy

This release improves shutdown handling in Logstash and its plugins. Up to and including 1.5.x, when a pipeline shutdown is initiated, either by SIGTERM or SIGINT, the following events occur inside Logstash:

  1. SIG* signal is trapped in the pipeline thread
  2. Pipeline#shutdown is called from the signal handler
  3. All input plugins threads are injected an exception using Ruby's Thread#raise
  4. Normally, the plugins exit their run method at which point the pipeline thread will again call the input teardown methods
  5. After input plugins are terminated, filters are injected with a special event to trigger their shutdown
  6. After filter plugins are terminated, outputs are also injected with a special event to trigger their shutdown
  7. When outputs terminate, the pipeline shuts down and Logstash quits.

The way input plugins terminate in step 3 is problematic, since raising exceptions on the input plugin threads from the outside is unpredictable. Calling a Thread#raise means that any execution happening in that thread must deal with the exception or terminate. In the context of input plugins, the exception can happen during the execution of code from third party libraries that many input plugins use.

Being unable to predict how the code behaves leads to undesirable outcomes. A plugin may exit normally, exit abruptly and lose buffered data, get stuck in an inconsistent state, or some other unknown behavior.

The solution

To make the input plugin shutdown more reliable, a strategy was proposed (#3210) to avoid Thread#raise and instead signal the plugin—in a thread-safe way—that a shutdown sequence has been started. Doing this delegates the responsibility of deciding how and when to shut down within each input plugin. This proposal was implemented at the core level, such that now all input plugins have three methods for shutdown purposes: stop, stop? and close.

  • stop is meant to be called from outside the plugin thread (from the pipeline thread) and its job is to ask/signal the plugin to stop.
  • stop? returns true if the stop method was called and can be used within the plugin to verify if a stop was requested.
  • close is called once when the plugin is stopped to perform any final bookkeeping.

The way to know a plugin is stopped is simply by waiting for the plugin run method to return. When the run method exits and the plugin thread exits, only then will the close method be called, and that only once.

API Cleanup

Making this change also created an opportunity to review the rest of the shutdown API between the plugins and the pipeline, and several methods in the Plugin Base class were found to be unnecessary and thus could be removed: shutdown, finished, finished?, running? and terminating?. The teardown method is renamed to close and retains the responsibility of post-termination bookkeeping, as described above.

Plugin Developers

This change of shutdown strategy implies that all input plugins must take ownership of checking when it's time to shut down, either by checking stop? or by implementing their own stop method. Also, filters and outputs that call the obsolete methods noted above needed to be changed.

Changing so many plugins required some coordinated effort, but now the pipeline/plugin contract is leaner and input plugins can perform a much safer and more predictable shutdown!

To help new plugin developers, the plugin examples repositories (input, filter, output, codec) have been updated to reflect and demonstrate the new shutdown contract.

Setting better defaults

A major release is also a good time to revisit the default values for configuration settings used in Logstash.

Default value for filter_worker

Until 1.5 the default value for the filter_worker setting was 1, which severely limited the pipeline performance: having a single filter worker meant that only one event was handled at a time since filters are evaluated in sequence.

Now, out of the box, the default value of the filter_workers setting will be set to half of the CPU cores of the machine. Increasing the workers provides parallelism in filter execution which is crucial when doing heavier processing like complex grok patterns or the useragent filter.

Note that, as before, this setting can be changed with the "w" flag, e.g. bin/logstash -f test.config -w 1.

Showing default settings at start-up

It's important to inform the user which values are being used in the current running instance so now Logstash will inform the use of the values being used for the default settings, namely filter_workers.

Beta Feedback

Give the 2.0.0-beta2 a spin! If you find any bugs please report them as issues either on the logstash core repo or on the appropriate logstash-plugins repository. You can also head over to our forum. We're excited to release Logstash 2.0, but can't do it without your help!