23 January 2017

Brewing in Beats: Optimize matching regular expressions

By Monica Sarbu

Welcome to the weekly Brewing in Beats ! With this weekly series, we are keeping you up to date with all the changes in Beats, including the latest commits, releases and other learning resources.

Optimize matching regular expressions

Steffen wasn’t happy with the performance of the regular expressions implementation in the Go standard library so he optimized some of the most common use cases by factors up to 10x and even 100x. The idea of the optimization is that in Beats we accept regular expressions in places where most people use them for simple substring searches (e.g. exclude_lines, include_lines, or multiline), so we can automatically detect these cases in regexps and switch to faster implementations. See the PR for benchmark results.

Community Beat: Apexbeat

Apexbeat extracts configurable contextual data and metrics from Java applications via the APEX toolkit and indexes them into Elasticsearch.

Community Beat: Connbeat

Connbeat collects TCP connections metadata to index them in Elasticsearch. For each connection, it gives you details about the IP, port of the endpoints involved, together with details about the local process. You can use it to monitor the connections in Docker instances, and you get extra metadata about your Docker container. The exported data are similar to the ones provided by the system.socket metricset in Metricbeat. For now, it works only on Linux systems.

Initial setup of Filebeat modules

With this change, the -setup option is added to the Filebeat command line for loading the Ingest Node pipeline at startup if the Elasticsearch output is configured. In case Elasticsearch is not available, then Filebeat fails to start with an error. 

For example, the following command includes the -setup option to initialize the Filebeat Nginx module before running Filebeat:

filebeat -e -modules nginx -setup

In the near future, we are planning to load also the Kibana dashboards, in addition to the Ingest Node pipeline, when using the -setup flag.


Other features and fixes


All Beats

  • Pass additional metadata to the outputs. As the first use case, the pipeline is passed as metadata to the Elasticsearch output #3359
  • Cleanup before_build script that it’s executed before creating Beats packages #3386
  • As Elasticsearch is planning to remove the support for types in the future, the _type field is marked as deprecated in 5.3 and it will be removed in 6.0. Instead of _type, you can use the type field. #3409
  • Generate the templates before testing to use the latest version #3425

Filebeat

  • Make optional the module directory with all the Filebeat modules #3405

Heartbeat

  • Set default settings in the http monitor check #3402
  • Update the full configuration file and use Elasticsearch as the default monitored http endpoint  #3401
  • Fix failure when ICMP is configured and IPv6 support is disabled on boot or not available in kernel #3414

Metricbeat

  • Update docker client to the latest version #3398
  • Add HTTP helper to be used for HTTP based metricsets #3413

Generators

  • Set the default period to 10s in the Metricbeat module generator #3377