29 January 2018

This Week in Elasticsearch and Apache Lucene - 2018-01-29

By Clinton GormleyAdrien Grand

Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.

Faster prefix queries

Text fields will soon have an option to index prefixes so that prefix queries can run as term queries under the hood, which are much faster. In general, the performance of prefix queries depends on the number of terms that match the prefix, which makes queries on short prefixes more expensive. When enabled, this option will index all prefixes whose length is between 2 and 5 (both included), which we think is a good trade-off between speed (prefix queries on 6 characters or more should not perform too bad in general) and space, since this option indexes an additional field with edge ngrams under the hood.

We are thinking of doing something similar with shingles to speed up phrase queries in the future.

Meltdown Blogpost

After intensive benchmarking we have published a blog post explaining the performance impact of the Meltdown patches on Elasticsearch: https://www.elastic.co/blog/performance-impact-of-meltdown-on-elasticsearch.

Rally 0.9.0

Rally 0.9.0 has been released. It allows users to configure Elasticsearch plugin parameters on the command line. There are also several changes to the "track" file format to allow a more flexible definition of benchmarks. See the migration guide at http://esrally.readthedocs.io/en/stable/migrate.html for a walkthrough of these changes.

Changes in 5.6:

  • StringTerms.Bucket.getKeyAsNumber detection type #28118
  • Ensure we protect Collections obtained from scripts from self-referencing #28335
  • X-Pack:
    • Ensure we protect Collections obtained from scripts from self-referencing #3681
    • fix trailing backslash in datapath deprecation check #3642

Changes in 6.1:

  • Fix settting notificaiton for complex setting (affixMap settings) that could cause transient settings to be ignored #28317
  • Fix peer recovery flushing loop #28350

Changes in 6.2:

  • Plugins: Use one confirmation of all meta plugin permissions #28366
  • Update Netty to 4.1.16.Final #28345
  • Settings: Introduce settings updater for a list of settings #28338
  • Add information when master node left to DiscoveryNodes' shortSummary() #28197
  • X-Pack:
    • [SAML] add security permission to get the classloader #3720
    • Remove production from the message about license installation without TLS #3666
    • [SAML] Find all tokens for a realm, not just the first 10 #3689
    • Elevate privileges fetching metadata for SAML #3671
    • Simplify security manager permissions #3651

Changes in 6.3:

  • Settings: Reimplement keystore format to use FIPS compliant algorithms #28255
  • Replace jvm-example by two plugin examples #28339
  • High level rest client : code clean up #28386
  • REST high-level client: add support for exists alias #28332
  • REST high-level client: move to POST when calling API to retrieve which support request body #28342
  • Add Indices Aliases API to the high level REST client #27876
  • Always return the after_key in composite aggregation response #28358
  • Remove Painless Type from MethodWriter in favor of Java Class. #28346
  • Deprecate the update_all_types option. #28284
  • [Plugin] Remove redundant argument for buildConfiguration of s3 plugin #28281
  • Completely Remove Painless Type from AnalyzerCaster in favor of Java Class #28329
  • Added Put Mapping API to high-level Rest client (#27205) #27869
  • Adds the ability to specify a format on composite date_histogram source #28310
  • Provide a better error message for the case when all shards failed #28333
  • Painless: Replace Painless Type with Java Class during Casts #27847
  • Trim down usages of ShardOperationFailedException interface #28312
  • Calculate sum in Kahan summation algorithm in aggregations (#27807) #27848
  • X-Pack:
    • BREAKING: Remove XPackExtension in favor of SecurityExtensions #3734
    • Remove the gradle cheatsheet #3708
    • Remove legacy files from xpack split #3707
    • Expose XPackExtensions via SPI #3530
    • Trim down usages of ShardOperationFailedException interface #3662

Changes in 7.0:

  • BREAKING: Java api clean up: remove deprecated isShardsAcked #28311
  • BREAKING: Remove the update_all_types option. #28288
  • X-Pack:
    • Fix XPackExtension javadoc #3711

Apache Lucene

Analysis

  • ShingleFilter doesn't work with synonyms. We would like to be able to speed up phrase queries with a simple mapping setting in the same way that we are doing for prefix queries. This will however require to fix ShingleFilter so that it works on arbitrary token streams and allows to return the same results as a phrase query would.
  • HyphenationDecompoundTokenFilter should really be thought as the extension of a tokenizer rather than a token filter.

Index

Search

Geo