This Week in Elasticsearch and Apache Lucene - 2019-07-19

Elasticsearch Highlights

Snapshot Lifecycle Management

Snapshot Lifecycle Management (SLM) is a set of Elasticsearch APIs that allow users to automatically create backups of their data on a predefined schedule. Historically, new Elasticsearch APIs have tended to ship a release or two ahead of their UI counterparts. We’ve made an effort to develop the UI in tandem with the API so they can ship together in 7.4. By shipping UI and API together, the feature as a whole becomes accessible to a broader set of users upon launch, and our product story becomes more comprehensive. We’ll continue this workflow in the future!

We merged ES Snapshot Lifecycle Management feature branch to master and completed the backport to 7.x. Feel free to take it for a spin! Documentation is here and here. Work continues on retention.

We submitted a PR that adds the list of Snapshot Lifecycle Management policies and a details panel to the Snapshot and Restore app in Kibana.

Search: Stemmers, Pinners, Cancellers, and their neighbors

We continued work on cancelling searches when the underlying client connection is closed. Complications around multi-search request have spawned additional changes: Associate sub-requests to their parent task in multi search API, and Make multi search tasks cancellable.

We have a draft implementation for the approximate nearest neighbours search based on the locality-sensitive hashing.

We have continued work on the result pinning PR to support Swiftype style promotion of results. As a result of reviewing we uncovered some inefficiencies in Lucene's DisMax query.

We have a PR for a new English plural stemmer and published a comparison of results with the existing Lucene implementation.

Geo

We merged a PR that introduces the new Spatial Plugin into master. This is the first building block for the new spatial features (including, but not limited to, geo) we are working on. We are working on an upcoming feature for this plugin, a new XYShape field, which is not strictly Geo.

We are working in refactoring the spatial components in Elasticsearch so the new XYShape field can be integrated easily. We have merged a PR that moves dateline handling logic out of ShapeBuilders , which is specific to Geo. Next is to extract this logic from the query builders.

Pipeline Aggregations

We are working on a community PR that adds a shift parameter to the MovingFunction pipeline agg. This allows the user to adjust where the window is positioned, instead of trailing the current bucket as is the case today. Useful if you want to include or exclude the current bucket, "center" the window, etc

We are working as well in adding a "none" gap policy to pipeline aggs. Users often want to execute a bucket_script on all buckets, whether they have a value or not. The existing gap_policies make this impossible. So the "none" policy basically does nothing, and lets the aggregation evaluate the null/missing/NaN value for itself. It also adds a params.doc_count parameter to the script context so that the user can inspect the document count and determine if the value is NaN because it is missing, or NaN because it was just NaN.

Async peer recoveries

We have been working in the past months on making peer recoveries non-blocking. We merged the most complex work item this week, with file chunks now being sent asynchronously, fixing an issue where setting the node_concurrent_recoveries setting to a large value would potentially lead to deadlocks. The only remaining item is the relocation handoff. Moving more code to async is a massive undertaking but will also tremendously help in our quest to reduce the default maximum size of the generic threadpool. We are also applying the same technique to CCR's recovery from remote, allowing us to share the complex file chunk coordination logic between peer recovery and recovery from remote.

PKI for Kibana

We believe we reached a consensus on all of the outstanding questions we had about Proxied PKI for Kibana, and are working through the series or pull requests to implement this feature.

Enrich Processor

We merged a PR to ensure that an enrich policy is immutable. Since enrich index names and enriching behaviors are directly associated to a policy we can eliminate an entire category of potential issues by enforcing immutable policies. We are still working through the details of ensuring that if a policy is deleted then immediately re-adding it won't re-introduce the same category of problems were are avoiding with immutable policies. (#43604)

We merged the background cleanup process (#43746) and is in the process of adding the ES version that an enrich policy was created with to the metadata so that we have it if anything major needs to change across versions and need compatibility logic.

Apache Lucene

Lucene 8.2

The Elasticsearch benchmarks helped catch a regression with memory usage of the terms dictionary. This was due to a change to FSTs that enabled direct arc addressing on dense nodes in order to be able to run lookups with a single random access instead of a binary search. This change only triggered minor size increases when tested against text data, but it has a worst-case scenario of ~4x more memory usage. In our case, it made the terms dictionary use 50% more memory, likely because of the _id field, which is binary and has denser nodes than FSTs made of english text. This change has been reverted from 8.2 and we will work on re-enabling it in a way that avoids worst-case scenarios memory-wise for 8.3.

Other

Changes in Elasticsearch

Changes in 8.0:

  • BREAKING: Fail node containing ancient closed index #44264

Changes in 7.4:

  • Expose index age in ILM explain output #44457
  • add disable_chunked_encoding configuration #44052
  • Associate sub-requests to their parent task in multi search API #44492
  • Defer reroute when starting shards #44433
  • Introduce test issue logging #44477
  • Add Snapshot Lifecycle Management #43934
  • Make peer recovery send file chunks async #44468
  • Do not allow version in Rest Update API #43516
  • Improve build scan metadata #44247
  • Cluster health should await events plus other things #44348
  • Log write failures for watcher history document. #44129
  • Allow RerouteService to reroute at lower priority #44338
  • Throw TranslogCorruptedException in more cases #44217
  • add clarification around TESTSETUP docu and error message #43306
  • HLRC: Fix '+' Not Correctly Encoded in GET Req. #33164
  • Fail engine if hit document failure on replicas #43523
  • [ML][Data Frame] prevent task from attempting to run when failed #44239
  • Support WKT point conversion to geo_point type #44107
  • Make plugin verification FIPS 140 compliant #44224
  • Avoid counting votes from master-ineligible nodes #43688

Changes in 7.3:

  • Fix incorrect calculation of how many buckets will result from a merge #44461
  • Don't use index_phrases on graph queries #44340
  • Fix broken short-circuit in getUnlicensedRealms #44399
  • Ensure field caps doesn't error on rank feature fields. #44370
  • [ML][Data Frame] treat bulk index failures as an indexing failure #44351
  • Improve CryptoService error message on missing secure file #43623
  • Fix AnalyzeAction response serialization #44284
  • [ML][Data Frame] responding with 409 status code when failing _stop #44231
  • Fix port range allocation with large worker IDs #44213

Changes in 6.8:

  • Skip update if leader and follower settings identical #44535
  • Fix parameter value for calling data.advanceExact #44205
  • Avoid stack overflow in auto-follow coordinator #44421
  • Avoid NPE when checking for CCR index privileges #44397
  • Fix varying responses for /_analyze request #44342
  • Do not swallow I/O exception getting authentication #44398
  • Fix swapped variables in error message #44300

Changes in Elasticsearch Hadoop Plugin

Changes in 7.3:

  • [DOCS] Fix broken links for ES API docs move #1317

Changes in Rally

Changes in 1.3.0:

  • Be resilient upon startup #730
  • BREAKING: Drop 1.x support for cluster metadata #729
  • Allow to set distribution version as parameter #728