This Week In Elasticsearch, August 26


Geo Shapes, Circle Support, & the Circle processor

Today Elasticsearch supports the Geo Shape circle when using the prefix-tree indexing strategy. Over time we have been investing significant effort into BKD-based Geo Shapes as this strategy has resulted in significant performance gains (see our blog post from earlier this year, scroll to "Geo Performance"). These BKD-based shapes are defined by GeoJSON/WKT, which does not have support for circles - therefore we must approximate it through polygons. It is worth noting that prefix-trees have been deprecated since 6.6.

In order to make the transition to these approximations easier, we've created the Circle Processor , which is an ingest processor that will take a circle definition as well as an error_distance_in_meters (see below), and automatically replace the definition with that of a sufficiently-sided polygon.

Circle Ingest Processor Error Distance

Automatic Query Cancellation

We merged the PR to cancel search tasks on connection close and added support for cancelling requests to the low-level REST client. However, we found a pretty bad bug in the Apache async client which causes the underlying IO Reactor to crash in some cases when requests are cancelled. This was triggered by one of our new tests which sends many requests and cancels them all. Once the IO reactor shuts down due to a hard error like this, the client becomes completely unusable, as it is no longer possible to send requests through it (see The bug has been now fixed upstream, and we are waiting for the fix to be released.

Snapshot Lifecycle Management UI complete!

We submitted a PR that adds the ability for users to create and edit their Snapshot Lifecycle Management (SLM) policies from the UI. This is a continuation of previous work for adding SLM UI to the Snapshot and Restore app in Kibana. With this addition, our first phase of SLM UI work is complete.

Snapshot Lifecycle Management UI

Index Templates UI compete!

This week, we submitted a PR that adds the ability to create, edit and clone Index Templates within the Index Management app.

Index Templates UI

Append-only index privilege

We started an implementation of an "append only" index privilege (the name may change) which would permit writing new documents to an index, without overwriting existing documents.

The proposed implementation requires that the index request be clearly marked as a "new" index, either by using an autogenerated id, by providing an op_type of create, or by using the _create endpoint (which is really just a convenience for setting the op_type).

The driver for this is when there is an ingest process (e.g. auditbeat) running on a host, which by necessity has some sort of credentials to send data into Elasticsearch. If those credentials are exposed in some way, we want to limit the set of things than an attacker could do with them. Since many ingest processes only ever write new documents and don't need to update/overwrite documents, we can protect existing data by making sure those credentials only have permission to create new docs.

Indexing Performance

An external contributor identified a bottleneck when rolling over translog generations in async durability mode. Two solutions were proposed, one using disruptor and one using a LinkedBlockingQueue, both aiming to avoid blocking other threads while writing to disk. We worked with the contributor and together we came up with a low-risk one-liner change, giving a roughly 20% speed-up on the contributor's benchmark.

Reindex Resilience

Default search behaviour is to skip unavailable (RED) shards. For reindex this could potentially mean data loss since unavailable shards are not copied to destination. We worked on fixing this by forcing allow_partial_search_results=false on the search request, which will make reindex fail rather than silently skip unavailable shards.

During this work, we discovered that under extreme conditions (flapping nodes), search would sometimes skip unavailable shards (RED shards) and thus return a partial result even with allow_partial_search_results=false.

We also continued working on reindex coordinator node resiliency, digging into ensuring that only one coordinator node can write data back to the .reindex index in case multiple nodes are assigned a persistent task due to failover.


Faster Nearest Neighbor search for LatLonPoint

The current LatLotPoint nearest neighbor search implementation calculates the real surface distance between two points to reject / add new documents to the result. This is is not necessary as we can used the sort key instead and therefore avoid expensive asin() computations. 

In addition the implementation is missing an optimisation we added a few months ago that consists of shrink wrapping leaf cells to have better bounds (similar to what we already did for FloatPointNearestNeighbor). Initial benchmarks results shows a 30% performance improvement.

The gradle train is moving

There is an on-going effort to move Lucene/Solr project from ant to gradle. The latest discussions  seems to target the upgrade to September / October. There is no side effect on Elasticsearch as we get the dependencies from maven but we will need to update our Lucene CI jobs to reflect this change.


Changes in Elasticsearch

Changes in 8.0:

  • CLI tools: write errors to stderr instead of stdout #45586
  • BREAKING: Remove support for string in unmapped_type. #45675

Changes in 7.4:

  • Add node.processors setting in favor of processors #45855
  • Fix TransportSnapshotsStatusAction ThreadPool Use #45824
  • Do sync before closeIntoReader when rolling generation to improve index performance #45765
  • Cancel search task on connection close #43332
  • Never release store using CancellableThreads #45409
  • Add destructiveDistroTest meta task #45762
  • Add is_write_index column to cat.aliases #44772
  • Repository Cleanup Endpoint #43900
  • Ignore translog retention policy if soft-deletes enabled #45473
  • [Closes #44045] Added 'slices' parameter when submitting reindex request via Java high level REST client #45690
  • Stop Executing SLM Policy Transport Action on Snapshot Pool #45727
  • Add support for inlined user dictionary in the Kuromoji plugin #45489
  • [ML][Data Frame] fixing _start?force=true bug #45660
  • Add input and outut tracking of built bwc versions #45694
  • Ingest Attachment: Upgrade tika to v1.22 #45575
  • [ML][Data Frame] moves failure state transition for MT safety #45676
  • Set security index refresh interval to 1s #45434
  • Lift the restrictions that uppercase is not allowed in Setting Name. #45222

Changes in 6.8:

  • Enable testing against JDK 14 #45178
  • Include leases in error message when operations no longer available #45681
  • Ensure AsyncTask#isScheduled remain false after close #45687

Changes in Rally

Changes in 1.x:

  • Capture team and track revisions in metrics metadata #725

Changes in 1.3.0:

  • Retrieve timestamped commit hash separately #750

  • Log git output #747