This Week in Elasticsearch and Apache Lucene - 2019-06-21

Elasticsearch

Voting-only master nodes

voting-only master-eligible node is a node that can participate in master elections but will not act as master in the cluster. Such a node can help elect another master-eligible node as master, and can serve as a tiebreaker in elections. High availability clusters require at least three master-eligible nodes, so that if one of the three nodes is down, then the remaining two can still elect a master amongst them-selves. Note that this only requires one of the two remaining nodes to have the capability to act as master, but both need to have voting powers for the election to go through. This means that one of the three dedicated master-eligible nodes can be made voting-only and allocated to a less powerful machine with a smaller heap-size, as its sole purpose is to safely store the cluster state and provide votes during elections. An alternative way is to use an existing data/coordinating-only node and give it the master-eligible voting-only role as well, which allows it to serve as a tiebreaker in a configuration with only two dedicated master nodes.

Voting-only master nodes have been a long-requested feature. With the introduction of the new cluster coordination subsystem in 7.0, implementing this feature was finally within reach. We have opened a PR that will add voting-only master nodes to Elasticsearch 7.3.

Automatic cancellation of Search Requests

We worked on a draft PR to cancel search tasks automatically when the user closes the connection. The idea behind this work is that we could simplify the cancellation of search request by checking the status of the connection. We have options to get notified if a channel is closed in our network layer so the PR is using this functionality to cancel in-flight requests that reference closed channels.

Analytics

While investigating an issue with doc value iterators for deeply-nested aggregations (see: #43091), we noticed the memory used to keep track of the doc ids and buckets in the BestBucketsDeferringCollector showed up as one of the main contributor. In our tests half of the memory held in the BestBucketsDeferringCollector is associated to segments that don't have matching docs in the selected buckets. This is expected on fields that have a big cardinality since each bucket can appear in very few segments.   We opened a PR to allocate the builders lazily, which reduces the memory consumption by a factor 2 (from 1GB to 512MB), and thus reduces the impact on garbage collections for these volatile allocations.

We recently added a min_interval option to the auto_date_histogram aggregation.  This lets the user specify the smallest interval they want when automatically generating buckets.  This is needed because most data is collected at a specific sampling rate, and it is nonsensical to try and zoom "past" that sampling rate (go go nyquist limit).  It will also probably let Rollup support auto_date_histo.

We opened an ES PR to iterate on a new Principal Component Analysis aggregation.

Geo

We are working on an improvement for BKD performance when the tree contains duplicated points. Results are very encouraging and we already working on the first change in Lucene.

We opened a PR which adds support for MultiPoint and Point shapes to be stored in GeometryTree. To represent the collection of points, a KDbush is used, which is a sorted array sorted recursively by alternating dimensions x/y.

We opened Lucene PR to merge XYShape support. The PR has complete feature parity with LatLonShape and works on Cartesian geometry. Next steps are to add Cartesian geometry support to the Shape builders and parsers in the ES geo library and add the field mapper to the xpack geo plugin.

Rally 1.2.1 released

The team released Rally 1.2.1.There is a lot of goodness in the release. Most notably Rally now validates track parameters (visualized below) and has a Docker image!

Rally Track Validation

Snapshot resiliency

We implemented a way to efficiently delete (virtual) directories in the blob store repositories, which will power the auto-clean up logic for orphaned indices.

We're also looking at upgrading the Azure repository plugin to the newer Azure v11 SDK. The legacy Azure SDK, which we're curently using, does not have a notion of bulk operations and is fully blocking on all API calls, which required us to implement bulk deletes via a private thread pool. The newer SDK is based on non-blocking IO and would allow for massive parallelism in operations without the cost of a private thread pool. There are some hurdles we have to overcome first though before we can upgrade. We're raising issues on the Azure SDK's Github repository to get these sorted out.

PKI in Kibana work

We've reached a general consensus on the approach we want to take for PKI in Kibana. There's still a few outstanding issues that we need to work through, but we've agreed on enough that we can start development work.

ILM

We fixed a couple ILM issues found in the wild.

The first is where force merging could wait forever for an expected segment count. Force merging via ILM is now a best effort to prevent policies from being stuck forever in this step. (#43246)

The second is that trying to stop ILM could be prevented for a large period around the shrink action, this has now been narrowed so we only prevent ILM from stopping when absolutely necessary. (#43254)

UI Goodness

The UI team is working on roadmaps for some new management interfaces.

We're working on an Index Templates UI, which will help users manage their index templates as well as understand how those templates interact: Index Templates

We're also working on the roadmap for an Index Mappings Editor, which will allow users to configure index mappings using a UI interface. The work is still in early stages and subject to change, but hopefully this mockup helps convey the power of what the team is looking at: Index Mappings

Apache Lucene

Geo-land

We're working on optimizing the BKD tree for leaves that contain just few point values that are different. Currently leaves are treated (almost) the same regardless if they have two different point values or they are all unique points. Therefore if we can capture low cardinality leaves we can reduce the index size by adding a new storage strategy and boost query performance by making sure that equal points are only tested against the query once.

Performance

We proposed change on adding a range query that takes advantage of index sorting has been approved and will land in the very near future.

We worked on improving IndexSearchers parallel capabilities when am executor is supplied. Lucene now also utilizes the incoming thread for search execution while before it was idle waiting for spawned threads on the executor to finish.

Misc

A release wizard tool that guides the release manager through the release process step  by step, helping you to to run the right commands in the right order, generating e-mail templates with the correct texts, versions, paths etc, obeying the voting rules and much more. 

Efforts to makes queries accountable to account for their ram usage.

We have some interesting followups on a moving computation to GPUs. This work is far out future work but fun to follow.

Changes

Changes in Elasticsearch

Changes in 8.0:

  • BREAKING: Get snapshots support for multiple repositories #42090
  • Remove indices exists action #43164

Changes in 7.3:

  • Added parsing of erroneous field value #42321
  • Add painless method getByPath, get value from nested collections with dotted path #43170
  • Testclusters: conver remaining x-pack #43335
  • Advance checkpoints only after persisting ops #43205
  • Reduce the number of docvalues iterator created in the global ordinals fielddata #43091
  • Allocate memory lazily in BestBucketsDeferringCollector #43339
  • [ML][Data Frame] adds new pipeline field to dest config #43124
  • Do not use soft-deletes to resolve indexing strategy #43336
  • Add kerberos grant_type to get token in exchange for Kerberos ticket #42847
  • Properly format reproduction lines for test methods that contain periods #43255
  • Deprecate native code info in xpack info api #43297
  • Geo: Add coerce support to libs/geo WKT parser #43273
  • Move dense_vector and sparse_vector to module #43280
  • Recursive Delete on BlobContainer #43281
  • Local node is discovered when cluster fails #43316
  • Rebuild version map when opening internal engine #43202

Changes in 7.2:

  • Fix round up of date range without rounding #43303
  • Prevent NullPointerException in TransportRolloverAction #43353

Changes in 7.0:

  • Make ILM force merging best effort #43246

Changes in 6.8:

  • Reconnect remote cluster when seeds are changed #43379
  • SQL: fix NPE in case of subsequent scrolled requests for a CSV/TSV formatted response #43365
  • Return 0 for negative "free" and "total" memory reported by the OS #42725
  • Narrow period of Shrink action in which ILM prevents stopping #43254

Changes in Elasticsearch Hadoop Plugin

Changes in 6.8:

  • [DOCS] Updates documentation version #1305

Changes in Rally

Changes in 1.3.0:

  • Implement ES daemon-mode in process launcher #701

Changes in 1.2.1:

  • Check tags in track and team repos #713