This Week in Elasticsearch and Apache Lucene - 2019-09-30

Elasticsearch

Snapshot Lifecycle Management (SLM)

When creating a snapshot, SLM previously invoked the create snapshot API and then returned immediately, this was fine, but it was possible to interpret starting a snapshot as successfully finishing a snapshot. SLM now waits for snapshots to complete (internally) so we can more accurately reflect whether the snapshot succeeded or failed (such as when it was successfully initiated but cancelled halfway through). (#47051)

We changed _execute to now support POST as well as PUT (#47061) and SLM stats now use the preferred array-of-objects format instead of named objects (#46991) as this is friendlier to the HLRC.

We submitted a PR that adds the ability to configure retention to a Snapshot Lifecycle Management (SLM) policy from the UI. Snapshot retention will kick in based on a cron schedule configured with the newly introduced cluster setting, slm.retention_schedule. This setting is surfaced through the UI and users will have the ability to edit the value.

Retention in SLM UI

Index Lifecycle Management (ILM)

We previously introduced a way to allow users to configure an origination date that'll be used to calculate the index age for phase transmissions in the context of ILM (#46561). We now allow the origination date to be parsed from the index name (#46755) which resolves the enhancement #42449. Now a user can set index.lifecycle.parse_origination_date to 'true' and the origination date will be parsed from the index. The origination date is the base for all min_age calculations.

Hash processor

We have been working through the details necessary to re-introduce the Hash processor (34085).

Last year, we wrote the Hash processor to help anonymize data before it is indexed. This can be helpful for some GDPR workflows, or any type of analysis where you don't want the person doing the analysis to see the actual value. For example, a human analyst looking for anomalies across health records, the patient's name is not relevant, but the unique identifier for that patient is relevant.

The Hash processor works by using keyed cryptographic hashes to anonymize the data. In order to ensure that resultant hashes are consistent for the same data, the same key must be used. When the Hash processor was first introduced there was no way to enforce that the same key was used by each node (the key is sensitive like a password and lives in the keystore, not cluster state). A bad configuration (e.g. different keys on different hosts) would result in no technical failures but incorrect results that are very difficult to distinguish from correct results. Due to this possibility, the Hash processor was pulled until we had a way to ensure that all keys across the cluster were consistent.

Since the time the Hash processor was pulled, we have introduced a new kind of setting: "consistent secure settings" (#40416). We have a work in progress PR that dusts off the original Hash processor, updates it to accommodate recent refactorings, and makes use of consistent secure settings. (#40416)  Since this is the first implementation of consistent secure settings there are a few items still need to be ironed out.

Sequence-number aware replication

We are making the replica allocator sequence-number aware. Shard allocation will prefer allocating replicas on nodes where they can perform an operation-based recovery, all powered by peer recovery retention leases. This is a big step in simplifying rolling restarts as well as full-cluster restarts, reducing the time the cluster will take to go back to green as well as reducing the number of operational steps required to perform such a restart, aiming to remove the need for a synced flush as well as allowing quicker recoveries even if indexing is ongoing during the rolling restart. An important factor here is that the master, which is in charge of allocating shards, has recent information about the content of the various shard copies. Today, information about the primary shard is not refreshed when a new node joins and risks making bad choices during replica allocation with this stale information of the primary. In the worst case, this could result in cancelling ongoing recoveries which might have advanced further than the new allocation target.

Snapshot resilience

We completed the work on cloud based repository testing by adding block support to Azure tests and by adding multipart support to S3 tests. With the resumable support added to the GCS tests two weeks ago we now have a generalized test for large segment files upload. These tests paid off right away by catching a bug in the snapshot deletion process that was promptly fixed. We also added low level tests that check the retry logic of the Google client as well as for the Azure client.

Reindex resilience

A test failure around rethrottle and slicing prompted us to stay more backwards compatible and not respond to a reindex request before all slice subtasks are created, ensuring that the rethrottle hits all subtasks/slices. We also changed the reindex checkpoint mechanism, which allows automatically resuming the reindex in case of a crash, to throttle writes as well as allow clean completion when completing the reindex operation.

Logging

We have added support for custom log messages in JSON. This will allow us to log additional fields for some events like cluster status changes. Additionally, we have started scoping out changes necessary to support emitting ECS compatible logs from Elasticsearch.

Faster queries in high dimensions

We opened an issue to improve query performance on the BKD tree by computing the exact bounds after each split. Currently we only adjust the splitting dimension after each split so we don't reflect the changes in the other dimensions. The first results showed a big improvement in query performance, but also incurred a big penalty in terms of indexing throughput.

We wondered if we really need to compute the exact bounds after each split as this is an expensive operation, so we modified the original approach to only compute bounds after N splits (currently N=4). This showed a similar query performance to the previous approach, but a significantly lower penalty in indexing throughput.

Rounding optimization for fixed offset timezones

Some time ago there was a performance regression that we spotted in our benchmarks regarding UTC "calendar" intervals. This was subsequently fixed (#38221), but as it turns out it also affects fixed intervals.  This manifested as a large performance regression, reported by users.  We implemented the rounding optimization for fixed offset timezones on any interval (#46670), which fixed issue #45702. We also backported it to 7.x and 7.4.0.

Changes

Changes in Elasticsearch

Changes in 8.0:

  • Format Java source files automatically #46745
  • BREAKING: Removes typed URLs from mapping APIs #41676

Changes in 7.5:

  • SQL: Check case where the pivot limit is reached #47121
  • Wait for snapshot completion in SLM snapshot invocation #47051
  • Track enabled test task candidate class files as task input #47054
  • SQL: Add support for shape type #46464
  • Add support for POST requests to SLM Execute API #47061
  • Warn on slow metadata persistence #47005
  • ILM: parse origination date from index name #46755
  • Remove the use of ClusterFormationTasks form RestTestTask #47022
  • Remove isRecovering method from Engine #47039
  • Validate query field when creating roles #46275
  • Add migration tool checks for _field_names disabling #46972
  • Reject regexp queries on the _index field. #46945
  • Validate index and cluster privilege names when creating a role #46361
  • Improve LeaderCheck rejection messages #46998
  • Update AWS SDK for repository-s3 plugin to support IAM Roles for Service Accounts #46969
  • add function submitDeleteByQueryTask in class RestHighLevelClient #46833
  • SQL: Add PIVOT support #46489
  • max_children exist only in top level nested sort #46731
  • Testfixtures allow a single service only #46780
  • Change HLRC count request to accept a QueryBuilder #46904
  • Use composition instead of inheritance for extending Gradle plugins #46888
  • BREAKING: Add support for aliases in queries on _index. #46640

Changes in 7.4:

  • Use 'should' clause instead of 'filter' when querying native privileges #47019

  • Assert no exceptions during state application #47090

  • Emit log message when parent circuit breaker trips #47000

  • Do not rewrite aliases on remove-index from aliases requests #46989

  • Fix G1 GC default IHOP #46169

  • Fix Bug in Snapshot Status Response Timestamps #46919

  • [HLRC] Send min_score as query string parameter to the count API #46829

  • GCS deleteBlobsIgnoringIfNotExists should catch StorageException #46832