This Week in Elasticsearch and Apache Lucene - 2019-06-07

Elasticsearch Highlights

Transport Client

We have removed of the transport client from the codebase, meaning elasticsearch 8.0.0 will not include the java TransportClient. The client jar support has been removed from the build; client jars were jar files that needed to be published so that a component (e.g. reindex, percolator, mustache) could be used with the transport client. We also has removed the transport client from our documentation.

Snapshot Restore UI

We started work on the delete snapshots functionality and opened a PR that fixes some i18n issues.

Better storage of _source

We revived an old issue about alternative ways to store our _source field in Lucene indices and wrote a quick proof of concept that stores each top-level field of the _source document in a separate stored field. The main benefit of this approach is that Lucene gives an identifier to each field and that we can identify fields by these numbers rather than having to repeat field names over and over again. This gave a 10-20% reduction of disk usage for stored fields on geonames depending on the codec. This is one of several ideas that are being explored these days and could help reduce disk usage, like dropping the _id field on documents that are below the global checkpoint for append-only use-cases, and possibly not index the @timestamp field and use index sorting + binary search on doc values to search on it.

Faster range query on sorted index

We are looking into optimizing range queries when the index is sorted (https://issues.apache.org/jira/browse/LUCENE-7714). We wrote a first prototype that shows an improvement over standard range queries. Another interesting aspect of this optimization is the fact that it enables searching on a numeric field without using the KD tree. So our users who take disk usage seriously could be interested in this feature regardless of whether this performs faster or slower than range queries on KD trees.

Encrypted Snapshots

As part of our ongoing plan to make it easier and safer to snapshot the security index we want to have a well supported option for storing encrypted snapshots. We currently support server-side encryption at rest (when the cloud provider can do so transparently) but we don’t offer anything client-side, and we don’t have a solution for file system based repositories.

We’re a good way through our thinking about how this would look, but we need to make sure it will be a good fit for our users needs, and that we can build something that will last.

Rollup GA

We opened an issue detailing what we need for Rollup GA. GA work has commenced with on one of the more straightforward items, allowing the DeleteJob API to also delete the rolled-up data. Today this requires manual user involvement and it is not user-friendly: a delete-by-query followed by updating the _meta mapping field. This will instead be handled by the DeleteJob API with an optional flag to also delete the data when the job is deleted. The rest of the Rollup GA items are of a similar nature, improvements to management ergonomics.

Snapshot Resiliency - it's all about hygiene

The snapshot resiliency effort currently focuses on the problem of left-over snapshot files not being cleaned up, resulting in large amounts of left-over files for repositories that have actively been used over longer periods of time. Left-over files can be caused by both snapshot deletions that failed mid-way through as well as snapshots that did not successfully complete. There can be plenty of reasons for these operations to fail, from those that are under Elasticsearch's control, such as coordinating repository access among the nodes in the cluster, and those that are outside of ES’s control, e.g. machines crashing or connectivity issues with the service where snapshots are stored.

Many of the resiliency-related issues that were under Elasticsearch's control have been fixed in recent versions (6.4+). ES will also have to do clean-ups for cases where things went wrong outside of ES's control, as well as do clean-ups for historical ES versions that might have accumulated a lot of left-over files. We are focusing on adding clean-up functionality of left-over files that will be done automatically by future ES versions during regular snapshot operations, and working on a tool that can be periodically run by our Cloud team to clean up repositories that are actively being snapshotted to by older ES versions without the automatic clean-up support.

Last week's snapshot resiliency sync focused on the performance characteristics of the auto-clean-up functionality that will be coming in newer ES versions, given that existing repositories might have accumulated a large number of left-over files. We did some preliminary benchmarking of the functionality and it appeared that the main issue would be with Azure since the Azure blob store does not allow for bulk deleting multiple blobs in a single request. Given that Azure does not provide a native API for bulk deletes (unlike S3 and GCS), we decided to approach the problem by adding a thread-pool to the Azure plugin to parallelize delete operations on Azure, which we implemented here.

We also added support for listing subdirectories on all our repository implementations, which is a prerequisite for efficiently finding left-over index folders. While blob stores, in contrast to file systems, generally do not provide a hierarchical view of files, most implementations (S3, GCP, Azure) provide virtual directory support by allowing certain operations to treat files that share common prefixes with configurable delimiters as belonging to the same virtual directory. We have also beefed up the testing for this, which we need to do against the actual Cloud service in order to be confident that these operations are implemented correctly.

We prepared a document that outlines the options for implementing a tool for removing left-over snapshot files on Cloud. In contrast to the automated clean-up that is being implemented in Elasticsearch and that will be running as part of the regular snapshot lifecycle, the Cloud clean-up tool can make no assumption about concurrently running snapshots, as that would interfere with existing snapshot operations on Cloud and require complex coordination between the snapshot process triggered by clusters on Cloud and the clean-up tool that will be run on different infrastructure. This means however that this tool needs to take a different algorithmic approach for detecting left-over files, as it might otherwise not correctly distinguish between files that are left-over and those that are in the process of being snapshotted by a concurrently running snapshot process. We've explored two options. The first one relies on file timestamps to be somewhat accurate, which is an option when limiting ourselves to the three Cloud services (S3, GCS, Azure). A second option requires a multi-pass approach, where the tool will require a second run after a certain period of time has passed, having the tool account for the time passed between the runs. We are looking at the operational impact of both approaches, the testing aspect as well as how to package the clean-up tool so that it can be easily run on Cloud infrastructure.

Apache Lucene

Change of score for fuzzy queries

Fuzzy queries are currently rewritten as a disjunction, which has the downside that documents that contain multiple terms from the rewritten query will score better than documents that contain multiple occurrences of one term. For instance foobar~1 could be rewritten as fobar OR foobaz, and a document that has one occurrence of fobar and another one of foobaz will likely get a better score than a document that has 2 occurrences of fobar.We argue that a better way to score fuzzy queries would be to use a SynonymQuery, which would sum up term frequency across all matched terms in order to compute the score.

Other

Changes in Elasticsearch

Changes in 8.0:

  • Remove the transport client #42538
  • Skip shadow jar logic for javadoc and sources jars #42904
  • BREAKING: Removes type from TermVectors APIs #42198
  • Make high level rest client a fat jar #42771
  • BREAKING: RollupStart endpoint should return OK if job already started #41502

Changes in 7.3:

  • Reindex max_docs parameter name #41894
  • Deprecation info for joda-java migration on 7.x #42659
  • Add a merge policy that prunes ID postings for soft-deleted but retained documents #40741
  • Omit JDK sources archive from bundled JDK #42821
  • Add custom metadata to snapshots #41281
  • Fix Infinite Loops in ExceptionsHelper#unwrap #42716
  • Add Ability to List Child Containers to BlobContainer #42653
  • Enable console audit logs for docker #42671
  • Enable Parallel Deletes in Azure Repository #42783
  • Deduplicate alias and concrete fields in query field expansion #42328
  • Permit API Keys on Basic License #42787
  • Replicate aliases in cross-cluster replication #41815
  • Eclipse libs projects setup fix #42852
  • Remove unnecessary usage of Gradle dependency substitution rules #42773
  • Fix error with test conventions on tasks that require Docker #42719
  • Remove "template" field in IndexTemplateMetaData #42099
  • Read the default pipeline for bulk upsert through an alias #41963

Changes in 7.2:

  • Skip installation of pre-bundled integ-test modules #42900
  • Fix NPE when rejecting bulk updates #42923
  • Use reader attributes to control term dict memory useage #42838
  • Avoid clobbering shared testcluster JAR files when installing modules #42879
  • NullPointerException when creating a watch with Jira action (#41922) #42081
  • Don't require TLS for single node clusters #42826

Changes in 6.8:

  • Fix concurrent search and index delete #42621
  • Wire query cache into sorting nested-filter computation #42906
  • Enable testing against JDK 13 EA builds #40829
  • Fixes a bug in AnalyzeRequest.toXContent() #42795

Changes in Elasticsearch Management UI

Changes in 7.3:

  • Add repository-azure autocompletion settings #37935

Changes in Rally

Changes in 1.2.0:

  • Add Rally Docker image to release process #702
  • Add download subcommand #704
  • Provide default for datastore.secure in all cases #705