This Week in Elasticsearch and Apache Lucene - 2020-01-25
Index Lifecycle Management
The ILM Rollover Action is now fully retryable. This is one of our top issues, and a big win for resiliency.
We have extracted a new step as part of the ILM RolloverAction that will wait for the active shards of the newly created index to become available, as opposed to possibly failing the entire rollover step if this wait would timeout or unable to be satisfied. We also made the update settings ILM step retryable and with this, we closed the issue of adding automatic retries to the
RolloverAction, as now, all the steps are retry-able.
Snapshot resilience and BWC
In 7.6+ we’re changing the way information is stored in a snapshot repository, so that we are resilient when snapshotting to eventually consistent blob stores such as S3. In previous versions, there was a risk of repositories becoming corrupted if a cluster was doing multiple snapshot operations in quick succession, as an eventually consistent repository would not allow the cluster to see the correct base state to update (and would have it incorrectly override stuff, corrupting snapshots in the worst case).
We still need to support a scenario where a cluster is, for example, upgraded from 7.5 to 7.6, and if something is wrong with the cluster in 7.6, allow a snapshot that was taken of the cluster in 7.5 to be restored to a new 7.5 cluster (i.e. reverting to an older state of the cluster). However, the new 7.6+ repository format is fundamentally different, not allowing older Elasticsearch versions to read from the repository. As a solution for this use case, we made it so that Elasticsearch 7.6+ will only write in the new 7.6+ repository format if all snapshots in the repository are on 7.6+. This means that as long as 7.5 snapshots are in the repository, ES 7.6+ will still write to the repository in the old way, allowing a 7.5 snapshot to be restored to a 7.5+ cluster.
This, in turn, means again that 7.6+ clusters are still at the risk of corrupting snapshots in their repository if they have older snapshots in the repository as the new resilient format can’t be used to write to the repository during that time. So, just as we thought that we had things under control, a new challenge came up. 7.6+ clusters can be converted to make use of SLM, which does not have a cool-down period, and can do back-to-back snapshots. A 7.6+ cluster could now corrupt a repository using SLM if it still had older snapshots in the repo. After evaluating various solutions, we opted to introduce a cool-down period into ES proper. This cool-down period only applies to S3-based repositories (other blob stores provide much higher consistency guarantees), and only in the case where there are older snapshots in the repository (i.e. where we have to write to the repository in the pre-7.6 format). This means that S3 snapshots take a little longer because of the cool-down period, but repository corruption should be avoided by ES 7.6+ clusters, independently of the format in which they write.
This work concludes the efforts on making snapshots resilient and safe on all blobstores we support and currently there are no known snapshot resiliency bugs left that we know of. During the snapshot resiliency sync we confirmed this fact with Cloud as well. The next step in this project is to enhance snapshot repositories to allow for concurrent operations. The first step in this is simplifying the state machine for snapshot creation and deletion in a way compatible with concurrent snapshot operations.
TLS and Authentication Examples for the High Level Rest Client
In the days of Transport client, we provided a special version of the client that included the client-side plugin for X-Pack security. This meant that same sorts of TLS configuration that worked on X-Pack security on the server were also available in the client. With the move to the High Level Rest Client, we have intentionally not copied all of that same support into the new client, but sometimes users struggled to work out how to configure the Rest Clients to connect to secured clusters. SSL in Java is versatile and robust, but it can be hard to get your head around the APIs if you’re not familiar with them.
We had some discussions about this last year, and decided that the best step we could take right now was to expand the rest client docs with clearer examples of configuring the High and Low Level Rest Clients to connect to a variety of different TLS configurations, and also added examples for authenticating with Tokens and API Keys.
Improved docs will be coming to a website near you.
We opened a PR to increase the data dimension limit for BKD in order to support indexing 3D triangles. There has been a hard limit of 8 dimensions for indexing points since the BKD tree was first introduced, but work since then has allowed this information to be split into ‘indexing’ dimensions, used to built the tree, and ‘data’ dimensions, which are stored in the leaves. This allows us some freedom to increase the number of dimensions stored while preserving index size and performance. This improvement will enable us to encode higher dimension data within a lower dimension index (eg 3D tessellated triangles as a 10 dimension point using only the first 6 dimensions for index construction).
We also merged some long-standing open issues around the organisation of spatial code within Lucene, moving LatLongShape and XYShape from lucene sandbox to core and removing the long obsolete (and mostly empty) spatial module.
Now that the parallel gradle build is on master, there have been a number of changes to improve the performance of the lucene test suite. In particular, adding java options to turn off background hotspot compilation has cut execution time massively (on my MBP running the full lucene test suite has gone from 18 minutes to 6 minutes)
The community is continuing to iterate on some implementations of approximate-nearest-neighbour search, in particular Hierarchical Navigable Small-Worlds and IVFFlat We’re doing our own investigations into building ANN search on top of existing lucene index structures, and it’s fascinating to follow these various different strands of research all feeding back on each other. Open development in action!