Snapshot Lifecycle Management
Snapshot Lifecycle Management (SLM) is a set of Elasticsearch APIs that allow users to automatically create backups of their data on a predefined schedule. Historically, new Elasticsearch APIs have tended to ship a release or two ahead of their UI counterparts. We’ve made an effort to develop the UI in tandem with the API so they can ship together in 7.4. By shipping UI and API together, the feature as a whole becomes accessible to a broader set of users upon launch, and our product story becomes more comprehensive. We’ll continue this workflow in the future!
We submitted a PR that adds the list of Snapshot Lifecycle Management policies and a details panel to the Snapshot and Restore app in Kibana.
Search: Stemmers, Pinners, Cancellers, and their neighbors
We continued work on cancelling searches when the underlying client connection is closed. Complications around multi-search request have spawned additional changes: Associate sub-requests to their parent task in multi search API, and Make multi search tasks cancellable.
We have a draft implementation for the approximate nearest neighbours search based on the locality-sensitive hashing.
We merged a PR that introduces the new Spatial Plugin into master. This is the first building block for the new spatial features (including, but not limited to, geo) we are working on. We are working on an upcoming feature for this plugin, a new XYShape field, which is not strictly Geo.
We are working in refactoring the spatial components in Elasticsearch so the new XYShape field can be integrated easily. We have merged a PR that moves dateline handling logic out of ShapeBuilders , which is specific to Geo. Next is to extract this logic from the query builders.
We are working on a community PR that adds a
shift parameter to the MovingFunction pipeline agg. This allows the user to adjust where the window is positioned, instead of trailing the current bucket as is the case today. Useful if you want to include or exclude the current bucket, "center" the window, etc
We are working as well in adding a "none" gap policy to pipeline aggs. Users often want to execute a bucket_script on all buckets, whether they have a value or not. The existing gap_policies make this impossible. So the "none" policy basically does nothing, and lets the aggregation evaluate the null/missing/NaN value for itself. It also adds a
params.doc_count parameter to the script context so that the user can inspect the document count and determine if the value is NaN because it is missing, or NaN because it was just NaN.
Async peer recoveries
We have been working in the past months on making peer recoveries non-blocking. We merged the most complex work item this week, with file chunks now being sent asynchronously, fixing an issue where setting the node_concurrent_recoveries setting to a large value would potentially lead to deadlocks. The only remaining item is the relocation handoff. Moving more code to async is a massive undertaking but will also tremendously help in our quest to reduce the default maximum size of the generic threadpool. We are also applying the same technique to CCR's recovery from remote, allowing us to share the complex file chunk coordination logic between peer recovery and recovery from remote.
PKI for Kibana
We merged a PR to ensure that an enrich policy is immutable. Since enrich index names and enriching behaviors are directly associated to a policy we can eliminate an entire category of potential issues by enforcing immutable policies. We are still working through the details of ensuring that if a policy is deleted then immediately re-adding it won't re-introduce the same category of problems were are avoiding with immutable policies. (#43604)
We merged the background cleanup process (#43746) and is in the process of adding the ES version that an enrich policy was created with to the metadata so that we have it if anything major needs to change across versions and need compatibility logic.
The Elasticsearch benchmarks helped catch a regression with memory usage of the terms dictionary. This was due to a change to FSTs that enabled direct arc addressing on dense nodes in order to be able to run lookups with a single random access instead of a binary search. This change only triggered minor size increases when tested against text data, but it has a worst-case scenario of ~4x more memory usage. In our case, it made the terms dictionary use 50% more memory, likely because of the _id field, which is binary and has denser nodes than FSTs made of english text. This change has been reverted from 8.2 and we will work on re-enabling it in a way that avoids worst-case scenarios memory-wise for 8.3.
- Can we leverage top-hits retrieval optimizations across multiple slices?
- BKDWriter could make better splitting decisions by recomputing the range of each dimension on each recursion level.
- We could clean things up by leveraging Set.copyOf and Set.of.
- Per-query I/O counters can be useful for benchmarking.
- Nearest neighbor search was missing an optimization we added a couple months ago that consists of shrink wrapping leaf cells to have better bounds.
- DisjunctionMaxQuery can be optimized for top-hits retrieval.
- IndexSearcher#termStatistics should not require creating a TermStates, which is expensive at it requires seeking in the terms dictionary.
- We are considering removing the "Direct" doc-value format.
Changes in Elasticsearch
Changes in 8.0:
- BREAKING: Fail node containing ancient closed index #44264
Changes in 7.4:
- Expose index age in ILM explain output #44457
- add disable_chunked_encoding configuration #44052
- Associate sub-requests to their parent task in multi search API #44492
- Defer reroute when starting shards #44433
- Introduce test issue logging #44477
- Add Snapshot Lifecycle Management #43934
- Make peer recovery send file chunks async #44468
- Do not allow version in Rest Update API #43516
- Improve build scan metadata #44247
- Cluster health should await events plus other things #44348
- Log write failures for watcher history document. #44129
- Allow RerouteService to reroute at lower priority #44338
- Throw TranslogCorruptedException in more cases #44217
- add clarification around TESTSETUP docu and error message #43306
- HLRC: Fix '+' Not Correctly Encoded in GET Req. #33164
- Fail engine if hit document failure on replicas #43523
- [ML][Data Frame] prevent task from attempting to run when failed #44239
- Support WKT point conversion to geo_point type #44107
- Make plugin verification FIPS 140 compliant #44224
- Avoid counting votes from master-ineligible nodes #43688
Changes in 7.3:
- Fix incorrect calculation of how many buckets will result from a merge #44461
- Don't use index_phrases on graph queries #44340
- Fix broken short-circuit in getUnlicensedRealms #44399
- Ensure field caps doesn't error on rank feature fields. #44370
- [ML][Data Frame] treat bulk index failures as an indexing failure #44351
- Improve CryptoService error message on missing secure file #43623
- Fix AnalyzeAction response serialization #44284
- [ML][Data Frame] responding with 409 status code when failing _stop #44231
- Fix port range allocation with large worker IDs #44213
Changes in 6.8:
- Skip update if leader and follower settings identical #44535
- Fix parameter value for calling data.advanceExact #44205
- Avoid stack overflow in auto-follow coordinator #44421
- Avoid NPE when checking for CCR index privileges #44397
- Fix varying responses for /_analyze request #44342
- Do not swallow I/O exception getting authentication #44398
- Fix swapped variables in error message #44300
Changes in Elasticsearch Hadoop Plugin
Changes in 7.3:
- [DOCS] Fix broken links for ES API docs move #1317
Changes in Rally
Changes in 1.3.0: