This Week in Elasticsearch and Apache Lucene - 2018-06-22
Using dots in aggregation names will be deprecated
We will begin deprecating using dots in the names of aggregations. Dots are used as a separator in buckets_path, and ordering syntax aggregations with dots in their names has thrown up many bugs. Therefore, we intend to require that all aggregation names are free of dots. As this may affect a large number of users, we plan deprecate this ability in 6.4 but not prohibit dots in aggregation names until 8.0 to give users a enough time to modify their applications.
Rollups will use the “missing_bucket” option from the composite aggregation
We opened a PR to make rollups use the new “missing_bucket” on aggregations. This closes an important limitation with rollups returning bad doc counts when you have multiple non-overlapping schemas. One situation where this can occur is with different beats each having a different schema but residing in the same index. With this enhancement, the user should be able to configured a single "combined" job which is more convenient and returns correct doc counts.
Reloadable Secure Settings
We merged and backported the work for reloadable secure settings. This work allows for re-reading the secure settings stored in the keystore on each node. During this reload, if a component has registered to be notified about update to these settings, it will be notified. This initial work allows for the discovery-ec2, repository-s3, repository-azure, and repository-gcs plugins to update the clients they use that depend on secure settings.
Changes in 5.6:
- Ensure we don’t use a remote profile if cluster name matches #31331
Changes in 6.3:
- [DOCS] Omit shard failures assertion for incompatible responses #31430
- Security: fix joining cluster with production license #31341
Changes in 6.4:
- In NumberFieldType equals and hashCode, make sure that NumberType is taken into account. #31514
- Remove QueryCachingPolicy#ALWAYS_CACHE #31451
- Add Delete Snapshot High Level REST API #31393
- Preserve response headers on cluster update task #31421
- Multiplexing token filter #31208
- Add get stored script and delete stored script to high level REST API #31355
- Skip get_alias tests for 5.x #31397
- Avoid sending duplicate remote failed shard requests #31313
- Fix defaults in GeoShapeFieldMapper output #31302
- RestAPI: Reject forcemerge requests with a body #30792
- Use system context for cluster state update tasks #31241
- REST high-level client: add validate query API #31077
- Expose lucene’s RemoveDuplicatesTokenFilter #31275
- Add ingest-attachment support for per document indexed_chars limit #31352
Changes in 7.0:
- Reload secure settings for plugins #31383
- lower rollover-info version bound to 6.4 #31414
- extend is-write-index serialization support to 6.4 #31415
- Choose JVM options ergonomically #30684
- BREAKING: Packaging: Remove windows bin files from the tar distribution #30596
- BREAKING: Percentile/Ranks should return null instead of NaN when empty #30460
The vote has passed, the release will be announced shortly. Elasticsearch was already upgraded to this final release.
Spatial code organization
There is an ongoing discussion regarding whether spatial code should entirely be in the lucene/spatial module, or whether the most commonly used bits should live in lucene/core. The current situation is not good since what we consider the best way to index geo-points is not even in lucene/core or lucene/spatial but in lucene/sandbox.
- ICU4J was upgraded to version 62.1
- Should we make per-dimension drill-down of facets optional?
- We are exploring building a highlighter based on the matches API.
- Can we support doc values on range fields?
- An AIOOBE in the UnifiedHighlighter was due to an off-by-1 error.
- We fixed the (edge) ngram filters to correctly set the position increment on end().
- We’re collaborating on a refactoring of the geo API.