As part of an effort to keep the deprecation info API up to date as we go (as opposed to a last minute push),we opened an issue to track breaking changes in 8.0 that need additions to this API (#42404).

Enrich

We have merged the PR that adds a force merge step to the enrich policy execution (#41969).

Work has also started on integrating security with the enrich feature as is described here.

We've started looking into how processor/pipeline can indicate that it needs to run on a different node. This is needed, because in the case when a write request with a pipeline is sent to a node that doesn't have an enrich index shard or that shard isn't ready yet. This should work similarly to how we forward write requests with a pipeline from a non ingest node to an ingest node.

Watcher UI

Work has continued on making improvements to Watcher. Recently this focused on the watch list view) and threshold alert.

watcher ui alert threshold improvements

Snapshot Repositories UI

We have instrumented Snapshot Repositories with UI metrics, and also worked on adding logic to warn users and disable delete for a cloud-managed repository.

cloud snapshot repositories UI warning

Bye Bye Transport Client

From 7.0 the high level REST Client has feature parity with the transport client, so we have started on the daunting task to remove the transport client. This will also move us closer to the removal of the Guice library. As part of this, the migrate tool was deprecated in 7.x and removed in master. This tool was originally created to allow early users of security to move from the file realm to the native realm when it was introduced. At this point, the tool does not add much value and its removal simplified removing the transport client from x-pack. We have started by removing the TestXPackTransportClient and then moved on to removing transport client usage in x-pack.

Performance

We merged Rally#688 resolving a major usability issue (Rally#478) where wrong user specified track parameters, e.g. due to a typo, don't show a warning and may end up running a wrong benchmark. We took the extra mile and display a list of the wrong parameters, the entire list of user configurable parameters as well as possible correct alternatives using Levenshtein style approximate string matching.

Types Removal

Work has begun on types removal for 8.0. We have started to remove typed URL paths in the REST layer and have also done some cleanup to completely remove internal references to types in the search APIs.

Multi-fields

We have deprecated+removed support for chained multi-fields. As of 8.0 it won't be possible to define a multi-field within a multi-field. This closes a long standing issue that was opened to limit multi-fields to relevant settings only.

Search low-level cancellation

A PR was merged that activates low-level cancellation for search by default. The change is minimal but it required a lot of benchmarking in order to ensure that it does not impact search performance. Low-level cancellation is required to improve the responsiveness of search cancellation for long running queries and will benefit to systems (Kibana, ...) that cannot change the default settings of an Elasticsearch cluster.

Removal of node.max_local_storage_nodes setting

We have deprecated and removed the node.max_local_storage_nodes setting. This setting, which prior to Elasticsearch 5 was enabled by default and caused all kinds of confusion, has since been disabled by default and is not recommended for production use. The prefered way going forward is for users to explicitly specify separate data folders for each started node to ensure that each node is consistently assigned to the same data path.

The main step in getting rid of this setting was to move the test infrastructure away. This also changes the behavior of ESIntegTestCases so that starting up nodes will not automatically reuse data folders of previously stopped nodes.

Finally, this also opens up the possibility to remove the "nodes/$nodeOrdinal" folder prefix from the data path in 8.0.

Geo

We've merged the initial implementation of the major serialized tree structure for storing polygons as doc-values (#40810 and #42331).

We revisited the tessellation logic after a user reported a failure in a valid polygon in the discuss forum. This new issue showed clearly that the tessellator logic was failing when a hole had a shared vertex with the polygon. We decided to change the logic when eliminating holes to detect if a hole has a shared vertices and use that vertex as the hole bridge. In order to do that efficiently the algorithm uses the bounding box of the hole to filter out potential candidates. This changed allowed us to remove most of the complex logic we had previously added.

Lucene

Lucene 8.1.1

A 8.1.1 release has been called in order to fix a NPE when a Solr collection is called via an alias. This release has no changes on the Lucene side however.

How to enforce a maximum clause count?

Currently Lucene enforces that boolean queries have at most 1024 clauses. Some users have been working around this limit by using multiple levels of boolean queries so that they can have more than 1024 clauses in total but each boolean query has no more than 1024 clauses so Lucene doesn't complain about the clause count. However we recently introduced a new rewrite rule that flattens disjunctions within disjunctions, effectively defeating this workaround, which is surprising some of our users. We are considering deactivating this rewrite rule when it would hit the clause count limit, but this also triggered a separate discussion about whether we could count clauses across the entire query tree as opposed to on a per-BooleanQuery basis.

Other

There is now a way to sort based on the value of a FeatureField. This means that in the future Elasticsearch users will be able to sort on a rank_feature field without having to add a float field in a multi-field as well.
There is a lot of work ongoing around getting a working Gradle build.
A bug was identified around concurrent rollbacks and reopens causing leakage of segment readers.
Javadocs of our analysis components are getting improved.
Range fields are getting doc-value support. Elasticsearch has its own, slightly different implementation of doc-value support for range fields due to the fact that it needs to support multiple ranges per document, which the Lucene implementation doesn't support.
StoredFieldsVisitor#stringField will now take a String instead of the UTF8 representation of the string as a byte[].
The conjunction scorer that leverages maximum per-clause scores was enhanced so that it works better when sub-clauses have two-phase iterators, such as phrase queries.
There are ongoing discussions regarding the introduction of a Collector-based rescorer.
Can we make codecs more efficiently handle boolean doc values?

Changes

Changes in Elasticsearch

Changes in 8.0:

Remove transport client from xpack #42202
BREAKING: Remove deprecated search.remote settings #42381
BREAKING: Remove node.max_local_storage_nodes #42428
Remove deprecated Repository methods #42359
Fix TopHitsAggregationBuilder adding duplicate _score sort clauses #42179
add 7.1.1 and 6.8.1 versions #42253
BREAKING: Remove the migrate tool #42174
Add ChaCha20 TLS ciphers on Java 12+ #42155

Changes in 7.3:

Bug fix to allow access to top level params in reduce script #42096
Avoid HashMap construction on Grok non-match #42444
Improve Close Index Response #39687
Allow fields to be set to * #42301
Bulk processor concurrent requests #41451
Rename SearchRequest#crossClusterSearch #42363
Fix alpha build error message when generate version object from version string #40406
Search - enable low_level_cancellation by default. #42291
Deprecate support for chained multi-fields. #41926

Changes in 7.2:

Fix settings prefix for realm truststore password #42336
Implement XContentParser.genericMap and XContentParser.genericMapOrdered methods #42059
Upgrade to Lucene 8.1.0 #42214
Add .code_internal-* index pattern to kibana user #42247
Remove IndexShard dependency from Repository #42213
Merge claims from userinfo and ID Token correctly #42277
Peer recovery should flush at the end #41660
Use translog to estimate number of operations in recovery #42211
[ML Data Frame] Persist and restore checkpoint and position #41942
Add experimental and warnings to vector functions #42205
Prevent in-place downgrades and invalid upgrades #41731
Validate non-secure settings are not in keystore #42209
Update to joda time 2.10.2 #42199
Fix FiltersAggregation NPE when filters is empty #41459
Update ciphers for TLSv1.3 and JDK11 if available #42082
Hash token values for storage #41792
Do not refresh realm cache unless required #42169
Protect logged exec spooling from no output #42177
Use local outputstream reference #42180
Deprecate the native realm migration tool #42142
Hide bwc build output on success #42102
Cacheability improvements for thirdparty audit task #42085
SQL: Add initial geo support #42031
Don't create tempdir for cli scripts #41913
Cleanup plugin bin directories #41907
Prevent order being lost for _nodes API filters #42045
[ML] Complete the Data Frame task on stop #41752

Changes in 7.1:

Move the FIPS configuration back to the build plugin #41989
7.1.0 release notes initial commit #42208
Implement factory methods for ValidationException #41993

Changes in 7.0:

Remove deprecated _source_exclude and _source_include from get API spec #42188
Build local year inside DateFormat lambda #42120

Changes in 6.8:

Avoid bubbling up failures from a shard that is recovering #42287
Recovery with syncId should verify seqno infos #41265
Execute actions under permit in primary mode only #42241
Avoid unnecessary persistence of retention leases #42299
6.8 release notes #41802
Fix max boundary for rollup jobs that use a delay #42158
SQL: Fix issue regarding INTERVAL * number #42014
Enforce transport TLS on Basic with Security #42150

Changes in Elasticsearch Hadoop Plugin

Changes in 7.3:

[DOCS] Updates version attribute file #1296

Changes in 7.1:

Fix the mapping client call on ES 5.x and earlier. #1284

Changes in 6.8:

Fix incorrect scroll termination for ES 2.x #1287

Changes in Elasticsearch Management UI

Changes in 7.2:

Snapshot Repositories UI #34407

Changes in Elasticsearch SQL ODBC Driver

Changes in 7.2:

Fix: add pyodbc as instance member to testing class #156
DSN parameter: support for "index_include_frozen" #151
Introduce geo support #154

Changes in Rally

Changes in 1.2.0:

Assume UTC timezone if not specified #693
Measure execution time of bulk ingest pipeline #694
Fail Rally early if there are unused variables in track-params #688

The Search AI Company

Generative AI

Search

Security

Observability

By solution

Industries

This Week in Elasticsearch and Apache Lucene - 2019-05-24

Elasticsearch

Highlights

Deprecation info API

Enrich

Watcher UI

Snapshot Repositories UI

Bye Bye Transport Client

Performance

Types Removal

Multi-fields

Search low-level cancellation

Removal of node.max_local_storage_nodes setting

Geo

Lucene

Lucene 8.1.1

How to enforce a maximum clause count?

Other

Changes

Changes in Elasticsearch

Changes in Elasticsearch Hadoop Plugin

Changes in Elasticsearch Management UI

Changes in Elasticsearch SQL ODBC Driver

Changes in Rally

Follow us

About us

Join us

Press

Partners

Trust & Security

Investor relations

EXCELLENCE AWARDS