This Week in Elasticsearch and Apache Lucene - 2018-06-15
Elasticsearch weekly highlights
Rolling Upgrade to 6.3 Issue
After the release of 6.3.0, we received
reports of nodes not re-joining their cluster after a rolling upgrade for
clients that both have a gold/platinum license installed, and security
enabled. Workaround is to explicitly set xpack.security.enabled:
true
in the elasticsearch.yml file, we are adding to docs as a known issue. Root cause is a change in 6.3.0 where security is disabled for trial licenses unless it is explicitly enabled; for existing gold and platinum licenses it should be enabled by default. The problem occurs when the new node starts, it does not know the current license type and defaults to a trial license, so the node does not send authentication with the ping requests, leading the nodes in the existing cluster reject the requests. A fix is being worked on.
SAML - RequestedAuthnContext
We have added support for RequestedAuthnContext, a feature that has been requested by a number of users. This attribute allows Service Providers, in our case the Elastic Stack, to request that the user be authenticated by the Identity Provider in a certain way such as Multi Factor Authentication.
Zen2
Basic components being in place, we have started to work on one of
the main user facing improvements in Zen2 compared to ZenDiscovery - removing
the need for a minimum_master_nodes
setting. Zen2 has extended the notion of cluster membership with a weight. Each node in the cluster is assigned a weight (currently 0 or 1) and only the votes of nodes with a weight count for election. For a node to become a master, it needs to win the votes for a majority of nodes with weights. When a master eligible node joins an existing cluster, it will receive a weight so it can participate in elections and support the master if other nodes fail. However, when a node leaves the cluster, the master can remove its weight if it still has a majority of the nodes to support it. Once the weight of that node is removed, it can not participate in future elections (until it rejoins). Similarly, if the cluster is partitioned, the majority side (if exists) will be able to remove the weights of all the nodes in the minority side. Once those weights are removed no node from the majority side will vote for a node in the minority side. In the same time the minority side can't form a cluster because it doesn't have enough weights based on the last know cluster configuration. This prevents a split brain from happening. The only way for the minority side to form a cluster is to rejoin the majority side and getting weights re-assigned to it.
The scripted_metric aggregation:experimental no more!
The scripted metric aggregation was added back in 1.4.0 as a way to use scripts to provide aggregation functions that are not available in the product without having to write a custom aggregation. We marked it experimental as we weren't sure how useful it was.
Since then the aggregation has remained largely unchanged but has been used by the community. Because its been in this state for so long, we don't feel we would be able to defend making large changes or removing it without providing a clear deprecation/migration path, making the experimental status lose its value.
So we made a change in 6.4.0 that the scripted_metric aggregation will no longer be experimental. It should still be used as a last resort, but we recognize there is a long-term place for this aggregation, at least until adding a custom aggregation is a much more friendly experience and part of a supported public API. It should also be noted that we still have plans to move it to a plugin.
NIO
We added CORS support to the transport-nio networking implementation. With this addition it means that transport-nio has now reached feature parity with the transport-netty4 networking implementation. Additionally, there is now benchmarking in place using transport-nio. This will enable us to closely monitor how future changes impact performance. While there is still a lot of work remaining on transport-nio, this progress marks a milestone for the transport-nio effort.
Changes
Changes in 5.6:
- Fix race in clear scroll #31259
Changes in 6.3:
- Compliant SAML Response destination check #31175
- [Rollup] Metric config parser must use builder so validation runs #31159
- Security: fix token bwc with pre 6.0.0-beta2 #31254
- Don’t swallow exceptions on replication #31179
Changes in 6.4:
- BREAKING (java): LLClient: Support host selection #30523
- CCS: don’t proxy requests for already connected node #31273
- Remove RestGetAllAliasesAction #31308
- Add Get Aliases API to the high-level REST client #28799
- Treat ack timeout more like a publish timeout #31303
- SQL: Whitelist SQL utility class for better scripting #30681
- Add notion of internal index settings #31286
- REST high-level client: add Cluster Health API #29331
- HLRest: Add get index templates API #31161
- Ignore numeric shard count if waiting for ALL #31265
- Revert upgrade to Netty 4.1.25.Final #31282
- Support RequestedAuthnContext #31238
- Validate xContentType in PutWatchRequest. #31088
- [INGEST] Interrupt the current thread if evaluation grok expressions take too long #31024
- Upgrade to Netty 4.1.25.Final #31232
- Use stronger write-once semantics for Azure repository #30437
- SQL: Make a single JDBC driver jar #31012
- Allow to trim all ops above a certain seq# with a term lower than X #30176
Changes in 7.0:
- BREAKING: REST hl client: cluster health to default to cluster level #31268
- BREAKING: REST high-level Client: remove deprecated API methods #31200
- Limit the number of concurrent requests per node #31206
- Allow to trim all ops above a certain seq# with a term lower than X, … #31211
- Default max concurrent search req. numNodes * 5 #31171
Apache Lucene
Lucene 7.4
Blockers are resolved, we expect to build a first release candidate on June 18th.
Other
- We had to relax an assertion since it might not always hold due to concurrency.
- We want to remove StandardFilter since it doesn't do anything.
- We fixed explanations of some function-score queries.
- We fixed StandardAnalyzer to no longer remove english stopwords, making it a reasonable default regardless of the language.
- We would like to remove the ability to iterate indexed fields, but this can't be done at the moment due to the fact that term vectors allow some documents to enable term vectors and other documents to disable them in the same index, so this would have to be fixed first.
- We are proposing to improve field lookup so that it runs in constant time.
- We noticed that AnalyzerWrapper fails to propagate calls to TokenStreamComponents.setReader, which is problematic for some analyzers.
- FrenchLightStemmer removes most diatritical marks but not diaereses.