15 June 2018

This Week in Elasticsearch and Apache Lucene - 2018-06-15

By Colin Goodheart-SmitheJason TedorPaul SanwaldAdrien GrandJay ModiBoaz Leskes

Elasticsearch weekly highlights

Rolling Upgrade to 6.3 Issue

After the release of 6.3.0, we received reports of nodes not re-joining their cluster after a rolling upgrade for clients that both have a gold/platinum license installed, and security enabled. Workaround is to explicitly set xpack.security.enabled: true in the elasticsearch.yml file, we are adding to docs as a known issue. Root cause is a change in 6.3.0 where security is disabled for trial licenses unless it is explicitly enabled; for existing gold and platinum licenses it should be enabled by default. The problem occurs when the new node starts, it does not know the current license type and defaults to a trial license, so the node does not send authentication with the ping requests, leading the nodes in the existing cluster reject the requests. A fix is being worked on.

SAML - RequestedAuthnContext

We have added support for RequestedAuthnContext, a feature that has been requested by a number of users. This attribute allows Service Providers, in our case the Elastic Stack, to request that the user be authenticated by the Identity Provider in a certain way such as Multi Factor Authentication.

Zen2

Basic components being in place, we have started to work on one of the main user facing improvements in Zen2 compared to ZenDiscovery - removing the need for a minimum_master_nodes setting. Zen2 has extended the notion of cluster membership with a weight. Each node in the cluster is assigned a weight (currently 0 or 1) and only the votes of nodes with a weight count for election. For a node to become a master, it needs to win the votes for a majority of nodes with weights. When a master eligible node joins an existing cluster, it will receive a weight so it can participate in elections and support the master if other nodes fail. However, when a node leaves the cluster, the master can remove its weight if it still has a majority of the nodes to support it. Once the weight of that node is removed, it can not participate in future elections (until it rejoins). Similarly, if the cluster is partitioned, the majority side (if exists) will be able to remove the weights of all the nodes in the minority side. Once those weights are removed no node from the majority side will vote for a node in the minority side. In the same time the minority side can't form a cluster because it doesn't have enough weights based on the last know cluster configuration. This prevents a split brain from happening. The only way for the minority side to form a cluster is to rejoin the majority side and getting weights re-assigned to it.

The scripted_metric aggregation:experimental no more!

The scripted metric aggregation was added back in 1.4.0 as a way to use scripts to provide aggregation functions that are not available in the product without having to write a custom aggregation. We marked it experimental as we weren't sure how useful it was.

Since then the aggregation has remained largely unchanged but has been used by the community. Because its been in this state for so long, we don't feel we would be able to defend making large changes or removing it without providing a clear deprecation/migration path, making the experimental status lose its value.

So we made a change in 6.4.0 that the scripted_metric aggregation will no longer be experimental. It should still be used as a last resort, but we recognize there is a long-term place for this aggregation, at least until adding a custom aggregation is a much more friendly experience and part of a supported public API. It should also be noted that we still have plans to move it to a plugin.

NIO

We added CORS support to the transport-nio networking implementation. With this addition it means that transport-nio has now reached feature parity with the transport-netty4 networking implementation. Additionally, there is now benchmarking in place using transport-nio. This will enable us to closely monitor how future changes impact performance. While there is still a lot of work remaining on transport-nio, this progress marks a milestone for the transport-nio effort.

Changes

Changes in 5.6:

Changes in 6.3:

  • Compliant SAML Response destination check #31175
  • [Rollup] Metric config parser must use builder so validation runs #31159
  • Security: fix token bwc with pre 6.0.0-beta2 #31254
  • Don’t swallow exceptions on replication #31179

Changes in 6.4:

  • BREAKING (java): LLClient: Support host selection #30523
  • CCS: don’t proxy requests for already connected node #31273
  • Remove RestGetAllAliasesAction #31308
  • Add Get Aliases API to the high-level REST client #28799
  • Treat ack timeout more like a publish timeout #31303
  • SQL: Whitelist SQL utility class for better scripting #30681
  • Add notion of internal index settings #31286
  • REST high-level client: add Cluster Health API #29331
  • HLRest: Add get index templates API #31161
  • Ignore numeric shard count if waiting for ALL #31265
  • Revert upgrade to Netty 4.1.25.Final #31282
  • Support RequestedAuthnContext #31238
  • Validate xContentType in PutWatchRequest. #31088
  • [INGEST] Interrupt the current thread if evaluation grok expressions take too long #31024
  • Upgrade to Netty 4.1.25.Final #31232
  • Use stronger write-once semantics for Azure repository #30437
  • SQL: Make a single JDBC driver jar #31012
  • Allow to trim all ops above a certain seq# with a term lower than X #30176

Changes in 7.0:

  • BREAKING: REST hl client: cluster health to default to cluster level #31268
  • BREAKING: REST high-level Client: remove deprecated API methods #31200
  • Limit the number of concurrent requests per node #31206
  • Allow to trim all ops above a certain seq# with a term lower than X, … #31211
  • Default max concurrent search req. numNodes * 5 #31171

Apache Lucene

Lucene 7.4

Blockers are resolved, we expect to build a first release candidate on June 18th.

Other

- We had to relax an assertion since it might not always hold due to concurrency.

- We want to remove StandardFilter since it doesn't do anything.

- We fixed explanations of some function-score queries.

- We fixed StandardAnalyzer to no longer remove english stopwords, making it a reasonable default regardless of the language.

- We would like to remove the ability to iterate indexed fields, but this can't be done at the moment due to the fact that term vectors allow some documents to enable term vectors and other documents to disable them in the same index, so this would have to be fixed first.

- We are proposing to improve field lookup so that it runs in constant time.

-  We noticed that AnalyzerWrapper fails to propagate calls to TokenStreamComponents.setReader, which is problematic for some analyzers.

- FrenchLightStemmer removes most diatritical marks but not diaereses.