This Week in Elasticsearch and Apache Lucene - 2017-05-08
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
A first look at Elastic's new Machine Learning Technology https://t.co/jHf5orWooF— David Pilato🇫🇷🇪🇺 (@dadoonet) May 6, 2017
Changes in 5.4:
- The KERBEROS security mode in the HDFS repository plugin did not play well with the security manager.
- Upgraded to Lucene 6.5.1.
- With Netty upgraded to 4.1.10, we can now customise how many
processorsare used by Netty to do resource sizing.
- Async responses from stale nodes should not be confused with responses from more recent requests.
- The nodes info API now includes JVM arguments, which are now also logged on startup.
- Netty included a patch from us to suppress warnings about unsafe not being available, but that change was lost during an upgrade, so we hack around it until Netty puts it back.
_field_capsAPI should handle a field that exists in only some indices.
- Plugin installation could fail with bad directory permissions if the user has a restrictive
Changes in 5.x:
- Added an
_typefield can now be disabled on an index, preventing the use of more than one type in an index, and removing the
_typefield from the
_uid. It is disabled by default in 5.x and will be enabled by default in master.
RemoteClusterAwarebase class allows services to listen to remote cluster config updates.
- Preserve the cluster alias throughout search execution to ensure that node lookup happens in the correct cluster.
_search_shardsAPI now returns all aliases, not just filtered aliases, to aid with cross cluster search alias and index resolution.
- Cross cluster searches should expand wildcards to concrete indices and aliases so that shard-level search requests don't see wildcards.
UpgraderPluginallows plugins to upgrade index templates and index metadata on startup.
_field_capsAPI now works cross cluster.
Terms.Bucketis now an interface instead of an abstract class to make it easier for the Java REST client to provide its own implementation.
Changes in master:
- The open/close-index APIs, unlike the delete-index API, threw an exception when wildcard resolution resulted in no matching indices.
- The file store can be simplified (removing heroics and old bug handling) by simply trying to write to a disk and throwing an exception.
-XX:-OmitStackTraceInFastThrowto prevent the JVM from losing stack traces.
- The Java High Level REST client continues to add support for bucket aggregation responses.
- With sequence numbers, the global checkpoint should not advance during shard recovery.
The discussion about releasing Lucene 7 started again and suggests that we target sometime in June. It's also an opportunity for everyone to mention changes they would like to see in, it will soon be too late to include new changes into that release!
We are about to cut a branch for Lucene 6.6 so you can expect to see it released within a couple weeks.
Better compression for doc-valuesHaving an iterator API would allow us to perform more aggressive compression, eg. by compressing multiple values together. This could be especially effective for time-series since values are expected to be correlated with time, which itself is correlated with index order. This needs to be explored as it could also have a performance impact so we need to find a fine trade-off.
- The query cache always passes a null query to the
- Can we make it easier to generate point queries with query parsers?
- Join queries have equals/hashcode bugs.
- We want to add a way for DoubleValues (think scripts) to return an explanation but it is not clear whether it is best to put it on the actual values or on their source.
- Facets gained the ability to be computed concurrently.
- Should we remove the postings highlighter now that we have the unified highlighter?
- Making IOUtils.rethrow return the thrown exception proved controversial but was finally committed.
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!