This Week in Elasticsearch and Apache Lucene - 2017-05-08
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
A first look at Elastic's new Machine Learning Technology https://t.co/jHf5orWooF
— David Pilato🇫🇷🇪🇺 (@dadoonet) May 6, 2017
Changes in 5.4:
- The KERBEROS security mode in the HDFS repository plugin did not play well with the security manager.
- Upgraded to Lucene 6.5.1.
- With Netty upgraded to 4.1.10, we can now customise how many
processors
are used by Netty to do resource sizing. - Async responses from stale nodes should not be confused with responses from more recent requests.
- The nodes info API now includes JVM arguments, which are now also logged on startup.
- Netty included a patch from us to suppress warnings about unsafe not being available, but that change was lost during an upgrade, so we hack around it until Netty puts it back.
- The
_field_caps
API should handle a field that exists in only some indices. - Plugin installation could fail with bad directory permissions if the user has a restrictive
umask
.
Changes in 5.x:
- Added an
ip_range
field type. - The
_type
field can now be disabled on an index, preventing the use of more than one type in an index, and removing the_type
field from the_uid
. It is disabled by default in 5.x and will be enabled by default in master. - The
RemoteClusterAware
base class allows services to listen to remote cluster config updates. - Preserve the cluster alias throughout search execution to ensure that node lookup happens in the correct cluster.
- The
_search_shards
API now returns all aliases, not just filtered aliases, to aid with cross cluster search alias and index resolution. - Cross cluster searches should expand wildcards to concrete indices and aliases so that shard-level search requests don't see wildcards.
- The
UpgraderPlugin
allows plugins to upgrade index templates and index metadata on startup. - The
_field_caps
API now works cross cluster. Terms.Bucket
is now an interface instead of an abstract class to make it easier for the Java REST client to provide its own implementation.
Changes in master:
- The open/close-index APIs, unlike the delete-index API, threw an exception when wildcard resolution resulted in no matching indices.
- The file store can be simplified (removing heroics and old bug handling) by simply trying to write to a disk and throwing an exception.
- Set
-XX:-OmitStackTraceInFastThrow
to prevent the JVM from losing stack traces.
Coming soon:
- The Java High Level REST client continues to add support for bucket aggregation responses.
- With sequence numbers, the global checkpoint should not advance during shard recovery.
Apache Lucene
Lucene 7.0
The discussion about releasing Lucene 7 started again and suggests that we target sometime in June. It's also an opportunity for everyone to mention changes they would like to see in, it will soon be too late to include new changes into that release!
Lucene 6.6
We are about to cut a branch for Lucene 6.6 so you can expect to see it released within a couple weeks.
Better compression for doc-values
Having an iterator API would allow us to perform more aggressive compression, eg. by compressing multiple values together. This could be especially effective for time-series since values are expected to be correlated with time, which itself is correlated with index order. This needs to be explored as it could also have a performance impact so we need to find a fine trade-off.Other changes:
- The query cache always passes a null query to the
onQueryCache
listener. - Can we make it easier to generate point queries with query parsers?
- Join queries have equals/hashcode bugs.
- We want to add a way for DoubleValues (think scripts) to return an explanation but it is not clear whether it is best to put it on the actual values or on their source.
- Facets gained the ability to be computed concurrently.
- Should we remove the postings highlighter now that we have the unified highlighter?
- Making IOUtils.rethrow return the thrown exception proved controversial but was finally committed.
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!