This Week in Elasticsearch and Apache Lucene - 2016-07-04
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
I love the new "How To... Tune for indexing speed / search speed / disk usage" section in the #elasticsearch docs https://t.co/czuVXJGKxp
— Clinton Gormley (@clintongormley) June 27, 2016
Elasticsearch Core
Changes in 2.x:
- Extended bounds on a histogram agg should still be applied even if no docs match.
- Bool queries should complain when more than one query is passed in a map.
- The "key_as_string" value for "epoch_millis" should not take the time zone into account.
- The {{#toJson}} and {{#join}} Mustache helpers make rendering arrays and maps easier.
- Azure discovery now works with the security manager.
- The S3 repository plugin needs an extra permission to work around a bug in the AWS library.
- The GCE cloud plugin no longer throws an NPE if the `region` is empty.
- Time zones are hard.
Changes in master:
- Settings whose validators depend on other settings should be checked by the coordinating node when updated.
- Ingest gains a `user-agent` processor.
- Empty ingest pipelines skip the conversion to map-of-maps for efficiency.
- AWS plugins support the new Asia Pacific (Mumbai) region.
- The percolator can extract terms from SynonymQuery, and uses the RamDirectory for querying nested docs.
- Painless gains replaceAll() and replaceFirst() functions.
- The plugin manager shows a progress bar during downloads.
- Time units hold onto the original unit for round-tripping, which means removing support for "w" (weeks) and for fractional time values.
- ExceptionsHelper.
detailedMessage() is deprecated as it loses the stack trace. - The `discovery-azure` plugin has been renamed to `discovery-azure-classic` to make way for the new `discovery-azure-arm` plugin.
- Now that index folders are named after the UUID, not the index name, the index UUID should be exposed in the cat API.
- NodeClient should be available in the HTTP service instead of being passed to all RestHandler constructors.
- Snapshot/restore's "index" file (which records the snapshots held in the repository) is now written atomically.
- Handoff during primary relocation was subject to a bug which could produce deadlock. A queued approach fixes this bug.
Ongoing changes:
- The effort to remove Guice continues: it has been removed from mapper plugins, plugins should implement ActionPlugin instead of onModule,RestHandler registration now happens in ActionModule and ActionPlugin.
- Networking is being moved to a module to lock down networking permissions, and Netty3 is being upgraded to the latest version.
- Script compilation no longer requires the cluster state. Instead StoredScripts should be a cluster state listener, scripts should be resolved by the coordinating node, and should be stored in the QueryShardContext to ensure they don't change during long scrolls.
- Aggregation streams are being replaced by NamedWritable.
- Similarities should be dynamically updatable.
- The query_string's `lowercase_expanded_terms` and `locale` shouldn't be required at this should be automatically handled by the analysis pipeline.
Apache Lucene
The Apache Lucene update will be back next week.
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!