7 de abril de 2016

This Week in Elasticsearch and Apache Lucene - 2016-04-05

•

Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.

Top News

#Elasticsearch 5.0.0-alpha1 release w/#Lucene 6, ingest node, & more!
Details: go.es.io/1MQddXD

— elastic (@elastic) April 5, 2016

Elasticsearch Core

Changes in 2.3:

Check that a translog is still open when asking for a new view on it.
Some columns in the cat APIs had duplicate column aliases.
Fixed an ArrayOutOfBounds exception when running aggregations on shards without values.

Changes in master:

The new /_cluster/allocation/explain API explains why a shard can or cannot be allocated to nodes in the cluster.
Type filters no long impact query time when there is only one type.
Dynamically added string fields now add a main "text" field and a sub "keyword" field. Text fields have fielddata disabled by default.
New dynamically settable soft limits added to protect unaware users from dangerous practices:
- Limit the number of fields that can be added to an index.
- Limit the maximum depth of mappings.
- Limit the number of shards that can be searched.
Node attributes must now be specified with `node.attr.xxx` instead of `node.xxx`.
The node.client setting has been removed in favour on node.master|data|ingest.
Throttling of an in-flight reindex request can now be updated dynamically.
The task management API can now return tasks grouped by parent task.
Explain on percolator queries now only runs on queries which could match.
The percolator query now supports scoring.
Fixed a bug allowing OOMs when recovering from the translog.
Removed the deprecated "reverse" option from sorting.
Don't hide stack traces when throwing exceptions.
Translog configuration is now immutable.
Cluster health checks should wait for the state to be applied, not ignore in-flight requests.
Inner hits has been refactoredz which means that the search refactoring is now complete, bar some minor cleanups.
The convert ingest processor now supports an auto option to auto-detect date, boolean, and numeric types.
The IndexOperationListener now reports whether a document was created or not.
The Painless code has been cleaned up moving all Java code out of the ANTLR grammars, improving error messages, and optimizing access to _score.

Ongoing changes:

Work continues on removing PROTOTYPE from our code base.
Adding index deletion tombstones to the cluster state to prevent old indices from popping back into existence.
The task management API should indicate which tasks can be cancelled.
The function_score query will learn how to combine scores from multiple queries.

Apache Lucene

It looks like we will release Lucene 6.1.0 before 6.0.0!
The second release candidate for 6.0.0 is out! Go test it and vote!
Distance queries get much faster with a better test for whether a BKD cell overlaps a circle on the earth's surface, but required this cool whole-earth debugger to help understand the tricky cases
We now have much better Polygon support, including multi-polygons, optionally containing holes, such that we can run real-world polygons, like Russia, without exhausting a 10 GB java heap
The newly created GeoTestUtil now has useful APIs for making random surprise-me polygons like these exotic nuclear-warfare-like shapes, and the base test class is now simpler
Spatial tests now use SerialMergeScheduler for better reproducibility
The bare essential geo spatial utility APIs are moving to core and being consolidated so all spatial modules can share them
LatLonPoint and GeoPointField should quantize in exactly the same way
We now use precisely the same constant for the mean radius of the earth when it's modeled (approximately) as a sphere
It's tricky to get javadocs working across our spatial modules
Our release tools still have remnants of subversion, and struggle with how we name our release branches
Geo3d gets easy-to-use APIs matching our geo2d APIs
The document classifier confusion matrix had buggy accuracy and precision calculations
The spatial-extras module has cutover to points
OfflineSorter more efficiently handles fixed-width values used by dimensional points
We are struggling with query-time quantization issues with LatLonPoint and Ge<wbr>oPointField, including NaNs
Reduce the number of polygon utility methods
We now sometimes test triangle shapes in our geo tests
GeoPointRangeDistanceQuery does not work with multi-valued documents
MoreLikeThisQuery should keep track of which terms came from which fields, but this seems to cause at least one test failure
Improve testing for long ordinals in BKDWriter without having to index 2.1 billion points
OfflineSorter should not always merge down to one segment in the end
GeoPointField should use the same full 64 bit encoding as LatLonPoint
Geo3d will also support polygons with holes, but handling "sideness" of a polygon is somewhat tricky for geo3d
We can optimize polygon queries with faster checks for whether BKD cells overlap the query polygon
Sometimes, BooleanQuery's explain method can lie about its score
Document classifier should also look at numeric fields
Should Lucene support boolean subset matching?
The legacy UninvertingReader class won't get multi-valued points support
SpanNearQuery can assign the wrong score when inner clauses overlap
Our web site still embarrassingly shows the latest subversion commits!
Another randomized geo test failure, this time on a tiny radius (14.3 cm!)

Watch This Space

Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!

Elasticsearch Platform

ELK Stack

Elastic Cloud

Observability

Security

Search

Por industria

Por solución

Cliente destacado

Desarrolladores

Conéctate

Conoce

Ayuda

Ve qué está sucediendo en Elastic

This Week in Elasticsearch and Apache Lucene - 2016-04-05

Top News

Elasticsearch Core

Apache Lucene

Watch This Space

Síguenos

Conócenos

Únete a nosotros

Prensa

Socios

Confianza y seguridad

Relaciones con inversionistas

EXCELLENCE AWARDS