Product release

Elasticsearch 5.3.1 released

Today we are pleased to announce the bugfix release of Elasticsearch 5.3.1, based on Lucene 6.4.2. It is already available for deployment on Elastic Cloud, our Elasticsearch-as-a-service platform. All users of 5.3.0 should upgrade.

Latest stable release in 5.x:

Multi Data Path Bug in Elasticsearch 5.3.0

Elasticsearch 5.3.0 contained a bug that was triggered by using default.path.data (configured by default in the RPM and Debian packages) along with an array setting for path.data. The bug resulted in the configured paths being added to the default path instead of overwriting it. This release contains a fix, as well as a check to see if there is still data sitting in the default path. See Multi data path bug in Elasticsearch 5.3.0 for more information about how to recover from this bug.

Misconfigured Shingle/CJK Filters can Cause OOM

There has been much work recently on improving Lucene’s handling of graph token streams, where analysis of text, either from a document during indexing, or a query during searching, produces multiple overlapping paths or interpretations for the tokens. Most token filters (e.g. synonyms, shingles, CJK, word-delimiter) now use graph analysis to calculate the correct order of tokens for accurate phrase queries.

However, a misconfigured token filter can generate too many paths, which can consume your entire heap space. For example, a shingle token filter with max_shingle_size and min_shingle_size set to different values or with output_unigrams set to true can easily result in an explosion of paths.

This release contains a protection mechanism to turn off graph analysis in the shingles and CJK token filters when configured to use different shingle lengths.

Cross-Cluster Search and Security

Cross-cluster search now works with clusters that have X-Pack Security enabled. See Configuring Cross-Cluster Search with Security.

Other Changes

Other changes worthy of mention include:

  • The filter element in the filter and significant_terms aggregations was being executed as a query instead of a filter.
  • The Netty receive predictor size has been changed to 64 kB to increase throughput and decrease garbage collection.
  • Reindex-from-remote was unable to pull from clusters on a version before 2.0.0.
  • Sliced scrolls could result in incomplete or duplicate results if the local time on each node was different.

Conclusion

Please download Elasticsearch 5.3.1, try it out, and let us know what you think on Twitter (@elastic) or in our forum. You can report any problems on the GitHub issues page.