Product release

Elasticsearch for Apache Hadoop 2.3.0 and 2.2.1 released

Joining the release train this week, Elasticsearch for Apache Hadoop 2.3.0 and 2.2.1 are now out containing compatibility improvements and bug fixes. Users are recommend to upgrade as soon as possible to take advantage of these.

As always, the artifacts are available at the downloads page and or Maven.

Important fixes

HDFS repository compatibility with Elasticsearch 2.3.0

For those that missed it, Elasticsearch 5.0.0 alpha1 was released a few days back and among its bundle of features, ships out of the box with the repository hdfs plugin. As such, pending any unforeseen events, ES-Hadoop 2.3 will be the last release cycle containing the HDFS plugin repository.

Optimized network transfer for fixed routing

When using a fixed or predefined routing, the connector optimizes the network request to hit only the target shards (whether it is for reads or writes).

Improved indexing of Spark RDDs

The check for empty Spark RDDs has been tweaked to avoid triggering loading of the RDD content, especially important when using disk persistence or no caching.

Better detection of shards overlap

The algorithm for checking overlapping shards has been improved (thanks to a user contribution) to use significantly less memory and thus, increasing the limit of indices it can work on.

Last 2.2. release

Alongside 2.3, ES-Hadoop 2.2.1 is released as the last planned maintenance release in the 2.2.x line. It contains a series of backported bug-fixes for those with conservatory upgrade paths. However even if you are on ES 1.x, upgrading to ES-Hadoop 2.3 is highly recommended.

Feedback

Looking forward to hearing your feedback on ES-Hadoop 2.3! You can find us on GitHub, Twitter (@elastic) or the forums. IRC works too.