23 August 2018 Releases

Elasticsearch for Apache Hadoop 6.4.0 Released

By James Baiera

We are excited to announce the release of Elasticsearch for Apache Hadoop (aka ES-Hadoop) 6.4.0 built against Elasticsearch 6.4.0. Let’s take a look at what’s new.

Error Handler API is now GA

In 6.2.0 we introduced an Error Handler API that allows users to be notified of and handle errors that are encountered when submitting bulk requests to Elasticsearch. Since then we have been working on improving this feature and opening it up to different parts of the connector. I am pleased to announce that as of version 6.4.0, the Error Handler API is considered GA!

Serialization Error Handlers

One of the large improvements that we have added to the Error Handler APIs is the ability to handle more classes of errors. Are you plagued by malformed records tripping up your processing jobs? We have just the set of error handlers for you! Serialization Error Handler APIs have been added to ES-Hadoop that will allow you to intercept, inspect, and handle errors that occur when reading and writing JSON data.

Elasticsearch Generic Error Handler

Keeping with the theme of boosting our ability to handle errors, included with 6.4.0 is a brand new default error handler implementation: The Elasticsearch Handler. What better way to collect application errors than to send them off to an Elasticsearch index for searching or reporting? The Elasticsearch Handler uses all the same configurations that already exist in ES-Hadoop, picking up defaults where it makes sense, but allowing you to configure it to send your error information to exactly where you need it. The errors that are sent to Elasticsearch follow the Elastic Common Schema, designed to make it as easy as possible to read with your own eyes or to integrate with other ECS enabled tools and technologies.

Support for Secure Settings

ES-Hadoop relies on a set of configuration values to operate. These configurations are often served up in plaintext via reporting or status tools in the Hadoop ecosystem to any users who request them. This is why, understandably, it feels pretty strange to be specifying sensitive configuration values like a user’s password to the connector in this way. We care deeply for our users’ security and privacy needs. Starting in 6.4.0, you can now package up the most sensitive settings and provide them via a keystore format. Take a look through our documentation on how to use this new route for providing secure settings.

As always, we love hearing your feedback and suggestions. If you have a great idea for a new feature or enhancement, or if you have any questions, stop by our forums or submit an issue!