Elasticsearch for Apache Hadoop 5.0.0-alpha5
I am excited to announce the release of Elasticsearch for Apache Hadoop (aka ES-Hadoop) 5.0.0-alpha5 built against Elasticsearch 5.0.0-alpha5.
IMPORTANT: This is an alpha release and is intended for testing purposes only. Crazy things might happen when running this code and indices created with this version will certainly not be compatible with Elasticsearch 5.0.0 GA. For the sake of your own sanity, we do not advise using this version in production.
Spark 2.0 !!!
It’s here! It’s here! It’s finally here! It’s exciting enough to warrant three exclamation points in the header! This version has added preliminary support for Spark 2.0! Give it a try and let us know what needs improving! Every sentence in this section has an exclamation point at the end of it! Hurray!
(Hadoop/Spark) + Slice API = More Parallel
A substantial change has been added to support the use of Elasticsearch’s new Scroll Slicing functionality. Now you can state the maximum number of documents you wish to see per input task and the framework will attempt to sub-divide input splits to increase your computing parallelism. Isn’t sharing beautiful?
Have sub-fields in your mapping named “properties”? Fixed. Don’t like DataFrames saving null values? Fixed. Tired of not seeing why your bulk indexing requests don’t report why they failed? Double fixed. Take a look at all of the items that have been spruced up in this release!
Now you might be wondering, “Why would I want to try an Alpha Release? Aren’t these things normally riddled with bugs?” Well, yeah, sometimes. We’re tracking a few things that we already know we’ve broken (like update scripts in 1.x), but we’re only human. Thats why we need the help from all of you awesome early adopters!
So, please, DO try this at home! You can download ES-Hadoop 5.0.0-alpha5, try it out, find out how it breaks, and let us know what you did on Twitter, GitHub, or in the forum. A crisp high five is waiting for all who participate! Not a huge fan of high fives? There’s always the Elasticsearch Pioneer Program instead!