I am excited to announce the release of Elasticsearch for Apache Hadoop (aka ES-Hadoop) 6.0.0-alpha2 built against Elasticsearch 6.0.0-alpha2.
IMPORTANT: This is an alpha release and is intended for testing purposes only. Crazy things might happen when running this code and indices created with this version will certainly not be compatible with Elasticsearch 6.0.0 GA. For the sake of your own sanity, we do not advise using this version in production.
Alpha Support for Spark Structured Streaming
Spark 2.0 saw the alpha release of Structured Streaming, a new streaming framework that combines the query planning facilities from Spark SQL with the potential for exactly once processing of stream messages. Now, with ES-Hadoop 6.0.0-alpha2, we are proud to present our own Elasticsearch structured streaming sink for Apache Spark, complete with all features you've come to love from the connector, as well as backed by an internal HDFS-based commit log to help ensure correct message delivery. All of this is detailed further in our documentation. Please remember though: Both Structured Streaming in Spark and this sink in ES-Hadoop are alpha features! There could still be bugs lurking out there. We advise against rolling this out to production without careful consideration.
Parsing errors from index auto-creation, backwards compatibility errors with scroll id's, missing support for timestamps in params and much more. Take a look at all of the items that have been spruced up in this release!
Now you might be wondering, "Why would I want to try an Alpha Release? Aren't these things normally riddled with bugs?" Well, yeah, sometimes. Thats why we need the help from all of you awesome early adopters!
So, please, DO try this at home! You can download ES-Hadoop 6.0.0-alpha2, try it out, find out how it breaks, and let us know what you did on Twitter, GitHub, or in the forum. A crisp high five is waiting for all who participate! Not a huge fan of high fives? There's always the Elasticsearch Pioneer Program instead!