Product release

Elasticsearch for Apache Hadoop 5.0.0-beta1

​I am excited to announce the release of Elasticsearch for Apache Hadoop (aka ES-Hadoop) 5.0.0-beta1 built against Elasticsearch 5.0.0-beta1.

IMPORTANT: This is an beta release and is intended for testing purposes only. Beta releases are normally more stable than Alpha releases, but crazy things can still happen when running this code. Indices created with this version are not guaranteed to be compatible with Elasticsearch 5.0.0 GA. For the sake of your own sanity, we do not advise using this version in production. Think of the hamsters.

What's new?

Spark 1.3-1.6 Streaming Support

Spark is pretty fast, but sometimes you need your data even faster. We loved hearing that some of you were using ES-Hadoop with Spark Streaming, but we also felt the same heartache about the limitations that you were running into. We decided to do something about it. ES-Hadoop now natively supports consuming DStreams from Spark Streaming 1.3-1.6! We've included some fixes for the most commonly reported Spark Streaming issue of running out of connection resources during small processing windows. May your TIMED_WAIT's be few, and your Spark Streaming Jobs live long and prosper.

Ingest Node

We heard about this cool new feature called the Ingest Node that was available in the alpha releases and coming out in Elasticsearch v5.0.0. We thought "Oh man, we ingest stuff, this node ingests stuff. We need to schedule a brunch with it immediately to trade gossip." Starting in ES-Hadoop 5.0.0-beta1 you can now specify an ingest pipeline to send your data to, as well as target only ingest nodes to cut down on unnecessary traffic. We're still waiting to hear back from you about brunch, Ingest Node. Call us!

Fast Acting Bug Repellant

Computers are hard. We thank our lucky stars every day that our friends in the community are so helpful when it comes to reporting issues. When you open up your copy of ES-Hadoop, you'll find a fresh batch of bug fixes already applied. These bugs range from issues with overwriting data with SparkSQL, memory leaks in the network code, false warnings about version compatibility, and a bunch more. You can check out all those fixes (and their gritty details) here. Cheers to the bug hunters!

Feedback

Anticipation is one of those cool things in life, but we understand that sometimes waiting is a real bummer. That's one of the reasons why we work so hard to make these early access releases for you all. It's like getting to open your birthday presents a few weeks in advance! You're happy, we're happy, and if anything ends up being broken, we have some time to get it fixed before the big special day.

So, please, DO try this at home! You can download ES-Hadoop 5.0.0-beta1, try it out, find out how it breaks, and let us know what you did on Twitter, GitHub, or in the forum. We are forever indebted to our early adopters, so much so that we created the Elasticsearch Pioneer Program! Now that's the sound of sweet, sweet gratitude!