What's the Scoop on ES-Hadoop? Spark, Streaming & More
Elasticsearch is an industry-leading solution for search and real-time analytics at scale. Apache Spark has shaped into a powerhouse for processing massive data, both in batch and streaming contexts. Elasticsearch for Apache Hadoop (ES-Hadoop) is a two-way connector that provides the tools needed to marry these two together in perfect data harmony.
This talk aims to introduce the audience to the basics of ES-Hadoop’s native Spark Integration, touch upon the other features that the connector brings to the table (including native integrations with Hive, Storm, Pig, Cascading, and MapReduce), shed some light on the internals of how it works, as well as highlight what’s to come.
James Baiera is a software engineer at Elastic focusing on Elasticsearch for Apache Hadoop. Born and raised in Cleveland, Ohio, he has made his debut in the tech industry by focusing on building massively parallel analytics engines. When he’s not working he doesn’t know what to do with himself.
Anoop Sunke is a data nerd, coffee lover, and a Solutions Architect at Elastic. As an SA his primary job is to help customers solve tough problems like distributed search, analytics, metrics, and exploding unicorns using the Elastic Stack. Prior to Elastic, Anoop worked at Hortonworks building large scale Hadoop architectures and at Microsoft implementing SQL Server solutions.