Elasticsearch for Apache Hadoop 2.2.0 beta1


See issues on GitHub

Release Notes

New features

  • Cloud/WAN support #577

Enhancements

  • es hadoop query all type #583
  • Unable to save from spark(on mesos) to Elastic search #579
  • [SPARK] Support the filter operator of null-safe equailty comprison. #572 (issue: #569)
  • GenericRowWithSchema not supported #557
  • Allow esRDD to specify what fields to load #555
  • Upgrade to Spark 1.5 #547
  • elasticsearch spark size option to limit the number of documents returned #546
  • Do not route MapReduce reads and writes through non-data nodes #513 (issue: #512)
  • Cannot restrict DataFrame to certain mapping field #497
  • java.util.NoSuchElementException: None.get on 2.1.0.rc1 when Array of objects in mapping #484
  • Integration with Amazon Elastic Map Reduce #96

Bug fixes

    li>Improve handling of secured resources #582
  • es.mapping.id populates wrong value #581
  • InitializationUtils throws an unexpected(?) exception while running against a 1.5.0 ElasticSearch cluster: Unsupported/Unknown Elasticsearch version 1.5.0 #571
  • SparkSQL to ES invalid "IN" query #556
  • What version of elasticsearch-spark or elasticsearch-hadoop should I use for Spark 1.4.1 #551
  • Incomplete source jar #542
  • ArrayIndexOutOfBoundsException on Spark SQL with 2.1.0.rc1 #482
  • Multiple indexes setting for 'es.resource' #289
  • Consistency of search queries while running MR jobs #124
  • Thousands of SearchContextMissingException: No search context found for id #576
  • Spark throwing an exception as "org.elasticsearch.hadoop.rest.EsHadoopTransportException: java.net.BindException" #570
  • Is it possible to write to Amazon Elasticsearch Service using elasticsearch-hadoop? #565
  • org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException #562
  • ES Newbie #561
  • Run time error with Spark 1.5 #558
  • Upgrade dependency on jackson #554
  • read huge data from es speculation #552
  • Spark not able to find ES server on AWS #550
  • Multi search query with spark #549
  • NoSuchMethodError: org.elasticsearch.spark.sql.package$.sparkSchemaRDDFunctions #544
  • [Hive] NULL structure value if key name is also a column name #543
  • ClassCastException when trying to access int field #541
  • Aggregation running very slow #540
  • EsRecordWriter causes Bad Request(400) on geo_point fields in some cases #530
  • Spark 1.3 -- Error in retrieving an array element from Elastic Search #529
  • Hive queries are running very slow with where clause #527
  • Ver. 2.1.0 throws java.util.NoSuchElementException: None.get with nested type #504
  • CLOSE_WAIT with hdfs repository plugin #492
  • Saving filed with list of values triggers exception #464
  • es.nodes and es.nodes.discovery question #399
  • hive integration, autocreate index, shouldn't es.mapping.ttl enable timestamp field #345
  • .cache() on an RDD causes RDDBlockId class not found #344

Regression

  • Problem in setting the id---in mapreduce with elasticsearch #548
  • repository-hdfs does not work with elasticsearch 2.0.0-beta1 #545

Docs

  • Missing 2.1.2 release #584
  • SPARK_CLASSPATH is deprecated in Spark 1.0+. #580
  • Fixes for broken links. #578
  • Section on performance #567
  • Page on networking / AWS support #566
  • Fixed typo #493
  • Make the project compilable when it's added as a git submodule #491
  • Update configuration.adoc #489
  • Using the EsSpark.esRDD method to read from Elasticsearch does not honor size parameters in either URI or DSL #469