Elasticsearch for Apache Hadoop 2.2.0


See issues on GitHub

Release Notes

Enhancements

  • ES Hadoop does not retry on HTTP 429 #655
  • No data node with id found #563
  • Allow mixed case indices #684
  • can only handle one dynamic portion of an index pattern for lowercase validation #679
  • Can EsSpark.saveToEs report what doc fail to save to ES #675
  • Script settings #673 (issue: #16197)
  • NumberFormatException when reading from ES for no noticable reason #663
  • Generalize field include/exclude array #649
  • Ignore _source field not mapped #648
  • SparkSQL schema not inferred correctly when using Elastichsearch "nested" type #616
  • nested field extraction #605
  • Which column causes issue in MapperParsingException #395

Bug fixes

  • When set "es.mapping.date.rich" to false, DataFrame schema not change to String or Long #672
  • Configuration Options - field include and field include as array #671
  • ArrayWritable cannot be serialized #668
  • Position for 'field' not found in row; typically this is caused by a mapping inconsistency #664
  • Fails to shutdown elasticsearch on YARN #658
  • RestService: missing index log message displays wrong setting #656
  • Hostname cannot be resolved if the uri schema is specified #652
  • Multiple errors using Spark SQL with elasticsearch-spark_2.11 with version 2.2.0-m1 #644
  • multi index data frame causes OOM #634
  • scala.MatchError for list type #617
  • EsHadoopIllegalStateException reading Geo-Shape into DataFrame - SparkSQL #607
  • es.mapping.exclude doesn't work for Hive #595
  • es.read.field.exclude behaves strangely with arrays #590

Docs

  • ElasticSearch+Spark in Java: error: package org.elasticsearch.spark.java.api does not exist #678
  • fixed grammar #666
  • Fix typos #629
  • Custom DateTimeFormat "yyyy-MM-dd HH:mm:ss" in the mapping cannot be parsed #624
  • Update spark.adoc #612

Reports

  • Error is thrown when multiple instances of the same es-hadoop library are deployed #685
  • SQL query never gets translated to ES search query with pushdown enabled #681
  • ElasticSearch Bailing Out #680
  • Somehow elasticsearch-spark_2.10 depends on 2.11 version of scala-library #674
  • An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD #670
  • unable to pass DataFrame or RDD to elasticsearch using Spark !! #665
  • Compressed snapshot for backing up #662
  • Spark-ES Schema problem #661
  • No data node found when publish hosts are private IPs #657
  • Caused by: java.io.IOException: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[10.XXX.XXX.XX:9200]] #654
  • ClassNotFoundException EsPartition on spark_2.10-2.2.0-rc1 #653
  • Allow one to specify the array depth #650
  • Allow 'prefixed' Elasticsearch installs #642
  • es-hadoop hive date map to timestamp field error #639
  • org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: [GET] on [_nodes/http] failed; server[hostname/XXX.XXX.XXX.XXX:Ports] returned [400|Bad Request:] #638
  • Spark ElasticsearchIllegalArgumentException[No data node with id[...] found] #637
  • ES hadoop problem finding the correct cluster nodes #636
  • ES Hadoop with reverse proxy #633
  • Hadoop-Spark2Elasticsearch data ingestion problem: Elasticsearch index docs count is greater than Hive table rows count #628
  • ES 2.0 SSL problem #608
  • Dynamic index writer doesn't work with dataframes (spark sql) #593