Elasticsearch for Apache Hadoop 2.2.0
See issues on GitHub
Release Notes
Enhancements
- ES Hadoop does not retry on HTTP 429 #655
- No data node with id found #563
- Allow mixed case indices #684
- can only handle one dynamic portion of an index pattern for lowercase validation #679
- Can EsSpark.saveToEs report what doc fail to save to ES #675
- Script settings #673 (issue: #16197)
- NumberFormatException when reading from ES for no noticable reason #663
- Generalize field include/exclude array #649
- Ignore _source field not mapped #648
- SparkSQL schema not inferred correctly when using Elastichsearch "nested" type #616
- nested field extraction #605
- Which column causes issue in MapperParsingException #395
Bug fixes
- When set "es.mapping.date.rich" to false, DataFrame schema not change to String or Long #672
- Configuration Options - field include and field include as array #671
- ArrayWritable cannot be serialized #668
- Position for 'field' not found in row; typically this is caused by a mapping inconsistency #664
- Fails to shutdown elasticsearch on YARN #658
- RestService: missing index log message displays wrong setting #656
- Hostname cannot be resolved if the uri schema is specified #652
- Multiple errors using Spark SQL with elasticsearch-spark_2.11 with version 2.2.0-m1 #644
- multi index data frame causes OOM #634
- scala.MatchError for list type #617
- EsHadoopIllegalStateException reading Geo-Shape into DataFrame - SparkSQL #607
- es.mapping.exclude doesn't work for Hive #595
- es.read.field.exclude behaves strangely with arrays #590
Docs
- ElasticSearch+Spark in Java: error: package org.elasticsearch.spark.java.api does not exist #678
- fixed grammar #666
- Fix typos #629
- Custom DateTimeFormat "yyyy-MM-dd HH:mm:ss" in the mapping cannot be parsed #624
- Update spark.adoc #612
Reports
- Error is thrown when multiple instances of the same es-hadoop library are deployed #685
- SQL query never gets translated to ES search query with pushdown enabled #681
- ElasticSearch Bailing Out #680
- Somehow elasticsearch-spark_2.10 depends on 2.11 version of scala-library #674
- An error occurred while calling z:org.apache.spark.api.python.PythonRDD.newAPIHadoopRDD #670
- unable to pass DataFrame or RDD to elasticsearch using Spark !! #665
- Compressed snapshot for backing up #662
- Spark-ES Schema problem #661
- No data node found when publish hosts are private IPs #657
- Caused by: java.io.IOException: org.elasticsearch.hadoop.rest.EsHadoopNoNodesLeftException: Connection error (check network and/or proxy settings)- all nodes failed; tried [[10.XXX.XXX.XX:9200]] #654
- ClassNotFoundException EsPartition on spark_2.10-2.2.0-rc1 #653
- Allow one to specify the array depth #650
- Allow 'prefixed' Elasticsearch installs #642
- es-hadoop hive date map to timestamp field error #639
- org.elasticsearch.hadoop.rest.EsHadoopInvalidRequest: [GET] on [_nodes/http] failed; server[hostname/XXX.XXX.XXX.XXX:Ports] returned [400|Bad Request:] #638
- Spark ElasticsearchIllegalArgumentException[No data node with id[...] found] #637
- ES hadoop problem finding the correct cluster nodes #636
- ES Hadoop with reverse proxy #633
- Hadoop-Spark2Elasticsearch data ingestion problem: Elasticsearch index docs count is greater than Hive table rows count #628
- ES 2.0 SSL problem #608
- Dynamic index writer doesn't work with dataframes (spark sql) #593