See issues on GitHub
Breaking changes:
- Upgrade integration with spark 1.3.0 #400
New features:
- Introduce configuration for returning `Date` object as primitives #422
- create Scala 2.11 artifact in addition to Scala 2.10 #376
- Allow client-only routing #375
- [Spark]Is there a way to make elasticsearch-hadoop stick to the load-balancer(client), instead of going trying to ping all the data nodes? #373
- Support serialization of case classes and java beans #365
- allow per-doc metadata to be specified at runtime #358
- Add support for the newly introduced `sources` in Spark 1.2 #350
- elasticsearch-hive0.14 issue #333
- Feature - Setting ids in spark w/o using a Map #255
Enhancements:
- Cannot set es-resource properties from the command-line #434
- Add warning when running Spark 1.2 jar against Spark 1.3 and vice versa #415
- Add warnings when invoking saveToES on SchemaRDD but not using the sql package #414
- Upgrade to Hive 1.1 #413
- Exception in SparkSQL when es.read.metadata=true #408
- Keep date formatting behavior aligned with ES API for spark serialization #397
- Added ability to exclude fields from the es mapping via 'es.mapping.exclude' config to ScalaValueWriter and SchemaRDDValueWriter #391
- Saving case class with mapping id error #384
- Precise message for ValueWriter #370
- hadoop trying to connect to elastic search non data node and failed #368
- Upgrade to Spark 1.2 #347
- TaskAttemptId string is not properly formed #346
- sparksql cant INSERT a es table #330
- Hive Column Comments causing Hive Query to fail #322
- Allow the source document to be returned as is #232
- Allow fields in a doc to be excluded/included #230
- exclude master/client nodes from data requests #214
- org.apache.calcite#calcite-core;0.9.2-incubating-SNAPSHOT: not found #337
Bug fixes:
- Enforce time zone for index formatting when none is specified #435
- failed unit test dateindexformattertest #433
- HeartBeat vs. mapreduce.task.timeout doesn't consider "0 == infinite" case #426
- SchemaRDD seems to be lost when loading parquet files #403
- [spark] esRDD doesn't respect config setting "es.field.read.empty.as.null" #402
- elasticsearch and Hive integration on Yarn #393
- Invalid position given exception #386
- EsSpark.esJsonRDD error #385
- Unable to index JSON from HDFS using SchemaRDD.saveToEs() #382
- Excluding fields when writing JSON documents from Spark to Elasticsearch doesn't work #381
- [Spark] Case Class example failed after compile #378
- Duplicate documents returned on alias scan #363
- Dynamic es.resource.write fails to find nested field #362
- Elastic search Hive integration issues #359
- Constant values should not be quoted by default #353
- When using nested objects with MR the returned array does use the correct type #342
- java.util.Date cannot be cast to org.apache.hadoop.io.Writable #340
- java.lang.UnsupportedOperationException caused by org.elasticsearch.hadoop.mr.EsInputFormat ? #338
- group by error #331
- elasticsearch-spark_2.10-2.1.0.Beta2 exception when join with parqustFile #323
- JSON serialization error #311
- Document Count in ES Different from Number of Entries Pushed #283
- Spark: UpdateScriptParams: JSON serialization error #351
- Cannot Find Node #436
- Hive Runtime Error while Writting into ES table #432
- savetoES can't use pre-defined mapping #424
- Not able to transfer data from hive to elastic-search #417
- Not able to insert data into ES using elasticsearch-hadoop-2.1.0.Beta3.jar #416
- internal.es.yarn.file default configuration not present in YARN/cfg.properties #411
- Hive- Elasticsearch Write Operation #409
- SparkSQL fails inserting data into Elasticsearch index #406
- Es-Hadoop ingestion through Pig is missing the mappings #405
- How can be sure of data colocation on my Spark/ES cluster ? #383
- HashMap[String,String] and elastic search type mapping is not kicking in to map String to Integer #372
- Indexing ES using rdd over https connection #371
- Serialization Issue from scala.collection.immutable.HashMap$HashTrieMap #369
- Support external versioning of documents #343
- Can one increase number of partitions and hence spark nodes used? #339
- Fix nested type serialization #327
- [2.1.0.Beta2] [ES 1.3.2] [Spark 1.1.0] EsHadoopNoNodesLeftException #303
- Issue while joining two hive tables stored on ES #293
- Anyway to silence 'WARN EsInputFormat: Cannot determine task id...' #427
- Bug in ElasticSearch and Spark SQL: Using SQL to query out data from JSON documents is totally wrong! #377
Docs:
- Incorrect parameter names es.update.params and es.update.params.json in config examples #430
- Mistakes in documentation #421
- Update Spark doc for new API InputFormat #401
- Fixed include/exclude examples #392
- Not able to locate scala.XML while adding 2.1.0.Beta3 version in dependency #374
- Fix typo, cleanup paragraph #367
- better document the date formatting feature for dynamic writing #360
- Document needs correction #329
- Demonstrate Storm's tick feature #312
- Improve Pig file size to increase parallelism according to shard size #294