Elasticsearch for Apache Hadoop 2.0.3


See issues on GitHub

Release Notes

Bug fixes

  • Json value extraction fails with mixed nested objects #455
  • Enforce time zone for index formatting when none is specified #435
  • failed unit test dateindexformattertest #433
  • HeartBeat vs. mapreduce.task.timeout doesn't consider "0 == infinite" case #426
  • elasticsearch and Hive integration on Yarn #393
  • Duplicate documents returned on alias scan #363
  • Dynamic es.resource.write fails to find nested field #362
  • Elastic search Hive integration issues #359
  • When using nested objects with MR the returned array does use the correct type #342
  • java.util.Date cannot be cast to org.apache.hadoop.io.Writable #340
  • java.lang.UnsupportedOperationException caused by org.elasticsearch.hadoop.mr.EsInputFormat ? #338
  • group by error #331
  • Fix writable serialization of date type with long values #320
  • JSON serialization error #311
  • Missing commons-cli dependency? #288
  • Document Count in ES Different from Number of Entries Pushed #283
  • Support external versioning of documents #343
  • Can one increase number of partitions and hence spark nodes used? #339
  • Fix nested type serialization #327
  • java.io.NotSerializableException: org.apache.spark.SparkContext #298
  • TaskAttemptId string is not properly formed #346
  • sparksql cant INSERT a es table #330
  • Hive Column Comments causing Hive Query to fail #322

Docs

  • Incorrect parameter names es.update.params and es.update.params.json in config examples #430
  • Update socks proxy configuration in configuration.adoc #419
  • Update Spark doc for new API InputFormat #401 (issue: #390)
  • Correcting Scala Code #389
  • Better document the date formatting feature for dynamic writing #360
  • Update index.adoc #318
  • It is not clear if the plugin must be installed on every node #306
  • Documentation on using raw json in pig #299
  • Improve Pig file size to increase parallelism according to shard size #294