Elasticsearch for Apache Hadoop 2.1.0.Beta4

Zip sha

See issues on GitHub

Release Notes

Breaking changes:

Upgrade integration with spark 1.3.0 #400

New features:

Introduce configuration for returning `Date` object as primitives #422
create Scala 2.11 artifact in addition to Scala 2.10 #376
Allow client-only routing #375
[Spark]Is there a way to make elasticsearch-hadoop stick to the load-balancer(client), instead of going trying to ping all the data nodes? #373
Support serialization of case classes and java beans #365
allow per-doc metadata to be specified at runtime #358
Add support for the newly introduced `sources` in Spark 1.2 #350
elasticsearch-hive0.14 issue #333
Feature - Setting ids in spark w/o using a Map #255

Enhancements:

Cannot set es-resource properties from the command-line #434
Add warning when running Spark 1.2 jar against Spark 1.3 and vice versa #415
Add warnings when invoking saveToES on SchemaRDD but not using the sql package #414
Upgrade to Hive 1.1 #413
Exception in SparkSQL when es.read.metadata=true #408
Keep date formatting behavior aligned with ES API for spark serialization #397
Added ability to exclude fields from the es mapping via 'es.mapping.exclude' config to ScalaValueWriter and SchemaRDDValueWriter #391
Saving case class with mapping id error #384
Precise message for ValueWriter #370
hadoop trying to connect to elastic search non data node and failed #368
Upgrade to Spark 1.2 #347
TaskAttemptId string is not properly formed #346
sparksql cant INSERT a es table #330
Hive Column Comments causing Hive Query to fail #322
Allow the source document to be returned as is #232
Allow fields in a doc to be excluded/included #230
exclude master/client nodes from data requests #214
org.apache.calcite#calcite-core;0.9.2-incubating-SNAPSHOT: not found #337

Bug fixes:

Enforce time zone for index formatting when none is specified #435
failed unit test dateindexformattertest #433
HeartBeat vs. mapreduce.task.timeout doesn't consider "0 == infinite" case #426
SchemaRDD seems to be lost when loading parquet files #403
[spark] esRDD doesn't respect config setting "es.field.read.empty.as.null" #402
elasticsearch and Hive integration on Yarn #393
Invalid position given exception #386
EsSpark.esJsonRDD error #385
Unable to index JSON from HDFS using SchemaRDD.saveToEs() #382
Excluding fields when writing JSON documents from Spark to Elasticsearch doesn't work #381
[Spark] Case Class example failed after compile #378
Duplicate documents returned on alias scan #363
Dynamic es.resource.write fails to find nested field #362
Elastic search Hive integration issues #359
Constant values should not be quoted by default #353
When using nested objects with MR the returned array does use the correct type #342
java.util.Date cannot be cast to org.apache.hadoop.io.Writable #340
java.lang.UnsupportedOperationException caused by org.elasticsearch.hadoop.mr.EsInputFormat ? #338
group by error #331
elasticsearch-spark_2.10-2.1.0.Beta2 exception when join with parqustFile #323
JSON serialization error #311
Document Count in ES Different from Number of Entries Pushed #283
Spark: UpdateScriptParams: JSON serialization error #351
Cannot Find Node #436
Hive Runtime Error while Writting into ES table #432
savetoES can't use pre-defined mapping #424
Not able to transfer data from hive to elastic-search #417
Not able to insert data into ES using elasticsearch-hadoop-2.1.0.Beta3.jar #416
internal.es.yarn.file default configuration not present in YARN/cfg.properties #411
Hive- Elasticsearch Write Operation #409
SparkSQL fails inserting data into Elasticsearch index #406
Es-Hadoop ingestion through Pig is missing the mappings #405
How can be sure of data colocation on my Spark/ES cluster ? #383
HashMap[String,String] and elastic search type mapping is not kicking in to map String to Integer #372
Indexing ES using rdd over https connection #371
Serialization Issue from scala.collection.immutable.HashMap$HashTrieMap #369
Support external versioning of documents #343
Can one increase number of partitions and hence spark nodes used? #339
Fix nested type serialization #327
[2.1.0.Beta2] [ES 1.3.2] [Spark 1.1.0] EsHadoopNoNodesLeftException #303
Issue while joining two hive tables stored on ES #293
Anyway to silence 'WARN EsInputFormat: Cannot determine task id...' #427
Bug in ElasticSearch and Spark SQL: Using SQL to query out data from JSON documents is totally wrong! #377

Docs:

Incorrect parameter names es.update.params and es.update.params.json in config examples #430
Mistakes in documentation #421
Update Spark doc for new API InputFormat #401
Fixed include/exclude examples #392
Not able to locate scala.XML while adding 2.1.0.Beta3 version in dependency #374
Fix typo, cleanup paragraph #367
better document the date formatting feature for dynamic writing #360
Document needs correction #329
Demonstrate Storm's tick feature #312
Improve Pig file size to increase parallelism according to shard size #294

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Elasticsearch for Apache Hadoop 2.1.0.Beta4

Release Notes

Breaking changes:

New features:

Enhancements:

Bug fixes:

Docs:

Follow us

About us

Join us

Press

Partners

Trust & Security

Investor relations

EXCELLENCE AWARDS

About us

Join us

Press

Partners

Trust & Security

Investor relations

EXCELLENCE AWARDS