This Week in Elasticsearch and Apache Lucene - September 1 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
Elasticsearch 2.0.0-beta1 released, with over 2,500 PRs from 469 contributors, get it while its hot! https://t.co/flVdtjPvYW
— elastic (@elastic) August 26, 2015
Elasticsearch Core
- Internal: Remove usage of
Tuple
as a method parameter (#13135, 2.1.0, 2.0.0) - Core: Call
beforeIndexShardCreated
listener earlier when creating shards (#13153, 2.1.0, 2.0.0) - Test: Use proper comparison operator ttl test (#13165, 2.1.0)
- Core: bumped version to 2.0.0-beta2-SNAPSHOT (#13166, 2.0.0)
- Snapshot/Restore: Add
readonly
option for repositories (#13144, 2.1.0) - Internal: Remove and forbid use of
com.google.common.collect.Lists
(#13170, 2.1.0) - Settings: Fix
discovery.zen.join_timeout
default value logic (#13162, 2.1.0, 2.0.0) - Core: Add unit test for
ShardPath.selectNewPathForShard
(#13158, 2.1.0, 2.0.0) - Allocation: Expand
ClusterInfo
to provide min / max disk usage for allocation decider (#13163, 2.1.0, 2.0.0) - CAT API: Default verbose to false (#13180, 2.1.0, 2.0.0)
- Build: Update Apache Maven Enforcer plugin to 1.4.1 (#13181, 2.1.0, 2.0.0)
- Plugins: Installing plugin without checksums ends up downloading from github (#13197, 2.1.0, 2.0.0)
- Mapping: Fix doc parser to still pre/post process metadata fields on disabled type (#13137, 2.1.0, 2.0.0)
- Allocation: Take Shard data path into account in
DiskThresholdDecider
(#13195, 2.1.0, 2.0.0) - Tests: Ensure binding on localhost host is consistently ipv4/v6 (#13208, 2.1.0, 2.0.0)
- Network: Convert upgrade action to broadcast by node (#13205, 2.1.0, 2.0.0)
- Internal: Add default impl for
resolveIndex()
(#13218, 2.1.0, 2.0.0) - Packaging: Lock vagrant to virtualbox (#13221, 2.1.0, 2.0.0)
- Docs: Update list of available os stats (#13198, 2.1.0, 2.0.0)
- Test: Provide the plugins to transport client communicating with the the external cluster (#13222, 2.1.0)
- Packaging: Backport remaining packaging tests work (#13223, 2.0.0)
- Tests: Allow tests to override whether mock modules are used (#13215, 2.1.0)
- Internal: Remove and forbid use of
com.google.common.collect.ImmutableList
(#13227, 2.1.0) - Mapping: Fix document parsing to properly ignore entire type when disabled (#13085, 2.0.0)
- Packaging: More portable extraction of short hostname (#13109, 2.1.0, 2.0.0)
- Mapping: Default
position_offset_gap
to 100 (#12544, 2.1.0, 2.0.0) - Mapping: Move
position_offset_gap default
change (#13111, 2.1.0) - Testing: check with
jps
that the pid file contains a pid that actually is an elasticsearch process (#12961, 2.1.0, 2.0.0) - Aggregations:
GeoDistance
Aggregation now prints field name when it finds an unexpected token (#13033, 2.1.0, 2.0.0) - Parent/Child: Remove unnecessary usage of extra index searchers (#12864, 2.1.0)
- Internal: Removed the
operation_threaded
option (#13119, 2.1.0, 2.0.0) - Mapping: Rename
position_offset_gap to position_increment_gap
(#13056, 2.1.0, 2.0.0) - Docs: Add migration guide notes for multicast moving to a plugin (#13091, 2.0.0)
- Stats: Expose shards data and state path via
ShardStats
(#13118, 2.1.0, 2.0.0) - Warmers: Warmers delete
_all
should not throw exception when no warmers (#13058, 2.1.0, 2.0.0) - Query DSL: Query DSL: deprecate
_name
and boost in short variants of queries (#12966, 2.1.0, 2.0.0) - Network: Add mechanism for transporting shard-level actions by node (#12944, 2.1.0, 2.0.0)
- Plugins: Plugins: Removed
plugin.types
(#13055, 2.1.0) - Plugins: Replace HTTP urls with HTTPS in
PluginManager
(#12824, 2.1.0, 2.0.0) - Build:
cmd /C
needs to be quoted as a whole when starting integration tests (#12910, 2.1.0, 2.0.0) - Stats: Sort thread pools by name in
NodesStats
(#13121, 2.1.0) - Internal: Turn
DestructiveOperations
into a Guice module (#13046, 2.1.0) - Update API: Default
detect_noop
to true (#11306, 2.1.0)
Apache Lucene
- Lucene 5.3.0 is released!
- Java 1.9 uses Unicode 7.0 which changed which characters are defined as whitespace
FingerprintFilter
emits a single token which is a sorted, de-duplicated set of all of its input tokens, to normalize text for use cases such as clustering- Fix more private access level javadocs
- Ivy has an improved option for file-locking, to sidestep the "leftover lock" problem you can hit when building Lucene/Solr in two different directories
- Make
Math.random
forbidden - Remove dead code from one of Lucene's grouping implementations
- Can/should we compress postings payloads?
- Test failures are creeping up on us
- Fix
CheckIndex
to gracefully handle corrupt or missing*.si
files SimpleNaiveBayesClassifier
seems to be modifying aQuery
after its cached, leading toConcurrentModificationException
- More fast iterations to integrate
BKDTree
andGeo3D
to provide accurate and fast earth-surface "point in shape" queries - Index time sorting should be simpler to use
CheckIndex
can no longer handle an empty directory?- Don't use approximations with
MatchAllDocsQuery
- Java 1.9 bugs with date parsing affect Tika and Lucene
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!