This Week in Elasticsearch and Apache Lucene - July 22 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Elasticsearch Core
- Build: Deploy artifacts to S3 as well as sonatype when running a release (#12270, 2.0.0)
- Zen Discovery Testing: restrict the test unicast zen discovery to the port range of the JVM (#12271, 2.0.0)
- Testing: Do not assert that all shards were successful in tests (#12284, 2.0.0)
- Dependencies: Upgrade groovy from 2.4.0 to 2.4.4 (#12288, 2.0.0, 1.7.0, 1.6.1)
- Networking: Transport Tracer should exclude
cluster:monitor/nodes/liveness
by default (#12291, 2.0.0) - Query DSL: QueryString ignores
maxDeterminizedStates
when creating a WildcardQuery (#12269, 2.0.0, 1.7.1) - Aggregations: Add better validation of
moving_avg
model settings (#12280, 2.0.0) - Serialization: Fix serialization of
IndexFormatTooNewException
and IndexFormatTooOldException (#12277, 2.0.0) - Scripting: Consistently name Groovy scripts with the same content (#12296, 2.0.0, 1.7.1)
- Testing: Use junit4 for running integration tests (#12302, 2.0.0)
- Testing: Fix
REPRODUCE WITH
for integration tests (#12303, 2.0.0) - Dependencies: Update Netty to version
3.10.3.Final
(#12310, 2.0.0) - Docs: Correct default date format (#12306, 2.0.0)
- Dependencies: Remove securemock from repository (#12313, 2.0.0)
- Build: Remove broken
exec
build target, replace with something better (#12307, 2.0.0) - Allocation: Use recently added allocation ids for shard started/failed (#12299, 2.0.0)
- ThreadPools: schedule a timeout check after adding command to queue (#12319, 2.0.0, 1.7.1)
- Cat API: Add scroll stats to cat API (#12331, 2.0.0)
- Allocation: Initial Refactor Gateway Allocator (#12335, 2.0.0)
- Refactoring: Simplify handling of ignored unassigned shards (#12339, 2.0.0)
- Stats: Use time with nanosecond resolution calculated at the executing node … (#12346, 2.0.0)
- Docs: Update time_zone specification (#12357, 2.0.0)
- Mapping: Remove ability to configure
_index
(#12356, 2.0.0) - Mapping: Remove index name from mapping parser (#12352, 2.0.0)
- Build: Add a release profile to the parent
pom.xml
(#12227, 2.0.0) - Test: Use
enforcer:display-info
to print version info (#12243, 2.0.0) - Exceptions: Render structured exceptions in
mget/mpercolate
(#12240, 2.0.0) - Snapshot/Restore: Url repository should respect
repo.path
for file urls (#11687, 2.0.0, 1.6.1) - Exceptions: Add index name to the upgrade exception (#12213, 2.0.0)
- Testing: Enable system assertions (#12248, 2.0.0)
- Refactoring: Integration tests (#12252, 2.0.0)
- Discovery: Wait on incoming joins before electing local node as master (#12161, 2.0.0)
- Allow shards to be allocated if leftover shard from different index exists (#12237, 2.0.0)
- Core: Carry over shard exception failure to master node (#12263, 2.0.0)
- Shard: Unique allocation id (#12242, 2.0.0)
- Aggregations: Add a terms aggregator based on ordinals that does not need global ordinals. (#12082, 2.0.0)
- Snapshot/Restore: Add checksum to snapshot metadata files (#12002, 2.0.0)
- Search: Add
_replica
and _replica_first as search preference. (#12244, 2.0.0) - Query DSL:
RegexpQueryParser
takes a String as value like its Builder (#12200, 2.0.0) - Query DSL:
PrefixQueryParser
takes a String as value like its Builder (#12204, 2.0.0) - Geo: Deprecate
validate_*
and normalize_* (#10248, 2.0.0, 1.7.1) - Stats: Add an API to locate unrecovered shards and their state (#11545, 2.0.0)
- Search: Add global search timeout setting (#12211, 2.0.0)
- Geo: Update
ShapeBuilder
and GeoPolygonQueryParser to accept non-closed GeoJSON (#11161, 2.0.0) - Parent/Child: Enforce
_parent
field resolution to be strict (#9521, 2.0.0)
Apache Lucene
SortingMergePolicy
now overridesMergePolicy.size
- Cutover the
spatial
module to use the new, fasterDocIdSetBuilder
BlendedInfixSuggester
should preserve suggestions that differ only in their payloadsQueryParserBase
was failing to pass itsmaxDeterminizedStates
when it createdWildcardQuery
- Small cleanup to
IndexWriter
's plumbing, removingThreadState
'sisActive
boolean IndexWriter
now requires that a codec not claim it created a file when in fact it did not- Unexpected exceptions during merges are now considered tragic, causing
IndexWriter
to preemptively close in self defense Geo3D
is now its own module, but Maven and IntelliJ needed help understanding thatSpatial3d
javadocs now pass our javadocs linter- Fixed a tricky concurrency bug while
IndexWriter
is aborting, caused by removingIndexWriterConfig.set/getMaxThreadStates
MockFileSystemTestCase.testURI
couldn't handle filesystems that don't allow non-ASCII charactersIndexUpgrader
fails to upgrade the index if there are 0 segmentsIndexInput.clone
's behavior after the original is closed is confusing- Add
GeoPointDistanceQuery
and dateline crossing support toGeoPointInBBoxQuery
- A fix for one of the IBM J9 bugs affecting Lucene is coming
- Lucene's
explain
method sometimes claimed a document matched when it did not - Symlinks in the filesystem while running Lucene's tests make the security manager angry
StandardTokenizer
inefficiently copies its token buffer when nothing changed- OpenJDK 1.9.0 early-access b72 has a number of bugs affecting Lucene, yet we must open our own issue since we cannot comment on OpenJDK issues ourselves
- Add K nearest neighbor and simple naive bayes document classifiers to Lucene's classifier module
MappingCharFilter
produces broken offsets, but fixing the bug is tricky and contentiousKNearestNeighborClassifier
should also use the class ranking- How can a given query opt out of caching?
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!