This Week in Elasticsearch and Apache Lucene - July 29 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
Elasticsearch 1.7.1 and 1.6.2 released with fixes for a rare but important bug https://t.co/VkTU9m1gyP
— elastic (@elastic) July 29, 2015
Elasticsearch Core
- Fielddata: Consult field info before fetching field data (#12403, 2.0.0)
- Settings: Copy the classloader from the original settings when checking for prompts (#12419, 2.0.0, 1.7.1, 1.6.2)
- Allocation: Cancel replica recovery on another node when synced copy found (#12421, 2.0.0)
- Internal:
ShardUtils#getElasticsearchLeafReader()
should use FilterLeafReader#getDelegate() instead of FilterLeafReader#unwrap (#12437, 2.0.0) - Settings: Medium Interval time for
ResourceWatcher
should be 30 seconds (#12423, 2.0.0) - Allocation:
ThrottlingAllocationDecider
should not counting relocating shards (#12409, 2.0.0) - Plugins: make
java.version
mandatory for jvm plugins (#12424, 2.0.0) - Query DSL: don't cache type filter in
DocumentMapper
(#12447, 2.0.0) - Settings: Add explicit check that we have reached the end of the settings stream when parsing settings (#12451, 2.0.0)
- Search: Copy headers from the MLT request before calling the multi-termvectors API (#12443, 1.7.1, 1.6.2)
- Allocation: No need to find replica copy when index is created (#12435, 2.0.0)
- Plugins: Skip hidden files (#12465, 2.0.0)
- Aggregations: Fix cidr mask conversion issue for
0.0.0.0/0
and add tests (#12430, 2.0.0, 1.7.1, 1.6.2) - Cache: Left over from the
query_cache/request_cache
rename (#12478, 2.0.0) - Internal: Forbid
Files.isHidden
(#12484, 2.0.0) - Packaging: Remove Core Lib directory (#12485, 2.0.0)
- Internal:
IndicesStore
shouldn't try to delete index after deleting a shard (#12487, 2.0.0, 1.7.1, 1.6.2) - PluginManager: Adapt pluginmanager to the new world (#12408, 2.0.0)
- Internal: Remove unused QueryParseContext argument in
MappedFieldType#rangeQuery()
(#12417, 2.0.0) - Internal: Replace
primaryPostAllocated
flag and use UnassignedInfo (#12374, 2.0.0) - Testing: Add unit tests for
nodesAndVersions
on shared filesystems (#12343, 2.0.0) - Azure Plugin: Correctly list blobs in Azure Storage to prevent snapshot corruption and do not unnecessarily duplicate Lucene segments in Azure Storage (#12380, 2.0.0)
- Translog: Ignore
EngineClosedException
during translog fysnc (#12384, 2.0.0) - Query DSL: Fix malformed query generation in field value factor function (#12328, 1.7.1)
- Fielddata: Remove the dependecy on
IndexFielddataService
from MapperService (#12371, 2.0.0) - Refactoring: Refactor pluginservice (#12367, 2.0.0)
- Internal: Simplify Replica Allocator (#12401, 2.0.0)
- Internal: Remove
TransportSingleCustomOperationAction
in favour of TransportSingleShardAction (#12350, 2.0.0) - Logging: Add shadow indicator when using shadow replicas (#12399, 2.0.0)
- Allocation: Adapt
IndicesClusterStateService
to use allocation ids (#12397, 2.0.0) - Internal: Add the ability to wrap an index searcher. (#12364, 2.0.0)
- Term Vectors: Make sure filter is correctly parsed for multi-term vectors (#12312, 2.0.0)
- Aggs: Add
HDRHistogram
as an option in percentiles and percentile_ranks aggregations (#12362, 2.0.0) - Internal: Cleanup the data structures used in
MetaData
class for alias and index lookups (#12202, 2.0.0) - Packaging: Split packages into maven sub-modules (#12286, 2.0.0)
- Build: Update to maven-shade-plugin 2.4.1 (#12324, 2.0.0)
- Logging: Add
-XX:+PrintGCDateStamps
when using GC Logs (#11735, 2.0.0, 1.7.1)
Apache Lucene
- The Lucene PMC has added two new committers: Christine Poerschke and Mikhail Khludnev
- Rework
build.xml
files to avoid running out of permgen space - Add a stemmer and analyzer for Lithuanian
FilterDirectoryReader.doClose
could over-decRef the reader - A nice speedup to
MultiTermsEnum.next
makes computing global ordinals 26-36% faster in simple tests IgnoreAcceptedDocsQuery
is no moreSortedSet/NumericDocValues
now optimize for the use-case where a small number of unique sets are used by all documents- The javadoc link to Wikipedia's Levenshtein distance was incorrect
- Fix a long standing test bug causing cryptic non-reproducible
ArrayIndexOut
inOfBoundsException TestRando
mSamplingFacetsCollector - Clarify javadocs for
IndexInput.close
on a clone - Pull query boosting out of the
Query
into a newBoostQuery
class - The new geo point queries can be too costly when applied to large shapes
BlendedTermQuery
scores a set of terms as if they all had the same index statistics- Upgrade
ANTLR
(used by the expressions module) from version 3.5 to 4.5 EarlyTerminatingSortingCollect
should track if it did in fact terminate earlyor WordDelimiterFilter
puts some tokens at the wrong position- When
IndexWriter
resolves a deletedQuery
to document IDs, it should use theQuery
API not theFilter
API MoreLikeThisQuery
computes the wrong term frequencies
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!