This Week in Elasticsearch and Apache Lucene - September 22 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
Elasticsearch requires less disk than you may think; even less in 2.0 with best_compression! https://t.co/ZAB9GxJQiA pic.twitter.com/qQJuyAr0yQ
— elastic (@elastic) September 15, 2015
Elasticsearch Core
- Packaging: Remove some bogus permissions only needed for tests. (#13620, 3.0.0)
- Exceptions: Use a dedicated id to serialize
EsExceptions
instead of it's class name. (#13629, 3.0.0, 2.1.0, 2.0.0) - Tests: Better simulate problematic plugins permissions in unit tests. (#13638, 3.0.0)
- Build: Add Intellij support for plugins with extra permissions. (#13647, 3.0.0)
- Exceptions: Rename
QueryParsingException
to a more generic ParsingException (#13631, 3.0.0) - Plugin Cloud Azure: Add documentation for setting
network.host
with azure discovery (#13637, 3.0.0, 2.1.0, 2.0.0) - Plugin Cloud AWS: Update AWS SDK version to
1.10.19
(#13655, 3.0.0, 2.1.0) - Plugins: Adds a validation for plugins script to check if java is set (#13633, 3.0.0, 2.1.0, 2.0.0)
- Packaging: Get
lang-javascript, lang-python, securemock
ready for script refactoring (#13695, 3.0.0) - Build: Disable doclint (#13689, 3.0.0)
- Build: Fix all javadocs issues, re-enable compiler warnings (but disable on java 9 where maven is broken) (#13702, 3.0.0)
- Test: Move a couple more rest-api-spec resource dirs into resources (#13704, 3.0.0, 2.1.0)
- Internal: Add a
BaseParser
helper for stream parsing (#13615, 3.0.0) - Packaging: Fix vagrant plugin tests (#13696, 3.0.0)
- Test: Add SLES-12 to list of tested virtual machines (#13675, 3.0.0, 2.1.0, 2.0.0)
- Test: Remove esoteric apt-get in
Vagrantfile
(#13644, 3.0.0, 2.1.0, 2.0.0) - Dependencies: Update to randomizedtesting 2.1.17 (#13713, 3.0.0)
- Settings: Remove
index.buffer_size
setting (#13563, 3.0.0, 2.1.0, 2.0.0) - Exceptions: Prevent losing stacktraces when exceptions occur (#13587, 3.0.0, 2.1.0, 2.0.0)
- SecurityManager: Remove `java.lang.reflect.ReflectPermission "suppressAccessChecks"`` (#13603, 3.0.0)
- Core: Fix
InternalEngineTests.testTranslogReplayWithFailure
to expect AssertionError as well (#13609, 3.0.0, 2.1.0, 2.0.0) - Packaging: Fix
centos-6
vagrant tests (#13594, 3.0.0, 2.1.0, 2.0.0) - Packaging: Add
opensuse-13
to vagrant packaging tests (#13593, 3.0.0, 2.1.0, 2.0.0) - Plugin Cloud GCE: Cloud GCE documentation update (#13598, 3.0.0, 2.1.0, 2.0.0)
- Plugins: Move
rest-api-spec
for plugins into test resources (#13611, 3.0.0, 2.1.0) - Tests: Prevent certain index templates from being wiped in between tests (#13606, 3.0.0, 2.1.0, 2.0.0)
- Plugin Discovery EC2: Move integration tests to unit tests (#13502, 3.0.0)
- Internal: Remove and forbid use of
com.google.common.primitives.Ints
(#13596, 3.0.0) - Plugin Cloud Azure: Enable SSL for Azure blob storage (#13573, 3.0.0, 2.1.0, 2.0.0)
- Tribe Node: Increment tribe node version on updates (#13566, 3.0.0, 2.1.0, 2.0.0, 1.7.3)
- Packaging: Remove
JAVA_HOME
detection from the debian init script (#13514, 3.0.0, 2.1.0) - Test: make sure
JAVA_HOME
is set before tests are run (#13124, 2.1.0, 2.0.0) - Packaging: Packaging test for filesystem scripts (#13262, 3.0.0, 2.1.0, 2.0.0)
- Build: maven assembly, elasticsearch is already excluded (#13387, 3.0.0, 2.1.0, 2.0.0)
- Plugin Cloud Azure: Split azure plugin in 3 plugins (#13330, 3.0.0)
- REST:
RestUtils.decodeQueryString
ignores the URI fragment when parsing a query string (#13365, 3.0.0, 2.1.0) - Test: fix hanging
testRefreshDoesNotMissShards
(#13343, 2.1.0, 2.0.0) - Internal: Enable indy (invokedynamic) compile flag for Groovy scripts by default (#8201, 3.0.0, 2.1.0)
Apache Lucene
- Votes for the third 5.3.1 release candidate are underway now
- At long last, Lucene finally moves away from the very old TF/IDF scoring model to the better performing BM25, by default, but this is only for the next major release (6.0)
- When handling a tragic exception
IndexWriter
fails to wait for any concurrent commits to first finish SpanPayloadCheckQuery.toString
now takes List
ofBytesRef
instead ofCollection
ofbyte[]
PayloadSpanUtil
now lives in Lucene'ssandbox
module, and span payload queries have moved to thequeries
module- Reduce transient heap required when writing large stored fields values with the default (compressing stored fields) codec
- Fix a couple tests to properly handle tragic merge exceptions
- Early access Jigsaw testing resulted in many small Lucene fixes
- Highlighter would sometimes get the wrong
slop
andinOrder
when convertingPhraseQuery
toSpan<wbr>Query
- Improve
FSDirectory
javadocs to explain why symlinks to an index directory are problematic - Improve German decompounding using
DictionaryCompoundWordTo<wbr>kenFilter
by havingminSubwordSize
apply to all fragments GeoPointDistanceQuery
is buggy with large distances- Should
TermOrdValComparator.<wbr>value
be forced to make a deep clone of the returnedBytesRef
? BooleanQuery.equals
is unfortunately sensitive to clause order, but fixing it is controversialAbstractRangeQueryNode.<wbr>toQueryString
does not always create the right "inverse" query string- Possible workarounds for a recent change in how
WordDelimiterFilter
sets positions of its word parts SimpleNaiveBayesClassifier
hits a ConcurrentModificationExcept<wbr>ion
while trying to evict a query fromIndexSearcher's
cache- Can we simplify Lucene's search APIs by merging
rewrite
andcreateWeig<wbr>ht
? - More iterations trying to get
BKDDistanceQuery
working on large distances FunctionQuery.AllScorer.<wbr>explain
confusingly overwritesqueryNorm
- Should we add a read-only interface to
BitSet
? - The time has come to remove the slow sandbox
RegexpQuery
PhraseQuery
claims it allows more than one term at a single position, but it does not
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!