This Week in Elasticsearch and Apache Lucene - August 18 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
See how GuideStar, with help from @elastic, aggregates, analyzes, & provides data for grantmakers & benefactors: http://t.co/jDA3MR1SNC
— GuideStar USA (@GuideStarUSA) August 17, 2015
Elasticsearch Core
- Internal: Remove ClassLoader from
Settings
(#12868, 2.0.0) - Java API: Enhancement/terms lookup fixes (#12870, 2.0.0-beta1)
- Build: Also deploy top level artifacts to S3 (#12876, 2.0.0-beta1)
- Plugins: PluginManager: Change staging URL to reflect S3 bucket (#12877, 2.0.0-beta1)
- Search: Simplify
ContextIndexSearcher
. (#12875, 2.0.0) - Release: Release scripts: Split
prepare_release
into two scripts (#12878, 2.0.0-beta1) - Plugins: Fix automatically generated URLs for official plugins in
PluginManager
(#12885, 2.0.0-beta1) - Plugins: Validate checksums for plugins if available (#12888, 2.0.0)
- Build: Fix reproduction line to include project filters (#12894, 2.0.0)
- Internal: Remove
Environment.resolveConfig
(#12872, 2.0.0) - Index Templates: Validate settings specified in index templates at template creation time (#12892, 2.1.0)
- Packaging: Remove
ES_CLEAN_BEFORE_TEST
(#12904, 2.1.0) - Refactor: Shard variable dependency from
processFirstPhaseResults
as shard is no more needed (#12917, 2.0.0) - Internal: Flatten
ClusterModule
and add more tests (#12916, 2.0.0) - Build: Change qa/vagrant artifactId (#12898, 2.1.0)
- Packaging: Add
bin
to jvm-example (#12897, 2.1.0) - Exceptions: Validate class before cast. (#12913, 2.1.0)
- Recovery: Endless recovery loop with
indices.recovery.file_chunk_size=0Bytes
(#12919, 2.0.0) - Internal: Allow a plugin to supply its own query cache implementation (#12881, 2.0.0)
- Release: Replace python search/replace with mvn versions:set plugin (#12924, 2.0.0)
- Help: Elasticsearch bootstrap help shouldn't mention plugins (#12933, 2.0.0)
- Stats: Refactor, remove _node/network and _node/stats/network. (#12922, 2.1.0, 2.0.0)
- Stats: Make platform specific assumptions in OS & Process probes tests (#12929, 2.1.0, 2.0.0)
- Packaging: Fix variable substitution for OS's using
systemd
(#12909, 2.1.0, 2.0.0) - Release: Release script: Set versions for non inherited projects (#12938, 2.0.0-beta1)
- Packaging: Use jvm-example for testing
bin/plugin
(#12895, 2.1.0) - Internal: Flatten
IndicesModule
and add tests (#12921, 2.0.0) - Network: Use
preferIPv6Addresses
for sort order, not preferIPv4Stack (#12951, 2.0.0-beta1) - Stats: Expose
ClassloadingMXBean
in Node Stats (#12764, 2.0.0-beta1) - Settings: Add
path.shared_data
(#12729, 2.0.0-beta1, 2.0.0) - Shadow Replicas: Remove the
node.enable_custom_paths
setting (#12837, 2.0.0-beta1, 2.0.0) - Aggregations: Removed unused factor parameter in
DateHistogramBuilder
(#12850, 2.0.0-beta1, 2.0.0) - Plugins: Introduce a formal
ExtensionPoint
class to stream line extensions (#12826, 2.0.0-beta1) - Plugin Cloud AWS: Add support for
base_path
in elasticsearch.yml (#12761, 2.0.0-beta1) - Percolator: Don't cache percolator query on loading percolators (#12862, 2.0.0-beta1)
- Testing: Refactor classes only plugged in by tests to use package private extension points (#12863, 2.0.0)
- Release: Create pre release script (#12860, 2.0.0-beta1)
- Exceptions: Improve error message of
ClassCastExceptions
(#12821, 2.1.0) - Java API: Prevents users from building a
BulkProcessor
with a null client (#12497, 2.1.0) - Packaging: Set prompts to be more consistent for vagrant (#12782, 2.0.0)
- Geo: Refactor geo_point
validate*
and normalize* for 2.x (#12742, 2.0.0) - Inner Hits: Reset the
ShardTargetType
after serializing inner hits. (#12261, 2.0.0) - Docs: Prepare plugin and integration docs for 2.0 (#12040, 2.0.0)
Apache Lucene
- First release candidate for Lucene 5.3.0 is out but it looks like there will soon be a second RC
DecimalFilter
folds Unicode numeric digits to basic latinMMapDirectory
startup check for whether it can unmap pages throws unhandled exceptions when the security policy thwarts it- Similarity classes should use
docCount
, if available, notmaxDoc
- Hyper-parameter
c
is ignored byNormlizationH1
similarity model - Use groovy to detect invalid license headers and other problems such as invalid tabs in
CHANGES.txt
,XSL
files, andJS
andXML
files - Upgrade ASM to 5.0.4 for Lucene's expressions module
- The
kNN
classifier should take a similarity ant eclipse
should pick the right Java version according to which branch you are using- The new table-encoded
SortedSetDocValu
now accounts for RAM used by its value tableses - Upgrade
ANTLR
(used by the expressions module) from version 3.5 to 4.5, and makeJavascriptCompiler
stateless - More iterations on a private branch to integrate
Geo3D
andBKD
trees toprovide accurate and fast earth-surface "point in shape" queries - Remove the esoteric
get/
methods fromsetIndexingChain IndexWriterConfig - You cannot rebuild
JapaneseTokenizer
's dictionary because the download URL is broken TermAutomatonQuery
, which generalizes on positional queries likePhraseQuery
allowing you to run an arbitrary automaton, will soon implement two-phased andneedsScores
support- IBM's J9 JDK has a fix coming for the broken handling of Unicode filenames, uncovered by Lucene's tests
- Change
SpanPayloadCheckQuery
from Collection<byte[]>
toList
<BytesRef> - This scary test failure, showing that
IndexWriter
is trying to delete a file that does not exist, only happens on Windows - We still see some builds failing due to inexplicable timeouts at 7200 seconds
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!