This Week in Elasticsearch and Apache Lucene - October 6th 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
Stretching Horizons with #Elasticsearch @logmatic_ . Read on to find out how they do it: http://t.co/ebbgILZpSQ. pic.twitter.com/hxZCsJ1VKl
— elastic (@elastic) October 5, 2015
Elasticsearch Core
- Internal: Clean up scripting permissions. (#13844, 3.0.0, 2.2.0)
- Tests: add
lang-groovy
to plugin vagrant test (#13856, 3.0.0, 2.2.0) - Internal: Add
SpecialPermission
to guard exceptions to security policy. (#13854, 3.0.0, 2.2.0) - Query refactoring: set
has_parent/has_child
types context properly (#13863, 3.0.0) - Internal: Remove
ClusterSerivce/IndexSettingsService
dependency from IndexShard (#13853, 3.0.0) - Internal: Verify actually written checksum in
VerifyingIndexOutput
(#13848, 3.0.0, 2.2.0, 2.1.0) - Engine: Remove the disabled autogenerated id optimization from
InternalEngine
(#13857, 3.0.0) - Internal: Start making
RecoverySourceHandler
unit testable (#13840, 3.0.0) - Dependencies: Update to
forbidden-apis
2.0 (#13876, 3.0.0, 2.2.0) - Packaging: Nuke
ES_CLASSPATH
appending, JarHell fail on empty classpath elements (#13880, 3.0.0, 2.2.0, 2.1.0, 2.0.0) - Plugins: Update version incompatibility message for plugin manager (#13883, 3.0.0, 2.2.0, 2.1.0, 2.0.0)
- Core: Close
TokenStream
in finally clause (#13870, 3.0.0, 2.2.0, 2.1.0) - Internal: Remove and forbid use of
com.google.common.io.Resources
(#13908, 3.0.0) - Internal: Remove and forbid use of
com.google.common.collect.ImmutableCollection
(#13909, 3.0.0) - Core: Verify Checksum once it has been fully written to fail as soon as possible (#13896, 3.0.0, 2.2.0, 2.1.0)
- Allocation: Early terminate high disk watermark checks on single data node cluster (#13882, 3.0.0, 2.2.0, 2.1.0)
- Internal: Remove and forbid use of
com.google.common.hash.*
(#13907, 3.0.0) - Index creation: Forbid index name
. & ..
(#13862, 3.0.0, 2.2.0, 2.1.0) - Logging: settings in log config file should not overwrite custom parameters (#13934, 2.2.0, 2.1.0, 2.0.0)
- Core: Record all bytes of the checksum in
VerifyingIndexOutput
(#13923, 3.0.0, 2.2.0, 2.1.0) - Internal: Remove shard-level injector (#13881, 3.0.0)
- SecurityManager: lock down javascript and python script engines better (#13924, 3.0.0, 2.2.0)
- Internal: Remove unneeded Module abstractions (#13944, 3.0.0)
- Testing: Move back some test from groovy plugin to core (#13945, 3.0.0, 2.2.0)
- Plugin Cloud GCE: cloud-gce plugin should check
discovery.type
(#13809, 3.0.0, 2.1.0, 2.0.0) - Snapshot/Restore: Fix blob size in `writeBlob()`` method (#13574, 3.0.0, 2.2.0, 2.1.0)
- CAT API: Adds disk used by indices to
_cat/allocation
(#13783, 3.0.0, 2.2.0, 2.1.0) - Plugin Cloud AWS: discovery-ec2 plugin should check
discovery.type
(#13814, 3.0.0, 2.2.0, 2.1.0, 2.0.0) - Snapshot/Restore: Snapshot restore operations throttle more than specified (#13828, 3.0.0, 2.2.0, 2.1.0, 2.0.0, 1.7.3)
Apache Lucene
- Upgrade to
forbiddenapis
2.0 - Fix test failures in the spatial module geo3d integration
SimpleNaiveBayesClassifier
must copy the BytesRef
term before searching on it, but maybe insteadTermQuery
should clone the incoming term?- We should not use
System.currentTimeMillis
in the replication module -
GeoPointDistanceQuery
has false failures due to shape boundary issues - Build release tools and documentation are perpetually in need of fixing
- Can we simplify Lucene's search APIs by merging
rewrite
andcreateWeig<wbr>ht
? - Here's Another approach to make distributed joins easier in Lucene
- Disjunctions could advance lazily, speeding up cases when they are MUST'd with another query
MemoryIndex
should implement doc values- Add the Divergence from Independence scoring model
- Maybe we should remove index-time boosts (norms) and just use doc values instead)?
- The new
SynonymGraphFilter,
to fix sneaky phrase query bugs in how multi-token synonyms are applied, is blocked on backwards compatibility concerns - The existing
DocIdSetIterator.cost
API does not provide enough information for two-phased iteration
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!