This Week in Elasticsearch and Apache Lucene - September 8 2015
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
New from infrastructure guru @leothrix: '#Elasticsearch Command Line Debugging w/The _cat API' http://t.co/iVPjX4QRvC pic.twitter.com/kKlCbzUFHy
— elastic (@elastic) September 2, 2015
Elasticsearch Core
- Tests: Remove stress tests (#13291, 2.1.0)
- Tests: Rename test suffix so we only use
Tests
(#13294, 2.1.0) - Packaging: Don't surround
`-Xloggc
log filename with quotes (#13296, 2.0.0) - Network: Improve situation when
network.host
is set to wildcard (e.g. 0.0.0.0) (#13299, 2.0.0) - Search: Optimize scrolls for constant-score queries (#13311, 2.1.0)
- Query DSL: Internal: simplify filtered query conversion to lucene query (#13312, 3.0.0, 2.1.0)
- Internal: Bump master (3.0-snapshot) to java 8 (#13314, 3.0.0)
- Test: Workaround pitfall in Java 8 target-type inference (#13318, 3.0.0)
- Core: Upgrade master to lucene
5.4-snapshot r1701068
(#13324, 3.0.0) - Network: Remove broadcast address check. (#13328, 2.0.0)
- Cleanup: remove non needed resources (#13321, 3.0.0, 2.1.0, 2.0.0)
- Internal: Fix deprecations introduced by the upgrade to Lucene 5.3 (#13308, 3.0.0, 2.1.0)
- Query DSL:
simple_query_string
overrides boost coming from lucene (#13331, 3.0.0, 2.1.0, 2.0.0) - Query DSL:
span_containing/within
override default boost coming from lucene (#13339, 3.0.0, 2.1.0, 2.0.0) - Internal: Remove use of underscore as an identifier (#13353, 3.0.0)
- Shadow Replicas: Allow deleting closed indices with shadow replicas (#13309, 3.0.0, 2.1.0, 2.0.0)
- Packaging: Shade
joda-convert
(#13358, 1.7.2, 1.6.3) - Internal: Remove and forbid use of
com.google.common.base.Objects
(#13355, 3.0.0) - Plugin Cloud AWS: Enable
S3SignerType
for repository-s3 (#13360, 3.0.0, 2.1.0) - Internal: Remove and forbid the use of
com.google.common.base.Predicate(s)
(#13349, 3.0.0) - Core: fix exception handling for unavailable shards in broadcast replicatio… (#13341, 2.1.0, 2.0.0)
- Build: move heapdump to target/heapdump dir (#13342, 3.0.0, 2.1.0, 2.0.0)
- Search: Remove the
scan/count
search types. (#13310, 3.0.0) - Internal: Remove cyclic dependencies between
IndexService and FieldData / BitSet
caches (#13381, 3.0.0) - Release: Add script to validate mvn repositories (#13306, 3.0.0, 2.0.0)
- Packaging: Install all plugins during bats tests (#13076, 2.1.0, 2.0.0)
- Mapping: Fix numerous checks for equality and compatibility in mapper field types (#13206, 2.1.0)
- Analysis: Lithuanian analysis (#13244, 2.1.0)
- Tests: Improve jacoco coverage (#13263, 2.1.0)
- Search: Allow reads on shards that are in
POST_RECOVERY
(#13246, 2.1.0, 2.0.0) - Build: remove shaded elasticsearch version (#13252, 2.1.0, 2.0.0)
- Packaging: More bats backports (#13255, 2.0.0)
- Packaging: Clean up more bats tests (#13083, 2.1.0, 2.0.0)
- Scripting: Propagate Headers and Context through to
ScriptService
(#12982, 2.1.0, 2.0.0) - Java API: Add qa smoke test client module (#13271, 2.1.0, 2.0.0)
- Tests: print test start and end of test setup and cleanup (#13268, 2.1.0, 2.0.0)
- Tests: Remove test class exclusion for Abstract prefix and rename classes accordingly (#13282, 2.1.0)
- Internal: Add listeners for
postIndex, postCreate, and postDelete
(#13203, 2.1.0, 2.0.0) - Search: Optimize counts on simple queries. (#13037, 2.1.0)
- Plugin Cloud AWS: Split cloud-aws into
repository-s3 and discovery-ec2
(#13097, 3.0.0) - Query DSL:
match_phrase_prefix
to take boost into account (#13142, 3.0.0, 2.1.0, 2.0.0) - Geo: Refactor
ignore_malformed
and coerce GeoPointFieldType to Builder (#13289, 2.0.0) - Aggregations: Add
percentiles_bucket
pipeline aggregation (#13186, 3.0.0, 2.1.0) - Core: Manual synchronization when iterating over listeners in
InternalClusterInfoService
(#13270, 3.0.0, 2.1.0, 2.0.0) - Aggregations: Add
stats_bucket/extended_stats_bucket
pipeline aggs (#13128, 3.0.0, 2.1.0) - Packaging: Test upgrading from an older version (#13287, 3.0.0, 2.1.0, 2.0.0)
- Geo: Refactor geopoint `validate & normalize_` for 2.0 (#12300, 2.0.0)
- Search Templates: Adds template support to
_msearch
resource (#12414, 2.1.0)
Apache Lucene
- A 5.3.1 bugfix release may be coming soon
- See the confusion matrix for a classifier
- Remove a nasty classloader hack that broke
MorfologikFilter
, and fix it correctly so you can pass the dictionary as aURI
- The new
BoostQuery
decouplesQuery
from boosting - How can we compress postings payloads?
- Nested conjunctions should always be flattened
BoostingQuery
was missing itsrewrite
methodMultiCollector
did not handle early termination properly- Add a point-within-distance query implemented with BKD trees
- Speed up
IndexSearcher.count
when a query is so simple (match all, single term) that we can use index statistics instead - The integration of
BKDTree
andGeo3D
is done (for Lucene 5.4.0), providing accurate and fast earth-surface "point in shape" queries, but we need to make its randomized tests more evil by simulating planets more squashed than earth, requiring some crazy math, including Lagrange multipliers GeoPointDistanceQuery
is buggy with large distances- Don't use approximations for
MatchAllDocsQuery
, and give it a dedicatedBulkScorer
- Dodge bugs in Java's collators
- Windows NTFS pending delete state for a file causes assertion failures in Lucene; we should fix Lucene's
WindowsFS
to also simulate this state - Symlinks to an index directory continue to cause problems for users
CheckIndex
cannot handle corrupt.si
files- Reduce heap used by
CompressingStoredFieldsWriter
when writing large strings during indexing - Reduce heap used by the new geo point queries by building the
BytesRef
on demand for sub-ranges GeoPointDistanceRangeQuery
will match points within a min/max distance range- When you incorrectly index nested documents the resulting error messages are very confusing
DisjunctionMaxQuery
,BoostingQuery
andBoostedQuery
now useIndexSearcher
to create sub-weights so caching can apply
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!