This week in Elasticsearch and Apache Lucene: Elasticsearch and the MarsCuriosity analytics cloud
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
@elastic although not mentioned, #elasticsearch is the powerhouse behind our @MarsCuriosity analytics cloud… w00t! http://t.co/SYfd0m1x3I
— Matt Lenda (@mattTheLenda) May 4, 2015
Elasticsearch Core
- Query: query string time zone not working (#10883, 2.0.0, 1.6.0, 1.5.3)
- Mappings: Fix
_field_names
to not have doc values (#10893, 2.0.0) - Aggregations: Added
min_bucket
aggregation (#10900, 2.0.0) - Aggs: Change the default
min_doc_count
to 0 on histograms. (#10904, 2.0.0) - Scripting: Add helper methods for dates (#10890, 2.0.0)
- Logging: Add logging of slow cluster state tasks (#10907, 2.0.0, 1.6.0)
- Queries: Add span
within
/<code>containing queries (#10913, 2.0.0) - Dependencies: Exclude jackson-databind dependency (#10924, 2.0.0, 1.6.0)
- Core: Cut over to the Lucene filter cache (#10897, 2.0.0)
- Cleanup: Remove old 0.90 shard allocator (#10889, 2.0.0)
- Clients: Automatically thread client based action listeners (#10940, 2.0.0)
- Startup: Simplify securitymanager init (#10936, 2.0.0)
- Cleanup: Never rely on
CWD
in paths (#10923, 2.0.0) - Startup: Remove
shutdownHooks
permission in security manager (#10953, 2.0.0) - Startup: Remove JNI permissions, improve JNI testing (#10962, 2.0.0)
- Shutdown: Remove exitVM permissions (#10963, 2.0.0)
- Shadow Replicas: Allow shards on shared filesystems to be recovered on any node (#10960, 2.0.0, 1.6.0)
- Client: Centralize admin implementations and action execution (#10955, 2.0.0)
- Testing: make testing better mimic reality for securitymanager (#10965, 2.0.0)
- Aggs: Fixes Infinite values return from
geo_bounds
with non-zero bucket-ordinals (#10917, 2.0.0, 1.6.0, 1.5.3) - Function score: Add default to
field_value_factor
(#10845, 2.0.0, 1.6.0) - Security manager: Remove reflection permission for sun.management. (#10848, 2.0.0)
- Discovery: Unicast Ping should close temporary connections after returning ping results (#10849, 2.0.0, 1.6.0, 1.5.3)
- Cleanup: Remove core delete-by-query implementation (#10859, 2.0.0)
- Cleanup: Remove index/indices replication infra code (#10861, 2.0.0)
- Mappings: Explicitly disallow multi fields from using object or nested fields (#10745, 2.0.0)
- Mappings: Consolidate document parsing logic (#10802, 2.0.0)
- Mappings: Remove includes and excludes from _source (#10814, 2.0.0)
- Internal: Remove the query parser cache. (#10856, 2.0.0)
- Allocation: Add multi
data.path
to migration guide (#10770, 2.0.0) - Cleanup: Remove Preconditions class (#10873, 2.0.0)
- Cleanup: Remove global
source
parameter from individual APIs in REST spec (#10863, 2.0.0, 1.6.0) - Query enhancement: return positions of parse errors found in JSON (#10837, 2.0.0)
- Serialization: read/writeGenericValue to support BytesRef (#10878, 2.0.0)
- Aggs: Don't use bitset cache for children filters in
inner_hits
(#10663, 2.0.0, 1.6.0, 1.5.3) - Aggs: Ignore object fields in
inner_hits
(#10662, 2.0.0, 1.6.0, 1.5.3) - Mappings: Remove file based default mappings (#10870, 2.0.0)
- Indices: Wait forever (or one day) for indices to close (#10833, 2.0.0, 1.6.0)
- Recovery: Decouple recoveries from engine flush (#10624, 2.0.0)
- Scripting: remove groovy sandbox (#10480, 2.0.0)
- Aggs: Ability to perform computations on aggregations (#10568, 2.0.0)
- Settings: Trimmed the main
elasticsearch.yml
configuration file (#5861, 2.0.0) - ref count write operations on IndexShard (#10610, 2.0.0, 1.6.0)
- Write state also on data nodes if not master eligible (#9952, 2.0.0)
Apache Lucene
- Upgrade Elasticsearch master to latest Lucene 5.2.x snapshot
- A new, simpler spatial API is coming, supporting the common use case of indexing points and intersecting with bounding box and polygons
- Internal simplification to expressions module to support zero-arg methods
- Classifier APIs are now more thread friendly
- Do not decode norms when scores are not needed when scoring a
BooleanQuery
- Two new span queries:
SpanWithinQuery
, where one span is contained in another, and <code>SpanContainingQuery, where one span contains another - Give a clearer exception when the index contains too-old segments
- Add assertions to detect illegally set ghost bits in
FixedBitSet
and <code>LongBi<wbr>tSet - The query cache only kicks in if the index has at least 10K docs by default, so e.g.
MemoryIndex
won't cache queries - Immense queries now just skip the query cache instead of evicting all existing entries
BooleanScorer
<a href="https://issues.apache.org/jira/browse/LUCENE-6452" target="_blank">scores conjunctions exactly the same as</a> <code>BooleanScorer2- Simplify the highlighter APIs for getting token streams
- Don't throw
NullPointerException
<wbr>from <code>PostingsHighlighter when some segments didn't index the field - Iterations continue on the geo3d package
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!