This Week in Elasticsearch and Apache Lucene: The second release candidate for 5.2.0 is out
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
How @SunFoundation uses #Elasticsearch + #Haystack to improve their criminal justice data inventory http://t.co/yOhFUZ0ZZw #Django #Python
— Leslie Hawthorn (@lhawthorn) June 2, 2015
Elasticsearch Core
- Cat API: Add wildcard support for header names (#11367, 2.0.0)
- Aggregations: Fixed Moving Average prediction to calculate the correct keys (#11375, 2.0.0)
- Aggregations: Sibling Pipeline Aggregations can now be nested in SingleBucketAggregations (#11380, 2.0.0)
- Search bug:
fielddata_fields
query string parameter was ignored (#11368, 2.0.0, 1.6.0, 1.5.3) - Internal: make JNA optional for tests and move classes to bootstrap package (#11378, 2.0.0, 1.6.0)
- Internal: catch
UnsatisfiedLinkError
on JNA load (#11385, 2.0.0) - Internal: deduplicate field names returned by simpleMatchToFullName & simpleMatchToIndexNames in
FieldMappersLookup
(#11377, 2.0.0) - Test: filter out colons in test section names (#11389, 2.0.0)
- Store: Consolidate directory lock obtain code (#11390, 2.0.0, 1.6.0)
- Translog: Acquire index writer lock before renaming translog file (#11396, 1.6.0)
- Engine: Ignore 3x segment upgrade if unneeded (#11383, 1.6.0)
- Packaging: Export hostname as environment variable for plugin manager (#11399, 2.0.0, 1.6.0)
- Store: Fall back to reading
SegmentInfos
from Store if reading from commit fails (#11403, 2.0.0) - Store: Fix stream version check in
ShardActiveRequest
(#11407, 1.6.0) - Serialization: Remove old version checks (#11397, 2.0.0)
- Internal: remove unused code. (#11381, 2.0.0)
- Settings: Prevent changing the number of replicas on a closed index (#11410, 2.0.0, 1.6.0)
- Internal: Close lock even if we fail to obtain (#11412, 2.0.0)
- Query API: Better exception if array passed to
term
query. (#11384, 2.0.0) - Cleanup: Consolidate shard level modules without logic into
IndexShardModule
(#11416, 2.0.0) - Java API: Fix typed parameters in
IndexRequestBuilder
and CreateIndexRequestBuilder (#11382, 2.0.0) - Snapshot/Restore: Fix check for locations in a repository path (#11426, 1.6.0)
- Recovery: Reduce cluster update reroutes with async fetch (#11421, 2.0.0, 1.6.0)
- Snapshot/Restore: Add support for applying setting filters when displaying repository settings (#11431, 1.6.0)
- Recovery: Synced flush 1.6 backport (#11417, 1.6.0)
- Parent/Child: Fix
_parent.type
validation (#11436, 2.0.0) - Internal: Better error messages when mlockall fails (#11433, 2.0.0)
- Search: Release search contexts after failed dfs or query phase for d… (#11434, 2.0.0, 1.6.0)
- Core: Read segment info from latest commit whenever possible (#11361, 2.0.0)
- Internal: Simplify
Transport*OperationAction
names (#11349, 2.0.0) - Build: maven warnings when running assembly (#11354, 2.0.0)
- Highlighting: Wildcard field names in highlighting should only return fields that can be highlighted (#11364, 2.0.0, 1.6.0)
- Bulk: Throw exception if unrecognized parameter in bulk action/metadata line (#11331, 2.0.0, 1.6.0)
- Aggregations: Add Holt-Winters to moving_avg aggregation (#11043, 2.0.0)
- Aggregations fix: queries with
size=0
break aggregations that need scores (#11358, 2.0.0) - Upgrade API: Refactor upgrade API to use transport and write minimum compatible version that the index was upgraded to (#11333, 2.0.0, 1.6.0)
- Snapshot/Restore: fix
FSRepository
location configuration (#11284, 1.6.0) - Search: Do not specialize TermQuery vs. TermsQuery (#11308, 2.0.0)
- Recovery: Move index sealing terminology to synced flush (#11336, 2.0.0, 1.6.0)
- TermVector: REST spec for termvector missing paths on 1.x branch (#11087, 1.6.0, 1.5.3)
- Dependencies: Upgrade Jackson to 2.5.3 (#11307, 2.0.0, 1.6.0)
- Internal: tighten up our compression framework (#11279, 2.0.0)
- Aggregations: Unify script and template requests across codebase (#11164, 2.0.0)
- Suggester: Remove filter from
PhraseSuggester
collate (#11195, 2.0.0) - Snapshot/Restore: Add support for applying setting filters when dislaying repository settings (#11270, 2.0.0)
- Allocation: Display low disk watermark to be consistent with documentation (#11313, 2.0.0, 1.6.0)
- Packaging: Add common systemd file for RPM/DEB package (#10725, 2.0.0, 1.6.0)
- Settings: Read configuration file with .yaml suffix (#10909, 2.0.0, 1.6.0, 1.5.3)
- Discovery: Prevent over allocation for multicast ping request (#10896, 2.0.0, 1.6.0)
- Java API: Deprecate
async
replication (#10642, 1.6.0) - Test: enable scripts on demand via annotation and randomize default script settings (#10303, 2.0.0, 1.6.0)
- Mapping: add an assertion to verify consistent serialization (#10472, 2.0.0, 1.6.0)
- Java API: Adding setters or making them public in
ActionRequests
(#8123, 2.0.0) - Parent/Child: Refactoring improve memory consumption and query execution (#6511, 2.0.0)
Apache Lucene
- The second release candidate for 5.2.0 is out
- Norms will be moved off-heap, making use of Lucene's recently added random-access input APIs (
RandomAccessInput
) </li><li>Now you can <a href="https://issues.apache.org/jira/browse/LUCENE-5283" target="_blank">disable <code>ant test from failing if zero tests were run </li><li>Lucene's default filesystem locking implementation, <code>NativeFSLockFa<wbr>ctory, had a nasty bug letting you release locks someone else had acquired - Simplify Lucene's locking APIs to prevent such bugs and add additional best-effort lock checks before destructive file system operations
- Near-real-time readers should always reflect any past commits or commit user-data
- A new suggest API for the document-based
NRTSuggester
<wbr>mirrors Lucene's query APIs</a> and adds support for fuzzy suggestions and context filtering </li><li>A new <code>BreakIterator </a><a href="https://issues.apache.org/jira/browse/LUCENE-6485" target="_blank">splits on an arbitrary character</a>, making it easier to highlight whole values in a multi-valued field </li><li>Simplify <code>ParallelCompositeRead<wbr>er by flattening any incoming reader hierarchy down to a simple list of parallel leaf readers, preventing future bugs like failing to call reader closed listeners WindowsFS
, a mock filesystem Lucene uses for testing that "acts like Windows" even on unix, <a href="https://issues.apache.org/jira/browse/LUCENE-6499" target="_blank">had some race conditions when files are concurrently opened and deleted</a> </li><li><a href="https://issues.apache.org/jira/browse/LUCENE-6497" target="_blank">Let subclasses check <code>FieldType's frozen state so users can implement custom field types- More iterations to improve span queries, including some great cleanups and single pass rewrites for span multi-term queries
- More iterations on a simplified "point in shape" geo spatial API, with sizable performance gains by using prefix terms to visit far fewer terms during searching
- Iterations continue on improving geo3d to model the earth more accurately as an ellipsoid instead of a simple sphere
- BKD trees, also for fast "point in shape" geo spatial queries, gets a bit faster, with a new visualization showing the change
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!