2015年6月9日

This Week in Elasticsearch and Apache Lucene: Algorithms that power Lucene and Elasticsearch

作者

•

Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.

Top News

Video of my #bbuzz talk is now online: Algorithms that power Lucene and Elasticsearch https://t.co/CVQQWQndAC
— Adrien Grand (@jpountz) June 3, 2015

Elasticsearch Core

Bulk API: Allow null values in the bulk action/metadata line parameters (#11459, 2.0.0, 1.6.0)
Internal: Use the smallest version rather than the default version (#11475, 1.6.0)
Core: Reduce shard inactivity timeout to 5m (#11479, 2.0.0, 1.6.0)
Aggs: Allow aggregations_binary to build and parse (#11473, 2.0.0, 1.6.0, 1.5.3)
Aggs: Fix bug where moving_avg prediction keys are appended to previous prediction (#11465, 2.0.0)
Network: Default to binding to loopback address (#11483, 2.0.0)
Internal: Minimize the usage of guava classes in interfaces, return types, arguments (#11501, 2.0.0)
Transport: ClusterHealth shouldn't fail with "unexpected failure" if master steps down while waiting for events (#11493, 2.0.0, 1.6.0)
Dependencies: update maven-assembly-plugin to 2.5.5 (#11518, 2.0.0)
Snapshot/Restore: Blob store shouldn't try deleting the write.lock file at the end of the restore process (#11517, 2.0.0)
GatewayAllocator: reset rerouting flag after error (#11519, 2.0.0, 1.6.0)
Build: Don't shade core artifacts (#11522, 2.0.0)
Settings: Make prompt placeholders consistent with existing placeholders (#11514, 1.6.0)
Scripting: Execute Scripting Engine before searching for inner templates in template query (#11512, 2.0.0)
Plugins: deprecate addQuery methods that are going to be removed in 2.0 (#11532, 1.6.0)
Plugins: one single global way to register custom query parsers (#11481, 2.0.0)
Dependencies: use released lucene 5.2 jar (#11534, 2.0.0)
Recovery: Fix recovered translog ops stat counting when retrying a batch (#11536, 2.0.0)
Core: Improve exception message when shard has a partial commit (segments_N file) due to prior disk full (#11539, 1.6.0)
Core: Add node setting to send SegmentInfos debug output to System.out (#11546, 2.0.0, 1.6.0)
Suggest API: Deprecate filter option in PhraseSuggester collate (#11445, 1.6.0)
Core: Fail shard if search execution uncovers corruption (#11440, 2.0.0, 1.6.0)
Mapping: Refactor core index/query time properties into FieldType (#11422, 2.0.0)
Mapping: Added epoch date formats to configure parsing of unix dates (#11453, 2.0.0)
Snapshot/Restore: Sync up snapshot shard status on a master restart (#11450, 2.0.0, 1.6.0)
Settings: Require units for time and byte-sized settings, take 2 (#11437, 2.0.0)
Recovery: Restart recovery upon mapping changes during translog replay (#11363, 2.0.0)
IdsQueryBuilder: Allow to add a list in addition to array (#11409, 2.0.0)
Settings: Rename settings to prevent watcher clash (#11359, 2.0.0)
Internal: Allow ActionListener to be called on the network thread (#10573, 2.0.0, 1.6.0)
Snapshot/Restore: Improve snapshot creation and deletion performance on repositories with large number of snapshots (#8969, 2.0.0)
REST: Add all meta fields to the top level json document in search response (#8131, 2.0.0)

Apache Lucene

5.2.0 is released!
There's a sudden push to find and fix the IBM J9 JVM bugs that Lucene's tests uncover, after this response on the Elasticsearch forums: let cyberneko peek into package-protected APIs, temporarily disable this test because something is wrong with how J9 handles unicode filenames, ignore the ClassCache reaper thread, ArrayIndexOutOfBoundsException in fieldcache (sidestepped if you pass -Xint to J9)
Tests now assert that queries do not compute scores when they were asked not to
Simplify Lucene's file-based lock API to prevent future bugs like this baddie
Properly handle co-linear path segments in geo3d
Fast experimental BKD tree based geo-spatial search has landed, and there's a new issue to speed up its polygon intersection queries, sharing logic from GeoPointField, with a new London, UK video showing the improvement
Some small cleanups to the new document-based suggester APIs
Split up one of Lucene's monster tests so devs with god-like boxes can run them concurrently
Geo3D can now model the earth more accurately as a slightly squashed sphere
Do not load norms if termquery won't compute scores
Lucene's segments_N file should directly store the version that wrote it, and the version of the oldest segment in the index
A new expert constructor will let you create IndexWriter from an already opened IndexReader, letting you efficiently upgrade reader to reader+writer
QueryNodeImpl.removeFromParent was doing nothing in a very costly manner
IndexWritershould not accept a lock timeout: such logic should be done at a higher level
More iterations to add a fast point-in-shape geo-spatial API
A commit with no user-data changes should also be reflected in NRT reopen
ant idea was failing to copy code style settings for IDEA
Improve how SpanMultiTermQueryWrapper limits which terms to search when there are too many

Watch This Space

Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!

Elasticsearch Platform

ELK Stack

Elastic Cloud

可观测性

安全性

搜索

按行业

按解决方案

客户聚焦

开发人员

保持联系

学习

帮助

This Week in Elasticsearch and Apache Lucene: Algorithms that power Lucene and Elasticsearch

Top News

Elasticsearch Core

Apache Lucene

Watch This Space

关注我们

关于我们

加入我们

新闻稿

合作伙伴

信任和安全性

投资者关系

卓越奖

关于我们

加入我们

新闻稿

合作伙伴

信任和安全性

投资者关系

卓越奖