This Week in Elasticsearch and Apache Lucene:Meetup: Scaling Elasticsearch & Autosuggest in Lucene
Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Top News
For our friends in Vancouver, B.C.
Polyglot meetup next Wed night two talks "Scaling Elasticsearch & Autosuggest in Lucene" http://t.co/aR98iTwzpd
— Tavis Rudd (@tavisrudd) May 11, 2015
Elasticsearch Core
- Reducers: Adding Average Bucket Aggregation (#11010, 2.0.0)
- Reducers: Adding Count Bucket Aggregation (#11015, 2.0.0)
- Reducers: Adding Sum Bucket Aggregation (#11013, 2.0.0)
- Dependencies: Remove Codehaus repository (#11014, 2.0.0, 1.6.0, 1.5.3)
- Parent/child: Deprecate the
top_children
query (#11022, 2.0.0, 1.6.0) - Translog: Make modifying operations durable by default (#11011, 2.0.0)
- Internal: Fix
NullPointerException
in <span><code>PendingDelete#toString (#11032, 2.0.0, 1.6.0, 1.5.3) - More Like This: remove
percent_terms_to_match
(#11030, 2.0.0) - Mappings: Remove traverse functions from Mapper (#11027, 2.0.0)
- Mappings: Remove mapper listeners (#11045, 2.0.0)
- Transport: read/write support for list of strings (#11056, 2.0.0)
- Logging: Truncate log messages at 10,000 characters by default (#11050, 2.0.0)
- Shadow replica: Add
path.shared_data
, change <code>index.data_path to be relative (#11065, 2.0.0) - ThreadPool: Make sure no leaking threads are left behind in case of initialization failure (#11061, 2.0.0, 1.6.0)
- Java api: unify
SearchResponse
and <code>BroadcastOperationResponse code around shards header (#11064, 2.0.0, 1.6.0) - Parent/child: Remove the
top_children
query (#11028, 2.0.0) - REST: Unify
query_string
parameters parsing (#11057, 2.0.0, 1.6.0) - Aggregations: Fix
geo_bounds
aggregation when longitude is 0 (#11090, 2.0.0, 1.6.0, 1.5.3) - Forbidden: Ban
PathUtils.get
(for now, until we fix the two remaining issues) (#11069, 2.0.0) - Time measurement: Use
System.nanoTime
for elapsed time (#11058, 2.0.0, 1.6.0) - Settings: Remove file based index templates (#11052, 2.0.0)
- Translog: Remove
Translog
interface (#10988, 2.0.0) - Startup: Ensure JNA is fully loaded when its avail, but don't fail its not (#10989, 2.0.0)
- Internal: Remove double exceptin handling that causes false replica failures (#10990, 2.0.0)
- Startup: Add pid file to Environment (#10986, 2.0.0)
- Security Manager: Run groovy scripts with no permissions (#10969, 2.0.0)
- Geo: Remove local lucene spatial package (#10966, 2.0.0)
- Translog: Use buffered tanslog type also when sync is set to 0 (#10993, 2.0.0, 1.6.0, 1.5.3)
- Scripting: Minor TimeZone Fix (#10994, 2.0.0)
- PluginManager: Let HTTPS work correctly (#10983, 2.0.0)
- Startup: bail if ES is run as root (#10970, 2.0.0)
- Scripts: Load fielddata on behalf of scripts (#10997, 2.0.0)
- Mappings: Remove ability to disable _source field (#10915, 2.0.0)
- More Like This: Deprecate the API (#10982, 1.6.0)
- Reducers: Derivative Aggregation x-axis units normalisation (#10898, 2.0.0)
- Reducers: Rename Moving Average models to their "common" names (#10964, 2.0.0)
- Mappings: Wait for mappings to be available on the primary before indexing. (#10949, 2.0.0)
- More Like This: removal of the MLT API (#11003, 2.0.0)
- Mappings:
numeric_resolution
should only apply to dates provided as numbers. (#11002, 2.0.0, 1.6.0) - Scripting: Allow script language to be null when parsing (#10976, 2.0.0, 1.6.0, 1.5.3)
- Environment: Only check existence for absolute paths in env.resolveConfig() (#10854, 2.0.0)
- Query DSL: Remove filter parsers (#10985, 2.0.0)
- Docs: Cleanup meta field docs (#10912, 2.0.0)
- REST: Remove (dfs_)query_and_fetch from the REST API (#10864, 2.0.0)
- Security Manager: Tighten up script security more (#10999, 2.0.0)
- Internal: prevent injection of unannotated dynamic settings (#10763, 2.0.0)
- Java api: remove redundant BytesQueryBuilder in favour of using WrapperQueryBuilder internally (#10919, 2.0.0)
Apache Lucene
TermsQuery
improvements: use less memory, especially with many terms, by compressing terms using prefix-coding, and fix all constructors to not create manyTerm
s</a> </li><li><span></span>The <a href="https://issues.apache.org/jira/browse/LUCENE-6196" target="_blank">geo3d package</a> landed, adding fast planar 3D shape filtering to Lucene's spatial module </li><li><span></span><a href="https://issues.apache.org/jira/browse/LUCENE-6422" target="_blank"><code>PackedQuadPrefixTree landed, providing faster spatial queries, and a more compact index in some cases- Fix equals and hashCode for span queries
- Cleanup: remove and refactor redundant
Scorer
code</a><wbr style="line-height: 1.6em;"> </li><li><span></span>Change enumeration of all strings accepted by an automaton <a href="https://issues.apache.org/jira/browse/LUCENE-6365" target="_blank">to an efficient iterator</a> </li><li><span></span>Add a common <a href="https://issues.apache.org/jira/browse/LUCENE-6459" target="_blank">suggest API that mirrors Lucene's <code>Query/<code>IndexSearcher <wbr>APIs</a> </li><li><span></span><a href="https://issues.apache.org/jira/browse/LUCENE-6464" target="_blank">Allow arbitrary context filtering</a> with <code style="line-height: 1.6em;">AnalyzingInfixS<wbr>uggester - Add min/max "from" count to query-time join
Highlighter
should eagerly drop 0-score fragments- Improve how
FuzzyQuery
computes scores by disregarding document frequency variations between the top matched terms
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole ELK ecosystem including news, learning resources and cool use cases!