14. November 2016

This Week in Elasticsearch and Apache Lucene - 2016-11-14

Von

•

Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.

All you need to know about #Elasticsearch 5.0 - Part 1 - Search https://t.co/nPDQk124oR
— Itamar Syn-Hershko (@synhershko) November 3, 2016

Elasticsearch Core

Changes in 2.x:

Reduced memory usage on client nodes by changing ShardActiveResponseHandler from holding on to the entire cluster state to just keeping the cluster state version.
Binary fields indexed in 1.x indices were not readable in 2.x.

Changes in 5.0:

The vm.map_max_count on systemd settings required a reboot to take effect - now we apply this setting during package install.
On Debian, the start-stop daemon was being backgrounded, which caused important exceptions to be swallowed.
ES_JVM_OPTIONS were being ignored on SysV systems.
Under failure conditions, a thread's original context could be changed before it returned to the pool, which means the context would persist when the thread is used in the future.
The response consumer in the Java REST client is stateful and shouldn't be reused, which we were doing when retrying failed requests.
2.x indices with TTL enabled were failing because the TTL query tries to access now.
Temporary index-* generational blobs created during snapshotting should be cleared up, otherwise they prevent further snapshots.
Joda time has been updated to v2.9.5
GET _snapshot/_all was returning duplicate in-progress snapshots.

Changes in 5.x:

Painless now supports d or D to denote decimal constants, and will suggest using a long constant if an integer is too big. Also, it now supports null safe dereferences like foo?.bar()?.baz?.
Every cluster state update causes data nodes to check if there are shards that can be deleted. A cache now makes this process much more efficient.
Task cancellation should wait for all child tasks to receive the cancellation request before returning.
The simple query string learns to run an all-fields query across all fields, instead of the _all field.
Alias names now have the same validation as index names (except for the lowercase restriction).
X-Pack:

Automatons used for FieldPermissions should be minimised in order to be thread safe.
The original ThreadContext should be restored after a preserved context is restored to prevent leaving previous users in the context.

Changes in 6.0:

Index template can now specify multiple patterns (and the parameter has changed from template to index_patterns).
index and delete operations should not mutate the version and version type of a request. Instead, this is the caller's responsibility.
The cluster state is now a truly immutable class.
The cat-thread-pool request should return unbounded queues as -1 instead of null.
X-Pack:

Watcher is being deguiced and generally cleaned up.

Ongoing changes:

Rank evaluation now treats unranked docs as irrelevant, and provides an option to ignore them.
Adding support for range fields and the corresponding queries (similar to geoshape queries).
REST URL parsing is undergoing a cleanup with the intention of being able to return 405 method not allowed.
Closing a shard when the store was not initialised could cause an NPE.
Painless is getting the Elvis operator.
Index*AlreadyExistsExceptions to be replaced by ResourceAlreadyException.
Delete requests should not have a body like the clear-scroll API has today.
Should Painless support binary values?
Plugins should have an extension point for adding custom cluster state metadata.
Non-master nodes should not change the cluster state.
Aliases and wildcards should be accepted in the indices_boost query.
X-Pack:

Blocking calls should not be made on the cluster state update thread.
Watch status should be updatable while watches are executing.
Watcher is gaining a JIRA action.

Apache Lucene

A few resources missed the migration from people.apache.org to home.apache.org
Index time sorting should allow sorting on multi-valued fields as well
FST packing is rarely used and adds a lot of complex code so we removed it
Multiple pre/post tags and phrase queries do not work together in FastVectorHighlighter
It is time to upgrade ICU from 56.1 to 58.1
FastVectorHighlighter fails to highlight phrase queries that have stop words that were removed at index time
The default sorting behaviour for missing values is always confusing
Our changes-to-html ant task annoyingly fails when Apache's JIRA instance is not reachable
The classic QueryParser should not parse an explicit query string differently depending on the default operator
JapaneseTokenizerFactory leaks file handles
Our release smoke testing tool does not work under cygwin on Windows
Attempting to index a too-massive single text field now causes an IllegalArgumentException instead of otherwise very confusing int overflow exceptions
TermAutomatonQuery, which is a powerful query that generalizes other proximity queries (SpanQuery, PhraseQuery, MultiPhraseQuery), now rewrites to simpler queries when the word-graph is trivial
UnifiedHighlighter now has extension points for custom queries
SpanNotQuery lets you specify the allowed range over overlap between sub-queries
Similarities now also explain exactly how they compute inverse document frequency
The new BooleanSimilarity bypasses all scoring and uses just the query's boost as the hit score
Fully dense norms and doc values should not waste memory on a fully set bitset

Watch This Space

Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!

Elasticsearch Platform

ELK Stack

Elastic Cloud

Observability

Security

Search

Nach Branche

Nach Lösung

Kunden-Spotlight

Entwickler:innen

Vernetzen

Lernen

Hilfe

Erfahren, was es bei Elastic Neues gibt

This Week in Elasticsearch and Apache Lucene - 2016-11-14

Elasticsearch Core

Apache Lucene

Watch This Space

Folgen Sie uns:

Über uns

Bei Elastic arbeiten

Presse

Partner

Vertrauen und Sicherheit

Investor Relations

EXCELLENCE AWARDS