18. Januar 2016

This Week in Elasticsearch and Apache Lucene - 2016-01-18

Von

•

Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.

Top News

Looking to upgrade your #Elasticsearch deployment from 1.x to 2.x? We’ve got the video for you & it’s OnDemand now! https://t.co/YX5v4H0yaX
— elastic (@elastic) January 15, 2016

Elasticsearch Core

Changes in 2.2:

An unrecognised content type passed to an update request used to throw a NPE.
The transport client now throws an exception when plugin.types is used, to help point users to addPlugin.
Support for secondary accounts on Azure plugins broke setups with only a primary account.
Filter/Filters aggregations were creating weights more often than needed, resulting in a performance regression.
Pending tasks were reporting incorrect (by 1000x) time-in-queue because of a bad conversion from nano to milliseconds.
A circular reference on an AlreadyClosedException could cause a stack overflow during rendering.
ignore_unavailable wasn't being respected when applied to aliases with closed indices.
Multiples types in the search URL were not properly filtering when an unknown type was present.

Changes in 2.x:

A URL filter on type could leak the type name into a highlighting request.
Percolate queries which use "now" in a date range were not working with the mpercolate API.
Cross-fields queries on non-string fields were broken.
The disk allocator didn't play nicely with file systems that don't report file system usage.

Changes in master:

5-minute and 15-minute load averages are now available on Linux again, and now on FreeBSD as well, but the format will probably change from an array to an object.
Shards with heavy indexing loads will get a greater share of the indexing buffer.
Master stopped using Java serialization a long time ago and, to guard against reintroduction, Serializable is now banned.
All dynamic index settings have been moved from the shard level to the index level as part of the great settings cleanup.
Get-alias and Cat-alias now return open and closed indices by default.

Ongoing:

Ingest node:
- Pipeline configuration is now stored in the cluster state, instead of in an index, in order to simplify update notifications.
- Ingest requests (which specify a pipeline) will now be forwarded to ingest nodes.
- Proper ingest methods added to the Java API.
- Ingest now uses the indexing threadpool instead of having a dedicated threadpool.
- Added the de-dot processor for converting dots in fieldnames to underscores.
- The simulate API now supports tracking of processor IDs across on_failure/compound processors, for easier tracking client site.
Search refactoring:
- Validation of geoshapes now happens in ShapeBuilders.
- All aggregations, highlighters, and rescorers are now refactored!
- Still waiting on suggesters, sort, rescore, and inner hits, which depends on everything else.
The reindex API has been merged into feature/reindex, but still needs to be integrated with the task management API.
The task management API will soon be able to connect parent tasks with their children.
The new scripting language is gaining throw and try/catch functionality, and the ability to detect infinite loops.
Possibly adding a fixed-point mapping type.

Apache Lucene

Lucene continues to migrate from Subversion to git and we still have improvements to the workaround script in the meantime
A number of improvements to TeeSinkTokenFilter , including removing the confusing SinkFilter
We are simultaneously releasing Lucene 5.3.2 and 5.4.1 and discussing the next major (6.0.0) Lucene release, exposing interesting challenges
A rare corner-case bug in reading 5.4.0 doc values, uncovered by Lucene's randomized testing, is quite nasty, prompting the upcoming 5.4.1 release
Lucene's release smoke tester should not check future versions for backwards compatibility
An invalid long-to-int cast causes broken ArrayIndexOutOfBoundsEx<wbr>ception when loading large (2.1+ GB) field cache entries
The confusion matrix in the classifier module can now give you its overall precision and recall
The SimpleText codec was not writing dimensional values correctly
LuceneTestCase will now use standardized language tags to represent the randomized Locale
Our default BytesTermAttribute implementation hits NullPointe<wbr>rException if the term is null
StemmerOverrideFilter may be buggy
Minimum should match and synonyms struggle to co-exist in query parsers in Lucene 5.x
More tricky geo query test failures
We should add a query to test for precisely equals dimensional values
Remove StoredDocument and friends before releasing Lucene 6.0.0
Missing @Override annotations should fail the build
PrefillTokenStream lets you specify exactly which tokens to iterate
JapaneseAnalyzer's decompounding messes up PhraseQuery matching \
JapaneseTokenizer now offers more than two possible tokenizations
A new LSH (locality sensitive hashing) TokenFilter and query is an alternative to the standard MoreLikeThisQuery
MoreLikeThisQuery should keep track of which terms came from which fields
RAMDirectory sometimes fails to throw EOFException if you try to seek beyond the end of the file
Unordered span queries differ in how they measure the allowed span from ordered span queries
SpanPositionQueue could be specialized to improve JIT performance
Codec level encryption offers fine-grained control over which parts of the index need encryption

Watch This Space

Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!

Elasticsearch Platform

ELK Stack

Elastic Cloud

Observability

Security

Search

Nach Branche

Nach Lösung

Kunden-Spotlight

Entwickler:innen

Vernetzen

Lernen

Hilfe

Erfahren, was es bei Elastic Neues gibt

This Week in Elasticsearch and Apache Lucene - 2016-01-18

Top News

Elasticsearch Core

Apache Lucene

Watch This Space

Folgen Sie uns:

Über uns

Bei Elastic arbeiten

Presse

Partner

Vertrauen und Sicherheit

Investor Relations

EXCELLENCE AWARDS