Welcome to This Week in Elasticsearch and Apache Lucene! With this weekly series, we're bringing you an update on all things Elasticsearch and Apache Lucene at Elastic, including the latest on commits, releases and other learning resources.
Learn how @Walmart uses #Elasticsearch for near real time retail analytics at #Elasticon. Register now: https://t.co/jPA4E4iSgB pic.twitter.com/sjdBcWrbS7
— elastic (@elastic) February 17, 2017
Batched Search Reduce Phases
In the last 2 weeks we refactored several parts of our search layer to allow for better testablity as well as to add new features. With the addition of cross cluster search a couple of weeks ago the ability to search across many many shards became a much higher priority. With batched search reduce phases we added the first step towards removing the artificial soft-limit of 1000 shards per search request. We added the ability to reduce shard results in batches (by default 512 shards at once) to free up resources as soon as possible to prevent the high memory consumption that the soft limit was added to prevent. At this point only aggregations are reduced in batches but future work is already on the way and is expected to come in version 5.4.
Circuit breaker accounting leak
This week we saw a two day round-the-clock debugging spree around a Circuit Breaker accounting bug. We've been chasing reports of clusters not accepting requests due to the request circuit breaker before but couldn't pin them down. Last Monday, Jason Bryan signalled that his private cloud cluster displayed the same symptoms. Restarting the cluster made it accessible again but plotting the request circuit breaker value over time clearly showed a leak. Something was incrementing it with a few MB every 2.5 minutes. When those MBs consumed 70% of nodes memory, no request to the node could be made. Some more hours and late into the night we correlated it to the snapshotting logic in Cloud. That was puzzling as cloud only snapshots every 30 minutes. More digging led to the discovery that the cloud call to list the snapshots times out, breaks the connection and then immediately tries another time. That explained the faster cycle but we still had no clue what was happening on the Elasticsearch side. Since it was our own cluster, Jason Tedor built a custom jar with some debugging logic and the results were surprising - it wasn't snapshot related at all. If the client closed a connection to the REST layer before Elasticsearch could respond, we would fail to mark the request resources as freed. The snapshot listing call's only fault was being slow... the cluster had 426 snapshots in it, each with many indices. S3 was just slow to read (>5m some times). This issue was fixed and will be part of the imminent 5.2.2 release. We deployed the build candidate to Jason's cluster and confirmed that the leak is gone. Thanks again to Jason Bryan and the cloud team for working with us and jumping through hoops to get this resolved.
New Rally nested track
Rally got a new track called "Nested" which indexes a subset of a Stackoverflow dump using nested documents. It runs nested queries, nested aggregations, nested sorts, as well as simple queries that do not leverage the nested structure, which still have to mask nested docs, so hopefully we should be better informed of performance improvements and regressions related to the use of nested documents in the future.
Completion suggester learns to deduplicate suggestions
Lucene's near-real-time document based suggester, exposed as the new completion suggester and context suggester in Elasticsearch 5.x, is a powerful auto-suggest implementation, differentiated because it respects deleted documents and can apply filters. It also supports an analyzer to normalize the different ways users type what are in fact the same suggestion which can be very powerful. However, it was missing duplicate removal, which is a big limitation for uses cases such as suggesting author names from your index when prolific authors may written many documents. Under the hood, the suggester builds a per-segment finite-state transducer (FST), where each path is first the analyzed suggestion string, followed by a vInt encoding of the document ID. This means that the FST has already done the hard part for deduplication: all duplicates will share a single path through up until the document ID, at which point it will branch out to all the many documents with that suggestion. We've now taken advantage of that to efficiently prune partial that can only lead to duplicate suggestions so that in Lucene 6.5.0, Elasticsearch 5.4.0 we will have the option to remove duplicates.
Changes in 5.3:
- Indexing operations delayed by a block on IndexShardOperationsLock were being executed on a new thread without the ThreadContext.
- Lazy load the GeoIp database in the geoip ingest processor.
- The queue for pipelined HTTP requests should be set to small initially in case pipelined requests are never made.
- The secure settings keystore CLI should be able to handle an empty response to a Y/N question.
- The releasing listener should be called even when a client disconnects before the response is sent, to ensure correct circuit breaker accounting. Pipelined requests require special handling.
- Don't set local node on cluster state used for node join validation.
- Various casting bugs in Painless have been fixed.
- In update scripts,
ctx._now
should return epoch time in milliseconds, not a value derived fromSystem.nanoTime
.
Changes in 5.x:
- To discover the latest snapshot In snapshot/restore, prefer listing index-N blobs over reading index-latest.
- The
WordDelimiterGraphTokenFilter
, like theSynonymGraphTokenFilter
, can handle phrase queries correctly. - No need to sort search results before merging them with Lucene's top docs utilities.
- The indices settings API should accept CBOR and SMILE.
- Script and template objects should be converted to JSON internally, regardless of their original encoding.
- Painless scripts called in a context which expects a primitive will now return
0
orfalse
if there is anull
return value. - Painless implements more interfaces, allowing previously hard coded expectations to be configured at compile time.
- Implemented batched reduction of search responses, and exposed
batched_reduce_size
in the search API. - Reducing Netty's network receive predictor size to 32kB reduces GC while maintaining throughput, but this may not be the right value for heaps smaller than 4GB.
- Upgraded HDRHistogram to 2.1.9.
- Added support for breaking on word, line, sentence, or character to the Fast Vector Highlighter. Support will also be added to the Unified Highlighter.
- The percolator can now extract terms from the
MultiPhraseQuery
. - Check for a trailing slash in S3 URLs, which could cause an empty directory to be created.
- Avoid over-allocating memory in StringBuilder.
- Deprecate global
repositories.azure
config settings in favour of specifying settings via the API.
Changes in master:
- The Java High Level REST client now supports named xcontent parsers, which will be used for parsing aggregation and suggester responses. Also added support for BulkRequest,
DeleteRequest
, and<a href="https://github.com/elastic/elasticsearch/pull/23266" target="_blank" >UpdateRequest</a>
. MadeBulkRequest
andUpdateRequest
supportToXContent
, andSuggest
andSuggestion
supportFromXContent
. - Make the translog aware of the range of sequence numbers that a single contains.
- The S3 plugin needs
getCredentials()
to be wrapped in adoPrivileged()
block, as it can open a socket withconnect
privileges. - Adding free file space from multiple mount points can overflow a
long
. - Document write requests should be immutable, and should not update the version number internally.
Coming up:
- Write failures should also be replicated from primary to replica to ensure that there are no gaps in the sequence number history.
Apache Lucene
- The
TopDocs.merge
API, used to merge hits from multiple shards, gets support for multiple cascaded reduce phases - The near-real-time document suggester can now efficiently remove duplicates when retrieving suggestions
- We long ago removed MergeScheduler.clone but the javadocs were never updated
- Some old code reading future to-become-ancient index formats can now be removed from Lucene
master
- Lucene will now, finally, record the original Lucene version that created an index
- If a graph token stream is created when parsing a user's query in double quotes we now create
SpanQueries
to prevent possible combinatoric explosion - The classes that implement merge aborting and IO throttling are very hairy and intertwined and should be simplified
CommonGramsQueryFilter,
which removes unigram tokens when bigram tokens are present, was creating a broken, disconnected graph at search time, and bothShingleFilter
andCJKBigramFilter
also create disconnected graphs, breaking query parsingFilterCodecReader
was failing to override and delegate all ofsuper's
methods- Our numerous grouping collectors could be simplified if we just factor out how the group value is selected into a new
GroupSelector
- If we deprecated index time boosts then length normalization factors, taking one byte per field X document by default, could be more accurate
BlockPackReader
will now throw a more descriptiveCorruptIndexException
when an invalidbitsPerValue
is seen- We have removed
GraphQuery
since all necessary interpretations of a graph token stream now happen while parsing the query - Managing our GnuPG keys used for signing release bits is challenging
CharTokenizer
derived tokenizers should be able to change the max token lengthOneMergeWrappingMergePolicy
lets you change each merge the merge policy chose before it's executed- Lucene has a number of ancient release artifacts that we should prune
- A user trying to understand how to do drill sideways with range faceting led to adding a new example to our demo facets sources
- Our evil
TestRandomChains,
which strings together random tokenizer components and then feeds them random text, uncovered a sneaky corner case bug in the newSimplePatternTokenizer
andSimplePatternSplitTokenizer
,
but anotherTestRandomChains
failure remains unexplained - Now that
ToParentBlockJoinCollector
is gone we can again try to make it easier to get the per-hit matching scorers inDisjunctionScorer
Watch This Space
Stay tuned to this blog, where we'll share more news on the whole Elastic ecosystem including news, learning resources and cool use cases!