July 15, 2015

Better query execution coming to Elasticsearch 2.0

It’s time to forget everything you knew about queries and filters: Elasticsearch 2.0 will make much better decisions by itself instead of relying on users to formulate an optimized query. This change will be almost invisible at the API level but let’s dive into the internal changes that made it possible. Most changes that are mentioned in this blog post have been done in Lucene 5.0, 5.1 and 5.2 and will be integrated in Elasticsearch 2.0.

The query/filter merge

Previous versions of Elasticsearch used to handle queries and filters as different citizens: you had queries that could match and score, and filters that could only match and were cacheable. In Elasticsearch 2.0, queries and filters are the same internal object, and these objects can be configured to score documents, or to skip scoring. Then the query DSL makes sure to propagate the information correctly: for instance the must clause of a bool query would need to produce scores if the bool query needs to produce scores too, while a must_not clause would never need to produce scores since it may only be used for filtering documents out.

In order to make the query DSL more consistent with this change, we have deprecated the filtered query in favour of a new filter clause on the bool query: filter clauses are like must clauses except that they do not contribute to the score. This means that the following query:

{
  “filtered” : {
    “query”: { query definition },
    “filter”: { filter definition }
  }
}

should now be replaced with

{
  “bool” : {
    “must”: { query definition },
    “filter”: { filter definition }
  }
}

Note that the query DSL is still backward compatible in spite of this change: if you try to run a filtered query, it will parse as a bool query internally. However, we would encourage you to migrate to the new syntax as the filtered query will be removed in a future release.

While this might look like a minor change, this is actually very useful for us. For instance, we used to have 3 queries/filters that could perform conjunctions: the bool query, the bool filter and the filtered query, and it happens that they all computed conjunctions in a slightly different way. The fact that we only have the bool query now allows us to perform optimizations in a more robust way, and in particular to leverage two-phase execution efficiently.

Two-phase execution

The previous filter API could be consumed in two ways: either using iterators over matching documents, or using an optional random-access API that allowed to check if a particular document matched the filter or not. Everything is good so far, except that the best way to consume a filter depended on which kind of filter you had: for instance the script filter was more efficient when using the random-access API while the bool filter was more efficient using the iterator API. This was quite a nightmare to optimize and was the root cause why the bool filter on the one hand and the and and or filters on the other hand performed differently.

To fix this issue, Lucene introduced a new two-phase iteration API, which consists of an approximation phase that can quickly iterate over a superset of the matching documents, and a verification phase that can be used to check if a document in this superset actually matches the query. A good example of use of this API is the phrase query: if you search for “quick fox”, a good approximation would be to search for documents that contain both “quick” and “fox”, and then as a verification step we could read positions to see if we can find these terms at consecutive positions.

Why is it useful? Imagine that you are searching for “quick fox” and applying a filter as well. The fact that we can dissociate the approximation from the verification phase allows us to first intersect the approximation with the filter, so that we will verify positions on a smaller set of documents. This two-phase iteration pattern also applies to other queries: for instance geo-distance queries can use a bounding box as an approximation and a distance computation as a verification, and filters that only make sense in a random-access fashion such as the script filter can return all documents in the index as an approximation and run the script as a verification.

A nice side-effect of this change is that the bool, and and or filters now behave the same, there is no need to pick one depending on the wrapped filters. By the way, we have deprecated the and and or filters and now encourage you to use bool instead.

Smarter filter caching

One issue that we often saw in previous Elasticsearch releases is that some queries were slow due to over-caching of filters. Take the term filter as an example: it can directly return iterators that are backed by disk-based postings lists, and these postings lists include skip lists that allow for efficient skipping, which is typically useful for conjunctions. However, if you want to cache the results, you need to consume the postings list entirely and load its content into memory, which means you can’t skip over unnecessary documents quickly anymore. In summary, filters should never be cached if they are not going to be reused, otherwise caching them is wasteful.

Elasticsearch changed filter caching to be entirely automatic. The way it works is that Elasticsearch keeps track of the 256 most recently used filters, and only caches those that appear 5 times or more in this history. The assumption is that over caching is more dangerous than under caching, given that uncached filters are already fast. So we would rather make sure they are reused before caching them. Also Elasticsearch does not cache anymore on segments which have less than 10000 documents or 3% of the documents of the index, since such segments are no bottleneck for query execution, on the contrary to the larger segments, and are also likely to be picked for a merge quickly since the merge policy favors small merges. As a consequence of this change, the _cache and _cache_key options are now deprecated and will be ignored if provided.

More efficient filter cache

In Elasticsearch 1.x, filters are cached using actual bit sets. While bit sets have a nice worst-case of 1 bit per existing document, this is quite wasteful when you have only a couple of bits that are sets out of several millions of existing bits. To improve this situation, Elasticsearch switched to a sparse bit set implementation for its filter cache in 2.0 called roaring bitmaps.

Also large terms filters are now prefix-compressed in order to take less memory in the filter cache. These two changes mean that you will now be able to cache more filters with the same amount of memory that you were giving to previous releases of Elasticsearch.

Better execution of multi-term queries

Multi-term queries are a family of queries that match several terms depending on some condition. For instance prefix queries match terms that start with a given sequence of bytes, regexp queries match terms that match a given regular expression, etc. All these queries are executed the same way: we create a bit set, then visit all matching terms and read their postings lists into that bit set. This approach scales well when you have many terms, unlike the bool query. One drawback however, is that it cannot leverage skipping and needs to visit all matching documents for all terms. In order to improve this, we changed multi-term queries to rewrite to a bool query when they only match a few terms in order to be more efficiently intersected with other queries.

Improved spans

Finally, span queries got some interesting improvements, and in particular now fully leverage the new two-phase iteration infrastructure. Additionally, two new span queries are exposed: span-within and span-containing.

All of these changes work together to allow Elasticsearch to make better decisions about how to execute your queries optimally. You no longer need to think about whether to use bool vs and, the order of your query clauses, or whether to cache or not. Elasticsearch now does it all for you.