WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Queries and Filters
editQueries and Filters
editThe DSL used by Elasticsearch has a single set of components called queries, which can be mixed and matched in endless combinations. This single set of components can be used in two contexts: filtering context and query context.
When used in filtering context, the query is said to be a "non-scoring" or "filtering" query. That is, the query simply asks the question: "Does this document match?". The answer is always a simple, binary yes|no.
-
Is the
created
date in the range2013
-2014
? -
Does the
status
field contain the termpublished
? -
Is the
lat_lon
field within10km
of a specified point?
When used in a querying context, the query becomes a "scoring" query. Similar to its non-scoring sibling, this determines if a document matches and how well the document matches.
A typical use for a query is to find documents:
-
Best matching the words
full text search
-
Containing the word
run
, but maybe also matchingruns
,running
,jog
, orsprint
-
Containing the words
quick
,brown
, andfox
—the closer together they are, the more relevant the document -
Tagged with
lucene
,search
, orjava
—the more tags, the more relevant the document
A scoring query calculates how relevant each document is to the
query, and assigns it a relevance _score
, which is later used to
sort matching documents by relevance. This concept of relevance is
well suited to full-text search, where there is seldom a completely
“correct” answer.
Historically, queries and filters were separate components in Elasticsearch. Starting in Elasticsearch 2.0, filters were technically eliminated, and all queries gained the ability to become non-scoring.
However, for clarity and simplicity, we will use the term "filter" to mean a query which is used in a non-scoring, filtering context. You can think of the terms "filter", "filtering query" and "non-scoring query" as being identical.
Similarly, if the term "query" is used in isolation without a qualifier, we are referring to a "scoring query".
Performance Differences
editFiltering queries are simple checks for set inclusion/exclusion, which make them very fast to compute. There are various optimizations that can be leveraged when at least one of your filtering query is "sparse" (few matching documents), and frequently used non-scoring queries can be cached in memory for faster access.
In contrast, scoring queries have to not only find matching documents, but also calculate how relevant each document is, which typically makes them heavier than their non-scoring counterparts. Also, query results are not cacheable.
Thanks to the inverted index, a simple scoring query that matches just a few documents may perform as well or better than a filter that spans millions of documents. In general, however, a filter will outperform a scoring query. And it will do so consistently.
The goal of filtering is to reduce the number of documents that have to be examined by the scoring queries.
When to Use Which
editAs a general rule, use query clauses for full-text search or for any condition that should affect the relevance score, and use filters for everything else.