WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Boosting Filtered Subsets
editBoosting Filtered Subsets
editLet’s return to the problem that we were dealing with in Ignoring TF/IDF,
where we wanted to score vacation homes by the number of features that each
home possesses. We ended that section by wishing for a way to use cached
filters to affect the score, and with the function_score
query we can do
just that.
The examples we have shown thus far have used a single function for all documents. Now we want to divide the results into subsets by using filters (one filter per feature), and apply a different function to each subset.
The function that we will use in this example is the weight
, which is
similar to the boost
parameter accepted by any query. The difference is
that the weight
is not normalized by Lucene into some obscure floating-point
number; it is used as is.
The structure of the query has to change somewhat to incorporate multiple functions:
GET /_search { "query": { "function_score": { "filter": { "term": { "city": "Barcelona" } }, "functions": [ { "filter": { "term": { "features": "wifi" }}, "weight": 1 }, { "filter": { "term": { "features": "garden" }}, "weight": 1 }, { "filter": { "term": { "features": "pool" }}, "weight": 2 } ], "score_mode": "sum", } } }
This |
|
The |
|
The function is applied only if the document matches the (optional) |
|
The |
|
The |
The new features to note in this example are explained in the following sections.
filter Versus query
editThe first thing to note is that we have specified a filter
instead of a
query
. In this example, we do not need full-text search. We just want to
return all documents that have Barcelona
in the city
field, logic that is
better expressed as a filter instead of a query. All documents returned by
the filter will have a _score
of 1
. The function_score
query accepts
either a query
or a filter
. If neither is specified, it will default to
using the match_all
query.
functions
editThe functions
key holds an array of functions to apply. Each entry in the
array may also optionally specify a filter
, in which case the function will be applied only to documents that match that filter. In this example, we
apply a weight
of 1
(or 2
in the case of pool
) to any document
that matches the filter.
score_mode
editEach function returns a result, and we need a way of reducing these multiple
results to a single value that can be combined with the original _score
.
This is the role of the score_mode
parameter, which accepts the following
values:
-
multiply
- Function results are multiplied together (default).
-
sum
- Function results are added up.
-
avg
- The average of all the function results.
-
max
- The highest function result is used.
-
min
- The lowest function result is used.
-
first
- Uses only the result from the first function that either doesn’t have a filter or that has a filter matching the document.
In this case, we want to add the weight
results from each matching
filter together to produce the final score, so we have used the sum
score
mode.
Documents that don’t match any of the filters will keep their original
_score
of 1
.