WARNING: This documentation covers Elasticsearch 2.x. The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.

This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.

« Boosting by Popularity Random Scoring »

› › ›

Boosting Filtered Subsets

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Boosting Filtered Subsets

edit

Let’s return to the problem that we were dealing with in Ignoring TF/IDF, where we wanted to score vacation homes by the number of features that each home possesses. We ended that section by wishing for a way to use cached filters to affect the score, and with the function_score query we can do just that.

The examples we have shown thus far have used a single function for all documents. Now we want to divide the results into subsets by using filters (one filter per feature), and apply a different function to each subset.

The function that we will use in this example is the weight, which is similar to the boost parameter accepted by any query. The difference is that the weight is not normalized by Lucene into some obscure floating-point number; it is used as is.

The structure of the query has to change somewhat to incorporate multiple functions:

GET /_search
{
  "query": {
    "function_score": {
      "filter": { 
        "term": { "city": "Barcelona" }
      },
      "functions": [ 
        {
          "filter": { "term": { "features": "wifi" }}, 
          "weight": 1
        },
        {
          "filter": { "term": { "features": "garden" }}, 
          "weight": 1
        },
        {
          "filter": { "term": { "features": "pool" }}, 
          "weight": 2 
        }
      ],
      "score_mode": "sum", 
    }
  }
}

	This `function_score` query has a `filter` instead of a `query`.
	The `functions` key holds a list of functions that should be applied.
	The function is applied only if the document matches the (optional) `filter`.
	The `pool` feature is more important than the others so it has a higher `weight`.
	The `score_mode` specifies how the values from each function should be combined.

The new features to note in this example are explained in the following sections.

filter Versus query

edit

The first thing to note is that we have specified a filter instead of a query. In this example, we do not need full-text search. We just want to return all documents that have Barcelona in the city field, logic that is better expressed as a filter instead of a query. All documents returned by the filter will have a _score of 1. The function_score query accepts either a query or a filter. If neither is specified, it will default to using the match_all query.

functions

edit

The functions key holds an array of functions to apply. Each entry in the array may also optionally specify a filter, in which case the function will be applied only to documents that match that filter. In this example, we apply a weight of 1 (or 2 in the case of pool) to any document that matches the filter.

score_mode

edit

Each function returns a result, and we need a way of reducing these multiple results to a single value that can be combined with the original _score. This is the role of the score_mode parameter, which accepts the following values:

multiply: Function results are multiplied together (default).
sum: Function results are added up.
avg: The average of all the function results.
max: The highest function result is used.
min: The lowest function result is used.
first: Uses only the result from the first function that either doesn’t have a filter or that has a filter matching the document.

In this case, we want to add the weight results from each matching filter together to produce the final score, so we have used the sum score mode.

Documents that don’t match any of the filters will keep their original _score of 1.

« Boosting by Popularity Random Scoring »