Script Score Queryedit

The script_score allows you to modify the score of documents that are retrieved by a query. This can be useful if, for example, a score function is computationally expensive and it is sufficient to compute the score on a filtered set of documents.

To use script_score, you have to define a query and a script - a function to be used to compute a new score for each document returned by the query. For more information on scripting see scripting documentation.

Here is an example of using script_score to assign each matched document a score equal to the number of likes divided by 10:

GET /_search
{
    "query" : {
        "script_score" : {
            "query" : {
                "match": { "message": "elasticsearch" }
            },
            "script" : {
                "source" : "doc['likes'].value / 10 "
            }
        }
     }
}
Note

The values returned from script_score cannot be negative. In general, Lucene requires the scores produced by queries to be non-negative in order to support certain search optimizations.

Accessing the score of a document within a scriptedit

Within a script, you can access the _score variable which represents the current relevance score of a document.

Predefined functions within a Painless scriptedit

You can use any of the available painless functions in the painless script. Besides these functions, there are a number of predefined functions that can help you with scoring. We suggest you to use them instead of rewriting equivalent functions of your own, as these functions try to be the most efficient by using the internal mechanisms.

saturationedit

saturation(value,k) = value/(k + value)

"script" : {
    "source" : "saturation(doc['likes'].value, 1)"
}

sigmoidedit

sigmoid(value, k, a) = value^a/ (k^a + value^a)

"script" : {
    "source" : "sigmoid(doc['likes'].value, 2, 1)"
}

Functions for vector fieldsedit

Warning

This functionality is experimental and may be changed or removed completely in a future release. Elastic will take a best effort approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.

These functions are used for for dense_vector and sparse_vector fields.

Note

During vector functions' calculation, all matched documents are linearly scanned. Thus, expect the query time grow linearly with the number of matched documents. For this reason, we recommend to limit the number of matched documents with a query parameter.

For dense_vector fields, cosineSimilarity calculates the measure of cosine similarity between a given query vector and document vectors.

{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "cosineSimilarity(params.query_vector, doc['my_dense_vector']) + 1.0", 
        "params": {
          "query_vector": [4, 3.4, -0.2]  
        }
      }
    }
  }
}

The script adds 1.0 to the cosine similarity to prevent the score from being negative.

To take advantage of the script optimizations, provide a query vector as a script parameter.

Similarly, for sparse_vector fields, cosineSimilaritySparse calculates cosine similarity between a given query vector and document vectors.

{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": "cosineSimilaritySparse(params.query_vector, doc['my_sparse_vector']) + 1.0",
        "params": {
          "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
        }
      }
    }
  }
}

For dense_vector fields, dotProduct calculates the measure of dot product between a given query vector and document vectors.

{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": """
          double value = dotProduct(params.query_vector, doc['my_vector']);
          return sigmoid(1, Math.E, -value); 
        """,
        "params": {
          "query_vector": [4, 3.4, -0.2]
        }
      }
    }
  }
}

Using the standard sigmoid function prevents scores from being negative.

Similarly, for sparse_vector fields, dotProductSparse calculates dot product between a given query vector and document vectors.

{
  "query": {
    "script_score": {
      "query": {
        "match_all": {}
      },
      "script": {
        "source": """
          double value = dotProductSparse(params.query_vector, doc['my_sparse_vector']);
          return sigmoid(1, Math.E, -value);
        """,
         "params": {
          "query_vector": {"2": 0.5, "10" : 111.3, "50": -1.3, "113": 14.8, "4545": 156.0}
        }
      }
    }
  }
}
Note

If a document doesn’t have a value for a vector field on which a vector function is executed, 0 is returned as a result for this document.

Note

If a document’s dense vector field has a number of dimensions different from the query’s vector, 0 is used for missing dimensions in the calculations of vector functions.

Random score functionedit

random_score function generates scores that are uniformly distributed from 0 up to but not including 1.

randomScore function has the following syntax: randomScore(<seed>, <fieldName>). It has a required parameter - seed as an integer value, and an optional parameter - fieldName as a string value.

"script" : {
    "source" : "randomScore(100, '_seq_no')"
}

If the fieldName parameter is omitted, the internal Lucene document ids will be used as a source of randomness. This is very efficient, but unfortunately not reproducible since documents might be renumbered by merges.

"script" : {
    "source" : "randomScore(100)"
}

Note that documents that are within the same shard and have the same value for field will get the same score, so it is usually desirable to use a field that has unique values for all documents across a shard. A good default choice might be to use the _seq_no field, whose only drawback is that scores will change if the document is updated since update operations also update the value of the _seq_no field.

Decay functions for numeric fieldsedit

You can read more about decay functions here.

  • double decayNumericLinear(double origin, double scale, double offset, double decay, double docValue)
  • double decayNumericExp(double origin, double scale, double offset, double decay, double docValue)
  • double decayNumericGauss(double origin, double scale, double offset, double decay, double docValue)
"script" : {
    "source" : "decayNumericLinear(params.origin, params.scale, params.offset, params.decay, doc['dval'].value)",
    "params": { 
        "origin": 20,
        "scale": 10,
        "decay" : 0.5,
        "offset" : 0
    }
}

Using params allows to compile the script only once, even if params change.

Decay functions for geo fieldsedit

  • double decayGeoLinear(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)
  • double decayGeoExp(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)
  • double decayGeoGauss(String originStr, String scaleStr, String offsetStr, double decay, GeoPoint docValue)
"script" : {
    "source" : "decayGeoExp(params.origin, params.scale, params.offset, params.decay, doc['location'].value)",
    "params": {
        "origin": "40, -70.12",
        "scale": "200km",
        "offset": "0km",
        "decay" : 0.2
    }
}

Decay functions for date fieldsedit

  • double decayDateLinear(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)
  • double decayDateExp(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)
  • double decayDateGauss(String originStr, String scaleStr, String offsetStr, double decay, JodaCompatibleZonedDateTime docValueDate)
"script" : {
    "source" : "decayDateGauss(params.origin, params.scale, params.offset, params.decay, doc['date'].value)",
    "params": {
        "origin": "2008-01-01T01:00:00Z",
        "scale": "1h",
        "offset" : "0",
        "decay" : 0.5
    }
}
Note

Decay functions on dates are limited to dates in the default format and default time zone. Also calculations with now are not supported.

Faster alternativesedit

Script Score Query calculates the score for every hit (matching document). There are faster alternative query types that can efficiently skip non-competitive hits:

Transition from Function Score Queryedit

We are deprecating Function Score, and Script Score Query will be a substitute for it.

Here we describe how Function Score Query’s functions can be equivalently implemented in Script Score Query:

script_scoreedit

What you used in script_score of the Function Score query, you can copy into the Script Score query. No changes here.

weightedit

weight function can be implemented in the Script Score query through the following script:

"script" : {
    "source" : "params.weight * _score",
    "params": {
        "weight": 2
    }
}

random_scoreedit

Use randomScore function as described in random score function.

field_value_factoredit

field_value_factor function can be easily implemented through script:

"script" : {
    "source" : "Math.log10(doc['field'].value * params.factor)",
    params" : {
        "factor" : 5
    }
}

For checking if a document has a missing value, you can use doc['field'].size() == 0. For example, this script will use a value 1 if a document doesn’t have a field field:

"script" : {
    "source" : "Math.log10((doc['field'].size() == 0 ? 1 : doc['field'].value()) * params.factor)",
    params" : {
        "factor" : 5
    }
}

This table lists how field_value_factor modifiers can be implemented through a script:

ModifierImplementation in Script Score

none

-

log

Math.log10(doc['f'].value)

log1p

Math.log10(doc['f'].value + 1)

log2p

Math.log10(doc['f'].value + 2)

ln

Math.log(doc['f'].value)

ln1p

Math.log(doc['f'].value + 1)

ln2p

Math.log(doc['f'].value + 2)

square

Math.pow(doc['f'].value, 2)

sqrt

Math.sqrt(doc['f'].value)

reciprocal

1.0 / doc['f'].value

decay functionsedit

Script Score query has equivalent decay functions that can be used in script.