Search and Query DSL changes

search_type

search_type=count removed

The count search type was deprecated since version 2.0.0 and is now removed. In order to get the same benefits, you just need to set the value of the size parameter to 0.

For instance, the following request:

GET /my_index/_search?search_type=count
{
  "aggs": {
    "my_terms": {
       "terms": {
         "field": "foo"
       }
     }
  }
}

can be replaced with:

GET /my_index/_search
{
  "size": 0,
  "aggs": {
    "my_terms": {
       "terms": {
         "field": "foo"
       }
     }
  }
}

search_type=scan removed

The scan search type was deprecated since version 2.1.0 and is now removed. All benefits from this search type can now be achieved by doing a scroll request that sorts documents in _doc order, for instance:

GET /my_index/_search?scroll=2m
{
  "sort": [
    "_doc"
  ]
}

Scroll requests sorted by _doc have been optimized to more efficiently resume from where the previous request stopped, so this will have the same performance characteristics as the former scan search type.

Search shard limit

In 5.0, Elasticsearch rejects requests that would query more than 1000 shard copies (primaries or replicas). The reason is that such large numbers of shards make the job of the coordinating node very CPU and memory intensive. It is usually a better idea to organize data in such a way that there are fewer larger shards. In case you would like to bypass this limit, which is discouraged, you can update the action.search.shard_count.limit cluster setting to a greater value.

fields parameter

The fields parameter has been replaced by stored_fields. The stored_fields parameter will only return stored fields — it will no longer extract values from the _source.

fielddata_fields parameter

The fielddata_fields has been deprecated, use parameter docvalue_fields instead.

search-exists API removed

The search exists api has been removed in favour of using the search api with size set to 0 and terminate_after set to 1.

Deprecated queries removed

The following deprecated queries have been removed:

filtered
Use bool query instead, which supports filter clauses too.
and
Use must clauses in a bool query instead.
or
Use should clauses in a bool query instead.
missing
Use a negated exists query instead. (Also removed _missing_ from the query_string query)
limit
Use the terminate_after parameter instead.
fquery
Is obsolete after filters and queries have been merged.
query
Is obsolete after filters and queries have been merged.
query_binary
Was undocumented and has been removed.
filter_binary
Was undocumented and has been removed.

Changes to queries

  • Unsupported queries such as term queries on geo_point fields will now fail rather than returning no hits.
  • Removed support for fuzzy queries on numeric, date and ip fields, use range queries instead.
  • Removed support for range and prefix queries on _uid and _id fields.
  • Querying an unindexed field will now fail rather than returning no hits.
  • Removed support for the deprecated min_similarity parameter in fuzzy query, in favour of fuzziness.
  • Removed support for the deprecated fuzzy_min_sim parameter in query_string query, in favour of fuzziness.
  • Removed support for the deprecated edit_distance parameter in completion suggester, in favour of fuzziness.
  • Removed support for the deprecated filter and no_match_filter fields in indices query, in favour of query and no_match_query.
  • Removed support for the deprecated filter fields in nested query, in favour of query.
  • Removed support for the deprecated minimum_should_match and disable_coord in terms query, use bool query instead. Also removed support for the deprecated execution parameter.
  • Removed support for the top level filter element in function_score query, replaced by query.
  • The collect_payloads parameter of the span_near query has been deprecated. Payloads will be loaded when needed.
  • The score_type parameter to the nested and has_child queries has been removed in favour of score_mode. The score_mode parameter to has_parent has been deprecated in favour of the score boolean parameter. Also, the total score mode has been removed in favour of the sum mode.
  • When the max_children parameter was set to 0 on the has_child query then there was no upper limit on how many child documents were allowed to match. Now, 0 really means that zero child documents are allowed. If no upper limit is needed then the max_children parameter shouldn’t be specified at all.
  • The exists query will now fail if the _field_names field is disabled.
  • The multi_match query will fail if fuzziness is used for cross_fields, phrase or phrase_prefix type. This parameter was undocumented and silently ignored before for these types of multi_match.
  • Deprecated support for the coerce, normalize, ignore_malformed parameters in GeoPolygonQuery. Use parameter validation_method instead.
  • Deprecated support for the coerce, normalize, ignore_malformed parameters in GeoDistanceQuery. Use parameter validation_method instead.
  • Deprecated support for the coerce, normalize, ignore_malformed parameters in GeoBoundingBoxQuery. Use parameter validation_method instead.
  • The geo_distance_range query is deprecated and should be replaced by either the geo_distance bucket aggregation, or geo_distance sort.

Top level filter parameter

Removed support for the deprecated top level filter in the search api, replaced by post_filter.

Highlighters

Removed support for multiple highlighter names, the only supported ones are: plain, fvh and postings.

Term vectors API

The term vectors APIs no longer persist unmapped fields in the mappings.

The dfs parameter to the term vectors API has been removed completely. Term vectors don’t support distributed document frequencies anymore.

Sort

The reverse parameter has been removed, in favour of explicitly specifying the sort order with the order option.

The coerce and ignore_malformed parameters were deprecated in favour of validation_method.

Inner hits

  • Top level inner hits syntax has been removed. Inner hits can now only be specified as part of the nested, has_child and has_parent queries. Use cases previously only possible with top level inner hits can now be done with inner hits defined inside the query dsl.
  • Source filtering for inner hits inside nested queries requires full field names instead of relative field names. This is now consistent for source filtering on other places in the search API.
  • Nested inner hits will now no longer include _index, _type and _id keys. For nested inner hits these values are always the same as the _index, _type and _id keys of the root search hit.
  • Parent/child inner hits will now no longer include the _index key. For parent/child inner hits the _index key is always the same as the the parent search hit.

Query Profiler

In the response for profiling queries, the query_type has been renamed to type and lucene has been renamed to description. These changes have been made so the response format is more friendly to supporting other types of profiling in the future.

Search preferences

The search preference _only_node has been removed. The same behavior can be achieved by using _only_nodes and specifying a single node ID.

The search preference _prefer_node has been superseded by _prefer_nodes. By specifying a single node, _prefer_nodes provides the same functionality as _prefer_node but also supports specifying multiple nodes.

The search preference _shards accepts a secondary preference, for example _primary to specify the primary copy of the specified shards. The separator previously used to separate the _shards portion of the parameter from the secondary preference was ;. However, this is also an acceptable separator between query string parameters which means that unless the ; was escaped, the secondary preference was never observed. The separator has been changed to | and does not need to be escaped.

Scoring changes

Default similarity

The default similarity has been changed to BM25.

DF formula

Document frequency (which is for instance used to compute inverse document frequency - IDF) is now based on the number of documents that have a value for the considered field rather than the total number of documents in the index. This change affects most similarities. See LUCENE-6711 for more information.

explain API

The fields field has been renamed to stored_fields