WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
_all approach is a good solution, as long as you thought
about setting it up before you indexed your documents. However, Elasticsearch
also provides a search-time solution to the problem: the
cross_fields type takes a term-centric approach, quite different from the
field-centric approach taken by
most_fields. It treats all
of the fields as one big field, and looks for each term in any field.
To illustrate the difference between field-centric and term-centric queries,
look at the
explanation for this field-centric
For a document to match, both
smith must appear in the same
field, either the
first_name field or the
(+first_name:peter +first_name:smith) (+last_name:peter +last_name:smith)
A term-centric approach would use this logic instead:
+(first_name:peter last_name:peter) +(first_name:smith last_name:smith)
In other words, the term
peter must appear in either field, and the term
smith must appear in either field.
cross_fields type first analyzes the query string to produce a list of
terms, and then it searches for each term in any field. That difference alone
solves two of the three problems that we listed in Field-Centric Queries, leaving
us just with the issue of differing inverse document frequencies.
cross_fields type solves this too, as can be seen from this
It solves the term-frequency problem by blending inverse document frequencies across fields:
+blended("peter", fields: [first_name, last_name]) +blended("smith", fields: [first_name, last_name])
In other words, it looks up the IDF of
smith in both the
last_name fields and uses the minimum of the two as the IDF for both
fields. The fact that
smith is a common last name means that it will be
treated as a common first name too.
cross_fields query type to work optimally, all fields should have
the same analyzer. Fields that share an analyzer are grouped together as
If you include fields with a different analysis chain, they will be added to
the query in the same way as for
best_fields. For instance, if we added the
title field to the preceding query (assuming it uses a different analyzer), the
explanation would be as follows:
(+title:peter +title:smith) ( +blended("peter", fields: [first_name, last_name]) +blended("smith", fields: [first_name, last_name]) )
This is particularly important when using the
One of the advantages of using the
cross_fields query over
_all fields is that you can boost individual
fields at query time.
For fields of equal value like
last_name, this generally
isn’t required, but if you were searching for books using the
description fields, you might want to give more weight to the
This can be done as described before with the caret (
The advantage of being able to boost individual fields should be weighed
against the cost of querying multiple fields instead of querying a single
_all field. Use whichever of the two solutions that delivers the most
bang for your buck.