WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.

This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.

› › ›

Phrase Matching

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Phrase Matching

edit

In the same way that the match query is the go-to query for standard full-text search, the match_phrase query is the one you should reach for when you want to find words that are near each other:

GET /my_index/my_type/_search
{
    "query": {
        "match_phrase": {
            "title": "quick brown fox"
        }
    }
}

Copy as curl View in Sense

Like the match query, the match_phrase query first analyzes the query string to produce a list of terms. It then searches for all the terms, but keeps only documents that contain all of the search terms, in the same positions relative to each other. A query for the phrase quick fox would not match any of our documents, because no document contains the word quick immediately followed by fox.

The match_phrase query can also be written as a match query with type phrase:

"match": {
    "title": {
        "query": "quick brown fox",
        "type":  "phrase"
    }
}

Copy as curl View in Sense

Term Positions

edit

When a string is analyzed, the analyzer returns not only a list of terms, but also the position, or order, of each term in the original string:

GET /_analyze?analyzer=standard
Quick brown fox

Copy as curl View in Sense

This returns the following:

{
   "tokens": [
      {
         "token": "quick",
         "start_offset": 0,
         "end_offset": 5,
         "type": "<ALPHANUM>",
         "position": 1 
      },
      {
         "token": "brown",
         "start_offset": 6,
         "end_offset": 11,
         "type": "<ALPHANUM>",
         "position": 2 
      },
      {
         "token": "fox",
         "start_offset": 12,
         "end_offset": 15,
         "type": "<ALPHANUM>",
         "position": 3 
      }
   ]
}

The position of each term in the original string.

Positions can be stored in the inverted index, and position-aware queries like the match_phrase query can use them to match only documents that contain all the words in exactly the order specified, with no words in-between.

What Is a Phrase

edit

For a document to be considered a match for the phrase “quick brown fox”, the following must be true:

quick, brown, and fox must all appear in the field.
The position of brown must be 1 greater than the position of quick.
The position of fox must be 2 greater than the position of quick.

If any of these conditions is not met, the document is not considered a match.

Internally, the match_phrase query uses the low-level span query family to do position-aware matching. Span queries are term-level queries, so they have no analysis phase; they search for the exact term specified.

Thankfully, most people never need to use the span queries directly, as the match_phrase query is usually good enough. However, certain specialized fields, like patent searches, use these low-level queries to perform very specific, carefully constructed positional searches.

« Proximity Matching Mixing It Up »

On this page

Term Positions
What Is a Phrase

Was this helpful?

Feedback

The Search AI Company

ELK Stack

Elastic Cloud

Generative AI

Search

Security

Observability

By solution

Industries

Customer spotlight

Research

Build

Learn

Connect

Phrase Matching

Phrase Matching

Term Positions

What Is a Phrase

Follow us

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards

About us

Join us

Partners

Trust & Security

Investor relations

Excellence Awards