Synonyms can replace existing tokens or be added to the token stream by using the
synonym token filter:
First, we define a token filter of type
We discuss synonym formats in Formatting Synonyms.
Then we create a custom analyzer that uses the
Synonyms can be specified inline with the
synonyms parameter, or in a
synonyms file that must be present on every node in the cluster. The path to
the synonyms file should be specified with the
synonyms_path parameter, and
should be either absolute or relative to the Elasticsearch
See Updating Stopwords for techniques that can be used to refresh the
Testing our analyzer with the
analyze API shows the following:
GET /my_index/_analyze?analyzer=my_synonyms Elizabeth is the English queen
A document like this will match queries for any of the following:
English monarch, or
Even a phrase query will work, because the position of
each term has been preserved.
Using the same
synonym token filter at both index time and search time is
redundant. If, at index time, we replace
English with the two terms
british, then at search time we need to search for only one of
those terms. Alternatively, if we don’t use synonyms at index time, then at
search time, we would need to convert a query for
English into a query for
english OR british.
Whether to do synonym expansion at search or index time can be a difficult choice. We will explore the options more in Expand or contract.