WARNING: The 2.x versions of Elasticsearch have passed their EOL dates. If you are running a 2.x version, we strongly advise you to upgrade.
This documentation is no longer maintained and may be removed. For the latest information, see the current Elasticsearch documentation.
Using Language Analyzers
editUsing Language Analyzers
editThe built-in language analyzers are available globally and don’t need to be configured before being used. They can be specified directly in the field mapping:
PUT /my_index { "mappings": { "blog": { "properties": { "title": { "type": "string", "analyzer": "english" } } } } }
Of course, by passing text through the english
analyzer, we lose
information:
We can’t tell if the document mentions one fox
or many foxes
; the word
not
is a stopword and is removed, so we can’t tell whether the document is
happy about foxes or not. By using the english
analyzer, we have increased
recall as we can match more loosely, but we have reduced our ability to rank
documents accurately.
To get the best of both worlds, we can use multifields to
index the title
field twice: once with the english
analyzer and once with
the standard
analyzer:
PUT /my_index { "mappings": { "blog": { "properties": { "title": { "type": "string", "fields": { "english": { "type": "string", "analyzer": "english" } } } } } } }
The main |
|
The |
With this mapping in place, we can index some test documents to demonstrate how to use both fields at query time:
PUT /my_index/blog/1 { "title": "I'm happy for this fox" } PUT /my_index/blog/2 { "title": "I'm not happy about my fox problem" } GET /_search { "query": { "multi_match": { "type": "most_fields", "query": "not happy foxes", "fields": [ "title", "title.english" ] } } }
Use the |
Even though neither of our documents contain the word foxes
, both documents
are returned as results thanks to the word stemming on the title.english
field. The second document is ranked as more relevant, because the word not
matches on the title
field.