IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

› › ›

Keyword datatype

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Keyword datatype

edit

A field to index structured content such as IDs, email addresses, hostnames, status codes, zip codes or tags.

They are typically used for filtering (Find me all blog posts where status is published), for sorting, and for aggregations. Keyword fields are only searchable by their exact value.

If you need to index full text content such as email bodies or product descriptions, it is likely that you should rather use a text field.

Below is an example of a mapping for a keyword field:

PUT my_index
{
  "mappings": {
    "properties": {
      "tags": {
        "type":  "keyword"
      }
    }
  }
}

Mapping numeric identifiers

Not all numeric data should be mapped as a numeric field datatype. Elasticsearch optimizes numeric fields, such as integer or long, for range queries. However, keyword fields are better for term and other term-level queries.

Identifiers, such as an ISBN or a product ID, are rarely used in range queries. However, they are often retrieved using term-level queries.

Consider mapping a numeric identifier as a keyword if:

You don’t plan to search for the identifier data using range queries.
Fast retrieval is important. term query searches on keyword fields are often faster than term searches on numeric fields.

If you’re unsure which to use, you can use a multi-field to map the data as both a keyword and a numeric datatype.

Parameters for keyword fields

edit

The following parameters are accepted by keyword fields:

`boost`	Mapping field-level query time boosting. Accepts a floating point number, defaults to `1.0`.
`doc_values`	Should the field be stored on disk in a column-stride fashion, so that it can later be used for sorting, aggregations, or scripting? Accepts `true` (default) or `false`.
`eager_global_ordinals`	Should global ordinals be loaded eagerly on refresh? Accepts `true` or `false` (default). Enabling this is a good idea on fields that are frequently used for terms aggregations.
`fields`	Multi-fields allow the same string value to be indexed in multiple ways for different purposes, such as one field for search and a multi-field for sorting and aggregations.
`ignore_above`	Do not index any string longer than this value. Defaults to `2147483647` so that all values would be accepted. Please however note that default dynamic mapping rules create a sub `keyword` field that overrides this default by setting `ignore_above: 256`.
`index`	Should the field be searchable? Accepts `true` (default) or `false`.
`index_options`	What information should be stored in the index, for scoring purposes. Defaults to `docs` but can also be set to `freqs` to take term frequency into account when computing scores.
`norms`	Whether field-length should be taken into account when scoring queries. Accepts `true` or `false` (default).
`null_value`	Accepts a string value which is substituted for any explicit `null` values. Defaults to `null`, which means the field is treated as missing.
`store`	Whether the field value should be stored and retrievable separately from the `_source` field. Accepts `true` or `false` (default).
`similarity`	Which scoring algorithm or similarity should be used. Defaults to `BM25`.
`normalizer`	How to pre-process the keyword prior to indexing. Defaults to `null`, meaning the keyword is kept as-is.
`split_queries_on_whitespace`	Whether full text queries should split the input on whitespace when building a query for this field. Accepts `true` or `false` (default).
`meta`	Metadata about the field.

Indexes imported from 2.x do not support keyword. Instead they will attempt to downgrade keyword into string. This allows you to merge modern mappings with legacy mappings. Long lived indexes will have to be recreated before upgrading to 6.x but mapping downgrade gives you the opportunity to do the recreation on your own schedule.

« Join datatype Nested datatype »