Get tokens from text analysis
The analyze API performs analysis on a text string and returns the resulting tokens.
Generating excessive amount of tokens may cause a node to run out of memory.
The index.analyze.max_token_count
setting enables you to limit the number of tokens that can be produced.
If more than this limit of tokens gets generated, an error occurs.
The _analyze
endpoint without a specified index will always use 10000
as its limit.
Body
-
analyzer string
The name of the analyzer that should be applied to the provided
text
. This could be a built-in analyzer, or an analyzer that’s been configured in the index. -
attributes array[string]
Array of token attributes used to filter the output of the
explain
parameter. -
char_filter array
Array of character filters used to preprocess characters before the tokenizer.
-
explain boolean
If
true
, the response includes token attributes and additional details. -
field string
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
-
filter array
Array of token filters used to apply after the tokenizer.
-
normalizer string
Normalizer to use to convert text into a single token.
text string | array[string]
curl \
--request GET http://api.example.com/_analyze \
--header "Content-Type: application/json" \
--data '{"text":"this is a test","analyzer":"standard"}'
{
"text": "this is a test",
"analyzer": "standard"
}