All methods and paths for this operation:
The analyze API performs analysis on a text string and returns the resulting tokens.
Generating excessive amount of tokens may cause a node to run out of memory.
The index.analyze.max_token_count setting enables you to limit the number of tokens that can be produced.
If more than this limit of tokens gets generated, an error occurs.
The _analyze endpoint without a specified index will always use 10000 as its limit.
indexIndex used to derive the analyzer.
If specified, the analyzer or field parameter overrides this value.
If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.
Index used to derive the analyzer.
If specified, the analyzer or field parameter overrides this value.
If no index is specified or the index does not have a default analyzer, the analyze API uses the standard analyzer.
The name of the analyzer that should be applied to the provided text.
This could be a built-in analyzer, or an analyzer that’s been configured in the index.
Array of token attributes used to filter the output of the explain parameter.
Array of character filters used to preprocess characters before the tokenizer.
If true, the response includes token attributes and additional details.
Default value is false.
Field used to derive the analyzer.
To use this parameter, you must specify an index.
If specified, the analyzer parameter overrides this value.
Array of token filters used to apply after the tokenizer.
Normalizer to use to convert text into a single token.
Text to analyze. If an array of strings is provided, it is analyzed as a multi-value field.
Tokenizer to use to convert text into tokens.
GET /_analyze
{
"analyzer": "standard",
"text": "this is a test"
}
resp = client.indices.analyze(
analyzer="standard",
text="this is a test",
)
const response = await client.indices.analyze({
analyzer: "standard",
text: "this is a test",
});
response = client.indices.analyze(
body: {
"analyzer": "standard",
"text": "this is a test"
}
)
$resp = $client->indices()->analyze([
"body" => [
"analyzer" => "standard",
"text" => "this is a test",
],
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"analyzer":"standard","text":"this is a test"}' "$ELASTICSEARCH_URL/_analyze"
client.indices().analyze(a -> a
.analyzer("standard")
.text("this is a test")
);
{
"analyzer": "standard",
"text": "this is a test"
}
{
"analyzer": "standard",
"text": [
"this is a test",
"the second text"
]
}
{
"tokenizer": "keyword",
"filter": [
"lowercase"
],
"char_filter": [
"html_strip"
],
"text": "this is a <b>test</b>"
}
{
"tokenizer": "whitespace",
"filter": [
"lowercase",
{
"type": "stop",
"stopwords": [
"a",
"is",
"this"
]
}
],
"text": "this is a test"
}
{
"field": "obj1.field1",
"text": "this is a test"
}
{
"normalizer": "my_normalizer",
"text": "BaR"
}
{
"tokenizer": "standard",
"filter": [
"snowball"
],
"text": "detailed output",
"explain": true,
"attributes": [
"keyword"
]
}
{
"detail": {
"custom_analyzer": true,
"charfilters": [],
"tokenizer": {
"name": "standard",
"tokens": [
{
"token": "detailed",
"start_offset": 0,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 0
},
{
"token": "output",
"start_offset": 9,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 1
}
]
},
"tokenfilters": [
{
"name": "snowball",
"tokens": [
{
"token": "detail",
"start_offset": 0,
"end_offset": 8,
"type": "<ALPHANUM>",
"position": 0,
"keyword": false
},
{
"token": "output",
"start_offset": 9,
"end_offset": 15,
"type": "<ALPHANUM>",
"position": 1,
"keyword": false
}
]
}
]
}
}