Analyze APIedit

Analyze Requestedit

An AnalyzeRequest contains the text to analyze, and one of several options to specify how the analysis should be performed.

The simplest version uses a built-in analyzer:

AnalyzeRequest request = AnalyzeRequest.withGlobalAnalyzer("english", 
    "Some text to analyze", "Some more text to analyze");       

A built-in analyzer

The text to include. Multiple strings are treated as a multi-valued field

You can configure a custom analyzer:

Map<String, Object> stopFilter = new HashMap<>();
stopFilter.put("type", "stop");
stopFilter.put("stopwords", new String[]{ "to" });  
AnalyzeRequest request = AnalyzeRequest.buildCustomAnalyzer("standard")  
    .build("<b>Some text to analyze</b>");

Configuration for a custom tokenfilter

Configure the tokenizer

Configure char filters

Add a built-in tokenfilter

Add the custom tokenfilter

You can also build a custom normalizer, by including only charfilters and tokenfilters:

AnalyzeRequest request = AnalyzeRequest.buildCustomNormalizer()

You can analyze text using an analyzer defined in an existing index:

AnalyzeRequest request = AnalyzeRequest.withIndexAnalyzer(
    "some text to analyze"

The index containing the mappings

The analyzer defined on this index to use

Or you can use a normalizer:

AnalyzeRequest request = AnalyzeRequest.withNormalizer(
    "some text to analyze"

The index containing the mappings

The normalizer defined on this index to use

You can analyze text using the mappings for a particular field in an index:

AnalyzeRequest request = AnalyzeRequest.withField("my_index", "my_field", "some text to analyze");

Optional argumentsedit

The following arguments can also optionally be provided:

request.attributes("keyword", "type");      

Setting explain to true will add further details to the response

Setting attributes allows you to return only token attributes that you are interested in

Synchronous executionedit

When executing a AnalyzeRequest in the following manner, the client waits for the AnalyzeResponse to be returned before continuing with code execution:

AnalyzeResponse response = client.indices().analyze(request, RequestOptions.DEFAULT);

Synchronous calls may throw an IOException in case of either failing to parse the REST response in the high-level REST client, the request times out or similar cases where there is no response coming back from the server.

In cases where the server returns a 4xx or 5xx error code, the high-level client tries to parse the response body error details instead and then throws a generic ElasticsearchException and adds the original ResponseException as a suppressed exception to it.

Asynchronous executionedit

Executing a AnalyzeRequest can also be done in an asynchronous fashion so that the client can return directly. Users need to specify how the response or potential failures will be handled by passing the request and a listener to the asynchronous analyze method:

client.indices().analyzeAsync(request, RequestOptions.DEFAULT, listener); 

The AnalyzeRequest to execute and the ActionListener to use when the execution completes

The asynchronous method does not block and returns immediately. Once it is completed the ActionListener is called back using the onResponse method if the execution successfully completed or using the onFailure method if it failed. Failure scenarios and expected exceptions are the same as in the synchronous execution case.

A typical listener for analyze looks like:

ActionListener<AnalyzeResponse> listener = new ActionListener<AnalyzeResponse>() {
    public void onResponse(AnalyzeResponse analyzeTokens) {

    public void onFailure(Exception e) {

Called when the execution is successfully completed.

Called when the whole AnalyzeRequest fails.

Analyze Responseedit

The returned AnalyzeResponse allows you to retrieve details of the analysis as follows:

List<AnalyzeResponse.AnalyzeToken> tokens = response.getTokens();   

AnalyzeToken holds information about the individual tokens produced by analysis

If explain was set to true, then information is instead returned from the detail() method:

DetailAnalyzeResponse detail = response.detail();                   

DetailAnalyzeResponse holds more detailed information about tokens produced by the various substeps in the analysis chain.