Put data frame analytics jobs APIedit

Creates a new data frame analytics job. The API accepts a PutDataFrameAnalyticsRequest object as a request and returns a PutDataFrameAnalyticsResponse.

Put data frame analytics jobs requestedit

A PutDataFrameAnalyticsRequest requires the following argument:

PutDataFrameAnalyticsRequest request = new PutDataFrameAnalyticsRequest(config); 

The configuration of the data frame analytics job to create

Data frame analytics configurationedit

The DataFrameAnalyticsConfig object contains all the details about the data frame analytics job configuration and contains the following arguments:

DataFrameAnalyticsConfig config = DataFrameAnalyticsConfig.builder("my-analytics-config") 
    .setSource(sourceConfig) 
    .setDest(destConfig) 
    .setAnalysis(outlierDetection) 
    .setAnalyzedFields(analyzedFields) 
    .setModelMemoryLimit(new ByteSizeValue(5, ByteSizeUnit.MB)) 
    .build();

The data frame analytics job ID

The source index and query from which to gather data

The destination index

The analysis to be performed

The fields to be included in / excluded from the analysis

The memory limit for the model created as part of the analysis process

SourceConfigedit

The index and the query from which to collect data.

DataFrameAnalyticsSource sourceConfig = DataFrameAnalyticsSource.builder() 
    .setIndex("put-test-source-index") 
    .setQueryConfig(queryConfig) 
    .build();

Constructing a new DataFrameAnalyticsSource

The source index

The query from which to gather the data. If query is not set, a match_all query is used by default.

QueryConfigedit

The query with which to select data from the source.

QueryConfig queryConfig = new QueryConfig(new MatchAllQueryBuilder());

DestinationConfigedit

The index to which data should be written by the data frame analytics job.

DataFrameAnalyticsDest destConfig = DataFrameAnalyticsDest.builder() 
    .setIndex("put-test-dest-index") 
    .build();

Constructing a new DataFrameAnalyticsDest

The destination index

Analysisedit

The analysis to be performed. Currently, only one analysis is supported: OutlierDetection.

OutlierDetection analysis can be created in one of two ways:

DataFrameAnalysis outlierDetection = OutlierDetection.createDefault(); 

Constructing a new OutlierDetection object with default strategy to determine outliers

or

DataFrameAnalysis outlierDetectionCustomized = OutlierDetection.builder() 
    .setMethod(OutlierDetection.Method.DISTANCE_KNN) 
    .setNNeighbors(5) 
    .build();

Constructing a new OutlierDetection object

The method used to perform the analysis

Number of neighbors taken into account during analysis

Analyzed fieldsedit

FetchContext object containing fields to be included in / excluded from the analysis

FetchSourceContext analyzedFields =
    new FetchSourceContext(
        true,
        new String[] { "included_field_1", "included_field_2" },
        new String[] { "excluded_field" });

Synchronous executionedit

When executing a PutDataFrameAnalyticsRequest in the following manner, the client waits for the PutDataFrameAnalyticsResponse to be returned before continuing with code execution:

PutDataFrameAnalyticsResponse response = client.machineLearning().putDataFrameAnalytics(request, RequestOptions.DEFAULT);

Synchronous calls may throw an IOException in case of either failing to parse the REST response in the high-level REST client, the request times out or similar cases where there is no response coming back from the server.

In cases where the server returns a 4xx or 5xx error code, the high-level client tries to parse the response body error details instead and then throws a generic ElasticsearchException and adds the original ResponseException as a suppressed exception to it.

Asynchronous executionedit

Executing a PutDataFrameAnalyticsRequest can also be done in an asynchronous fashion so that the client can return directly. Users need to specify how the response or potential failures will be handled by passing the request and a listener to the asynchronous put-data-frame-analytics method:

client.machineLearning().putDataFrameAnalyticsAsync(request, RequestOptions.DEFAULT, listener); 

The PutDataFrameAnalyticsRequest to execute and the ActionListener to use when the execution completes

The asynchronous method does not block and returns immediately. Once it is completed the ActionListener is called back using the onResponse method if the execution successfully completed or using the onFailure method if it failed. Failure scenarios and expected exceptions are the same as in the synchronous execution case.

A typical listener for put-data-frame-analytics looks like:

ActionListener<PutDataFrameAnalyticsResponse> listener = new ActionListener<PutDataFrameAnalyticsResponse>() {
    @Override
    public void onResponse(PutDataFrameAnalyticsResponse response) {
        
    }

    @Override
    public void onFailure(Exception e) {
        
    }
};

Called when the execution is successfully completed.

Called when the whole PutDataFrameAnalyticsRequest fails.

Responseedit

The returned PutDataFrameAnalyticsResponse contains the newly created data frame analytics job.

DataFrameAnalyticsConfig createdConfig = response.getConfig();