All methods and paths for this operation:
This API provides explanations for a data frame analytics config that either exists already or one that has not been created yet. The following explanations are provided:
monitor_mlIdentifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
The configuration of how to source the analysis data. It requires an index. Optionally, query and _source may be specified.
The destination configuration, consisting of index and optionally results_field (ml by default).
The analysis configuration, which contains the information necessary to perform one of the following types of analysis: classification, outlier detection, or regression.
A description of the job.
The approximate maximum amount of memory resources that are permitted for
analytical processing. If your elasticsearch.yml file contains an
xpack.ml.max_model_memory_limit setting, an error occurs when you try to
create data frame analytics jobs that have model_memory_limit values
greater than that setting.
Default value is 1gb.
The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.
Default value is 1.
Specify includes and/or excludes patterns to select which fields will be included in the analysis. The patterns specified in excludes are applied last, therefore excludes takes precedence. In other words, if the same field is specified in both includes and excludes, then the field will not be included in the analysis.
Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.
Default value is false.
POST _ml/data_frame/analytics/_explain
{
"source": {
"index": "houses_sold_last_10_yrs"
},
"analysis": {
"regression": {
"dependent_variable": "price"
}
}
}
resp = client.ml.explain_data_frame_analytics(
source={
"index": "houses_sold_last_10_yrs"
},
analysis={
"regression": {
"dependent_variable": "price"
}
},
)
const response = await client.ml.explainDataFrameAnalytics({
source: {
index: "houses_sold_last_10_yrs",
},
analysis: {
regression: {
dependent_variable: "price",
},
},
});
response = client.ml.explain_data_frame_analytics(
body: {
"source": {
"index": "houses_sold_last_10_yrs"
},
"analysis": {
"regression": {
"dependent_variable": "price"
}
}
}
)
$resp = $client->ml()->explainDataFrameAnalytics([
"body" => [
"source" => [
"index" => "houses_sold_last_10_yrs",
],
"analysis" => [
"regression" => [
"dependent_variable" => "price",
],
],
],
]);
curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"source":{"index":"houses_sold_last_10_yrs"},"analysis":{"regression":{"dependent_variable":"price"}}}' "$ELASTICSEARCH_URL/_ml/data_frame/analytics/_explain"
{
"source": {
"index": "houses_sold_last_10_yrs"
},
"analysis": {
"regression": {
"dependent_variable": "price"
}
}
}
{
"field_selection": [
{
"field": "number_of_bedrooms",
"mappings_types": [
"integer"
],
"is_included": true,
"is_required": false,
"feature_type": "numerical"
},
{
"field": "postcode",
"mappings_types": [
"text"
],
"is_included": false,
"is_required": false,
"reason": "[postcode.keyword] is preferred because it is aggregatable"
},
{
"field": "postcode.keyword",
"mappings_types": [
"keyword"
],
"is_included": true,
"is_required": false,
"feature_type": "categorical"
},
{
"field": "price",
"mappings_types": [
"float"
],
"is_included": true,
"is_required": true,
"feature_type": "numerical"
}
],
"memory_estimation": {
"expected_memory_without_disk": "128MB",
"expected_memory_with_disk": "32MB"
}
}