Evaluate data frame analytics APIedit

Evaluates the data frame analytics for an annotated index.

This functionality is experimental and may be changed or removed completely in a future release. Elastic will take a best effort approach to fix any issues, but experimental features are not subject to the support SLA of official GA features.

Requestedit

POST _ml/data_frame/_evaluate

Prerequisitesedit

Descriptionedit

The API packages together commonly used evaluation metrics for various types of machine learning features. This has been designed for use on indexes created by data frame analytics. Evaluation requires both a ground truth field and an analytics result field to be present.

Request bodyedit

index
(Required, object) Defines the index in which the evaluation will be performed.
query
(Optional, object) A query clause that retrieves a subset of data from the source index. See Query DSL.
evaluation

(Required, object) Defines the type of evaluation you want to perform. See Data frame analytics evaluation resources.

Available evaluation types:

  • binary_soft_classification
  • regression
  • classification

Examplesedit

Binary soft classificationedit

POST _ml/data_frame/_evaluate
{
  "index": "my_analytics_dest_index",
  "evaluation": {
    "binary_soft_classification": {
      "actual_field": "is_outlier",
      "predicted_probability_field": "ml.outlier_score"
    }
  }
}

The API returns the following results:

{
  "binary_soft_classification": {
    "auc_roc": {
      "score": 0.92584757746414444
    },
    "confusion_matrix": {
      "0.25": {
          "tp": 5,
          "fp": 9,
          "tn": 204,
          "fn": 5
      },
      "0.5": {
          "tp": 1,
          "fp": 5,
          "tn": 208,
          "fn": 9
      },
      "0.75": {
          "tp": 0,
          "fp": 4,
          "tn": 209,
          "fn": 10
      }
    },
    "precision": {
        "0.25": 0.35714285714285715,
        "0.5": 0.16666666666666666,
        "0.75": 0
    },
    "recall": {
        "0.25": 0.5,
        "0.5": 0.1,
        "0.75": 0
    }
  }
}

Regressionedit

POST _ml/data_frame/_evaluate
{
  "index": "house_price_predictions", 
  "query": {
      "bool": {
        "filter": [
          { "term":  { "ml.is_training": false } } 
        ]
      }
  },
  "evaluation": {
    "regression": {
      "actual_field": "price", 
      "predicted_field": "ml.price_prediction", 
      "metrics": {
        "r_squared": {},
        "mean_squared_error": {}
      }
    }
  }
}

The output destination index from a data frame analytics regression analysis.

In this example, a test/train split (training_percent) was defined for the regression analysis. This query limits evaluation to be performed on the test split only.

The ground truth value for the actual house price. This is required in order to evaluate results.

The predicted value for house price calculated by the regression analysis.

The following example calculates the training error:

POST _ml/data_frame/_evaluate
{
  "index": "student_performance_mathematics_reg",
  "query": {
    "term": {
      "ml.is_training": {
        "value": true 
      }
    }
  },
  "evaluation": {
    "regression": {
      "actual_field": "G3", 
      "predicted_field": "ml.G3_prediction", 
      "metrics": {
        "r_squared": {},
        "mean_squared_error": {}
      }
    }
  }
}

In this example, a test/train split (training_percent) was defined for the regression analysis. This query limits evaluation to be performed on the train split only. It means that a training error will be calculated.

The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results.

The field that contains the predicted value for student performance calculated by the regression analysis.

The next example calculates the testing error. The only difference compared with the previous example is that ml.is_training is set to false this time, so the query excludes the train split from the evaluation.

POST _ml/data_frame/_evaluate
{
  "index": "student_performance_mathematics_reg",
  "query": {
    "term": {
      "ml.is_training": {
        "value": false 
      }
    }
  },
  "evaluation": {
    "regression": {
      "actual_field": "G3", 
      "predicted_field": "ml.G3_prediction", 
      "metrics": {
        "r_squared": {},
        "mean_squared_error": {}
      }
    }
  }
}

In this example, a test/train split (training_percent) was defined for the regression analysis. This query limits evaluation to be performed on the test split only. It means that a testing error will be calculated.

The field that contains the ground truth value for the actual student performance. This is required in order to evaluate results.

The field that contains the predicted value for student performance calculated by the regression analysis.

Classificationedit

POST _ml/data_frame/_evaluate
{
   "index": "animal_classification",
   "evaluation": {
      "classification": { 
         "actual_field": "animal_class", 
         "predicted_field": "ml.animal_class_prediction.keyword", 
         "metrics": {
           "multiclass_confusion_matrix" : {} 
         }
      }
   }
}

The evaluation type.

The field that contains the ground truth value for the actual animal classification. This is required in order to evaluate results.

The field that contains the predicted value for animal classification by the classification analysis. Since the field storing predicted class is dynamically mapped as text and keyword, you need to add the .keyword suffix to the name.

Specifies the metric for the evaluation.

The API returns the following result:

{
   "classification" : {
      "multiclass_confusion_matrix" : {
         "confusion_matrix" : [
         {
            "actual_class" : "cat", 
            "actual_class_doc_count" : 12, 
            "predicted_classes" : [ 
              {
                "predicted_class" : "cat",
                "count" : 12 
              },
              {
                "predicted_class" : "dog",
                "count" : 0 
              }
            ],
            "other_predicted_class_doc_count" : 0 
          },
          {
            "actual_class" : "dog",
            "actual_class_doc_count" : 11,
            "predicted_classes" : [
              {
                "predicted_class" : "dog",
                "count" : 7
              },
              {
                "predicted_class" : "cat",
                "count" : 4
              }
            ],
            "other_predicted_class_doc_count" : 0
          }
        ],
        "other_actual_class_count" : 0
      }
    }
  }

The name of the actual class that the analysis tried to predict.

The number of documents in the index that belong to the actual_class.

This object contains the list of the predicted classes and the number of predictions associated with the class.

The number of cats in the dataset that are correctly identified as cats.

The number of cats in the dataset that are incorrectly classified as dogs.

The number of documents that are classified as a class that is not listed as a predicted_class.