Geographic functionsedit

The geographic functions detect anomalies in the geographic location of the input data.

The machine learning features include the following geographic function: lat_long.

You cannot create forecasts for anomaly detection jobs that contain geographic functions. You also cannot add rules with conditions to detectors that use geographic functions.

Lat_longedit

The lat_long function detects anomalies in the geographic location of the input data.

This function supports the following properties:

  • field_name (required)
  • by_field_name (optional)
  • over_field_name (optional)
  • partition_field_name (optional)

For more information about those properties, see Detector configuration objects.

Example 1: Analyzing transactions with the lat_long function.

PUT _ml/anomaly_detectors/example1
{
  "analysis_config": {
    "detectors": [{
      "function" : "lat_long",
      "field_name" : "transactionCoordinates",
      "by_field_name" : "creditCardNumber"
    }]
  },
  "data_description": {
    "time_field":"timestamp",
    "time_format": "epoch_ms"
  }
}

If you use this lat_long function in a detector in your anomaly detection job, it detects anomalies where the geographic location of a credit card transaction is unusual for a particular customer’s credit card. An anomaly might indicate fraud.

The field_name that you supply must be a single string that contains two comma-separated numbers of the form latitude,longitude. The latitude and longitude must be in the range -180 to 180 and represent a point on the surface of the Earth.

For example, JSON data might contain the following transaction coordinates:

{
  "time": 1460464275,
  "transactionCoordinates": "40.7,-74.0",
  "creditCardNumber": "1234123412341234"
}

In Elasticsearch, location data is likely to be stored in geo_point fields. For more information, see Geo-point datatype. This data type is not supported natively in machine learning features. You can, however, use Painless scripts in script_fields in your datafeed to transform the data into an appropriate format. For example, the following Painless script transforms "coords": {"lat" : 41.44, "lon":90.5} into "lat-lon": "41.44,90.5":

PUT _ml/datafeeds/datafeed-test2
{
  "job_id": "farequote",
  "indices": ["farequote"],
  "query": {
    "match_all": {
          "boost": 1
    }
  },
  "script_fields": {
    "lat-lon": {
      "script": {
        "source": "doc['coords'].lat + ',' + doc['coords'].lon",
        "lang": "painless"
      }
    }
  }
}

For more information, see Transforming data with script fields.