As described in Analyzing the past and present, Elastic Stack machine learning features can calculate baselines of normal behavior then extrapolate anomalous events. These baselines are accomplished by generating models of your data.
To ensure resilience in the event of a system failure, snapshots of the machine learning
model for each anomaly detection job are saved to an internal index within the Elasticsearch
cluster. The amount of time necessary to save these snapshots is proportional to
the size of the model in memory. By default, snapshots are captured
approximately every 3 to 4 hours. You can change this interval
background_persist_interval) when you create or update a job.
To reduce the number of snapshots consuming space on your cluster, at the end of
each day, old snapshots are automatically deleted. The age of each snapshot is
calculated relative to the timestamp of the most recent snapshot. By default, if
there are snapshots over one day older than the newest snapshot, they are
deleted except for the first snapshot each day. As well, all snapshots over ten
days older than the newest snapshot are deleted. You can change these retention
model_snapshot_retention_days) when you create or update a job. If you want to
exempt a specific snapshot from this clean up, use Kibana or the
update model snapshots API to set
You can see the list of model snapshots for each job with the get model snapshots API or in the Model snapshots tab on the Job Management page in Kibana:
There are situations other than system failures where you might want to revert to using a specific model snapshot. The machine learning features react quickly to anomalous input and new behaviors in data. Highly anomalous input increases the variance in the models and machine learning analytics must determine whether it is a new step-change in behavior or a one-off event. In the case where you know this anomalous input is a one-off, it might be appropriate to reset the model state to a time before this event. For example, after a Black Friday sales day you might consider reverting to a saved snapshot. If you know about such events in advance, however, you can use calendars and scheduled events to avoid impacting your model.