10 novembre 2016 Technique

Artificial Intelligence Dreams, Machine Learning Promises, Behavioral Analytics? - How Elastic and Prelert Fit Together

Par Steve Dodson

At Salesforce’s recent Dreamforce conference, artificial intelligence (AI) dominated as a theme. Perhaps the most prominent announcement was Salesforce's Einstein ("everyone’s data scientist"), which signaled the company’s entry into the AI space and promises to make the lives of their users easier. It’s an exciting move and a great demonstration of “AI for everyone”.


Investment and acquisitions in AI have also increased significantly recently, and even AI veterans such as Andrew Ng are describing AI as “the new electricity. Just as 100 years ago electricity transformed industry after industry, AI will now do the same.”


This is all great validation for the direction Elastic is headed with Prelert, a recently-acquired machine learning company, who build machine learning technologies on top of the Elastic Stack. However, there is a lot of noise and terminology in the space, and I’d like to describe how Prelert fits in and hint at where the product will go with Elastic.

Terminology

The market is full of terms such as Artificial Intelligence, Machine Intelligence, Machine Learning, Advanced Machine Learning, Deep Machine Learning, Unsupervised Machine Learning, Supervised Machine Learning, Semi-Supervised Machine Learning, Reinforcement Learning, Deep Learning, Parametric Machine Learning, Non-Parametric Machine Learning, Behavioral Analytics, User Behavioral Analytics, User Entity Behavioral Analytics, Predictive Analytics, Prescriptive analytics, Advanced Analytics, Statistics, Cognitive Computing, Smart Machines and so on. These terms are used technically and in product marketing material, and it can be difficult to understand what exactly these terms mean, and what a product actually does. For example, after the recent Elastic{ON} Tour event in NYC, I was asked the following questions:


  • Does Prelert do Artificial Intelligence or Machine Learning?
  • Does Prelert do Machine Learning or Statistics?
  • Does Prelert do Deep Learning? Does Prelert do Supervised or Unsupervised Machine Learning?
  • What is Behavioral Analytics?

To cut through this noise, I’ll try to explain what Prelert (currently) does and what terms we use to describe it.

What Prelert (currently) does and how we describe it

Technically, Prelert automatically learns a predictive model for the distribution of feature values at given time, based on the historical values we have seen to date. This predictive model can be used to compute the probability of current behaviour given historical behaviour. If feature values are unpredictable (low probability) we classify them as anomalous.


For example, given a time series data stream (e.g. transactions per minute), Prelert will automatically learn the typical behaviour of the time series as data is being streamed into the Prelert engine. As Prelert sees more data the models become more accurate. Prelert also ages out old models as typically systems change over time.


This diagram shows how Prelert automatically learns the behaviour of a time series as the data is streamed in (left to right). The diagram also shows Prelert models becoming more accurate over time - Prelert automatically learns the periodicity in the data and then as more data is seen the variance in the models decreases, and the models fit the raw data more accurately.


This technology allows users to answer questions such as “Are there unusual log messages in my log file? Has the performance of my instance changed? Is this user behaving unusually?”, which are difficult to answer using simple keyword searches, simple aggregations or visualisation.

Behavioral Analytics

Behavioral Analytics is less of a technical term, but is rather a way the market describes types of analysis performed by this technology. Originally coined to describe analysis of consumer behaviors when interacting with web sites, this term now refers to the analysis of various types of entities, including users, systems, applications, and other IT and business-related entities.


Prelert is used for these use cases and so at a high-level we use the term ‘behavioral analytics’ to describe Prelert’s technology.

Artificial Intelligence? Machine Learning?

Firstly, we don’t really define a hard line between Artificial Intelligence (AI) and Machine Learning (ML). If intelligence is the ability to acquire and apply knowledge and skills, it is very difficult to describe intelligence without talking about learning.


Therefore, whilst Machine Learning is often seen as a subset of Artificial Intelligence we use both terms to describe Prelert’s technology, as we gain knowledge about a system's behaviour by learning the characteristics of the data.

Machine Learning? Statistics?

We recently presented a deep dive into Prelert and talked mainly about Bayesian frameworks and Statistical Distributions. This lead to some confusion about whether Prelert uses Machine Learning.


Machine Learning is a much younger discipline than Statistics, and a lot of Machine Learning is based on Statistical techniques and foundations. A technical description is ‘Statistics is more focussed on statistical inference, that is, explaining and testing properties of a population from which we see a random sample, machine learning is more concerned with making predictions, even if the prediction can not be explained very well (a.k.a. “a black box prediction”).’


At the core of Prelert there are a lot of statistical techniques and methods that create semi-parametric models of the data. However, the overall objective is to learn a model that can make predictions - so we describe Prelert’s technology as Machine Learning in the sense that we have “the ability to learn without being explicitly programmed”.

Supervised Machine Learning? Unsupervised Machine Learning? Deep Learning?

These terms start to become more technical, and in summary:



Currently, a primary use case for Prelert is anomaly detection, and the time series data we analyse generally do not have labeled anomalies. As there are no labels, Prelert uses unsupervised learning to model the input data and calculate the anomalousness of a feature based on this model. Prelert’s implementation is not a neural network and so we don’t classify it as deep learning. More details on deep learning’s applicability to this problem space will be provided in a later blog.

Summary

With so many new startups and buzz around AI and machine learning it can be difficult to establish understand what these technologies actually do and their value. For Prelert and Elastic the goal is to try and enhance the current Elastic search experience by initially allowing users to automatically use unsupervised machine learning to find anomalies in time series data, and understand via counterfactual reasoning what are the probable causes. This is just the start though, and we’re planning to expose more of the value of our models and extend the insights we can give to time series and non-time series data. Exciting times!