The following limitations and known problems apply to the 6.2.4 release of X-Pack:
Categorization identifies static parts of unstructured logs and groups similar
messages together. The default categorization tokenizer assumes English language
log messages. For other languages you must define a different
categorization_analyzer for your job. For more information, see
Categorizing log messages.
Additionally, a dictionary used to influence the categorization process contains only English words. This means categorization might work better in English than in other languages. The ability to customize the dictionary will be added in a future release.
The X-Pack machine learning features in Kibana use pop-ups. You must configure your web browser so that it does not block pop-up windows or create an exception for your Kibana URL.
At this time, you cannot use cross cluster search in either the machine learning APIs or the machine learning features in Kibana.
For more information about cross cluster search, see Cross Cluster Search.
You cannot use machine learning features on tribe nodes. For more information about that type of node, see Tribe node.
In Kibana, Anomaly Explorer charts are not displayed for anomalies
that were due to categorization,
time_of_day functions, or
functions. Those particular results do not display well as time series
The charts are also not displayed for detectors that use script fields. In that case, the original source data cannot be easily searched because it has been somewhat transformed by the script.
The Anomaly Explorer charts can also look odd in circumstances where there is very little data to plot. For example, if there is only one data point, it is represented as a single dot. If there are only two data points, they are joined by a line.
If you start a datafeed and specify an end date, it will close the job when the datafeed stops. This behavior avoids having numerous open one-time jobs.
If you do not specify an end date when you start a datafeed, the job remains open when you stop the datafeed. This behavior avoids the overhead of closing and re-opening large jobs when there are pauses in the datafeed.
If you create jobs in Kibana, you must use datafeeds. If the data that you want to analyze is not stored in Elasticsearch, you cannot use datafeeds and therefore you cannot create your jobs in Kibana. You can, however, use the machine learning APIs to create jobs and to send batches of data directly to the jobs. For more information, see Datafeeds and API Quick Reference.
The post data API enables you to send data to a job for analysis. The data that you send to the job must use the JSON format.
For more information about this API, see Post Data to Jobs.
One of the counts associated with a machine learning job is
which indicates the number of records that are missing a configured field.
Since jobs analyze JSON data, the
missing_field_count might be misleading.
Missing fields might be expected due to the structure of the data and therefore
do not generate poor results.
For more information about
see Data Counts Objects.
By default, the
terms aggregation returns the buckets for the top ten terms.
You can change this default behavior by setting the
If you are send pre-aggregated data to a job for analysis, you must ensure
size is configured correctly. Otherwise, some data might not be
It is not possible to create an X-Pack machine learning analysis job that uses time-based
index patterns, for example
This applies to the single metric or multi metric job creation wizards in Kibana.
You cannot use the following field names in the
over_field_name properties in a job:
over. This limitation
also applies to those properties when you create advanced jobs in Kibana.
If you create single or multi-metric jobs in Kibana, it might enable some options under the covers that you’d want to reconsider for large or long-running jobs.
For example, when you create a single metric job in Kibana, it generally
model_plot_config advanced configuration option. That configuration
option causes model information to be stored along with the results and provides
a more detailed view into anomaly detection. It is specifically used by the
Single Metric Viewer in Kibana. When this option is enabled, however, it can
add considerable overhead to the performance of the system. If you have jobs
with many entities, for example data from tens of thousands of servers, storing
this additional model information for every bucket might be problematic. If you
are not certain that you need this option or if you experience performance
issues, edit your job configuration to disable this option.
For more information, see Model Plot Config.
Likewise, when you create a single or multi-metric job in Kibana, in some cases
it uses aggregations on the data that it retrieves from Elasticsearch. One of the
benefits of summarizing data this way is that Elasticsearch automatically distributes
these calculations across your cluster. This summarized data is then fed into
X-Pack machine learning instead of raw results, which reduces the volume of data that must
be considered while detecting anomalies. However, if you have two jobs, one of
which uses pre-aggregated data and another that does not, their results might
differ. This difference is due to the difference in precision of the input data.
The machine learning analytics are designed to be aggregation-aware and the likely increase
in performance that is gained by pre-aggregating the data makes the potentially
poorer precision worthwhile. If you want to view or change the aggregations
that are used in your job, refer to the
aggregations property in your datafeed.
For more information, see Datafeed Resources.
When X-Pack security is enabled, a datafeed stores the roles of the user who created or updated the datafeed at that time. This means that if those roles are updated then the datafeed subsequently runs with the new permissions that are associated with the roles. However, if the user’s roles are adjusted after creating or updating the datafeed, the datafeed continues to run with the permissions that were associated with the original roles. For more information, see Datafeeds.
If you use an
over_field_name property in your job (that is to say, it’s a
population job), you cannot create a forecast. If you try to create a forecast
for this type of job, an error occurs. For more information about forecasts,
see Forecasting the Future.
If you use any of the following analytical functions in your job, you cannot create a forecast:
If you try to create a forecast for this type of job, an error occurs. For more information about any of these functions, see Function Reference.