node.voting-only role has been
introduced that allows nodes to participate in elections even though they are
not eligible to become the master.
The benefit is that these nodes still help with high availability while
requiring less CPU and heap than master nodes.
node.voting-only role is only available with the default
distribution of Elasticsearch.
A new Analyzer reload API allows to reload the definition of search-time analyzers and their associated resources. A common use-case for this API is the reloading of search-time synonyms. In earlier versions of Elasticsearch, users could force synonyms to be reloaded by closing the index and then opening it again. With this new API, synonyms can be updated without closing the index.
The Analyzer reload API is only available with the default distribution of Elasticsearch.
flattened field type has been added, which can index
json objects into a single field. This helps avoid hitting issues
due to many fields in mappings, at the cost of more limited search
flattened field type is only available with the
default distribution of Elasticsearch.
These functions are only available with the default distribution of Elasticsearch.
Read aliases are now replicated via cross-cluster replication. Note that write aliases are still not replicated since they only make sense for indices that are being written to while follower indices do not receive direct writes.
Document-level security was using an unbounded cache for the set of visible documents. This could lead to a memory leak when using a templated query as a role query. The cache has been fixed to evict based on memory usage and has a limit of 50MB.
Terms aggregations generally need to build global ordinals in order to run. Unfortunately this operation became more memory-intensive in 6.0 due to the move to doc-value iterators in order to improve handling of sparse fields. Memory pressure of global ordinals now goes back to a more similar level as what you could have on pre-6.0 releases.
[beta] This functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features. Transforms are a core new feature in Elasticsearch that enable you to transform an existing index to a secondary, summarized index. Transforms enable you to pivot your data and create entity-centric indices that can summarize the behavior of an entity. This organizes the data into an analysis-friendly format.
Transforms were originally available in 7.2. With 7.3 they can now run either as a single batch transform or continuously incorporating new data as it is ingested.
Data frames enable new possibilities for machine learning analysis (such as outlier detection), but they can also be useful for other types of visualizations and custom types of analysis.
The goal of outlier detection is to find the most unusual data points in an index. We analyse the numerical fields of each data point (document in an index) and annotate them with how unusual they are.
We use unsupervised outlier detection which means there is no need to provide a training data set to teach outlier detection to recognize outliers. In practice, this is achieved by using an ensemble of distance based and density based techniques to identify those data points which are the most different from the bulk of the data in the index. We assign to each analysed data point an outlier score, which captures how different the entity is from other entities in the index.
In addition to new outlier detection functionality, we are introducing the evaluate data frame analytics API, which enables you to compute a range of performance metrics such as confusion matrices, precision, recall, the receiver-operating characteristics (ROC) curve and the area under the ROC curve. If you are running outlier detection on a source index that has already been labeled to indicate which points are truly outliers and which are normal, you can use the evaluate data frame analytics API to assess the performance of the outlier detection analytics on your dataset.