Here are the highlights of what’s new and improved in Elasticsearch 8.0!
7.x REST API compatibilityedit
8.0 introduces several breaking changes to the Elasticsearch REST APIs. While it’s important to update your application to account for these changes, finding and updating every API call in a single upgrade can be painful and error-prone. To make this process easier, we’ve added support for 7.x compatibility headers to our REST APIs. In many cases, these optional headers let you make 7.x-compatible requests to an 8.0 cluster and receive 7.x-compatible responses.
While we still recommend you update your application to use native 8.0 requests and responses, the 7.x API compatibility headers let you safely make these changes over a longer period of time.
For more information about the headers and how to use them, see REST API compatibility.
Security features are enabled and configured by defaultedit
Running Elasticsearch without security leaves your cluster exposed to anyone who can send network traffic to Elasticsearch. In previous versions, you had to explicitly enable the Elasticsearch security features such as authentication, authorization, and network encryption (TLS). Starting in Elasticsearch 8.0, security is enabled and configured by default when you start Elasticsearch for the first time.
At startup, we generate enrollment tokens that you use to connect a Kibana instance or enroll additional nodes in your secured Elasticsearch cluster, without having to generate security certificates or update YAML configuration files. Just use the generated enrollment token when starting new nodes or Kibana instances, and the Elastic Stack handles all of the security configuration for you. Out of the box, you’ll get:
- User authentication
- User authorization
- Encrypted internode communication with TLS
- Encrypted communication between Elasticsearch and Kibana with TLS
Need a new enrollment token? Use the
tool to create enrollment tokens for Elasticsearch nodes and Kibana instances.
Better protection for system indicesedit
System indices store configurations and internal data for Elastic features. Generally, system indices are reserved only for internal use by these features. While possible, directly accessing or changing a system index can cause instability and other issues.
In 8.0, we’ve made several changes to protect system indices from direct access.
To access a system index, you must now have the
permission set to
superuser role also no longer gives write access to system indices. As a
result, the built-in
elastic superuser can’t change system indices by
If available, use Kibana or the associated Elasticsearch APIs to manage data for a feature rather than accessing a system index. If you attempt to directly access a system index, Elasticsearch will return a warning in the header of API responses and in the deprecation logs.
New kNN search APIedit
With 8.0, we’re introducing a technical preview of the kNN search API.
dense_vector fields, a k-nearest neighbor (kNN)
search finds the k nearest vectors to a query vector, as measured by a
similarity metric. kNN is commonly used to power recommendation engines and rank
relevancy based on natural language processing (NLP) algorithms.
Previously, Elasticsearch only supported exact kNN searches using a
with a vector function. While this method guarantees accurate results, it often
results in slow searches and doesn’t scale well with large datasets. In exchange
for slower indexing and imperfect accuracy, the new kNN search API lets you run
approximate kNN searches on larger datasets and at faster speeds.
Storage savings for
We’ve updated inverted indices, an internal data structure, to use a more
space-efficient encoding. This change will benefit
match_only_text fields, and, to a lesser extent,
text fields. In our
benchmarks using application logs, this translated into a 14.4% reduction of
the size of the index of the
message field (mapped as
an overall 3.5% reduction of the on-disk footprint.
This change will be picked up automatically by both new indices, and existing indices for every new segment.
Faster indexing of
geo_shape, and range fieldsedit
We’ve optimized indexing speeds for multi-dimensional points, an internal data
structure used for
geo_shape, and range fields. Lucene-level
benchmarks reported 10-15% faster indexing for these fields types. Elasticsearch indices
and data streams that mostly consist of these fields may see noticeable
improvements to indexing speed.
PyTorch model support for natural language processing (NLP)edit
Now it is possible to upload PyTorch models that are trained outside Elasticsearch and use them for inference at ingest time. Third party model support brings modern natural language processing (NLP) and search use cases to the Elastic Stack such as:
- Named entity recognition (NER)
- Text classification
- Text embedding
- Zero-shot classification