December 23, 2020

An introduction to the Elastic data stream naming scheme

With Elastic 7.9, the Elastic Agent and Fleet were released, along with a new way to structure indices and data streams in Elasticsearch for time series data. In this blog post, we'll give an overview of the Elastic data stream naming scheme and how it works. This is the first in a series of blog posts around the Elastic data stream naming scheme.

Elastic data stream naming scheme

The Elastic data stream naming scheme is made for time series data and consists of splitting datasets into different data streams using the following naming convention.

type: Generic type describing the data
dataset: Describes the data ingested and its structure
namespace: User-configurable arbitrary grouping

These three parts are combined by a “-” and result in data streams like logs-nginx.access-production. In all three parts, the “-” character is not allowed. This means all data streams are named in the following way:

{type}-{dataset}-{namespace}

For both dataset and namespace there is a default value, which is dataset=generic and namespace=default. In the case of Elastic Agent, if a user just starts to ingest a log file, the data ends up in logs-generic-default.

To have all benefits of the Elastic data stream naming scheme, each document must contain the following three fields:

data_stream.type
data_stream.dataset
data_stream.namespace

More details about these fields can be found in the Elastic Common Schema (ECS). The above fields are mapped as constant keyword fields, which makes querying on them efficient by reducing the number of shards that have to be queried.

Benefits of the Elastic data stream naming scheme

The Elastic data stream naming scheme has a few benefits over previous indexing strategies used by Beats and Logstash. Instead of very few large indices, many smaller but denser data streams are used. A short summary of the benefits:

Reduced number of fields per index: As the data is split up per data set across multiple data streams, each data stream contains a minimal set of fields. This leads to better space efficiency and faster queries.
More granular control of the data: Having the data split up by data set and namespace allows granular control over rollover, retention, and security permissions.
Flexibility: Users can use the namespace to divide and organize data in any way they want.
Better curated experiences: Due to the common structure of the Elastic data stream naming scheme, it is possible to build a better curated experience on top of the data streams.
Fewer ingest permissions needed: Before, the setup of templates and ingest pipelines was performed by the Elastic Agent. As this now happens in a centralized way, the ingestion tool only needs permissions to append data.

Usage of the Elastic data stream naming scheme

The Elastic data stream naming scheme is supported from Elastic Stack version 7.9 and newer, as it requires support for data streams, the new Elasticsearch component templates, and constant keywords. Index templates for logs-*-* and metrics-*-* ship with Elasticsearch >=7.9. All data shipped with the Elastic Agent uses the Elastic data stream naming scheme. To use it for any other data shipper, just follow the naming structure and add the data_stream fields to make it work.

Summary

This is a short summary of the Elastic data stream naming scheme. In follow-up blog posts, we'll dive into the technical details on how it works behind the scenes, how it is used by the Elastic Agent in detail, and how you can use it for your own benefit. For additional insight, watch the deep dive into the new Elastic indexing strategy on the Elastic Community YouTube channel.

Context engineering

Vector database

Search powered applications

Logs

Threat protection

Workflows

Elasticsearch

Kibana (Discover, Dashboards)

Elastic Agent Builder

AutoOps

Piped query language

Jina AI search models

Elastic Cloud Serverless

Elastic Cloud Hosted

Self-managed Elasticsearch

Ecommerce search

Customer support search

Search-driven apps

Log analytics

Infrastructure monitoring

Digital experience monitoring

App performance monitoring

AIOps

LLM observability

Next-gen SIEM

Workflows for security

XDR and endpoint security

AI for security

10x your data's value

Cloud providers

Elastic AI Ecosystem

Search AI Partner Program

AV-Comparatives

Forrester Wave™ XDR

Gartner Magic Quadrant Leader

IDC MarketScape

Search

Security

Observability

Get started

Demo gallery

Downloads

Integrations

Docs

Elasticsearch Labs

Elastic Security Labs

Elastic Observability Labs

Blog

Community

Events

Webinars

Discuss

Training

Support

Consulting

An introduction to the Elastic data stream naming scheme

Elastic data stream naming scheme

Benefits of the Elastic data stream naming scheme

Usage of the Elastic data stream naming scheme

Summary