Le contenu de cette page n'est pas disponible dans la langue sélectionnée. Chez Elastic, nous mettons tout en œuvre pour vous proposer du contenu dans différentes langues. En attendant, nous vous remercions de votre patience !

23 décembre 2020

An introduction to the Elastic data stream naming scheme

Nicolas Ruflin

With Elastic 7.9, the Elastic Agent and Fleet were released, along with a new way to structure indices and data streams in Elasticsearch for time series data. In this blog post, we'll give an overview of the Elastic data stream naming scheme and how it works. This is the first in a series of blog posts around the Elastic data stream naming scheme.

Elastic data stream naming scheme

The Elastic data stream naming scheme is made for time series data and consists of splitting datasets into different data streams using the following naming convention.

type: Generic type describing the data
dataset: Describes the data ingested and its structure
namespace: User-configurable arbitrary grouping

These three parts are combined by a “-” and result in data streams like logs-nginx.access-production. In all three parts, the “-” character is not allowed. This means all data streams are named in the following way:

{type}-{dataset}-{namespace}

For both dataset and namespace there is a default value, which is dataset=generic and namespace=default. In the case of Elastic Agent, if a user just starts to ingest a log file, the data ends up in logs-generic-default.

To have all benefits of the Elastic data stream naming scheme, each document must contain the following three fields:

data_stream.type
data_stream.dataset
data_stream.namespace

More details about these fields can be found in the Elastic Common Schema (ECS). The above fields are mapped as constant keyword fields, which makes querying on them efficient by reducing the number of shards that have to be queried.

Benefits of the Elastic data stream naming scheme

The Elastic data stream naming scheme has a few benefits over previous indexing strategies used by Beats and Logstash. Instead of very few large indices, many smaller but denser data streams are used. A short summary of the benefits:

Reduced number of fields per index: As the data is split up per data set across multiple data streams, each data stream contains a minimal set of fields. This leads to better space efficiency and faster queries.
More granular control of the data: Having the data split up by data set and namespace allows granular control over rollover, retention, and security permissions.
Flexibility: Users can use the namespace to divide and organize data in any way they want.
Better curated experiences: Due to the common structure of the Elastic data stream naming scheme, it is possible to build a better curated experience on top of the data streams.
Fewer ingest permissions needed: Before, the setup of templates and ingest pipelines was performed by the Elastic Agent. As this now happens in a centralized way, the ingestion tool only needs permissions to append data.

Usage of the Elastic data stream naming scheme

The Elastic data stream naming scheme is supported from Elastic Stack version 7.9 and newer, as it requires support for data streams, the new Elasticsearch component templates, and constant keywords. Index templates for logs-*-* and metrics-*-* ship with Elasticsearch >=7.9. All data shipped with the Elastic Agent uses the Elastic data stream naming scheme. To use it for any other data shipper, just follow the naming structure and add the data_stream fields to make it work.

Summary

This is a short summary of the Elastic data stream naming scheme. In follow-up blog posts, we'll dive into the technical details on how it works behind the scenes, how it is used by the Elastic Agent in detail, and how you can use it for your own benefit. For additional insight, watch the deep dive into the new Elastic indexing strategy on the Elastic Community YouTube channel.

Ingénierie du contexte

Base vectorielle

Applications optimisées pour la recherche

Logs

Protection contre les menaces

Workflows

Elasticsearch

Kibana (Discover, tableaux de bord)

Elastic Agent Builder

AutoOps

Langage de requête canalisé

Modèles de recherche Jina AI

Elastic Cloud Serverless

Elastic Cloud hébergé

Elasticsearch autogéré

Recherche sur les sites d'e-commerce

Recherche dans le service client

Applications axées sur la recherche

Analyse des logs

Suivi d'infrastructure

Suivi de l'expérience numérique

App : suivi des performances

AIOps

Observabilité des LLM

SIEM nouvelle génération

Workflows pour la sécurité

XDR et sécurité aux points de terminaison

L'IA pour la sécurité

Décuplez la valeur de vos données

Fournisseurs cloud

Écosystème IA d'Elastic

Programme de partenariat Search AI

AV-Comparatives

Forrester Wave™ XDR

Leader dans le Magic Quadrant de Gartner

IDC MarketScape

Recherche

Security

Observability

Lancez-vous

Galerie de démonstrations

Téléchargements

Intégrations

Documentation

Elasticsearch Labs

Elastic Security Labs

Elastic Observability Labs

Blog

Communauté

Événements

Webinars

Discussion

Formation

Support technique

Conseil

An introduction to the Elastic data stream naming scheme

Elastic data stream naming scheme

Benefits of the Elastic data stream naming scheme

Usage of the Elastic data stream naming scheme

Summary