Tech Topics

How to implement Prometheus long-term storage using Elasticsearch

Prometheus plays a significant role in the observability area. An increasing number of applications use Prometheus exporters to expose performance and monitoring data, which is later scraped by a Prometheus server.

However, when it comes to storage, Prometheus faces some limitations in its scalability and durability since its local storage is limited by single nodes. In order to tackle this, users should either go with clustered storage in Prometheus itself, or use the Prometheus interfaces that allow integrating with remote storage systems. In this blog, we're going to look at how simple and effective it is to integrate Prometheus with Elasticsearch for long-term storage.

Elasticsearch is a powerful, scalable, and efficient storage system that can store various types of data (text, numerics, geo, structured, unstructured, etc.) in the long term. The scaling capabilities and search performance of Elasticsearch make it a prime option when it comes to storing monitoring data. Plus, combining Elasticsearch with Kibana gives users dynamic insights into their data.

Integrating Prometheus with Elasticsearch for long-term storage

Prometheus integrates with remote storage systems in two ways:

  • Prometheus can write samples that it ingests to a remote URL in a standardized format
  • Prometheus can read (back) sample data from a remote URL in a standardized format

The read and write protocols both use a snappy-compressed protocol buffer encoding over HTTP. For details on the request and response messages, see the remote storage protocol buffer definitions.

In order to use Elasticsearch as a remote storage system for Prometheus, the user can employ the official Metricbeat module and more specifically the remote_write metricset. In this, the user can configure Metricbeat like:

# Metrics sent by a Prometheus server using remote_write option 
- module: prometheus 
  metricsets: ["remote_write"] 
  host: "localhost" 
  port: "9201"

And Prometheus:

  - url: "http://localhost:9201/write"

In addition, the user can set up secure settings for the server using TLS/SSL.

With this type of configuration, Metricbeat, on startup, will start an http server that will wait for metrics from a Prometheus server. When metrics are received they will be processed accordingly by Metricbeat and will be efficiently stored in Elasticsearch. Then the user can retrieve and analyze the metrics using Kibana's visualizations and dashboards.

It’s also worth mentioning that Metricbeat, Elasticsearch, and Kibana can run out of the box in Kubernetes environments. This can be achieved by using Elastic Cloud on Kubernetes (ECK), the official Elasticsearch and Kibana operator, along with the predefined Metricbeat manifests. Putting all these together gives the user the opportunity to quickly set up an Elastic-based remote storage integration for their Prometheus servers that run on Kubernetes.

As a result, users will benefit from doubling down on observability for their cloud-native applications and workloads. By creating a long-term storage solution for their data with Elasticsearch, they’ll also have the opportunity to efficiently query them back and further analyze them at any time. Moreover, users can leverage Kibana to build visualizations.

Below is an example of how we can build a dashboard visualizing Prometheus metrics from node exporters and cAdvisor:

Node statistics

And it goes without saying that we would like to monitor how remote storage statistics evolve by visualizing how many samples have been successfully sent to remote storage or how many have been retried, failed, or dropped. Another significant metric to monitor here is the number of shards, which shows the number of concurrency/parallelism with which Prometheus server sends metrics to the remote storage endpoint.

Prometheus remote storage statistics

Implement long-term storage for Prometheus today

Are you already considering a long-term storage solution for your Prometheus metrics? Download Metricbeat 7.7 and start storing your data in Elasticsearch without worrying about scalability and query efficiency. The fastest way to get your cluster deployed is to spin up a free trial of Elasticsearch Service. Of course, if you have any questions, remember that we are always happy to help on the Discuss forums.