Elastic's metrics analytics gets 5x faster

Explore Elastic's metrics analytics enhancements, including faster ES|QL queries, TSDS updates and OpenTelemetry exponential histogram support.

Elastic's metrics analytics gets 5x faster

In our previous blog in this series, we explored the fundamentals of analyzing metrics using the Elasticsearch Query Language (ES|QL) and the interactive power of Discover. Building on that foundation, we are excited to announce a suite of powerful enhancements to Time Series Data Streams (Elastic’s TSDB) and ES|QL designed to provide even more comprehensive and blazingly faster metrics analytics capabilities!

These latest updates, available in v9.3 and in Serverless, introduce significant performance gains, sophisticated time series functions, and native OpenTelemetry exponential histogram support that directly benefit SREs and Observability practitioners.

Query Performance and Storage Optimizations

Speed is paramount when diagnosing incidents. Compared to prior releases, we have achieved a 5x+ improvement in query latency when wildcarding or filtering by dimensions. Additionally, storage efficiency for OpenTelemetry metrics data has improved by approximately 2x, significantly reducing the infrastructure footprint required to retain high-volume observability data. If you’re hungry to learn more about what architectural updates are driving these optimizations, stay tuned… Tech blogs are on their way! 

Expanded Time Series Analytics in ES|QL

The ESQL TS source command, which targets time series indices and enables time series aggregation functions, has been significantly enhanced to support complex analytics capabilities.

We have expanded the library of time series functions to include essential tools for identifying anomalies and trends.

  • PERCENTILE_OVER_TIME
    ,
    STDDEV_OVER_TIME
    ,
    VARIANCE_OVER_TIME
    : Calculate the percentile, standard deviation, or variance of a field over time, which is critical for understanding distribution and variability in service latency or resource usage.

Example: Seeing the worst-case latency in 5-minute intervals.

TS metrics*  | STATS MAX(PERCENTILE_OVER_TIME(kafka.consumer.fetch_latency_avg, 99))
  BY TBUCKET(5m)
  • DERIV
    : This command calculates the derivative of a numeric field over time using linear regression, useful for analyzing the rate of change in system metrics.

Example: trending gauge values over time.

TS metrics*  | STATS AVG(DERIV(container.memory.available))
  BY TBUCKET(1 hour)
  • CLAMP
    : To handle noisy data or outliers, this function limits sample values to a specified lower and upper bound.

Example: handling saturation metrics (like CPU or Memory utilization) where spikes or measurement errors can occasionally report values over 100%, making the rest of the data look like a flat line at the bottom of the chart.\

TS metrics*  | STATS AVG(CLAMP(k8s.pod.memory.node.utilization, 0, 100))
  BY k8s.pod.name
  • TRANGE
    : This new filter function allows you to filter data for a specific time range using the
    @timestamp
    attribute, simplifying query syntax for time-bound investigations.

Example: Filtering and showing metrics for the last 4 hours.

TS metrics*  | WHERE TRANGE(4h) | STATS AVG(host.cpu.pct)
  BY TBUCKET(5m)

Window Functions To smoothen results over specific periods, ES|QL now introduces window functions. Most time series aggregation functions now accept an optional second argument that specifies a sliding time window. For example, you can calculate a rate over a 10-minute sliding window while bucketing results by minute.

Example: Calculating the average rate of requests per host for every minute, using values over a sliding window of 5 minutes.

TS metrics*  | STATS AVG(RATE(app.frontend.requests, 5m))
  BY TBUCKET(1m)

Accepted window values are currently limited to multiples of the time bucket interval in the BY clause. Windows that are smaller than the time bucket interval or larger but not a multiple of the time bucket interval will be supported in feature releases. 

Native OpenTelemetry Exponential Histograms

Elastic now provides native support for OpenTelemetry exponential histograms, enabling efficient ingest, querying, and downsampling of high-fidelity distribution data.

We have introduced a new exponential_histogram field type designed to capture distributions with fixed, exponentially spaced bucket boundaries. Because these fields are primarily intended for aggregations, the histogram is stored as compact doc values and is not indexed, optimizing storage efficiency. These fields are fully supported in ES|QL aggregation functions such as

PERCENTILES
,
AVG
,
MIN
,
MAX
, and
SUM
.

You can index documents with exponential histograms automatically through our OTLP endpoint or manually. For example, let’s create an index with an exponential histogram field and a keyword field:

PUT my-index-000001
{
  "settings": {
    "index": {
      "mode": "time_series",
      "routing_path": ["http.path"],
      "time_series": {
        "start_time": "2026-01-21T00:00:00Z",
        "end_time": "2026-01-25T00:00:00Z"
     }
    }
  },
  "mappings": {
    "properties": {
      "@timestamp": {
        "type": "date"
      },
      "http.path": {
        "type": "keyword",
        "time_series_dimension": true
      },
      "responseTime": {
        "type": "exponential_histogram",
        "time_series_metric": "histogram"
      }
    }
  }
}

Index a document with a full exponential histogram payload:

POST my-index-000001/_doc
{
  "@timestamp": "2026-01-22T21:25:00.000Z",
  "http.path": "/foo",
  "responseTime": {
    "scale":3,
    "sum":73.2,
    "min":3.12,
    "max":7.02,
    "positive": {
      "indices":[13,14,15,16,17,18,19,20,21,22],
      "counts":[1,1,2,2,1,2,1,3,1,1]
    }
  }
}

POST my-index-000001/_doc
{
  "@timestamp": "2026-01-22T21:26:00.000Z",
  "http.path": "/bar",
  "responseTime": {
    "scale":3,
    "sum":45.86,
    "min":2.15,
    "max":5.1,
    "positive": {
      "indices":[8,9,10,11,12,13,14,15,16,17,18],
      "counts":[1,1,1,1,1,1,1,2,1,1,2]
    }
  }
}

And finally, query the time series index using ES|QL and the TS source command:

TS my-index-000001  | STATS MIN(responseTime), MAX(responseTime),
        AVG(responseTime), MEDIAN(responseTime),
        PERCENTILE(responseTime, 90)
  BY http.path

Enhanced Downsampling

Downsampling is essential for long-term data retention. We have introduced a new "last value" downsampling mode. This method exchanges accuracy for storage efficiency and performance by keeping only the last sample value, providing a lightweight alternative to calculating aggregate metrics.

You can configure a time series data stream for last value downsampling in a similar way as regular downsampling, just by setting the

downsampling_method
to
last_value
. For example, by using a data stream lifecycle:

PUT _data_stream/my-data-stream/_lifecycle
{
  "data_retention": "7d",
  "downsampling_method": "last_value",
  "downsampling": [
     {
       "after": "1m",
       "fixed_interval": "10m"
      },
      {
        "after": "1d",
        "fixed_interval": "1h"
      }
   ]
}

In Conclusion

These enhancements mark a significant step forward in Elastic's metrics analytics capabilities, delivering 5x+ faster query latency, 2x storage efficiency and specialized commands like

DERIV
,
CLAMP
, and
PERCENTILE_OVER_TIME
. With native support for OpenTelemetry exponential histograms and expanded downsampling options, SREs can now perform richer, more cost-effective analysis on their observability data. This release empowers teams to detect anomalies faster and manage long-term metrics retention with greater efficiency.

We welcome you to try the new features today!

Share this article