<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Elastic Observability Labs - Articles by Vinay Chandrasekhar</title>
        <link>https://www.elastic.co/observability-labs</link>
        <description>Trusted security news &amp; research from the team at Elastic.</description>
        <lastBuildDate>Thu, 04 Jun 2026 17:54:35 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Elastic Observability Labs - Articles by Vinay Chandrasekhar</title>
            <url>https://www.elastic.co/observability-labs/assets/observability-labs-thumbnail.png</url>
            <link>https://www.elastic.co/observability-labs</link>
        </image>
        <copyright>© 2026. Elasticsearch B.V. All Rights Reserved</copyright>
        <item>
            <title><![CDATA[Elastic's metrics analytics gets 5x faster]]></title>
            <link>https://www.elastic.co/observability-labs/blog/elastic-metrics-analytics</link>
            <guid isPermaLink="false">elastic-metrics-analytics</guid>
            <pubDate>Wed, 28 Jan 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Explore Elastic's metrics analytics enhancements, including faster ES|QL queries, TSDS updates and OpenTelemetry exponential histogram support.]]></description>
            <content:encoded><![CDATA[<p>In our <a href="https://www.elastic.co/observability-labs/blog/metrics-explore-analyze-with-esql-discover">previous blog in this series</a>, we explored the fundamentals of analyzing metrics using the Elasticsearch Query Language (ES|QL) and the interactive power of Discover. Building on that foundation, we are excited to announce a suite of powerful enhancements to Time Series Data Streams (Elastic’s TSDB) and ES|QL designed to provide even more comprehensive and blazingly faster metrics analytics capabilities!</p>
<p>These latest updates, available in v9.3 and in Serverless, introduce significant performance gains, sophisticated time series functions, and native OpenTelemetry exponential histogram support that directly benefit SREs and Observability practitioners.</p>
<h2>Query Performance and Storage Optimizations</h2>
<p>Speed is paramount when diagnosing incidents. Compared to prior releases, we have achieved a 5x+ improvement in query latency when wildcarding or filtering by dimensions. Additionally, storage efficiency for OpenTelemetry metrics data has improved by approximately 2x, significantly reducing the infrastructure footprint required to retain high-volume observability data. If you’re hungry to learn more about what architectural updates are driving these optimizations, stay tuned… Tech blogs are on their way! </p>
<h2>Expanded Time Series Analytics in ES|QL</h2>
<p>The <a href="https://www.elastic.co/docs/reference/query-languages/esql/commands/ts">ESQL TS source command</a>, which targets time series indices and enables <a href="https://www.elastic.co/docs/reference/query-languages/esql/functions-operators/time-series-aggregation-functions">time series aggregation functions</a>, has been significantly enhanced to support complex analytics capabilities.</p>
<p>We have expanded the <a href="https://www.elastic.co/docs/reference/query-languages/esql/esql-functions-operators">library of time series functions</a> to include essential tools for identifying anomalies and trends.</p>
<ul>
<li><code>PERCENTILE_OVER_TIME</code>, <code>STDDEV_OVER_TIME</code>, <code>VARIANCE_OVER_TIME</code>: Calculate the percentile, standard deviation, or variance of a field over time, which is critical for understanding distribution and variability in service latency or resource usage.</li>
</ul>
<p>Example: Seeing the worst-case latency in 5-minute intervals.</p>
<pre><code class="language-bash">TS metrics*  | STATS MAX(PERCENTILE_OVER_TIME(kafka.consumer.fetch_latency_avg, 99))
  BY TBUCKET(5m)
</code></pre>
<ul>
<li><code>DERIV</code>: This command calculates the derivative of a numeric field over time using linear regression, useful for analyzing the rate of change in system metrics.</li>
</ul>
<p>Example: trending gauge values over time.</p>
<pre><code class="language-bash">TS metrics*  | STATS AVG(DERIV(container.memory.available))
  BY TBUCKET(1 hour)
</code></pre>
<ul>
<li><code>CLAMP</code>: To handle noisy data or outliers, this function limits sample values to a specified lower and upper bound.</li>
</ul>
<p>Example: handling saturation metrics (like CPU or Memory utilization) where spikes or measurement errors can occasionally report values over 100%, making the rest of the data look like a flat line at the bottom of the chart.\</p>
<pre><code class="language-bash">TS metrics*  | STATS AVG(CLAMP(k8s.pod.memory.node.utilization, 0, 100))
  BY k8s.pod.name
</code></pre>
<ul>
<li><code>TRANGE</code>: This new filter function allows you to filter data for a specific time range using the <code>@timestamp</code> attribute, simplifying query syntax for time-bound investigations.</li>
</ul>
<p>Example: Filtering and showing metrics for the last 4 hours.</p>
<pre><code class="language-bash">TS metrics*  | WHERE TRANGE(4h) | STATS AVG(host.cpu.pct)
  BY TBUCKET(5m)
</code></pre>
<p><strong>Window Functions</strong> To smoothen results over specific periods, ES|QL now introduces window functions. Most time series aggregation functions now accept an optional second argument that specifies a sliding time window. For example, you can calculate a rate over a 10-minute sliding window while bucketing results by minute.</p>
<p>Example: Calculating the average rate of requests per host for every minute, using values over a sliding window of 5 minutes.</p>
<pre><code class="language-bash">TS metrics*  | STATS AVG(RATE(app.frontend.requests, 5m))
  BY TBUCKET(1m)
</code></pre>
<p>Accepted window values are currently limited to multiples of the time bucket interval in the BY clause. Windows that are smaller than the time bucket interval or larger but not a multiple of the time bucket interval will be supported in feature releases. </p>
<h2>Native OpenTelemetry Exponential Histograms</h2>
<p>Elastic now provides native support for OpenTelemetry exponential histograms, enabling efficient ingest, querying, and downsampling of high-fidelity distribution data.</p>
<p>We have introduced a new <a href="https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/exponential-histogram">exponential_histogram</a> field type designed to capture distributions with fixed, exponentially spaced bucket boundaries. Because these fields are primarily intended for aggregations, the histogram is stored as compact doc values and is not indexed, optimizing storage efficiency. These fields are fully supported in ES|QL aggregation functions such as <code>PERCENTILES</code>, <code>AVG</code>, <code>MIN</code>, <code>MAX</code>, and <code>SUM</code>.</p>
<p>You can index documents with exponential histograms automatically through our <a href="https://www.elastic.co/docs/manage-data/data-store/data-streams/tsds-ingest-otlp#configure-histogram-handling">OTLP endpoint</a> or manually. For example, let’s create an index with an exponential histogram field and a keyword field:</p>
<pre><code class="language-bash">PUT my-index-000001
{
  &quot;settings&quot;: {
    &quot;index&quot;: {
      &quot;mode&quot;: &quot;time_series&quot;,
      &quot;routing_path&quot;: [&quot;http.path&quot;],
      &quot;time_series&quot;: {
        &quot;start_time&quot;: &quot;2026-01-21T00:00:00Z&quot;,
        &quot;end_time&quot;: &quot;2026-01-25T00:00:00Z&quot;
     }
    }
  },
  &quot;mappings&quot;: {
    &quot;properties&quot;: {
      &quot;@timestamp&quot;: {
        &quot;type&quot;: &quot;date&quot;
      },
      &quot;http.path&quot;: {
        &quot;type&quot;: &quot;keyword&quot;,
        &quot;time_series_dimension&quot;: true
      },
      &quot;responseTime&quot;: {
        &quot;type&quot;: &quot;exponential_histogram&quot;,
        &quot;time_series_metric&quot;: &quot;histogram&quot;
      }
    }
  }
}
</code></pre>
<p>Index a document with a full exponential histogram payload:</p>
<pre><code class="language-bash">POST my-index-000001/_doc
{
  &quot;@timestamp&quot;: &quot;2026-01-22T21:25:00.000Z&quot;,
  &quot;http.path&quot;: &quot;/foo&quot;,
  &quot;responseTime&quot;: {
    &quot;scale&quot;:3,
    &quot;sum&quot;:73.2,
    &quot;min&quot;:3.12,
    &quot;max&quot;:7.02,
    &quot;positive&quot;: {
      &quot;indices&quot;:[13,14,15,16,17,18,19,20,21,22],
      &quot;counts&quot;:[1,1,2,2,1,2,1,3,1,1]
    }
  }
}

POST my-index-000001/_doc
{
  &quot;@timestamp&quot;: &quot;2026-01-22T21:26:00.000Z&quot;,
  &quot;http.path&quot;: &quot;/bar&quot;,
  &quot;responseTime&quot;: {
    &quot;scale&quot;:3,
    &quot;sum&quot;:45.86,
    &quot;min&quot;:2.15,
    &quot;max&quot;:5.1,
    &quot;positive&quot;: {
      &quot;indices&quot;:[8,9,10,11,12,13,14,15,16,17,18],
      &quot;counts&quot;:[1,1,1,1,1,1,1,2,1,1,2]
    }
  }
}
</code></pre>
<p>And finally, query the time series index using ES|QL and the TS source command:</p>
<pre><code class="language-bash">TS my-index-000001  | STATS MIN(responseTime), MAX(responseTime),
        AVG(responseTime), MEDIAN(responseTime),
        PERCENTILE(responseTime, 90)
  BY http.path
</code></pre>
<p><img src="https://www.elastic.co/observability-labs/assets/images/elastic-metrics-analytics/exponential_histogram_esql_example.png" alt="Alt text" /></p>
<h2>Enhanced Downsampling</h2>
<p>Downsampling is essential for long-term data retention. We have introduced a new <a href="https://www.elastic.co/docs/manage-data/data-store/data-streams/downsampling-concepts#downsampling-methods">&quot;last value&quot; downsampling mode</a>. This method exchanges accuracy for storage efficiency and performance by keeping only the last sample value, providing a lightweight alternative to calculating aggregate metrics.</p>
<p>You can <a href="https://www.elastic.co/docs/manage-data/data-store/data-streams/run-downsampling">configure a time series data stream</a> for last value downsampling in a similar way as regular downsampling, just by setting the <code>downsampling_method</code> to <code>last_value</code>. For example, by using a data stream lifecycle:</p>
<pre><code class="language-bash">PUT _data_stream/my-data-stream/_lifecycle
{
  &quot;data_retention&quot;: &quot;7d&quot;,
  &quot;downsampling_method&quot;: &quot;last_value&quot;,
  &quot;downsampling&quot;: [
     {
       &quot;after&quot;: &quot;1m&quot;,
       &quot;fixed_interval&quot;: &quot;10m&quot;
      },
      {
        &quot;after&quot;: &quot;1d&quot;,
        &quot;fixed_interval&quot;: &quot;1h&quot;
      }
   ]
}
</code></pre>
<h2>In Conclusion</h2>
<p>These enhancements mark a significant step forward in Elastic's metrics analytics capabilities, delivering 5x+ faster query latency, 2x storage efficiency and specialized commands like <code>DERIV</code>, <code>CLAMP</code>, and <code>PERCENTILE_OVER_TIME</code>. With native support for OpenTelemetry exponential histograms and expanded downsampling options, SREs can now perform richer, more cost-effective analysis on their observability data. This release empowers teams to detect anomalies faster and manage long-term metrics retention with greater efficiency.</p>
<p>We welcome you to <a href="https://cloud.elastic.co/serverless-registration?onboarding_token=observability">try the new features</a> today!</p>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/elastic-metrics-analytics/elastic_metrics_leaner_blog_image.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[LLM Observability: Azure OpenAI]]></title>
            <link>https://www.elastic.co/observability-labs/blog/llm-observability-azure-openai</link>
            <guid isPermaLink="false">llm-observability-azure-openai</guid>
            <pubDate>Mon, 24 Jun 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[We are excited to announce the general availability of the Azure OpenAI Integration that provides comprehensive Observability into the performance and usage of the Azure OpenAI Service!]]></description>
            <content:encoded><![CDATA[<p>We are excited to announce the general availability of the <a href="https://www.elastic.co/integrations/data-integrations?solution=all-solutions&amp;category=azure">Azure OpenAI Integration</a> that provides comprehensive Observability into the performance and usage of the <a href="https://azure.microsoft.com/en-us/products/ai-services/openai-service">Azure OpenAI Service</a>! Also look at <a href="https://www.elastic.co/observability-labs/blog/llm-observability-azure-openai-v2">Part 2 of this blog</a></p>
<p>While we have offered <a href="https://www.elastic.co/observability-labs/blog/monitor-openai-api-gpt-models-opentelemetry">visibility into LLM environments</a> for a while now, the addition of our Azure OpenAI integration enables richer out-of-the-box visibility into the performance and usage of your Azure OpenAI based applications, further enhancing LLM Observability.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/llm-observability-azure-openai/llm-observability-azure-openai-monitoring.png" alt="LLM Observability: Azure OpenAI Monitoring" /></p>
<p>The Azure OpenAI integration leverages <a href="https://www.elastic.co/elastic-agent">Elastic Agent</a>’s Azure integration capabilities to collect both logs (using <a href="https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/stream-monitoring-data-event-hubs">Azure EventHub</a>) and metrics (using <a href="https://learn.microsoft.com/en-us/azure/azure-monitor/reference/supported-metrics/metrics-index">Azure Monitor</a>) to provide deep visibility on the usage of the <a href="https://azure.microsoft.com/en-us/products/ai-services/openai-service">Azure OpenAI Service</a>.</p>
<p>The integration includes an out-of-the-box dashboard that summarizes the most relevant aspects of the service usage, including request and error rates, token usage and chat completion latency.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/llm-observability-azure-openai/llm-observability-azure-openai-monitoring-overview.png" alt="LLM Observability: Azure OpenAI Monitoring Overview" /></p>
<h2>Creating Alerts and SLOs to monitor Azure OpenAI</h2>
<p>As with every other Elastic integration, all the <a href="https://www.elastic.co/docs/current/integrations/azure_openai#logs">logs</a> and <a href="https://www.elastic.co/docs/current/integrations/azure_openai#metrics">metrics</a> information is fully available to leverage in every capability in <a href="https://www.elastic.co/observability">Elastic Observability</a>, including <a href="https://www.elastic.co/guide/en/observability/current/slo.html">SLOs</a>, <a href="https://www.elastic.co/guide/en/observability/current/create-alerts.html">alerting</a>, custom <a href="https://www.elastic.co/guide/en/kibana/current/dashboard.html">dashboards</a>, in-depth <a href="https://www.elastic.co/guide/en/observability/current/monitor-logs.html">logs exploration</a>, etc.</p>
<p>To create an alert to monitor token usage, for example, start with the Custom Threshold rule on the Azure OpenAI datastream and set an aggregation condition to track and report violations of token usage past a certain threshold.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/llm-observability-azure-openai/llm-observability-azure-openai-create-alert.png" alt="LLM Observability: Azure OpenAI Monitoring Alert Creation" /></p>
<p>When a violation occurs, the Alert Details view linked in the alert notification for that alert provides rich context surrounding the violation, such as when the violation started, its current status, and any previous history of such violations, enabling quick triaging, investigation and root cause analysis.</p>
<p>Similarly, to create an SLO to monitor error rates in Azure OpenAI calls, start with the custom query SLI definition adding in the good events to be any result signature at or above 400 over a total value that includes all responses. Then, by setting an appropriate SLO target such as 99%, start monitoring your Azure OpenAI error rate SLO over a period of 7, 30, or 90 days to track degradation and take action before it becomes a pervasive problem.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/llm-observability-azure-openai/llm-observability-azure-openai-create-slo.png" alt="LLM Observability: Azure OpenAI Monitoring SLO Creation" /></p>
<p>Please refer to the <a href="https://www.elastic.co/guide/en/observability/current/monitor-azure-openai.html">User Guide</a> to learn more and to get started!</p>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/llm-observability-azure-openai/AI_fingertip_touching_human_fingertip.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Explore and Analyze Metrics with Ease in Elastic Observability]]></title>
            <link>https://www.elastic.co/observability-labs/blog/metrics-explore-analyze-with-esql-discover</link>
            <guid isPermaLink="false">metrics-explore-analyze-with-esql-discover</guid>
            <pubDate>Thu, 23 Oct 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[The latest enhancements to ES|QL and Discover based metrics exploration unleash a potent set of tools for quick and effective metrics analytics.]]></description>
            <content:encoded><![CDATA[<h2>Metrics are critical in identifying the “what”</h2>
<p>As a core pillar of Observability, metrics offer a highly structured, quantitative view of system performance and health. They provide a crucial symptomatic perspective—revealing <em>what</em> is happening, such as high application latency, increasing service errors, or spiking container CPU utilization, which is essential for initiating alerting and triaging efforts. This capability for effective monitoring, alerting, and triaging is paramount to ensuring robust service delivery and achieving successful business outcomes.</p>
<p>Elastic Observability provides a comprehensive, end-to-end experience for metrics data. Elastic ensures that metrics data can be collected from numerous sources, enriched as needed and shipped to the Elastic Stack. Elastic efficiently stores this time series data, including high-cardinality metrics, utilizing the <a href="https://www.elastic.co/observability-labs/blog/time-series-data-streams-observability-metrics">TSDS index mode</a> (Time Series Data Stream), introduced in <a href="https://www.elastic.co/blog/whats-new-elasticsearch-8-7-0#efficient-storage-of-metrics-with-tsdb,-now-generally-available">prior versions</a> and used across Elastic time series <a href="https://www.elastic.co/blog/70-percent-storage-savings-for-metrics-with-elastic-observability">integrations</a>. This foundation ensures comprehensive observability through out-of-the-box dashboards, alerts, SLOs, and streamlined data management.</p>
<p>Elastic Observability 9.2 provides enhancements to metrics exploration and analysis through powerful query language extensions and expanded UI capabilities. These enhancements focus on making analysis on TSDS data via counter rates and common aggregations over time easier and faster than ever before.</p>
<p>The main metrics enhancements center on these key features, offered as Tech Preview:</p>
<ol>
<li>Metrics analytics with TSDS and ES|QL</li>
<li>Interactive metrics exploration in Discover</li>
<li>OTLP endpoint for metrics</li>
</ol>
<h2>Metrics analytics with TSDS and ES|QL</h2>
<p>The introduction of the new <a href="https://www.elastic.co/docs/reference/query-languages/esql/commands/ts"><code>TS</code> source command</a> in <a href="https://www.elastic.co/docs/reference/query-languages/esql">ES|QL</a> (Elasticsearch Query Language) on TSDS metrics dramatically simplifies time series analysis.</p>
<p>The <code>TS</code> command is specifically designed to target only time series indices, differentiating it from the general <code>FROM</code> command. Its core power lies in enabling a dedicated suite of time series aggregation functions within the <code>STATS</code> command.</p>
<p>This mechanism utilizes a dual aggregation paradigm, which is standard for time series querying. These queries involve two aggregation functions:</p>
<ul>
<li>
<p><strong>Inner (Time Series) function:</strong> Applied implicitly per time series, often over bucketed time intervals.</p>
</li>
<li>
<p><strong>Outer (Regular) function:</strong> Used to aggregate the results of the inner function across groups. For instance, if you use <code>STATS SUM(RATE(search_requests)) BY TBUCKET(1 hour), host</code>, the <code>RATE()</code> function is the inner function applied per time series in hourly buckets, and <code>SUM()</code> is the outer function, summing these rates for each host and hourly bucket.</p>
</li>
</ul>
<p>If an ES|QL query using the <code>TS</code> command is missing an inner (time series) aggregation function, <code>LAST_OVER_TIME()</code> is implicitly assumed and used. For example, <code>TS metrics | STATS AVG(memory_usage)</code> is equivalent to <code>TS metrics | STATS AVG(LAST_OVER_TIME(memory_usage))</code>.</p>
<h3>Key time series aggregation functions available in ES|QL via <code>TS</code> command</h3>
<p>These functions allow for powerful analysis on time-series data:</p>
<table>
<thead>
<tr>
<th align="center"></th>
<th align="center"></th>
<th align="center"></th>
</tr>
</thead>
<tbody>
<tr>
<td align="center"><strong>Function</strong></td>
<td align="center"><strong>Description</strong></td>
<td align="center"><strong>Example Use Case</strong></td>
</tr>
<tr>
<td align="center"><code>RATE()</code> <strong>/</strong> <code>IRATE()</code></td>
<td align="center">Calculates the per-second average rate of increase of a counter (<code>RATE</code>), accounting for non-monotonic breaks like counter resets, making it the most appropriate function for counters, or the per-second rate of increase between the last two data points (<code>IRATE</code>), ignoring all but the last two points for high responsiveness.</td>
<td align="center">Calculating request per second (RPS) or throughput.</td>
</tr>
<tr>
<td align="center"><code>AVG_OVER_TIME()</code></td>
<td align="center">Calculates the average of a numeric field over the defined time range.</td>
<td align="center">Determining average resource usage over an hour.</td>
</tr>
<tr>
<td align="center"><code>SUM_OVER_TIME()</code></td>
<td align="center">Calculates the sum of a field over the time range.</td>
<td align="center">Total errors over a specific time window.</td>
</tr>
<tr>
<td align="center"><code>MAX_OVER_TIME()</code> <strong>/</strong> <code>MIN_OVER_TIME()</code></td>
<td align="center">Calculates the maximum or minimum value of a field over time.</td>
<td align="center">Identifying peak resource consumption.</td>
</tr>
<tr>
<td align="center"><code>DELTA()</code> <strong>/</strong> <code>IDELTA()</code></td>
<td align="center">Calculates the absolute change of a gauge field over a time window (<code>DELTA</code>) or specifically between the last two data points (<code>IDELTA</code>), making <code>IDELTA</code> more responsive to recent changes.</td>
<td align="center">Tracking changes in system gauge metrics (e.g., buffer size).</td>
</tr>
<tr>
<td align="center"><code>INCREASE()</code></td>
<td align="center">Calculates the absolute increase of a counter (<code>INCREASE</code>).</td>
<td align="center">Analyzing immediate rate changes in fast-moving counters.</td>
</tr>
<tr>
<td align="center"><code>FIRST_OVER_TIME()</code> <strong>/</strong> <code>LAST_OVER_TIME()</code></td>
<td align="center">Calculates the earliest or latest recorded value of a field, determined by the <code>@timestamp</code> field.</td>
<td align="center">Inspecting initial and final metric states within a bucket.</td>
</tr>
<tr>
<td align="center"><code>ABSENT_OVER_TIME()</code> <strong>/</strong> <code>PRESENT_OVER_TIME()</code></td>
<td align="center">Calculates the absence or presence of a field in the result over the time range.</td>
<td align="center">Identifying monitoring coverage gaps.</td>
</tr>
<tr>
<td align="center"><code>COUNT_OVER_TIME()</code> <strong>/</strong> <code>COUNT_DISTINCT_OVER_TIME()</code></td>
<td align="center">Calculates the total count or the count of distinct values of a field over time.</td>
<td align="center">Measuring frequency or cardinality changes.</td>
</tr>
</tbody>
</table>
<p>These functions, available with the <code>TS</code> command, allow SREs and Ops teams to easily perform rate calculations and other common aggregations, enabling efficient metrics analysis as a routine part of observability workflows. And it’s much faster, too! Internal performance testing has revealed that TS commands outperform other ways of querying metrics data by an order of magnitude or more, and consistently! </p>
<h2>Interactive metrics exploration in Discover</h2>
<p>The 9.2 release introduces the capability to explore and analyze metrics directly and interactively within the Discover interface. In addition to exploring and analyzing logs and raw events, Discover now provides a dedicated environment for metrics exploration:</p>
<ul>
<li>
<p><strong>Easy start:</strong> Begin exploration simply by querying metrics ingested via <code>TS metrics-*</code>.</p>
</li>
<li>
<p><strong>Grid view and pre-applied aggregations:</strong> This command displays all metrics in a grid format at a glance, immediately applying the appropriate aggregations based on the metric type, such as <code>rate</code> versus <code>avg</code>.</p>
</li>
<li>
<p><strong>Search and group-by:</strong> Quickly search for specific metrics by name. Also easily group and analyze metrics by dimensions (labels) and specific values. This allows narrowing down to metrics and dimensions of choice for targeted analysis.</p>
</li>
<li>
<p><strong>Quick access to details:</strong> Furthermore, the interface provides access to crucial details, including query and response details, the underlying ES|QL commands, the metric field type, and applicable dimensions, for each metric.</p>
</li>
<li>
<p><strong>Easy tweaking and dashboarding:</strong> The system automatically populates ES|QL queries, aiding in making easy tweaks, slicing, and dicing the data. Once analyzed, metrics and resulting analyses can be added to new or existing dashboards with ease.</p>
</li>
</ul>
<p><img src="https://www.elastic.co/observability-labs/assets/images/metrics-explore-analyze-with-esql-discover/metrics-discover-ts-command.png" alt="Interactive metrics exploration in Discover" /></p>
<h2>OTLP endpoint for metrics</h2>
<p>We are also introducing a native OpenTelemetry Protocol (OTLP) endpoint specifically for metrics ingest directly into Elasticsearch. The endpoint especially benefits self-managed customers, and will be integrated into our <a href="https://www.elastic.co/docs/reference/opentelemetry/motlp">Elastic Cloud Managed OTLP Endpoint</a> for Elastic-managed offerings. The native endpoint and related updates improve ingest performance and scalability of OTel metrics, providing up to 60% higher throughput via <code>_otlp</code>, and up to 25% higher throughput when using classic <code>_bulk</code> methods. </p>
<h2>In Conclusion</h2>
<p>By merging the power of ES|QL's new time series aggregations with the familiar interactive experience of Discover, Elastic 9.2 enables a potent set of metrics analytics tools. The tools significantly boost the exploration and analysis phase of any observability workflow. And we’re just getting started on unleashing the full power of metrics in Elastic Observability!</p>
<p>We welcome you to <a href="https://cloud.elastic.co/serverless-registration?onboarding_token=observability">try the new features</a> today!</p>
<p>Also learn more about how we provide metrics analytics for AWS, Azure, GCP, Kubernetes, and LLMs on <a href="https://www.elastic.co/observability-labs">Observability Labs</a></p>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/metrics-explore-analyze-with-esql-discover/metrics-blog-image-ts-discover.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Migrating Datadog and Grafana dashboards and alerts to Kibana with the Observability Migration Platform]]></title>
            <link>https://www.elastic.co/observability-labs/blog/migrate-datadog-grafana-dashboards-alerts-to-kibana</link>
            <guid isPermaLink="false">migrate-datadog-grafana-dashboards-alerts-to-kibana</guid>
            <pubDate>Tue, 28 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Learn how to migrate supported Datadog and Grafana dashboards and alerts to Kibana with the Observability Migration Platform.]]></description>
            <content:encoded><![CDATA[<p>The Observability Migration Platform is a CLI-driven workflow that translates supported Grafana and Datadog assets into Kibana-native outputs and produces the evidence needed to review the result. It changes migration from a manual rebuild into a translation-and-verification workflow that gets teams into <a href="https://www.elastic.co/docs/solutions/observability">Elastic Observability</a> faster.</p>
<h2>Migrations covered by the Observability Migration Platform</h2>
<p>The current scope covers Datadog and Grafana. The platform can work from exported assets or live APIs, and it focuses on dashboards and alerting content on the Datadog and Grafana paths it currently covers.</p>
<p>Support is not identical across the two sources. Datadog has end-to-end extraction, validation, compile, upload, smoke, and verification workflows, but it currently covers a narrower slice of widgets and monitors. Grafana coverage is broader. The platform provides a practical translation pipeline for the supported paths.</p>
<p>The screenshots below show examples of dashboards after migration.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/migrate-datadog-grafana-dashboards-alerts-to-kibana/migrated-dashboard-1.jpg" alt="Migrated Node Exporter Full dashboard in Kibana, top of page showing CPU, memory, network, and disk panels" /></p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/migrate-datadog-grafana-dashboards-alerts-to-kibana/migrated-dashboard-2.jpg" alt="Migrated Node Exporter Full dashboard in Kibana, scrolled to the Memory Meminfo section showing detailed memory panels" /></p>
<h2>How the Observability Migration Platform works</h2>
<p>At a high level, the workflow has two halves: source-aware translation on the way in and target-aware validation and delivery on the way out. That split matters because Grafana and Datadog differ not only in JSON shape, but also in query languages, panel types, controls, and alerting models.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/migrate-datadog-grafana-dashboards-alerts-to-kibana/overview.png" alt="End-to-end flow of the Observability Migration Platform: extract from Grafana or Datadog, normalize and plan, translate queries, panels, and alerts, emit Kibana-native output, validate against an Elastic target, then compile and upload to Kibana while producing verification and review artifacts" /></p>
<p>A run starts with exported assets or live source APIs. From there, the workflow normalizes source-specific objects, chooses a translation path for each supported dashboard, panel, and alerting artifact, and emits Kibana-native output. This is where most of the source-specific logic lives: translating queries or Datadog formulas, mapping panel semantics, carrying forward controls and links where possible, and deciding when an exact translation is not the right answer.</p>
<p>The second half is target-aware. The emitted output can be validated against an Elastic target, compiled, and uploaded to Kibana through the shared runtime. In the happy path, that yields a working translated dashboard. In rougher cases, validation may show that a panel cannot run safely as emitted. When that happens, the workflow is designed to fail conservatively: it can mark the panel for manual review or replace it with an upload-safe placeholder instead of shipping a broken runtime panel.</p>
<p>Just as important, the outcome is not simply &quot;a dashboard showed up in Kibana.&quot; The workflow also produces reviewer-facing evidence such as a migration report, manifest, verification packets, and rollout plan so you can see what translated cleanly, what was downgraded or manualized, and what still needs human judgment. Those artifacts are what make the process operationally credible: they give teams something concrete to inspect, compare, and act on.</p>
<h2>Running the migration</h2>
<p>The platform is CLI-driven, and a good fit for migration work that needs to be repeatable, reviewable, and easy to automate. Users can start with a representative slice of dashboards and alerting content from Grafana or Datadog, point the workflow at an Elastic target, and use that first run to understand translation quality, validation results, and how much follow-up review is required.</p>
<p>To run the full path against Elastic, create an <a href="https://www.elastic.co/docs/solutions/observability/get-started">Elastic Observability Serverless</a> project, generate a <a href="https://www.elastic.co/docs/deploy-manage/api-keys/serverless-project-api-keys">Serverless project API key</a>, and point the CLI at your Elasticsearch and Kibana endpoints:</p>
<pre><code class="language-shell">obs-migrate migrate \
  --source grafana \
  --input-mode files \
  --input-dir ./grafana_exports \
  --output-dir ./migration_output \
  --assets all \
  --native-promql \
  --data-view &quot;metrics-*&quot; \
  --validate \
  --es-url &quot;$ELASTICSEARCH_ENDPOINT&quot; \
  --es-api-key &quot;$KEY&quot; \
  --kibana-url &quot;$KIBANA_ENDPOINT&quot; \
  --kibana-api-key &quot;$KEY&quot; \
  --upload
</code></pre>
<p>The run validates the emitted queries against Elastic, compiles the generated dashboards, uploads them to Kibana, and produces the standard migration artifacts for review.</p>
<p>A typical run looks like this:</p>
<ol>
<li>Start with exported assets or live source APIs from Grafana or Datadog.</li>
<li>Choose the asset scope with <code>--assets dashboards</code>, <code>--assets alerts</code>, or <code>--assets all</code>.</li>
<li>Translate the supported dashboards, queries, controls, and alerting artifacts into Kibana-native output.</li>
<li>Validate the emitted content against an Elastic target (if configured), then compile and upload the translated dashboards for dashboard-capable runs.</li>
<li>Review the migration evidence, including <code>migration_report.json</code>, <code>verification_packets.json</code>, <code>run_summary.json</code>, etc., to understand what translated cleanly, where semantic gaps remain, and which dashboards, panels, or alert rules still require human review.</li>
<li>If alert rule creation is enabled, review the migrated rules (which are disabled by default) in Kibana before deciding which ones to enable or redesign.</li>
</ol>
<h2>What's next</h2>
<p>The platform is still evolving, and will continue to gain depth and self-service capabilities. The biggest open areas are stronger measured source-to-target semantic verification, further coverage for Datadog, deeper coverage for harder query families and non-dashboard surfaces, and cleaner shared runtime contracts across the workflow.</p>
<p>It is also built to grow over time. The source and target boundaries are explicit by design, which gives the platform room to expand coverage and support additional source paths in the future.</p>
<h2>In conclusion</h2>
<p>If you are planning a move into Elastic, a good starting point is to create an <a href="https://www.elastic.co/docs/solutions/observability/get-started">Elastic Observability Serverless</a> project. That gives you the target environment where translated dashboards and alerting content can be validated and reviewed.</p>
<p>To learn more about the migration workflow, talk to your Elastic representative about current access, supported coverage, and how it can help with your migration needs.</p>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/migrate-datadog-grafana-dashboards-alerts-to-kibana/header.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Your PromQL queries now run in Kibana!]]></title>
            <link>https://www.elastic.co/observability-labs/blog/promql-queries-run-in-kibana</link>
            <guid isPermaLink="false">promql-queries-run-in-kibana</guid>
            <pubDate>Wed, 15 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[With PromQL now natively supported in Kibana, write and execute PromQL for analyzing metrics in Discover, in Dashboards visualizations, in alerting rules and wherever else ES|QL is supported. PromQL is currently available in Tech Preview for common metrics analytics use cases.]]></description>
            <content:encoded><![CDATA[<p>Since its initial development in 2012 alongside Prometheus, PromQL has been a cornerstone of time-series monitoring for over a decade.
While Kibana already comprehensively supports time-series analysis via the ES|QL TS command, we are thrilled to introduce native PromQL support for common metrics analytics use cases.
For teams already fluent in PromQL, this support means a near-zero learning curve and significantly easier onboarding directly into the Elastic ecosystem.</p>
<h2>Running PromQL queries in Kibana</h2>
<p>In the ES|QL editor in Kibana, enter the <code>PROMQL</code> command, and type your PromQL in that block.
<code>PROMQL</code> marks that segment so Elasticsearch parses it as PromQL inside the wider ES|QL request Kibana sends.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/promql-queries-run-in-kibana/promql-first-look.png" alt="Discover in ES|QL mode with a PROMQL query in the bar" /></p>
<h2>What you can query</h2>
<p>Here are a few patterns to get started.</p>
<p><strong>Raw metric</strong></p>
<pre><code class="language-esql">PROMQL container.cpu.usage
</code></pre>
<p><strong>Average across all containers</strong></p>
<pre><code class="language-esql">PROMQL avg(container.cpu.usage)
</code></pre>
<p><strong><code>rate()</code> on a counter</strong></p>
<pre><code class="language-esql">PROMQL rate(docker.network.inbound.bytes)
</code></pre>
<p><strong>Aggregated rate</strong></p>
<pre><code class="language-esql">PROMQL sum(rate(docker.network.inbound.bytes))
</code></pre>
<p><strong>Group by a label</strong></p>
<pre><code class="language-esql">PROMQL sum by (agent.id) (rate(docker.network.inbound.bytes))
</code></pre>
<p>You may notice that none of these examples include <code>start</code>, <code>end</code>, <code>step</code>, or a lookback window on every <code>rate()</code>.
Those parameters are optional: the time picker and Kibana defaults handle most of it for you.</p>
<p>Optionally, you can include the data stream name using the <code>index=</code> parameter.
For example: <code>PROMQL index=metrics-docker.cpu-default container.cpu.usage</code>.
Adding the parameter helps narrow down the scope of what data the query scans.</p>
<p>The current release of PromQL tech preview has over 80% query coverage benchmarked against top Grafana dashboards.
Advanced modifiers and specific functions are in consideration for future releases.</p>
<h2>Find your streams and metric names</h2>
<p>If you have existing PromQL queries, you can use them directly in the <code>PROMQL</code> command without changes.
If you are writing a query from scratch and need to find the exact field names, run <code>TS metrics-*</code> in Discover to see every metrics data stream.
Each metric appears as a small chart so you can tell at a glance what is active.
Hover over a metric and click the &quot;View details&quot; action to see the field name and the data stream it belongs to.</p>
<p>For a deeper walkthrough, see <a href="https://www.elastic.co/docs/solutions/observability/infra-and-hosts/discover-metrics">Explore metrics data with Discover in Kibana</a>.</p>
<h2>Time picker and query time handling</h2>
<p>The time picker in Kibana sets the time window for the query.
Dashboard panels and Alerting rules work the same way using their own time range, so you do not need to write <code>start=</code> or <code>end=</code> in the query itself.</p>
<p>Step is the gap between two consecutive data points on the chart.
A smaller step means more data points across the same span.
If you do not set <code>step=</code> or <code>buckets=</code>, the default is <code>buckets=100</code>.
You can set <code>step=</code> to a fixed width such as <code>1m</code>, or set <code>buckets=</code> to a different target maximum number of data points.</p>
<h2>Discover and Dashboards</h2>
<p>In Discover, switch to ES|QL mode and run your <code>PROMQL</code> query so you can see how the metric behaves over the range you pick, as a time-series chart.
When you want to save that visualization, choose &quot;Save visualization to dashboard&quot; and add it to a new or existing dashboard.</p>
<p>Or go to Dashboards directly: add a panel, choose ES|QL, and write your <code>PROMQL</code> query.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/promql-queries-run-in-kibana/dashboard-promql.png" alt="Dashboard: ES|QL visualization with PromQL" /></p>
<h2>Alerting</h2>
<p>You can create alert rules using PromQL.
Go to Alerts, open Manage rules, and create a rule.
Search for Elasticsearch query and select it.
Choose ES|QL as the query type.</p>
<p>Write your <code>PROMQL</code> query, but assign the metric to a variable so you can use it in a <code>WHERE</code> clause for the alert condition:</p>
<pre><code class="language-esql">PROMQL metric_value=(sum by (agent.id) (rate(docker.network.inbound.bytes)))
| WHERE metric_value &gt;= 500
</code></pre>
<p>Select <code>@timestamp</code> for the time field and continue defining the rest of the rule configuration.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/promql-queries-run-in-kibana/alert-rule-promql.png" alt="Alert rule: Elasticsearch query with a PROMQL condition" /></p>
<h2>Try it</h2>
<ol>
<li>Open an <a href="https://cloud.elastic.co/serverless-registration">Observability project on Elastic Cloud Serverless</a>, or use Elastic Stack 9.4.</li>
<li>Write your query: in the ES|QL editor in Kibana, run your PromQL via <code>PROMQL</code>.
You can also go to Dashboards, add a panel, choose ES|QL, and write the query there.</li>
<li>If you are writing from scratch and need to find metric names, run <code>TS metrics-*</code> in Discover (see &quot;Find your streams and metric names&quot; above).</li>
<li>Check the results and adapt the query if needed.</li>
</ol>
<p>PromQL support in Elasticsearch and Kibana will continue to evolve.
Follow the Observability Labs feed for follow-up posts as coverage and ergonomics improve.</p>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/promql-queries-run-in-kibana/cover.png" length="0" type="image/png"/>
        </item>
    </channel>
</rss>