<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Elastic Observability Labs - Articles by Jeffrey Rengifo</title>
        <link>https://www.elastic.co/observability-labs</link>
        <description>Trusted security news &amp; research from the team at Elastic.</description>
        <lastBuildDate>Thu, 09 Apr 2026 17:21:53 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Elastic Observability Labs - Articles by Jeffrey Rengifo</title>
            <url>https://www.elastic.co/observability-labs/assets/observability-labs-thumbnail.png</url>
            <link>https://www.elastic.co/observability-labs</link>
        </image>
        <copyright>© 2026. Elasticsearch B.V. All Rights Reserved</copyright>
        <item>
            <title><![CDATA[How to cut Elasticsearch log storage costs with LogsDB]]></title>
            <link>https://www.elastic.co/observability-labs/blog/elasticsearch-logsdb-index-mode-storage-savings</link>
            <guid isPermaLink="false">elasticsearch-logsdb-index-mode-storage-savings</guid>
            <pubDate>Thu, 09 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Learn how to enable LogsDB index mode in Elasticsearch and measure real storage savings. We compare a standard index against a LogsDB index using Apache logs and show how much storage you can reclaim.]]></description>
            <content:encoded><![CDATA[<p>LogsDB is a specialized Elasticsearch index mode that gives you full functionality at a fraction of the storage cost. Your Kibana dashboards, searches, alerts, and visualizations all continue to work exactly as before. No data is discarded. No queries need to be updated. No workflows break. It is one setting, and everything else gets cheaper.</p>
<p>In benchmarks, LogsDB brought a dataset from <strong>162.7 GB down to 39.4 GB</strong> — a <strong>76% reduction in storage</strong>. You can explore the full nightly benchmark results at <a href="https://elasticsearch-benchmarks.elastic.co/#tracks/logsdb/nightly/default/90d">elasticsearch-benchmarks.elastic.co</a>.</p>
<p>In this tutorial you'll reproduce the experiment yourself using Kibana Dev Tools and an Apache logs dataset. You'll create two identical indices, ingest the same documents into both, and measure the storage difference with the <code>_stats</code> API. By the end, you'll see a 44% reduction on your test data — and understand exactly why production numbers push even higher.</p>
<blockquote>
<p><strong>Already on Elasticsearch 9.2+?</strong> Any data stream with a <code>logs-</code> prefix already uses LogsDB by default. Jump to <a href="#what-about-your-existing-logs">What about your existing logs?</a> to verify your setup.</p>
</blockquote>
<blockquote>
<p><strong>Want the full picture?</strong> For the engineering history behind these savings — how Lucene doc values, synthetic <code>_source</code>, index sorting, and ZSTD were developed and stacked over twelve years — see <a href="https://www.elastic.co/observability-labs/articles/elasticsearch-logsdb-storage-evolution"><em>Elasticsearch over the years: how LogsDB cuts index size by up to 75%</em></a>.</p>
</blockquote>
<h2>Prerequisites</h2>
<ul>
<li>Elasticsearch 8.17+ cluster, Elastic Cloud deployment, or Serverless</li>
<li>Kibana with Dev Tools access</li>
<li>Some logs</li>
<li>Basic familiarity with running API calls in Kibana Dev Tools</li>
</ul>
<h2>How LogsDB saves storage</h2>
<p>LogsDB stacks three mechanisms to achieve its storage reduction:</p>
<ul>
<li><strong>Index sorting</strong> — documents are sorted by <code>host.name</code> then <code>@timestamp</code>, grouping similar log lines so compression codecs find far more repeated patterns. Sorting alone accounts for roughly 30% of the savings.</li>
<li><strong>ZSTD compression with delta/GCD/run-length encoding</strong> — <code>best_compression</code> switches from LZ4 to Zstandard and applies numeric codecs to each doc values column. The standard index in this tutorial uses LZ4, so part of what you're measuring is the full package LogsDB delivers automatically.</li>
<li><strong>Synthetic <code>_source</code></strong> — Elasticsearch skips storing the raw JSON blob entirely and reconstructs <code>_source</code> on demand from doc values, adding another 20–40% of savings on top.</li>
</ul>
<blockquote>
<p><strong>Synthetic <code>_source</code> trade-offs:</strong> Field ordering in returned documents may differ from the original, and some edge cases around multi-value array fields behave differently. For most log analytics workloads these differences are invisible, but check the <a href="#next-steps">synthetic <code>_source</code> documentation</a> before enabling it in latency-sensitive applications.</p>
</blockquote>
<p>For a deep dive into the architecture behind each mechanism, see <a href="https://www.elastic.co/observability-labs/articles/elasticsearch-logsdb-storage-evolution"><em>Elasticsearch over the years: how LogsDB cuts index size by up to 75%</em></a>.</p>
<p>Let's now walk through the steps you can take to enable LogsDB and measure the storage savings.</p>
<h2>Step 1: Collect logs with Elastic Agent</h2>
<p>The recommended way to ingest Apache logs into Elasticsearch is through Elastic Agent with the Apache integration. It handles collection, parsing, ECS field mapping, and routing automatically.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/elasticsearch-logsdb-index-mode-storage-savings/integration.png" alt="Elastic Agent Apache integration setup in Kibana" /></p>
<p>Browse all available integrations in the <a href="https://www.elastic.co/integrations">Elastic integrations catalog</a>.</p>
<p>Once the Agent is collecting logs and routing them to <code>logs-apache.access-*</code>, move to the next step.</p>
<h2>Step 2: Create the two indices</h2>
<p>All commands in this tutorial are run in <strong>Kibana Dev Tools</strong>.</p>
<p>Create one standard index and one LogsDB index with identical field mappings. The only difference is <code>&quot;index.mode&quot;: &quot;logsdb&quot;</code>.</p>
<p><strong>Standard index:</strong></p>
<pre><code class="language-json">PUT /apache-standard
{
  &quot;mappings&quot;: {
    &quot;properties&quot;: {
      &quot;@timestamp&quot;:                  { &quot;type&quot;: &quot;date&quot; },
      &quot;host.name&quot;:                   { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.request.method&quot;:         { &quot;type&quot;: &quot;keyword&quot; },
      &quot;url.path&quot;:                    { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.version&quot;:                { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.response.status_code&quot;:   { &quot;type&quot;: &quot;integer&quot; },
      &quot;http.response.bytes&quot;:         { &quot;type&quot;: &quot;integer&quot; },
      &quot;http.request.referrer&quot;:       { &quot;type&quot;: &quot;keyword&quot; },
      &quot;user_agent.original&quot;:         { &quot;type&quot;: &quot;keyword&quot; }
    }
  }
}
</code></pre>
<p><strong>LogsDB index:</strong></p>
<pre><code class="language-json">PUT /apache-logsdb
{
  &quot;settings&quot;: {
    &quot;index.mode&quot;: &quot;logsdb&quot;
  },
  &quot;mappings&quot;: {
    &quot;properties&quot;: {
      &quot;@timestamp&quot;:                  { &quot;type&quot;: &quot;date&quot; },
      &quot;host.name&quot;:                   { &quot;type&quot;: &quot;keyword&quot; },
      &quot;url.path&quot;:                    { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.request.method&quot;:         { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.version&quot;:                { &quot;type&quot;: &quot;keyword&quot; },
      &quot;http.response.status_code&quot;:   { &quot;type&quot;: &quot;integer&quot; },
      &quot;http.response.bytes&quot;:         { &quot;type&quot;: &quot;integer&quot; },
      &quot;http.request.referrer&quot;:       { &quot;type&quot;: &quot;keyword&quot; },
      &quot;user_agent.original&quot;:         { &quot;type&quot;: &quot;keyword&quot; }
    }
  }
}
</code></pre>
<p>That single <code>&quot;index.mode&quot;: &quot;logsdb&quot;</code> line activates all three storage mechanisms. Elasticsearch enables these additional settings behind the scenes — you don't set any of them manually:</p>
<pre><code class="language-json">{
  &quot;index.sort.field&quot;:              [&quot;host.name&quot;, &quot;@timestamp&quot;],
  &quot;index.sort.order&quot;:              [&quot;asc&quot;, &quot;desc&quot;],
  &quot;index.codec&quot;:                   &quot;best_compression&quot;,
  &quot;index.mapping.ignore_malformed&quot;: true,
  &quot;index.mapping.ignore_above&quot;:    8191
}
</code></pre>
<h2>Step 3: Reindex the logs</h2>
<p>Use the <code>_reindex</code> API to copy the same documents into both test indices:</p>
<pre><code class="language-json">POST /_reindex
{
  &quot;source&quot;: { &quot;index&quot;: &quot;logs-apache.access-*&quot; },
  &quot;dest&quot;:   { &quot;index&quot;: &quot;apache-standard&quot; }
}

POST /_reindex
{
  &quot;source&quot;: { &quot;index&quot;: &quot;logs-apache.access-*&quot; },
  &quot;dest&quot;:   { &quot;index&quot;: &quot;apache-logsdb&quot; }
}
</code></pre>
<p>Both indices now hold identical documents, so the storage comparison in the next step reflects only the index mode difference.</p>
<h2>Step 4: Force merge for a fair comparison</h2>
<p>Before measuring, force merge both indices to a single segment:</p>
<pre><code>POST /apache-standard/_forcemerge?max_num_segments=1

POST /apache-logsdb/_forcemerge?max_num_segments=1
</code></pre>
<p>These calls block until the merge finishes. Wait for both responses before continuing.</p>
<p><strong>Why this matters:</strong> Elasticsearch writes data into multiple Lucene segments before merging them in the background. Measuring mid-merge gives artificially inflated numbers because each segment is compressed independently. Forcing a single segment shows the real steady-state storage footprint you'd see in a mature production index.</p>
<blockquote>
<p><strong>Only run <code>_forcemerge</code> on indices that are no longer being written to.</strong> Force merging an index that is still receiving writes is resource-intensive and can impact ingestion performance. In production, you can use <a href="https://www.elastic.co/docs/manage-data/lifecycle/index-lifecycle-management">Index Lifecycle Management (ILM)</a> to automate force merges as part of the warm or cold phase, once an index is rolled over and no longer actively ingested into.</p>
</blockquote>
<h2>Step 5: Measure the difference</h2>
<pre><code>GET /apache-standard/_stats?filter_path=indices.*.primaries.store

GET /apache-logsdb/_stats?filter_path=indices.*.primaries.store
</code></pre>
<p>The <code>filter_path</code> parameter keeps the response focused. Look for <code>primaries.store.size_in_bytes</code> in each response.</p>
<p>In our test with Apache log records, the results were:</p>
<table>
<thead>
<tr>
<th>Index</th>
<th>Documents</th>
<th>Size</th>
</tr>
</thead>
<tbody>
<tr>
<td>apache-standard</td>
<td>111,818</td>
<td>15.37 MB</td>
</tr>
<tr>
<td>apache-logsdb</td>
<td>111,818</td>
<td>8.6 MB</td>
</tr>
<tr>
<td><strong>Reduction</strong></td>
<td></td>
<td><strong>44%</strong></td>
</tr>
</tbody>
</table>
<p>To put this in perspective: at 1 TB of log data, LogsDB brings that down to around 560 GB. That's 450 GB saved without any changes to your queries. At production scale with billions of documents and synthetic <code>_source</code> enabled, savings push to 76% — taking 162.7 GB down to 39.4 GB in our benchmark.</p>
<h2>Visualize in Kibana</h2>
<p>To see the storage difference visually, open Kibana and go to <strong>Management → Stack Management → Index Management</strong>. You'll see both indices listed with their current sizes side by side.</p>
<p><img src="https://www.elastic.co/observability-labs/assets/images/elasticsearch-logsdb-index-mode-storage-savings/index-stats.png" alt="Kibana Index Management showing storage comparison between standard and LogsDB indices" /></p>
<blockquote>
<p><strong>Why Kibana shows larger numbers than <code>_stats</code>:</strong> Kibana Index Management displays the total index size including all replica shards. The <code>_stats</code> query above uses <code>primaries</code> to report primary shards only. The ratio between the two indices remains the same either way.</p>
</blockquote>
<h2>What about your existing logs?</h2>
<h3>Elasticsearch 9.2+ (already enabled by default)</h3>
<p>Since 9.2, any data stream matching the <code>logs-*</code> naming pattern automatically uses LogsDB. You're likely already saving storage without any configuration change.</p>
<p>Verify your existing data streams:</p>
<pre><code>GET /.ds-logs-*/_settings?filter_path=*.settings.index.mode
</code></pre>
<p>If you see <code>&quot;index.mode&quot;: &quot;logsdb&quot;</code> in the responses, you're already getting the savings.</p>
<h3>Elasticsearch 8.x or 9.0–9.1 (enable per data stream via index template)</h3>
<p>For earlier versions, enable LogsDB on a data stream by updating its index template. This affects all new indices created from that template — existing indices are not changed, so the transition is safe and gradual.</p>
<p><strong>Option A — Update an existing template:</strong></p>
<pre><code class="language-json">PUT _index_template/logs-myapp-template
{
  &quot;index_patterns&quot;: [&quot;logs-myapp-*&quot;],
  &quot;data_stream&quot;: {},
  &quot;template&quot;: {
    &quot;settings&quot;: {
      &quot;index.mode&quot;: &quot;logsdb&quot;
    }
  },
  &quot;priority&quot;: 200
}
</code></pre>
<p><strong>Option B — Check and patch an existing integration template:</strong></p>
<p>First, find the template managing your data stream:</p>
<pre><code>GET _index_template/logs-apache*
</code></pre>
<p>Then add the <code>index.mode</code> setting to the <code>template.settings</code> block using a <code>PUT _index_template/&lt;name&gt;</code> call with the full template body including your addition.</p>
<p>After updating the template, the next index rollover will use LogsDB. Trigger a rollover immediately if you don't want to wait:</p>
<pre><code>POST /logs-myapp-default/_rollover
</code></pre>
<p><strong>Upgrading from 8.x to 9.0+:</strong> Existing data streams are not changed automatically. Only new rollovers will use LogsDB. There is no data loss and no reindexing required — the savings accumulate as new indices roll over.</p>
<h2>What about query performance?</h2>
<p>LogsDB does not significantly impact query performance for typical log analytics workloads. The index sorting by <code>host.name</code> and <code>@timestamp</code> can actually <em>improve</em> range query and aggregation performance on those fields, since matching documents are stored adjacently. Queries that don't filter on those fields perform comparably to a standard index.</p>
<p>For indexing throughput data across releases, see the <a href="https://www.elastic.co/observability-labs/articles/elasticsearch-logsdb-storage-evolution#performance-not-just-storage">performance section</a> of the companion article.</p>
<h2>Conclusion</h2>
<p>LogsDB activates with a single <code>&quot;index.mode&quot;: &quot;logsdb&quot;</code> setting and delivers measurable storage savings immediately: 44% in our hands-on test, and 76% (162.7 GB → 39.4 GB) in production benchmarks with synthetic <code>_source</code>. On Elasticsearch 9.2+, <code>logs-*</code> data streams already use LogsDB by default. For 8.x or earlier 9.x clusters, a one-line index template change enables it on your next rollover with no data loss and no reindexing required.</p>
<h2>Next steps</h2>
<ul>
<li><a href="https://www.elastic.co/docs/reference/elasticsearch/index-settings/logsdb">LogsDB index mode documentation</a></li>
<li><a href="https://www.elastic.co/docs/reference/elasticsearch/mapping/synthetic-source">Synthetic <code>_source</code> documentation and limitations</a></li>
<li><a href="https://www.elastic.co/docs/manage-data/data-store/data-streams/logs-data-stream">Configuring a logs data stream</a></li>
<li><a href="https://www.elastic.co/blog/logsdb-index-mode-generally-available">LogsDB GA announcement</a></li>
<li><a href="https://www.elastic.co/blog/elasticsearch-logsdb-tsds-benchmarks">LogsDB and TSDS performance benchmarks</a></li>
</ul>
]]></content:encoded>
            <category>observability-labs</category>
            <enclosure url="https://www.elastic.co/observability-labs/assets/images/elasticsearch-logsdb-index-mode-storage-savings/header.png" length="0" type="image/png"/>
        </item>
    </channel>
</rss>