このページのコンテンツは、選択された言語ではご利用いただけません。Elasticでは現在、コンテンツをさまざまな言語でご利用いただけるよう取り組んでいます。お使いの言語が準備されるまでお待ちくださるようお願いいたします。

December 13, 2017

Elasticsearch 6.1.0 released

Today we are pleased to announce the release of Elasticsearch 6.1.0, based on Lucene 7.1.0. This is the latest stable release, and is already available for deployment on Elastic Cloud, our Elasticsearch-as-a-service platform.

Latest stable release in 6.x:

You can read about all the changes in the release notes linked above, but there are a few changes which are worth highlighting:

Index Splitting

As a companion to the Shrink Index API, we now have a Split Index API that allows you to split an existing index into a new index, where each original primary shard is split into two or more primary shards in the new index.

The split is done efficiently by hard-linking the data in the source primary shard into multiple primary shards in the new index, then running a fast Lucene Delete-By-Query to mark documents which should belong to a different shard as deleted. These deleted documents will be physically removed over time by the background merge process.

The split API can only be used on indices that have had the index.number_of_routing_shards setting specified at index creation time. From 7.0, we plan to have this setting be set automatically: until then, this feature will only be available to new indices created on or after Elasticsearch 6.1.0.

Composite Aggregation with Paging

Elasticsearch is designed to return the top-10 best search results or the top-50 most accessed destination pages in your web logs as fast as possible. This speed is part of the reason why Elasticsearch is so popular for analytics. However, sometimes you need to get back ALL terms and the top-N design of aggregations doesn’t allow this to happen efficiently on high cardinality fields.

The new composite aggregation is designed to make this possible. The composite agg allows you to create terms, histogram, or date_histogram composite buckets on one or more fields, sorted in "natural order", i.e. alphabetically for terms, and numerically or by date for the histograms.

Because these composite buckets are returned in sorted order, results can be paged through efficiently in a similar manner to a scroll request. The first search request could return the first 100 or 1000 buckets, then the next tranche can be requested by passing the values of the last composite bucket in the after parameter, and so on until all buckets have been retrieved.

An additional benefit to the composite aggregation is that doc counts and metric aggs directly under the composite aggregation are accurate for the cases where you need non-approximated counts, as we can be sure that we have seen all documents for a particular composite bucket (unlike the top-N model). While you can specify a further terms agg under the composite agg, it will use the standard top-N model and return approximate counts.

Adaptive Replica Selection

Today in Elasticsearch, a series of search requests to the same shard will be forwarded to the primary and each replica in round robin fashion. This can prove problematic if one node starts a long garbage collection — search requests will still be forwarded to the slow node regardless and will have an impact on search latency.

In 6.1, we have added an experimental featured called Adaptive Replica Selection. Each node tracks and compares how long search requests to other nodes take, and uses this information to adjust how frequently to send requests to shards on particular nodes. In our benchmarks, this results in an overall improvement in search throughput and reduced 99th percentile latencies.

This option is disabled by default as we are still fine-tuning how to compare different search requests and how to account for differences due to caching, but the results we are seeing are very promising. You can enable or disable this feature at runtime by updating a dynamic cluster setting, so it is worth trying this out in your environment. If you do so, we would love to hear about your results.

Improved Indexing Throughput

Each document indexed in Elasticsearch includes a _fields metafield, which lists the fields contained in that document. This is needed to support the exists query. It turns out that this simple feature is surprisingly costly. We have since reworked the exists query to use doc-values or norms as a proxy for _fields, which limits the need for the _fields metafield to only those fields that have neither doc-values nor norms. This simple change has resulted in a massive 15% increase in indexing throughput in our benchmarks, with no loss of functionality.

Scripted Similarities

Elasticsearch now uses BM25 scoring instead of TF/IDF, which is going to be removed. That said, some people still want to use TF/IDF, and some people would like to have more control over scoring such as disabling term frequency or inverse document frequency. Previously, the only way to have such control was to write an Elasticsearch plugin. This has become much easier thanks to scripted similarities. Now, you can write your own custom similarity using Painless. The linked docs demonstrate how to recreate TF/IDF with two simple scripts.

Watcher run_as support

Up until now, watches have been executed as an internal X-Pack user when security is enabled, which allowed the watch to access any index that the X-Pack user has access to. Starting in 6.1.0, search inputs, search transforms, and index actions will instead be run as the user who created (or last updated) the watch. This will limit the watch's privileges to those of the user: if the user can't read index foo, then neither can the watch. Elevated permissions can still be requested in a watch by using the run_as privilege. Existing watches will continue to run as the X-Pack user until they are updated.

Conclusion

Please download Elasticsearch 6.1.0, try it out, and let us know what you think on Twitter (@elastic) or in our forum. You can report any problems on the GitHub issues page.

コンテキストエンジニアリング

ベクトル検索

検索が支えるアプリケーション

ログ

脅威保護

ワークフロー

Elasticsearch

Kibana（Discover、ダッシュボード）

Elastic Agent Builder

AutoOps

パイプ型クエリ言語

Jina AI 検索モデル

Elastic Cloud Serverless

Elastic Cloud Hosted

セルフマネージドのElasticsearch

eコマース検索

カスタマーサポート検索

検索主導のアプリ

ログ分析

インフラ監視

デジタルエクスペリエンスの監視

アプリのパフォーマンス監視

AIOps

LLMオブザーバビリティ

次世代SIEM

セキュリティのためのワークフロー

XDRとエンドポイントセキュリティ

セキュリティのためのAI

データの価値を10倍に

クラウドプロバイダー

Elastic AIのエコシステム

AIパートナープログラムを検索

AV-Comparatives

Forrester Wave™のリーダー

Gartner Magic Quadrant™のリーダー

IDC MarketScapeリーダー

検索

セキュリティ

オブザーバビリティ

使い始める

デモギャラリー

ダウンロード

統合

ドキュメント

Elastic Search Labs

Elastic Security Labs

Elastic Observability Labs

ブログ

コミュニティー

イベント

ウェビナー

ディスカッション

トレーニングコース

サポート

コンサルティング

Elasticsearch 6.1.0 released

Index Splitting

Composite Aggregation with Paging

Adaptive Replica Selection

Improved Indexing Throughput

Scripted Similarities

Watcher run_as support

Conclusion