Product release

Elasticsearch 6.0.0-beta1 released

We are excited to announce the release of Elasticsearch 6.0.0-beta1, based on Lucene 7-SNAPSHOT. This is the third in a series of pre-6.0.0 releases designed to let you test out your application with the features and changes coming in 6.0.0, and to give us feedback about any problems that you encounter. Open a bug report today and become an Elastic Pioneer.

IMPORTANT: This is a beta release and is intended for testing purposes only. Indices created in this version will not be compatible with Elasticsearch 6.0.0 GA. Upgrading 6.0.0-beta1 to any other version is not supported.

DO NOT DEPLOY IN PRODUCTION

Also see:

You can read about all the changes in the release notes linked above, but there are some big changes worth mentioning below.

Sequence numbers and fast recovery

While synced flush has greatly improved shard recovery times for indices that are not being written to, recovery of active indices is still a slow and heavy operation. An active replica on a node that leaves the cluster for a brief period still needs to copy over all or most of the files in the primary shard in order to bring itself up to date.

The new Sequence Numbers infrastructure assigns an incremental operation ID to every index, update, or delete. This new infrastructure allows a replica to ask the primary for all operations from X onwards. If these operations are found in the primary’s translog an older replica can bring itself up to date by just replaying the transaction log and avoid the need to copy files.

This release features:

  • fast operation-based recovery for active indices after a node rejoins the cluster or is restarted,
  • a custom translog retention policy (defaulting to 12 hours or 512 MB) to make fast recovery more likely,
  • cleanup of old transaction logs on idle indices, and
  • primary-to-replica sync when a replica is promoted to be the new primary.

We also have the infrastructure we need to start developing the cross data centre replication (xDCR) X-Pack feature. We will continue to use the new infrastructure to tackle more complex correctness problems, specifically the rollback of unacknowledged operations in replicas and the usage of sequence numbers for optimistic locking.

Search scalability

A search against an index pattern like logstash-* can fan out to a huge number of shards. Usually, queries like these include a date range filter which means that the majority of shards won’t contain any matching documents. We already include an optimization to abort search requests on these shards early, before any real work is done, but this is not enough. Imagine a multi-search request containing 10 search requests, each of which target 2,000 shards. That’s 20,000 shard-level search requests which are added to the search threadpool queue. This could easily result in rejections, even though the majority of these requests are very quick.

Previously, Kibana used the _field_stats API with a date range filter to figure out which indices might contain matching documents, and then ran the search request against only those indices. We wanted to remove this API because it was much heavier than users expected and open to abuse. Instead, a search request now has a light shard prefiltering phase which is triggered if a search request targets at least 128 shards (by default). These prefilter requests are not added to the search queue and so cannot be rejected because the queue is full. The prefilter request rewrites the query at the shard level and determines whether the query has any chance of matching any documents at all. The full search request is then sent only to those shards which have a chance of matching.

But what if the user actually does want to search all 2,000 shards, or searches all indices by mistake? These wide-ranging requests should not overwhelm the cluster nor get in the way of search requests from other users. In order to solve this, we introduced the max_concurrent_shard_requests parameter whose default value depends on the number of nodes in the cluster, but which has a fixed upper limit of 256. This may make a single search request that targets many shards slower, but it makes for fairer concurrent searches by many users.

Preventing full disks

We have long had the cluster.routing.allocation.disk.watermark.low and cluster.routing.allocation.disk.watermark.high settings which prevent shards being assigned to full disks and actively move shards away from full disks. However, if all of the disks in your cluster are full, then there is nowhere to move shards to and eventually you will run out of disk space. Now, we have added the cluster.routing.allocation.disk.watermark.flood_stage setting. When a disk passes this level, indices that have shards on this node will be set to read only. No more writes will be accepted. Instead, you will need to either delete the index or add more space and set it back to read-write.

To prevent a persistent logged failure from filling up the disk, Elasticsearch is switching to the following out-of-the-box logging config:

  • Roll logs every 128MB
  • Compress rolled logs
  • Maintain a sliding window of logs
  • Remove the oldest logs to keep all compressed logs under 2GB

Removal of default passwords

In 5.x, X-Pack Security set the passwords of internal users to changeme by default, in order to make the getting-started experience easier, but it is never a good idea to ship with default passwords. In 6.0, we have added a bootstrap.password setting which can be added to the secure keystore before startup. When the cluster starts up, any node with this setting will try to set the password for the elastic user unless that user already has a password, so that the cluster will start in a secure state. On top of that, we’ve added a setup-passwords command line tool which will generate and set strong passwords for all of the internal users.

Other changes

  • We have reduced the overhead of profiling, search timeouts, and low-level search cancellation. Profiling will have less impact on search requests by measuring the timings of a subset of method calls and then assuming all calls have similar costs. Timeouts and low-level search cancellation will similarly check whether the request needs to be aborted less often in order to have a lower impact on the execution time of search requests.
  • The percolator performance has been improved by using range fields in order to index range queries, so that fewer queries need to be verified at percolation time. In addition, we are looking into making verification faster by using the binary representation of the query rather than its smile representation, since it should be significantly faster to parse in most cases.
  • The RPM and Debian packages put the config directory in the appropriate location for those distributions. Scripts that need access to the config directory (such as the plugin script and the secure-settings script) previously required that the CONF_DIR environment variable be set by the user, which frequently lead to confusion. Now, all scripts use the new elasticsearch-env include script which sets the CONF_DIR variable setting correctly for each package. Custom locations can still be set by the user using the CONF_DIR variable. For consistency, the bin/elasticsearch script no longer accepts the --path.conf setting, but relies on CONF_DIR instead.

Conclusion

Please download Elasticsearch 6.0.0-beta1, try it out, and let us know what you think on Twitter (@elastic) or in our forum. You can report any problems on the GitHub issues page.