Today, we are happy to announce the release of Elasticsearch 1.4.0.Beta1, based on Lucene 4.10.1. You can download them and read the full changes list here: Elasticsearch 1.4.0.Beta1.
This is a Beta release - please do not use in production.
The theme of 1.4.0 is resiliency: making Elasticsearch more stable and reliable than ever before. It is easy to be reliable when everything functions as it should. The difficult part comes when the unexpected happens: nodes run out of memory, their performance is degraded by slow garbage collections or heavy I/O, networks fail or transmit data erratically.
This Beta release includes three major efforts to improve resiliency:
Distributed systems are complex. We have an extensive test suite which creates random scenarios to try to simulate conditions that we could never imagine, but we recognise that there are an infinite number of edge cases. The changes in 1.4.0.Beta1 include all of the improvements that we have made thus far. We ask you to test these changes out in the real world and to tell us about any problems that you see.
Heap space is a limited resource. We recommend limiting the heap to 50% of your available RAM with a maximum of 32GB. Above this limit, the JVM can no longer use compressed pointers, and garbage collections are much slower. A major cause of node instability is slow garbage collection, which can be caused by:
- memory pressure,
- swapping (see memory settings), or
- very large heaps.
This release includes a number of changes to improve memory management and, as a consquence, to improve node stability:
One of the biggest users of memory is fielddata. In order to make aggregations, sorting, and script access to field values fast, we load field values into memory and we keep them there. Heap space is precious, so the data in memory uses complex compression algorithms and micro-optimizations to make every bit count. This works very well, until you have more data than heap space. It’s a problem that can always be solved by adding more nodes, but often heap space reaches capacity long before the CPU or I/O is saturated.
Recent releases have added support for doc values. Essentially, doc values provide the same function as in-memory fielddata, but they are written to disk at index time. The benefit that they provide is that they consume very little heap space. Doc values are read from disk, instead of from memory. While disk access is slow, doc values benefit from the kernel’s filesystem cache. The filesystem cache, unlike the JVM heap, is not constrained by the 32GB limit. By shifting fielddata from the heap to the filesystem cache, you can use smaller heaps which means faster garbage collections and thus more stable nodes.
Before this release, doc values were significantly slower than in-memory fielddata. The changes in this release have improved the performance significantly, making them almost as fast as in-memory fielddata.
All you need to do to use doc values instead of in-memory fielddata is to map new fields as follows:
With this mapping in place, any use of fielddata for this field will automatically use doc values from disk instead of loading fields into memory. Note: currently doc values is not supported on analyzed string fields.
The fielddata circuit breaker was recently added to limit the maximum amount of memory to be used by fielddata, preventing one of the biggest causes of OOME. Now we have extended that concept to provide a circuit-breaker at the request level, which imposes a limit on all of the memory that can be used by a single request.
Bloom filters have provided an important performance optimization during indexing — for checking whether a previous version of a document exists — and during document retrieval by ID — to determine which segment contains the document. But of course, they come with a cost: memory. Recent improvements have removed the need for bloom filters. Currently, Elasticsearch still builds them at index time (just in case real world experience doesn’t match with our test scenarios) but they aren’t loaded into memory by default. Assuming all goes as planned, we will remove them completely in a future version.
The biggest thing we can do to improve cluster stability is to improve node stability. If nodes are stable and respond in a timely manner, it greatly reduces the chances of the cluster becoming unstable. That said, we live in an imperfect world — things go wrong unexpectedly, and the cluster needs to be able to recover from these situations without losing data.
We have spent several months working on improving Elasticsearch’s ability to recover from failure on the improve_zen branch. First, we have added tests to replicate complex network level failures. Then we have added fixes for each test. There is still more to do, but we have solved most of the problems that our users have experienced, including issue #2488 — “minimum_master_nodes does not prevent split-brain if splits are intersecting.”
We take cluster resiliency very seriously. We want you to understand what Elasticsearch will do for you, and also what its weak points are. With this in mind, we have created the Resiliency Status Document. This document tracks the resiliency issues that we (or our users) have encountered, what has already been fixed, and what remains to be fixed. Please read this document carefully and take the appropriate measures to protect your data.
Checksums on the shard recovery over the network helped us to detect a bug in the compression library, which was fixed in version 1.3.2. Since then, we have added more checksums and more checksum verification throughout Elasticsearch:
- During merging, all files in a segment have their checksums verified (#7360).
- When reopening an index, the smaller files in a segment are verified completely, and the larger files have a lightweight truncation check (LUCENE-5842).
- When replaying events from the transaction log, each event has its checksum verified (#6554).
- During shard recovery, or when restoring from a snapshot, Elasticsearch needs to compare a local file with a remote copy to ensure that they are identical. Using just the file length and checksum proved to be insufficient. Instead, we now check the identity of all the files in the segment (#7159).
You can read about the many features, enhancements, and bug fixes in this release in the Elasticsearch 1.4.0.Beta1 changelog, but there are a few changes worthy of special mention:
Groovy is now the new default scripting language. The previous default, MVEL, was showing its age, and the fact that it couldn’t be run in a sandbox was a real security issue. Groovy is sandboxed (which means it can be safely enabled out of the box), well maintained and fast! See our recent blog post about scripting for more.
Elasticsearch in its default configuration was vulnerable to cross site scripting. We have fixed that by disabling CORS by default. Site plugins installed in Elasticsearch will still work as before, but external websites will no longer be able to access a remote cluster, unless CORS is reenabled. We have also added more CORS settings to give you greater control over which websites are allowed to access your cluster. See our security page for more info.
A new experimental shard-level query cache can make aggregations on static indices respond almost instantaneously. Imagine that you have a dashboard showing the number of pageviews per day across your website. These numbers don’t change on older indices, but the aggregation is recalculated every time you refresh the dashboard. With the new query cache, the aggregation results will be returned directly from the cache, unless the data in the shard has changed. You will never get stale results from the cache — it will always return the same results as an uncached request.
We have added three new aggregations:
- This is an extension of the filter aggregation. It allows you to define multiple buckets, with a different filter per bucket.
- The parent-child equivalent of the nested aggregation, the children agg lets you aggregate on the child documents belonging to a parent document.
- This aggregation gives you complete control over the metrics calculated on your data. It provides hooks for the initialization phase, the document collection phase, the shard-level combination phase, and the global reduce phase.
Previously, you have been able to retrieve the aliases, mappings, settings and warmers for an index… but separately. The get-index API now allows you to retrieve some or all of these together, for one index or many indices. This is very useful when you want to create a new index which is identical, or almost identical, to an existing index.
There are a few improvements to document indexing and updating:
- We now use Flake IDs for auto-generated document IDs, which provides a nice performance boost when doing primary key lookups.
- Updates which don’t make any changes to the document are now cheaper if you set detect_noop to true. With this setting enabled, only update requests which change the content of the _source field will write a new version of the document.
- Updates can now be handled entirely from scripts. Previously, the script was only run if the document already existed, otherwise an upsert document would be inserted. The scripted_upsert parameter now allows you to handle document creation directly from the script.
The already very useful function_score query now supports a weight parameter, used to tune the relevance impact of each specified function. This allows you to give more weight to recency than popularity, or more weight to price than to geolocation. Also the random_score function is no longer affected by segment merges, thus providing more consistent ordering.