Today we are happy to announce the release of Elasticsearch 0.90.8, which is based on Lucene 4.6. This is the current stable release in the 0.90 series. You can download it here.
There are not many big new features in this release, but it contains a number of important bug fixes and stability improvements. We highly recommend upgrading, especially if you are currently running 0.90.6 or 0.90.7, or if you are using parent-child relationships.
New Features
Cluster Stats API
In addition to the nodes-stats
, nodes-info
and indices-stats
APIs, we add the new cluster-stats
API, which returns useful summary information from a cluster-wide perspective. It includes basic index metrics and important information about the nodes in the cluster.
Read more about the cluster-stats
API.
Simple Query String Query
The query_string
query is very powerful but problematic. It supports a complex dense mini-language for expressing queries, but any syntax error will result in an error message instead of results. It allows users to query any field in your index and potentially run very heavy queries. It is not a suitable query to expose directly to your users.
Enter the new simple_query_string
query. It supports a much simpler syntax:
&
: and|
: or-
: not(...)
: grouping or precedence"quick brown fox"
: phrase queryfoo*
: prefix query
But the best part about it is that it is immune to syntax errors. If the syntax is not quite right, it will try to do the right thing anyway!
Read more about the simple_query_string
query here.
Disable Fielddata Loading
When faceting or sorting on field values, Elasticsearch needs instant access to the value for each document in order to perform well. Fielddata is the magic potion that makes these functions blazing fast, by loading field values into memory. However, load the wrong field into memory and you could run out of memory and bring your cluster down.
We are working on circuit-breakers which will prevent you from damaging your cluster, but in the meantime we allow you to disable fielddata loading for specific fields, such as the body of an email:
{ "body": { "type": "string", "fielddata": { "format": "disabled" } } }
Read more about disabling fielddata here.
Geo-Point Compression
A geo-point consists of a latitude and a longitude, and these values need to be loaded into fielddata memory to perform filtering by geo-location or geo-distance. By default, a geo-point takes up 16 bytes of memory and is extremely precise. We can easily sacrifice a little precision for big memory savings:
Precision | Bytes per point | Size reduction |
---|---|---|
1km | 4 | 75% |
3m | 6 | 62.5% |
1cm | 8 | 50% |
1mm | 10 | 37.5% |
Read more about setting geo-point precision
here.
Token Count
Quite often we not only want to make a field searchable, we also want to know how many words or tokens that the field contains. We have added a new field type called token_count
which will index the number of tokens in the field automatically:
{ "message": { "type": "multi_field", "fields": { "message": { "type": "string" }, "word_count": { "type": "token_count" } } } }
A filter on message.word_count
would allow you to find documents that contain the tokens foo
, bar
and baz
, but no other tokens.
Read more about token_count
here
Bug Fixes and Enhancements
This release contains a number of important optimizations and bug fixes which will improve the stability of your cluster, especially if you have a very large cluster.
- In rare cases, it is possible that shards could be deleted incorrectly and that dead nodes continue to show up as members of the cluster. This fix alone is sufficient reason to upgrade. See #4503.
- The logic in the shard allocation deciders has been greatly improved — deciding where to allocate thousands of shards can now be completed in seconds instead of minutes. See #4459, #4458 and #4454.
- Recovery of local primary shards (fast) is now done before relocating primary shards from one node to another (slow). #4237.
- Frequent mapping updates on clusters with very large mappings will now complete much more quickly — only the latest mapping is processed instead of each mapping change. See #4373.
- Cluster state changes now wait for an
ack
response from the nodes in the cluster. Usually these changes complete quickly, but on very large clusters they can take more time. Theack
mechanism ensures that changes are in place before returning success to the client. Seeack
related issues. - Various bugs were fixed in the
has_child
andhas_parent
queries, which could occasionally return incorrect results. See #4341, #4313, #4306 and #4291. - Ensuring that the
bootstrap.mlockall
setting has been applied correctly is both very important and difficult to do. Now you can use thenodes-info
API to verify, with:curl localhost:9200/_nodes/process?pretty
We hope you enjoy this new release. Please download 0.90.8, and let us know what you think.