Tech Topics

0.90.6 Released

Today we are happy to announce the release of elasticsearch 0.90.6, which is based on Lucene 4.5.1. This is the current stable release in the 0.90 series and we recommend upgrading. You can download it here.

In this release there have been big improvements to highlighting including the new postings highlighter, automatic script reloading, better handling of edge cases associated with cluster-level changes, and many small bug fixes and enhancements which you can read about in the release notes.

Postings highlighter

The new postings highlighter is faster, requires less disk space than the fast-vector-highlighter, and is “sentence aware” which should result in more meaningful snippets.

In order to use it, the field mapping must be configured with index_options set to "offsets" instead of the default "positions". For instance:

curl -XPUT localhost:9200/my_index -d '
{
    "mappings": {
        "my_type": {
            "properties": {
                "title": {
                    "type":          "string",
                    "analyzer":      "english",
                    "index_options": "offsets"
                }
            }
        }
    }
}
'

With the mapping setup correctly, we can index some documents:

curl -XPOST localhost:9200/my_index/my_type/_bulk -d '
{"index":{"_id": 1}}
{"title": "The quick brown fox jumped over the lazy dog"}
{"index":{"_id": 2}}
{"title": "Brown foxes do love jumping, especially over dogs"}
'

And search them:

curl -XGET localhost:9200/my_index/my_type/_search?pretty -d '
{
    "query": {
        "match": { "title": "Jumping brown foxes"}
    },
    "highlight": {
        "fields": {
            "title": {}
        }
    }
}
'

Each result from the above request is accompanied by a nicely highlighted
snippet:

  • “The quick brown fox jumped over the lazy dog”
  • Brown foxes do love jumping, especially over dogs”

Other highlighting improvements

Thanks go to Nik Everett, a frequent contributor to Elasticsearch, who has added the ability to specify a separate query just for highlighting (see highlight query) and the ability to return a simple excerpt when there are no words that can be highlighted (see no_match_size in highlighted fragments).

More accurate terms facet

Because terms facets are calculated by combining the results from multiple shards, it is possible that each shard has a different top-10 list, resulting in inaccurate global counts. This release introduces the shard_size parameter which allows you to fetch more results from each shard, while still returning only size (default 10) results to the user. Pulling more results from each shard reduces the inaccuracy in global counts.

Reload scripts automatically

Scripts are used in many APIs in Elasticsearch, eg for scoring, script fields, faceting etc. A script can either be specified in the request itself, or named scripts can be loaded from the config/scripts/ directory on each node. Previously, changing configured scripts was a tiresome process which involved updating and restarting all nodes. Now, a new watcher will check for changes in the scripts directory every 60 seconds (configurable with watcher.interval) and load new scripts, reload changed scripts or delete removed scripts automatically. See Automatic script reloading for more.

Pretty is prettier

This is a very simple change, but removes a common source of annoyance. Pretty-printed results now have a newline \n character appended to make console output easier to read.

We hope you enjoy this new release. Please download and use 0.90.6, and let us know what you think.