November 26, 2014

Elasticsearch Upgrade Guide as Seen by the Client

UPDATE: This article refers to our hosted Elasticsearch offering by an older name, Found. Please note that Found is now known as Elastic Cloud.

Upgrading a system is one thing. Usually it involves taking a backup, doing the upgrade and then verifying everything is OK. Upgrading a system that your software depends on can be quite a different experience. In particular when the task is long overdue.

Introduction

I have written this guide to help you get an overview of which changes to expect (to your system) when getting ready for a new Elasticsearch version. I cannot guarantee it’s entirely complete, for that you will have to consult the release notes, but it should provide a good start for most scenarios. As always, no amount of planning can make up for not testing before going into production.

Versions, Big and Small

At the time of this writing, there are 109 different versions of Elasticsearch published. This might seem daunting, but a lot of them are compatible with one another. Elasticsearch uses a versioning scheme with three numbers, A.B.C. A denotes the major version, B denotes the feature version of that major version and finally C denotes the bug fix version of the feature version.

The idea behind this versioning scheme is that upgrading to the latest bug fix version of a given feature version should be safe for most users. A breaking change in a bug fix release usually will have a straightforward workaround. Also, within one major version, Elasticsearch does its best to ensure that different versions are capable of communicating, making it possible to upgrade without stopping the cluster.

Get to Know Your Own System First

It might seem pedantic to say this, but people work at different levels of abstractions and most people who use old Elasticsearch versions have had to prioritize other things for a while.

Client Library and Connection Method

Most people connect to Elasticsearch over HTTP - or the REST interface as it is often referred to - but if your client is written in Java or another language running on the Java Virtual Machine you may very well use the transport protocol. There also people connecting with Thrift or even Memcached.

HTTP

One of the big advantages of using HTTP is the technology independence of the protocol. What this means when upgrading to a newer Elasticsearch version is that your old client will still be able to connect to the new Elasticserach version. That said, there might still be differences in URL’s, the query DSL and other features requiring changes to your client, but you’re safe to move on to the next section.

Java Transport

If your’re using the Java Transport you should upgrade the client library to match the new Elasticsearch version. Even if it’s just a minor bug fix some of the bugs could be on the client side. For larger version upgrades there is also a risk the old client will refuse to connect to a newer cluster.

Most of the time, upgrading the client is as simple as changing a jar dependency, but sometimes refactoring will be required. Most of the time the refactoring is really straight forward, like updating package names in your imports.

Other Connection Methods

For other connection methods like Thrift and Memcached you will have to consult the documentation of the plugin enabling that connection.

Queries Used

After making sure your client is able to connect to the new Elasticsearch version, it is time to verify that your queries are compatible. Dig through the source of your client and create an example query in DSL syntax for every query type you have. If your client already implements the queries with the DSL this will mostly be cut and paste. The transport client on the other hand uses a query builder that in syntax is rather different from the query DSL. The solution is to execute the toString() method like the below example:

System.<span class="fu">out</span>.<span class="fu">println</span>(org.<span class="fu">elasticsearch</span>.<span class="fu">index</span>.<span class="fu">query</span>.<span class="fu">QueryBuilders</span>.<span class="fu">matchQuery</span>(<span class="st">"myField"</span>, <span class="st">"Hello SearchString!"</span>).<span class="fu">toString</span>());

Having these example queries is beneficial for two reasons: first, it provides good documentation for the features in Elasticsearch that you rely on and second, it makes it easy to test them against the new version.

Plugins

Compile a list of installed plugins in your cluster and check if they also require upgrading. If that is the case, then you should also check their release notes for breaking changes.

<span class="kw">bin/plugin.sh</span> --list

Breaking Changes

As can be seen in the below list, there was a great many breaking changes in the 0.90 branch. Pre 0.90 was probably no better, but that is out of scope for this article. The good news is that starting with version 1.0 the breaking changes are mostly reservered for the larger releases.

This list is based on the release notes included with each release, but to keep things shorter, I’ve tried my best to only include the issues that might require changes to your application. Potential breaking changes in the operational aspect, like how the default shard allocation algorithm has been changed, has been left out. In other words, use this list to plan the upgrade of your client to a new Elasticsearch version. Don’t use it as an excuse for not testing your entire solution with the new version before rolling out into production.

The 0.90 Branch

In the 0.90 branch breaking changes, big and small are spread across the versions and there is little notion of stabillity between bug fixes, even if the really big changes where postponed until 1.0.

0.90.0
minimum_should_match applied to wrong query in multi_match #2918
0.90.1
MatchQueryParser doesn’t allow field boosting on query when included in a _GET request #3024
Make GetField behavior more consistent for multivalued fields. #3015
0.90.2
Add a minimum_should_match parameter when common terms query has only high frequent terms #3188
0.90.3
Java Client: Renamed IndicesAdminClient.existsAliases to IndicesAdminClient.aliasesExist #3330
0.90.4
Java API: Remove RestActions#splitXXX(String) methods #3680
Flush API: Removed the refresh flag #3689
Optimize API: Removed the refresh flag #3690
0.90.6
Handling of the _parent field: Rejecting documents without parent field set as well as prohibit adding a parent mapping at runtime as well as #3849
Reject indexing requests which specify a parent, if no parent type is defined #3848
Completion Suggester: Reject non-integer weights on indexing to prevent rounding #3977
0.90.7
Remove Index Reader warmer introduced in 0.90.6 as it is not a good default behavior for all use cases. This will be reimplemented as an opt in feature. #4078 & #4079
0.90.8
Java client: FilterBuilder and QueryBuilder throw ElasticSearchIllegalArgumentException on similar errors #4199
0.90.10
Stats/Infos API: JvmStats now have standard names for gc and memory pools #4661
Cluster Stats API: Expose min/max file descriptors #4681

The 1.0 Branch

Starting with the 1.0 branch Elasticsearch had a new emphasis on stability. All known breaking changes are introduced in the 1.0.0 release.

Stats and info apis have been changed to be more RESTful, important change for monitoring and operations, but probably not relevant for your client.
The indices api has been cleaned up, an important change if your client creates new indexes or makes changes to mappings or warmers.
Wrapping documents in an object to specify type is disabled by default.
Count, delete-by-query and validate-query requests require the query to be wrapped in a query parameter just like the search request. Check if you use any of these requests.
The filter parameter in search requests have been renamed to post_filter. Old version still works, but was changed for good reasons. See Optimizing Elasticsearch Searches for more info.
multi_field mapping type has been replaced by a fields parameter on other types, as explained in docs
The standard and pattern analyzers have been changed to use an empty stopwords list by default. (Was previously english stopwords.)
All dates without years use 1970 as default.
Default unit in geo queries have been changed to meters (previously miles)
min_similarity, fuzziness and edit_distance parameters are replaced by the single fuzziness parameter.
ignore_missing parameter has been replaced by the parameters: expand_wildcards, ignore_unavailable and allow_no_indices.
Deleting an index requires a name or a pattern.
Return value ok is removed
Return values found, not_found and exists are all changed to found where applicable.
Field values, in response to the fields parameter, are now always returned as arrays.
fields parameter no longer supports _source.field syntax, use source-filtering
text query is replaced by match query
field query is replaced by query_string query
Function score query replaces _boost field for document boost
copy_to parameter replaces the path parameter in mappings.
function_score replaces custom_filters_score, custom_score and custom_boost_score.
.percolator type replaces _percolator index, read more here if you use percolation
Query/Get/Update APIs: Allow to control where single fields should be extracted from (source, stored fields or fielddata) #4492

The 1.1 Branch

Query API: Removed custom_score and custom_boost_factor queries #5076
NodesInfo API: Using plugins instead of singular plugin #5072
Mapping API: Binary fields are no more stored by default, because its data is already available in the _source #4957
Aggregations: aggregation names can now only contain alpha-numeric, hyphen (“-”) and underscore (“_”) characters, due to the enhancement which allows sub-aggregation sorting #5253

No known breaking changes for the fix versions up to and including version 1.0.3.

The 1.2 Branch

No known breaking changes up to and including bug fix realease 1.2.4.

If using the java transport, your client will also have to run on a Java 7 compatible JVM #5421
Scripting: Disable dynamic scripting by default #5943
Snapshot/Restore API: Added PARTIAL snapshot status #5792
Gateways: Removed deprecated gateway functionality (in favor of snapshot/restore) #5520
Versioning: Version types EXTERNAL & EXTERNAL_GTE test for version equality in read operation & disallow them in the Update API #5929
Versioning: A Get request with a version set always validates for equality #5663
Versioning: Calling the Update API using EXTERNAL and EXTERNAL_GTE version type throws a validation error #5661
Aggregations: Changed response structure of percentile aggregations #5870
Cluster State API: Remove index template filtering #4954
Nodes Stats API: Add human readable JVM start_time and process refresh_interval #5280
Java API: Unified IndicesOptions constants to explain intentions #6068

The 1.3 Branch

The 1.3 branch mostly follows a similar pattern, but there is an exception in the 1.3.3 release. There is a good reason for the exception though. The change removes an unintended feature that allowed users to specify experimental postings formats, with the risk of data loss on a future upgrade. There probably aren’t many people who have used this feature, as it was not documented, but those who did need to reindex their data.

1.3.0
- Analysis: Improvements to StemmerTokenFilter, stemmers named english, porter2, light_english, portuguese_rslp and dutch_kp are affected. #6452
- Thread pool rejection status code is changed from 503 to 429. This might be relevant if your client uses circuit breakers #6629 #6627.
- Setting action.wait_on_mapping_change has been removed #6648
- Security: Disable JSONP by default #6795
1.3.3
- Mapping: Remove unsupported postings_format / doc_values_format #7604 (issues: #7238, #7566)

Other than 1.3.3 there are no known breaking changes up to and including version 1.3.5.

The 1.4 Branch

The 1.4.0 release is one of the biggest releases, but still not close to as big a change as 1.0.

Percolation queries can only refer to fields that already exist in the mappings.
Aliases with filters can only refer to fields that exists in the mappings.
Read operations are returned by default, even if there is no master. Writes are still not allowed. Halting read operations can be enabled in configuration.
The MVEL scripting language has been replaced by Groovy and is only available as a plugin

Conclusion

This list might seem long, if your’re planning to upgrade many versions, but chances are that lot of the issues will not affect your client and in many cases it’s not hard to make your client handle both your current version and the new one. Having a staging or test cluster is the way to go in order to play safe.

If on the other hand you are upgrading to a version incompatible with the previous one and you need to avoid downtime during the actual upgrade of the Elasticsearch cluster I recommend this article. The approach is to start with encapsulating access to Elasticsearch in your system behind and interface common to both the implementation targeting the old version and the implementation for the new version. The clue then is to select Elasticsearch cluster and the version of that cluster into a runtime configuration. In many ways this is the silver bullet of upgrade strategies, both in terms of capabilities and implementation cost.