0.19.5 is out. You can download it here. The release includes critical bug fixes in elasticsearch cluster management and peer recovery process, and it is highly recommended for all users.
[Update]: Some users reported a problem when starting elasticsearch with
0.19.5 due to verification error. This problem has been fixed in
p. [Update 2]: The
deb package seems to be broken in
0.19.6, a fix will be issued later.
p. [Update 3]:
0.19.7 has been released fixing the debian packaging problem and a bug in the new compression support (on reads).
The release also includes several features, including:
elasticsearch has had the ability to compress the _source document for a long time. The problem with it is the fact that small documents don’t end up compressing well, as several documents compressed in a single compression “block” (we use LZF, but Snappy, whihc will be supported also in the future, works the same) will provide a considerable better compression ratio. This version introduces the ability to compress stored fieds using the
index.store.compress.stored setting, as well as term vector using the
The settings can be set on the index level, and are dynamic, allowing to change them using the index update settings API. elasticsearch can handle mixed stored / non stored cases. This allows, for example, to enable compression at a later stage in the index lifecycle, and optimize the index to make use of it (generating new segmetns that use compression).
Best compression, comprared to _source level compression, will mainly happen when indexing smaller documents (less than 64k). The price on the other hand is the fact that for each doc returned, a block will need to be decompressed (its fast though) in order to extract the document data.
The way Lucene, the IR library elasticsearh uses under the covers, works is by creating immutable segments (up to deletes) and constantly merging them (the merge policy settings allow to control how those merges happen). The merge proces happen in an asyncronous manner without affecting the indexing / search speed. The problem though, especially on systems with low IO, is that the merge process can be expensive and affect search / index operation simply by the fact that the box is now taxed with more IO happening.
The store module now allows to have throttling configured for merges (or all) either on the node level, or on the index level. The node level throttling will make sure that out of all the shards allocated on that node, the merge process won’t pass the specific setting bytes per second. It can be set by setting
merge, and setting
indices.store.throttle.max_bytes_per_sec to something like
5mb. The node level settings can be changed dynamically using the cluster update settings API.
If specific index level configuration is needed, regardless of the node level settings, it can be set as well using the
index.store.throttle.max_bytes_per_sec. The default value for the type is
node, meaning it will throttle based on the node level settings and participate in the global throttling happening. Both settings can be set using the index udpate settings API dynamically.