This Week in Elasticsearch - July 17, 2013

Welcome to the third issue of This week in elasticsearch. In this format we try to inform you about the latest and greatest changes in elasticsearch. We try to cover what happened in the github repositories, as well as all the events happening about elasticsearch and give you a small peek in the future.

Elasticsearch core

  • Support for Pattern Capture Token Filter (#3340): This filter emits a token for every capture group in a regular expression. See the description in the issue for a great example how simple it is to index camel cased source code and make it searchable! This token filter will also be part of Lucene 4.4, and is included in 0.90 and master branches.
  • The found field is now included for delete operations in a bulk response (#3320, included in 0.90 and master)
  • Open/Close Index API supports multiple indices and wildcards (#3217, included in master)
  • More Like This queries now correctly return an error if used with a numeric field (#3252, included in master and 0.90)
  • Geo shapes can now be set to null (#3310, included in master only)
  • The Cat Api is being extended further (included in master)
  • Some very internal changes: The CacheRecycler class has become a node level component instead of being static (commit). The search reduce phase uses atomic reference arrays instead of concurrent hash maps (commit). Last but not least the jsr166 concurrency classes have been upgraded. More at the JSR 166 interest site
  • IndicesAdminClient#existsAliases has been renamed to IndicesAdminClient#aliasesExist. Note This is a breaking change for Java API (#3330, included in master and 0.90)
  • Full recovery is now faster with a large number of shards and improved memory usage, spanned over several commits and included in master and 0.90 (1, 2, 3, 4)
  • Important note: Make sure you are using the same java version within all the nodes of your cluster. A recent java 7 update changed the serialization of ip addresses. In Elasticsearch we don't use Java serialization except for cases where we propagate exceptions over the wire. This failure is logged noting that there was an error in deserializing the response exception message, and can lead to nodes refusing to join the cluster, or failure to allocate shards on a node.

Elasticsearch community

Got an interesting open source project, plugin, driver or anything else for elasticsearch? Here is your time to shine! Just drop us a note and we will list it here (and on the .org website, of course!).

  • ElasticHQ 0.95 has been released. This release allows you to manage aliases, index health is shown and all screens support live polling
  • A thin python client called slimes has been released
  • A node.js client called elastics has been released, which is written in coffeescript

Meetups

Also, if you are interested in a core elasticsearch training, the next locations are San Francisco and Boston in August. For more locations, check the training page

If you are interested in all this, we are hiring. We are interested in your skills, not in your location. Just drop us a note.