14 Mai 2014

This Week in Elasticsearch - May 14, 2014

Von Alexander Reelsen

Welcome to This Week in Elasticsearch. In this roundup, we try to inform you about the latest and greatest changes in Elasticsearch. We cover what happened in the GitHub repositories, as well as many Elasticsearch events happening worldwide, and give you a small peek into the future of the project.

Elasticsearch Core

  • Internal: Limit the number of bytes that can be allocated to process requests (#6050, master and 1.x)
  • Aggregations: Support _index field (#5848, master and 1.x)
  • Index Status API: Removed API as the Recovery API now offers this functionality (#6062, master)
  • cat API: node version could be empty in /_cat/nodes endpoint (#5480, master and 1.x)
  • Indices Options: Unify method names (#6068, master and 1.x)
  • Aggregations: Changed response format of percentile agg (#5870, master and 1.x)
  • Percolate API: The percolator needs to take deleted percolator documents into account (#5843, master, 1.x, 1.1 and 1.0)
  • Scroll API: Reduce phase fails if shard failures occur (#6022, master and 1.x)
  • More like this API: Added ability to include the queried document (#6067, master and 1.x)
  • cat API: Added information about fielddata (#4593, master and 1.x)
  • Dependencies: Update shade plugin to support java 8 (#6015, master and 1.x)
  • REST Clear Scroll API: Return 404 error code if id does not exist (#5865, master and 1.x)
  • More like this API: take into account size and from request body parameters (#5981, master and 1.x)
  • Validate query API: made sure match_all query is used when no body is provided (#6111, master and 1.x)
  • Validate query API: take into account filters eventually associated with aliases (#6112, master and 1.x)
  • Validate query API: take into account type parameter if specified (#6116, master and 1.x)
  • Validate query API: properly detect extra json fields beyond the query (#5685, master and 1.x)
  • Internal: add global ordinals based include/exclude support to terms and significant terms aggregations (#6000, master and 1.x)
  • Segments merging: allow to change concurrent merge scheduling setting dynamically (#6098, master and 1.x)
  • Mappings: improved mappings validation (#6093, master and 1.x)
  • Segments merging: restored previous elasticsearch specific ConcurrentMergeScheduler defaults (commit, master and 1.x)
  • Segments merging: lowered merge throttling defaults back to 20 MB/sec now that LUCENE-5641 is solved (#6081, master and 1.x)
  • Internal: fixed single index resolution to concrete indices when at least one index is required (#6137, master and 1.x)
  • Snapshot & Restore API: fixed hanging aborted snapshot during node shutdown (#5966, master, 1.x and 1.1)
  • Segments merging: removed SerialMergeScheduler (#6120, master and 1.x)
  • Aggregations: added option for background filter to significant terms (#5944, master and 1.x)
  • Scripting: log script change/add and removal at INFO level (#6104, master and 1.x)
  • Aggregations: use t-digest as a dependency for the percentiles aggregation (#6142, master and 1.x)
  • Clear scroll API: made sure that the whole message gets read when freeing the context (#6147, master and 1.x)
  • Internal: fixed NPE when initializing an accepted socket in NettyTransport (#6144, master and 1.x)
  • Recovery api: fixed percent bytes recovered greater than 100% (#6113, master and 1.x)
  • Field data: track the number of times the circuit breaker has tripped (#6130, master and 1.x)
  • Aggregations: Add shard_min_doc_count parameter to terms aggregation (#6143, master and 1.x)
  • Query DSL: allow sorting on nested sub generated field (#6150, master and 1.x)

Elasticsearch Ecosystem

Here's some more information about what is happening in the ecosystem we are maintaining around the ELK stack - that's Elasticsearch plus Logstash and Kibana - including plugin and driver releases.

  • The Logstash team was happy to announce Logstash 1.4.1 last week. You can also check out the latest release of our Logstash Puppet module.
  • Elasticsearch for Apache Hadoop 2.0 RC1 has been released.
  • Elasticsearch Marvel version 1.1.1 has been released.
  • Nicolas Everett released version 0.0.8 of the experimental highlighter plugin for Elasticsearch.
  • Issac Springer has released a PowerShell module for common maintenance tasks in Elasticsearch.
  • Elasticsearch now powers FBOpen, an open API server, data import tools, and sample apps to help small businesses search for opportunities to work with the U.S. government.
  • Patrick Peschlow created an incredibly useful Elasticsearch Indexing Performance Cheatsheet.
  • Graeme Fowler has a nice series of blog posts of how to monitor exim mail logs with the ELK stack.
  • Sebastian Belczyk shared his work on benchmarking the performance of child document queries.
  • Graham Tackley, Director of Architecture at The Guardian News and Media, will be presenting their Elasticsearch Use Case at the upcoming DataBeat conference. You can read all about his presentation from VentureBeat, including how the VB folks will be showcasing you how to use all that data you have to make more money at DataBeat.
  • Somkiat shared a how to on using Elasticsearch with the Thai analyzer.(ในไทย)
  • We've started a blog post series to let you know what Elasticsearch events are on each week. Published on Mondays, Where in the World is Elasticsearch helps you find cool talks near you. We'd love your feedback!

Full house last week at the first ever Elasticsearch Rio de Janeiro Meetup (photo by Leslie Hawthorn)

Slides & Videos

Jason Scheller from Thomson Reuters on their use of Elasticsearch, April New York City Elasticsearch Meetup

Leslie Hawthorn's presentation from the Elasticsearch Rio de Janeiro Meetup

Kicking off the May Boston Elasticsearch Meetup (photo by Igor Motov)

Clay Whetung of Brewster on on their use of Elasticsearch, April New York City Elasticsearch Meetup

How to Use Elasticsearch Analyzers, presented by Pablo Musa at the first ever Rio de Janeiro Elasticsearch Meetup (BR-PT)

Slides from the May Search Meetup Karlsruhe

Slides from ESPRIT JUG Day Tunisia

David Pilato helps the audience at ESPRIT JUG Day Tunisia 2014 Make Sense of their (Big) Data! (photo by the ESPRIT JUG Day team)

Slides from ESPRIT JUG Day Tunisia

Where to Find Us

We'd love to feature all the great Elasticsearch, Logstash and Kibana presentations and meetups happening worldwide in this section. If you're speaking or hosting a meetup, let our Community Manager, Leslie Hawthorn, know!


The wonderful Mark Walkom, Community Organizer of the Elasticsearch Sydney Meetup, will discuss How to Get Running with Logstash on May 20th. Join Mark at Google Sydney for the May SAGE-AU Meetup at 6 PM.


The Elasticsearch Vienna Meetup Group has scheduled their first meetup for Thursday, June 12th. Please join us at 7 PM to hear from Alexander Reelsen on What's New in Elasticsearch.


Honza Kral will discuss how to Explore Your Data using Elasticsearch at the Bulgarian Web Summit 2014. The conference takes place on May 31st in Sofia.


The fine folks at the Polyglot Unconference in Vancouver, BC are hosting their annual event on May 23-25th. The conference kicks off with Ganesh Swami presenting a half day workshop on Getting Started with Elasticsearch.


  • Honza Kral will be speaking at DjangoConEU on From __icontains to search. The conference takes place May 13-17th on the Île des Embiez in France.
  • David Pilato will be attending dotSCale on May 19th in Paris. Don't miss his dotScale workshop, Elasticsearch Overview, on May 17th!
  • David Pilato will host an Elasticsearch workshop at the Solutions Linux Conference on May 20th. The conference runs from May 20-21st in Paris. If you don't have time to attend David's workshop, make sure to stop by the Elasticsearch booth to say hello!
  • David Pilato will run a workshop on Elasticsearch and Kibana at the Breizhcamp 2014. The event runs from May 21-23rd in Rennes.


  • The Berlin Geek 2 Geek Meetup group will get together this Friday, May 16th to talk all about Logging. Jilles van Gurp from Linko will discuss the ELK stack. Doors open at 7 PM.
  • The Elasticsearch Stuttgart meetup will convene their second meeting on May 26th at 7 PM. Topics will include running Elasticsearch on AWS and Google Cloud Engine.
  • The Berlin .Net Users Group will talk Elasticsearch on Monday, May 26th at 7 PM.
  • The Elasticsearch team will be at Berlin Buzzwords. (When we say the team, we mean most of our folks in the EU and several of our employees from the US. :)) We have many talks on the program and look forward to hosting you in the developer chill area, as well. Even better, the Berlin Elasticsearch User Group will convene a hackathon on Wednesday, May 28th. Please join us!


Martijn Laarman will be speaking on from text to full-text search at the NDC Oslo 2014 conference. The show runs from June 2-6th.


The Torun JUG will get together on May 28th at 6 PM to talk about Lucene and all of her friends, with a spotlight on Elasticsearch.


Costin Leau will speak at Topconf Bucharest 2014 on Big data real time search and analytics. Topconf Bucharest runs from June 10-13th and Costin will speak at 3:20 PM on June 12th.


Honza Kral will speak at PyCon Ru. The schedule is still being finalized, but mark your calendars for June 2nd and 3rd. If you're heading to PyCon Ru, make sure to say hello to Honza!


Clinton Gormley has been invited to speak at the Barcelona on Rails Meetup on May 15th. Join him for a presentation on Elasticsearch's Query DSL: Not just for wizards! Doors open at 7 PM, and thanks to the fine folks at XING for hosting us!


  • Alexander Reelsen will be speaking at Mimacom Days Zurich on June 4th. He will cover Elasticsearch - Beyond Full Text Search at 9:45 AM, directly after the conference welcome remarks.
  • Alexander Reelsen and Britta Weber will be speaking at the Zurich Elasticsearch Meetup on June 7th. Alex will discuss What's new in Elasticsearch and Britta will cover the Significant Terms Aggregation. Doors open at 7 PM.

United Kingdom

United States

  • Just announced: The Elasticsearch San Francisco Meetup group will welcome Graham Tackely, Director of Architecture at The Guardian News and Media, on May 20th. Graham will discuss using Elasticsearch for Analytics at The Guardian. Register now, because this one will fill up fast! Thanks to our friends at Salesforce for hosting us!
  • New meetup: The Elasticsearch Silicon Valley Meetup group will convene on May 21st at 7 PM. You'll hear from Kurt Hurtado on the Logstash team on using the ELK Stack in a DevOps Environment, plus engineers at LinkedIn will talk about how they use the ELK stack. Many thanks to LinkedIn for hosting us!
  • The Atlanta Elasticsearch Meetup group will hold their second meeting on May 20th. Drew Raines will cover Improving Resiliency in Elasticsearch, and Rashid Khan will discuss Aggregations and Kibana. Doors open at 6:30 PM.
  • The Miami JVM Group have recalendared their Introduction to Elasticsearch for May 20th. Doors open at 7 PM, and registration is open!
  • Elasticsearch will have a table and some tasty treats at GOTO Chicago. The conference runs from May 20-21st. Make sure to stop by and say hello!
  • The Elasticsearch Chicago Meetup group will have their fifth meeting on May 22nd. If you're staying in town an extra few days for GOTO Chicago, please join us! (And if you're in town anyway, you should also totally join us!) Doors open at 6:00 PM.
  • Jordan Sissel will be speaking at Gluecon 2014! Make sure to catch his talk and visit the Elasticsearch booth. The conference runs from May 21-22nd in Bloomfield, Colorado.
  • The Washington DC Elasticsearch Meetup group will get together on May 28th at 6:30 PM. You'll hear from engineers at AOL about Moloch, their open source network forensics tool built on top of Elasticsearch.
  • Costin Leau will speak at Hadoop Summit North America on Real-time Analytics and Anomalies Detection using Elasticsearch, Hadoop and Storm. The conference runs from June 3-5th in San Jose, California. Costin will take the stage at 4:35 PM on June 3rd.

Where to Find You

Our Community Manager, Leslie Hawthorn, is hard at work to help folks create more Elasticsearch meetup groups and to help meetup organizers find more speakers. If you are interested in either effort, take a moment to let her know.

Oh yeah, we're also  hiring. If you'd like us to find you for employment purposes, just drop us a note.  We care more about your skill set and passion for Elasticsearch, Kibana and Logstash than where you rest your head.


If you are interested in Elasticsearch training we have courses taught by our core developers coming up in:

  • Atlanta - May 20, 2014 (core Elasticsearch training)
  • New York - June 4, 2014 (core Elasticsearch training)
  • London - June 4, 2014 (core Elasticsearch training)
  • Zurich - June 5, 2014 (core Elasticsearch training)
  • San Francisco - June 6, 2014 (ELK workshop)
  • Amsterdam - June 27, 2014 (ELK workshop)