This Week in Elasticsearch - February 04, 2015
Welcome to This Week in Elasticsearch. In this roundup, we try to inform you about the latest and greatest changes in Elasticsearch. We cover what happened in the GitHub repositories, as well as many Elasticsearch events happening worldwide, and give you a small peek into the future of the project.
- Scripting: Make groovy sandbox method blacklist dynamically additive (#9470, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Scripting: Make
script.groovy.sandbox.method_blacklist_patchtruly append-only (#9473, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Cache: Use correct number of bytes in query cache accounting (#9479, 2.0.0, 1.5.0)
- Internal: Disable auto generated id indexing optimization (#9468, 2.0.0, 1.5.0, 1.4.3)
- Cache: Use a smaller expected size when serializing query results (#9485, 2.0.0, 1.5.0)
- Internal: Remove dirty flag and force boolean for refresh (#9484, 2.0.0)
- Internal: Remove
- Aggregations: Remove limitation on field access within aggs to the types provided in the search (#9487, 2.0.0, 1.5.0)
- Test: Add type-unrestricted version of field mapper getter to
SearchContext(#9490, 2.0.0, 1.5.0)
- Search: Remove query-cache serialization optimization. (#9500, 2.0.0, 1.5.0, 1.4.3)
- Recovery: update access time of ongoing recoveries (#9506, 2.0.0, 1.5.0)
- Internal: Add
beforeIndexAddedToClustercallback (#9514, 2.0.0, 1.5.0)
- Scripting: Disallow method pointer expressions in Groovy scripting (#9509, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Scripting: Update Groovy dependency to 2.4.0 (#9520, 2.0.0, 1.5.0)
- Aggregations: Add 'offset' option to histogram aggregation (#9505, 2.0.0)
- Internal: upgrade to Lucene snapshot r1656366 (#9524, 2.0.0)
- Test: support stashed values within property names in our REST tests (#9533, 2.0.0, 1.5.0)
- Search: Avoid calling
- Aggregations: Unify histogram implementations (#9446, 2.0.0)
- Nodes Stats: Fix open file descriptors count on Windows (#9397, 2.0.0, 1.5.0, 1.4.3)
- Aggregations: Add standard error bounds to extended_stats (#9389, 1.4.3)
- Inner hits: Make sure inner hits on
has_parentquery resolve hits properly (#9384, 2.0.0, 1.5.0)
- Netty Transport: Add profiles to transport infos (#9134, 2.0.0, 1.5.0)
- Internal: Add
checksumoption for index.shard.check_on_startup (#9183, 2.0.0)
- Resiliency: Add support for registering custom circuit breaker (#8795, 2.0.0)
- Mappings: Remove type prefix support from field names in queries (#9492, 2.0.0)
- Internal: Reuse Lucene's MultiCollector (#9549, 2.0.0)
Check out the Q&A w/ Britta Weber @a2tirb at Fyber's Elasticsearch UG (@elasticberlin) meetup. http://t.co/iKBiWhF4Nq pic.twitter.com/R7gNeuynwg
— Fyber (@Fyber) February 4, 2015
In Apache Lucene This Past Week
- Fixed a doc byte doc values reuse bug, and a concurrency bug with doc values updates that could cause false `FileNotFoundException`.
- IndexWriter's infoStream now logs how much time threads were stalled because flushing couldn't keep up
- JapaneseAnalyzer (Kuromoji) will soon normalize Kanji numerals to their Arabic equivalents
- Lucene's randomized tests uncovered a new JDK bug in Lithuanian Collation
- Clean up the codec APIs building compound files (CompoundFileFormat): remove the explicit files() method, and remove the redundant files argument to write()
- Remove per-document Analyzer during indexing for 5.0: indexed tokens may not match what your search-time analyzer produces
Realtime updates from PostgreSQL to Elasticsearch - Atlassian Developers http://t.co/61L72hIFGI#Elasticsearch #PostgreSQL
— Found (@foundsays) February 4, 2015
Here's some more information about what is happening in the ecosystem we are maintaining around the ELK stack - that's Elasticsearch plus Logstash and Kibana - including plugin and driver releases.
- Colin Goodheart-Smithe treated us to a walk through of using Numeric Aggregations through the lens of analyzing UK housing data. Colin's post is the latest installment in our ongoing series on Aggregations, and, in case you missed the previous articles, you can find them linked in the first paragraph of his post.
- Ever wondered how Apache Lucene handles deleted documents? Mike McCandless shared a deep dive on the topic earlier this week. Useful information for lovers of Lucene and for folks looking to improve indexing performance in Elasticsearch.
- We just released our Elasticsearch Puppet Module version 0.9.0, including out of the box support for Hiera, plus added support for OpenSuse 13.x, for Puppet 3.7, and more. Richard Pijnenburg also shares some thoughts on the future direction of our Puppet Modules in the release blog post.
- Hernan Vivani with the Amazon Web Services team shared a guide to using Elasticsearch and Kibana on Amazon EMR. The post takes you through the install process and gives a few basic examples to confirm it is working. Excellent resource for folks just getting started on their Elasticsearch & AWS journey.
- Steve Elliott with the LateRooms engineering team wrote up a how to on analyzing URLs to enrich your logs for users of Logstash 1.4.x. Offers great tips on using the Grok and Translate plugins, amongst other useful bits. If you enjoyed Steve's post, you may also want to take a look at how LateRooms uses the full ELK stack to drive data driven decision making for every department at their company. That's right, Data Driven Managering.
- Nick Canzoneri with Postmark published an article about how they've revisited their Elasticsearch architecture to handle the ever increasing volumes of data that come with scaling your business. Even better, he's offered to answer your scaling questions if you leave comments on his post. Excellent overview of architectural considerations from a long time happy user.
- Jorge de la Cruz shared a how to on centralizing and analyzing your Zimbra logs using the ELK stack. (en Español)
Blogged: #Elasticsearch One Tip a Day: Using Dynamic Templates To Avoid Rigorous Mappings http://t.co/3VRRXop26z
— Itamar Syn-Hershko (@synhershko) February 4, 2015
Slides & Videos
Capacity Planning and Custom Setups Suitable for Large Elasticsearch Deployments
Better Decisions Through Better Data (auf Deutsch)
OH: In the last 12 hours, we have taken in 12GB of #IoT sensor data in @elasticsearch
— Chris Matthieu (@chrismatthieu) February 3, 2015
Where to Find Us
We'd love to feature all the great Elasticsearch, Logstash, and Kibana presentations and meetups happening worldwide in this section. If you're speaking or hosting a meetup, let our Director of Developer Relations, Leslie Hawthorn, know!
The Vienna Ruby Users Group will get convene on Feb 12, with talks on Logstash, Jekyll and Octopress. You can register now to save your seat.
- On tonight: The Search Meetup Munich group will get together on Feb 5 for a special Elasticsearch edition: Oliver Eilhard on his Elasticsearch Go Client, Alexander Reelsen on the Percolator, plus Q&A with our CTO, Shay Banon. Register now to save your seat.
- The Ansible Berlin User Group is holding their inaugural meetup on Feb 11, and they'll be talking Elasticsearch and Kibana along with Jenkins, Rethinkdb and, of course, Ansible. Join them and help celebrate their first ever meetup!
- The Berlin Elasticsearch User Group will be holding their monthly meeting on February 24. Register now to save your seat, and even better volunteer to speak!
The Tel Aviv-Yafo ELK meetup group will be talking How to Use ELK to Analyze Logs from a Large Production AWS Environment on February 24. These folks are looking for space to meet, so you can host get in touch with the organizers. You can register for the meetup now while they're finding a location.
Registration is already full for the next Elasticsearch Japan Study Session in Tokyo, but you can add yourself to the waitlist. The user group will get together on February 13 at 7:30 PM.
The NDC Meetup Group in Oslo will get together on Feb 18 to talk Data Exploration with Elasticsearch. Register now to save your place.
The Agile Code Camp team are convening a developer and designer hack day on February 23, and attendees will be developing with Elasticsearch. Register now to attend the full day event.
If you're a star in the Ansible Galaxy, you're no doubt attending AnsibleFest London on Feb 5. Stop by and say hello to Alan Hardy and Samir Bennacer at our table in the exhibits area! Our Developer Advocate, Robyn Bergeron, will also be hanging out in the hallway track if you'd like to say hello!
- New meetup: Join Aaron Mildenstein and Peter Kim at the Monitoring NYC Meetup next week! They'll be talking all the new features in Logstash 1.5 and an overview of the full ELK stack on Feb 10. Register now to save your place.
Heading to Strata in San Jose? We've got several activites planned for around the conference. Check them out - you don't have to be attending Strata to enjoy some of the fun!
- Join Costin Leau, creator of Elasticsearch for Apache Hadoop, at the conference. He'll present on Search Evolved: Unraveling Your Data on Friday, Feb 20 at 11:30 AM.
- We're hosting a tutorial, Going Beyond the Needle in the Haystack: Elasticsearch and the ELK Stack, on Feb 18. You can still add a tutorial option to your Strata pass if you'd like to attend, or register for a full conference pass and the tutorial session with a 20% discount using code ES20.
- Whether or not you're headed to Strata, we have a special edition meetup of the Silicon Valley Elasticsearch Meetup happening on Feb 18. You'll hear from Costin Leau, creator of Elasticsearch for Apache Hadoop, and Todd Nine from Apigee on Apache Usergrid & Elasticsearch. Register now to save your place. If you're attending Strata, the venue is a mere 10 minute walk from the convention center.
And for our friends beyond Silicon Valley:
- Robyn Bergeron will be speaking at SCALE 13x on DevOps + Open Source == BFF Practices! Join her to learn more about DevOps (practice, theory, and otherwise!), shared habits of successful open source communities and DevOps practitioners, and tips for how you or your organization can start applying these habits today.
- If you're attending the Linux Foundation Collaboration Summit, make sure to say hello to Leslie Hawthorn. She'll be there to answer any of your questions about the ELK stack in the hallway track. Collab Summit is on Feb 18-20 in Santa Rosa, California.
- The Chicago MySQL Users Group will be getting together on Feb 19 to talk MySQL Audit Logging and the ELK stack. Register now to save your seat.
- For folks in Atlanta who love Elasticsearch and OpenStack, the meetup on Feb 19 will be a great place to be. Sign up now to hear about Managing Your OpenStack logs with ELK.
Where to Find You
PSST! If you're a regular reader of This Week in Elasticsearch, a.k.a TWIES, you're thinking of skipping this section. You may even be thinking to yourself, yes of course I will drop a note on Twitter when I am giving a talk on all things ELK. That's awesome, because we'd like to showcase every meetup, conference presentation and workshop on Elasticsearch, Logstash, and Kibana happening worldwide. And now, we've made it even easier for you to get support for your meetup!
Head on over to our meetups page! (And we'll still totally send you swag if you're giving a talk on anything ELKy at a conference.)
Oh yeah, we're also hiring. If you'd like us to find you for employment purposes, just drop us a note. We care more about your skill set and passion for Elasticsearch, Kibana, and Logstash than where you rest your head.
If you are interested in Elasticsearch training we have courses taught by our core developers coming up in:
- Melbourne - February 9, 2015 (Core Elasticsearch Training)
- New York - February 11, 2015 (Hands on Workshop)
- New York - February 12, 2015 (Core Elasticsearch Training)
- Sydney - February 16, 2015 (Hands on Workshop)
- Amsterdam - February 18, 2015 (Hands on Workshop)
- Amsterdam - February 19, 2015 (Core Elasticsearch Training)
- London - February 25, 2015 (Hands on Workshop)
- London - February 26, 2015 (Core elasticsearch)
- Bangalore - February 25, 2015 (Hands on Workshop)
- Bangalore - February 26, 2015 (Core Elasticsearch Training)
- New Delhi - March 17, 2015 (Core Elasticsearch Training)
- San Francisco - March 17, 2015 (Hands on Workshop)
- Mountain View - March 18, 2015 (Core Elasticsearch Training)
- Northern Virginia - March 24, 2015 (Core Elasticsearch Training)
- Munich - March 24, 2015 (Core Elasticsearch Training)
- Munich - March 26, 2015 (Hands on Workshop)
- Stockholm - March 25, 2015 (Core Elasticsearch Training)
- Los Angeles - March 25, 2015 (Core Elasticsearch Training)
- Paris - March 25, 2015 (Hands on Workshop)
- Paris - March 26, 2015 (Core Elasticsearch Training)