This Week in Elasticsearch - February 11, 2015
Welcome to This Week in Elasticsearch. In this roundup, we try to inform you about the latest and greatest changes in Elasticsearch. We cover what happened in the GitHub repositories, as well as many Elasticsearch events happening worldwide, and give you a small peek into the future of the project.
Groovy scripting vulnerability. See the blog and download Elasticsearch v1.4.3 / v1.3.8 https://t.co/BfUEcwek0R
— elasticsearch (@elasticsearch)
February 11, 2015
Elasticsearch Core
- Discovery: check index uuid when merging incoming cluster state into the local one (#9541, 2.0.0, 1.5.0, 1.4.3)
- Internal: Avoid unnecessary utf8 conversion when creating ScriptDocValues for a string field (#9557, 2.0.0, 1.5.0, 1.4.3)
- Discovery: publishing timeout to log at WARN and indicate pending nodes (#9551, 2.0.0, 1.5.0, 1.4.3)
- Core: Remove full flush /
FlushType.NEW_WRITER
(#9559, 2.0.0, 1.5.0) - Core: Remove
FlushType
and make resources final inInternalEngine
(#9565, 2.0.0, 1.5.0) - Gateway: Add logging around gateway shard allocation (#9562, 2.0.0, 1.5.0, 1.4.3)
- Recovery: Add a timeout to local mapping change check (#9575, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Logging: improve logging messages added in #9562 (#9603, 2.0.0, 1.5.0, 1.4.3)
- Core: Mapping update task back references already closed index shard (#9607, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Aggregations: Add
offset
todate_histogram
, replacingpre_offset
andpost_offset
(#9597, 2.0.0) - Core: Refactor
InternalEngine
into abstractEngine
and classes (#9585, 2.0.0, 1.5.0) - Core: Close Engine immediately if a tragic event strikes. (#9616, 2.0.0, 1.5.0)
- Geo: Correct bounding box logic for
GeometryCollection
type (#9550, 2.0.0, 1.5.0, 1.4.3) - Core: Factor out settings updates from
Engine
(#9625, 2.0.0, 1.5.0) - Internal: Promptly cleanup updateTask timeout handler (#9621, 2.0.0, 1.5.0, 1.4.3, 1.3.8)
- Percolate API: support encoded body as query string param consistently (#9628, 2.0.0, 1.5.0, 1.4.3)
- Upgrade API: Change
wait_for_completion
to default to true (#9639, 1.4.3) - Core: Consolidate index / shard deletion in
IndicesService
(#9605, 2.0.0, 1.5.0) - Internal: Add
AliasesRequest
interface to mark requests that manage aliases (#9460, 2.0.0, 1.5.0) - Query API:
field_value_factor
query now throws exception on multiple values (#9246, 2.0.0, 1.5.0) - Engine: back port fix to a potential dead lock when failing engine during
COMMIT_TRANSLOG
flush (#9501, 1.4.3) - Recovery: Improved timeout logging on stalled recovery and exception (#9600, 1.4.3)
- Mappings: Remove support for new indexes using path setting in object/nested fields or index_name in any field (#9570, 2.0.0)
- Tests: Add static indexes to fill out 0.20.x coverage. (#9537, 1.4.3)
- Fielddata: Change threshold value of fielddata.filter.frequency.max/min (#9522, 2.0.0)
- Mappings: Remove ability to set path for
_id
and _routing on 2.0+ indexes (#9623, 2.0.0)
wow! “@mattTheLenda: @elasticsearch is now powering our @MarsCuriosity data analytics platform. Welcome to Mars, @kimchy… #freakyfast"
— Shay Banon (@kimchy)
February 6, 2015
In Apache Lucene This Past Week
- Improvements landed for some important steps towards future merging of Lucene's Filter and Query: do not decode term frequency blocks from the postings when scores are not needed; move needsScores from Scorer to Weight so decisions about scoring can be made higher up in the search stack; remove TermFilter since QueryWrapperFilter (TermQuery) is now the same thing.
- In a step towards the goal of eventually having all query scorers expose positions/offsets, code was merged to trunk and 5.1 to merge DocsEnum and DocsAndPositionsEnum into PostingsEnum. Now, when iterating through postings, you always have methods to access positions/offsets. (N.B.: Most queries don't implement positions.)
- When an exception is hit while one thread commits and another opens a near-real-time reader, IndexWriter will no longer deadlock.
- Further improvements to IndexWriter include adding new methods to see if it's open and to see if a "tragic" exception struck, causing IndexWriter to close itself in self defense.
- And from the ease of use department: When an index has segments that are too old, the exception thrown is now much more clear for the user.
90-170% indexing throughput boost with new Elasticsearch-php core. Informal test, but encouraging results so far! pic.twitter.com/8he5v7Qa8o
— Zachary Tong (@ZacharyTong)
February 10, 2015
Elasticsearch Ecosystem
Here's some more information about what is happening in the ecosystem we are maintaining around the ELK stack - that's Elasticsearch plus Logstash and Kibana - including plugin and driver releases.
- If you've been a fan of Mark Harwood's previous posts on anomaly detection, you'll love his latest: Spotting Bad Actors: What Your Logs Can Tell You about Protecting Your Business. The post is a very detailed how to on using Elasticsearch's aggregations to analyze web server log files, with the goal of discovering how to block unwelcome visitors to a site.
- Christoffer Vig from our partner firm, Comperio, shared an awesome walk through post on analytics with Elasticsearch and Kibana 4. In his post, you'll learn all about how to use the two together to gain valuable insights, such as "Which Belgian Beer gives you the most value for the money?"
- Chris Simpson has republished his fantastic how to on Using Elasticsearch on Amazon EC2 - now up to date for current versions of Elasticsearch.
- Florian Hopf authored an article on Fixing Elasticsearch Allocation Issues he encountered while working with 350 Logstash indices on his laptop. Nice post on the process of debugging, how the Cluster Stats API can make your life easier.
- Marco Bonzanini shared an article exploring some options to improve the results of Elasticsearch queries with multiple terms. Amongst other use cases in this post, you'll learn about phrase-based matches and phrase matches with slop for proximity search.
- Bruce Park published a tutorial on Testing Elasticsearch in Your Rails 4 Application. Lots of useful code samples!
Slides & Videos
Shay Banon on the how and why of building Elasticsearch's API
Spotify's engineering team share their Elasticsearch Use Cases at the recent Stockholm Meetup
Alexander Reelsen's quick introduction to Elasticsearch's percolator, showcasing the potential of performing document enrichment before indexing
Where to Find Us
We'd love to feature all the great Elasticsearch, Logstash, and Kibana presentations and meetups happening worldwide in this section. If you're speaking or hosting a meetup, let our Director of Developer Relations, Leslie Hawthorn, know!
Austria
- The Vienna Ruby Users Group will get convene on Feb 12, with talks on Logstash, Jekyll and Octopress. You can register now to save your seat.
- The Vienna Elasticsearch Users Group will get together on March 4 to talk Elasticsearch at Cloud Foundry and more. Register now to let the organizers know you're attending.
Germany
Want to know more @logstash and happen to be in Berlin next 24/02, join me @elasticberlin (http://t.co/Kh6SyckgnW) for an all night on logs!
— Pere Urbón-Bayes (@purbon)
February 11, 2015
India
The Configuration Management Magic Meetup will convene on February 21 in Bangalore. Among the many talks on offer, you can hear all about Log Analysis using Elasticsearch, Kibana and Fluentd. Register now to save your seat.
Israel
The Tel Aviv-Yafo ELK meetup group will be talking How to Use ELK to Analyze Logs from a Large Production AWS Environment on February 24. You can register now to attend.
Japan
Registration is already full for the next Elasticsearch Japan Study Session in Tokyo, but you can add yourself to the waitlist. The user group will get together on February 13 at 7:30 PM.
Speaker Update: Peter Vulgaris from @facebook speaks in the #Elasticsearch track at #GOTOams: http://t.co/TcA6fFyHjp
— GOTO Amsterdam (@GOTOamst)
February 11, 2015
Norway
The NDC Meetup Group in Oslo will get together on Feb 18 to talk Data Exploration with Elasticsearch. Register now to save your place.
South Africa
The inaugural Capetown Elasticsearch Meetup will convene on March 5 to talk shop and plan for the future of the group. Register now to let the organizers know you plan to attend.
Taiwan
The Agile Code Camp team are convening a developer and designer hack day on February 23, and attendees will be developing with Elasticsearch. Register now to attend the full day event.
United Kingdom
Elasticsearch will be out in force at QCon London, which returns to the Queen Elizabeth II Conference Center this year. You can visit us at our booth on the show floor, plus we'll be having one of our engineers take the stage for the main program. Those details are in the works, but in the meantime you can take a look at Kristoffer Dyrkorn's talk information. He'll be sharing the story of how Elasticsearch and other technologies are powering the Norwegian Roads Authority's brand new system to provide real-time traffic information to travelers throughout Norway.
United States
Heading to Strata in San Jose? We've got several activites planned for around the conference. Check them out - you don't have to be attending Strata to enjoy some of the fun!
- Join Costin Leau, creator of Elasticsearch for Apache Hadoop, at the conference. He'll present on Search Evolved: Unraveling Your Data on Friday, Feb 20 at 11:30 AM.
- We're hosting a tutorial, Going Beyond the Needle in the Haystack: Elasticsearch and the ELK Stack, on Feb 18. You can still add a tutorial option to your Strata pass if you'd like to attend, or register for a full conference pass and the tutorial session with a 20% discount using code ES20.
- Whether or not you're headed to Strata, we have a special edition meetup of the Silicon Valley Elasticsearch Meetup happening on Feb 18. You'll hear from Costin Leau, creator of Elasticsearch for Apache Hadoop, Holden Karau from Databricks on Elasticsearch and Spark, and Todd Nine from Apigee on Apache Usergrid & Elasticsearch. Register now to save your place. If you're attending Strata, the venue is a mere 10 minute walk from the convention center.
Coming up in San Francisco next month, you can join the SF MySQL Meetup to hear all about using MySQL and the ELK stack for audit logging. The user group will get together on March 11, but the event is filling up quickly. Register now to save your seat.
And for our friends beyond Silicon Valley:
- Robyn Bergeron will be speaking at SCALE 13x on DevOps + Open Source == BFF Practices! Join her to learn more about DevOps (practice, theory, and otherwise!), shared habits of successful open source communities and DevOps practitioners, and tips for how you or your organization can start applying these habits today.
- If you're attending the Linux Foundation Collaboration Summit, make sure to say hello to Leslie Hawthorn. She'll be there to answer any of your questions about the ELK stack in the hallway track. Collab Summit is on Feb 18-20 in Santa Rosa, California.
- The Chicago MySQL Users Group will be getting together on Feb 19 to talk MySQL Audit Logging and the ELK stack. Register now to save your seat.
- For folks in Atlanta who love Elasticsearch and OpenStack, the meetup on Feb 19 will be a great place to be. Sign up now to hear about Managing Your OpenStack logs with ELK.
Honza Kral mentors attendees at the recent Django Girls Brno workshop
Photo Credit: Martin Kyral for Django Girls
Where to Find You
PSST! If you're a regular reader of This Week in Elasticsearch, a.k.a TWIES, you're thinking of skipping this section. You may even be thinking to yourself, yes of course I will drop a note on Twitter when I am giving a talk on all things ELK. That's awesome, because we'd like to showcase every meetup, conference presentation and workshop on Elasticsearch, Logstash, and Kibana happening worldwide. And now, we've made it even easier for you to get support for your meetup!
Head on over to our meetups page! (And we'll still totally send you swag if you're giving a talk on anything ELKy at a conference.)
Oh yeah, we're also hiring. If you'd like us to find you for employment purposes, just drop us a note. We care more about your skill set and passion for Elasticsearch, Kibana, and Logstash than where you rest your head.
Trainings
If you are interested in Elasticsearch training we have courses taught by our core developers coming up in:
- Sydney - February 16, 2015 (Hands on Workshop)
- Sydney - February 17, 2015 (Core Elasticsearch Training)
- Amsterdam - February 18, 2015 (Hands on Workshop)
- Amsterdam - February 19, 2015 (Core Elasticsearch Training)
- Los Angeles - February 19, 2015 (Hands on Workshop)
- London - February 25, 2015 (Hands on Workshop)
- London - February 26, 2015 (Core elasticsearch)
- Bangalore - February 25, 2015 (Hands on Workshop)
- Bangalore - February 26, 2015 (Core Elasticsearch Training)
- New Delhi - March 17, 2015 (Core Elasticsearch Training)
- San Francisco - March 17, 2015 (Hands on Workshop)
- Mountain View - March 18, 2015 (Core Elasticsearch Training)
- Northern Virginia - March 24, 2015 (Core Elasticsearch Training)
- Munich - March 24, 2015 (Core Elasticsearch Training)
- Munich - March 26, 2015 (Hands on Workshop)
- Stockholm - March 25, 2015 (Core Elasticsearch Training)
- Los Angeles - March 25, 2015 (Core Elasticsearch Training)
- Paris - March 25, 2015 (Hands on Workshop)
- Paris - March 26, 2015 (Core Elasticsearch Training)