2014년 1월 8일

This Week in Elasticsearch - January 08, 2014

Welcome to This Week in Elasticsearch. In this roundup, we try to inform you about the latest and greatest changes in Elasticsearch. We cover what happened in the GitHub repositories, as well as many Elasticsearch events happening worldwide, and give you a small peek into the future of the project.

We've been out for a bit due to the end of the year holidays, so we have even more great information to share with you this week.

Elasticsearch Core

Elasticsearch 0.90.9 has been released
Make parsing strict for geo_shape query & filter and stricter for common query. (#4508, master)
Fix computation of ram bytes used in bloom filter posting format (commit, 0.90 and master)
Snapshot/Restore: Add ability to specify base directory on the repository level (commit, master)
Snapshot/Restore: Update snapshot list when snapshot is deleted (commit, master)
Allow to enable / disable bloom filter loading on an index (#4525, 0.90 and master)
Search with terms lookup might get stuck while doing a get for the terms (#4519, 0.90 and master)
Failed search on a shard tries a local replica on a network thread (#4526, 0.90 and master)
Aggregations: Parsing is more strict now (#4464, master)
Cluster Health API returns wrong shard numbers if one of the indices is in red status (#4528, 0.90 and master)
Make doc lookups in queries/filters consistent (#4486, master)
Updated to netty 3.9.0 (commit, 0.90 and master)
Named filter and query don't work with parent/child queries (#4534, 0.90 and master)
Cat API: Collapse/group column support (#4433, master)
Cat API: Fixed NullPointerException in cat/shards when UNASSIGNED (#4544, master)
Use BINARY doc values instead of SORTED_SET doc values to store numeric data, as those can be better used for computations (#3993, master)
Geo distance calculations now default to sloppy_arc (#4498, master)
Make RangeAggregator a MULTI_BUCKETS aggregator (#4550, master)
Geo points are now stored using doc values (#4207, master)
Merge rest-spec-api into elasticsearch core (#4540, 0.90 and master)
Make all search-related APIs consistently accept a query param (#4074, master)
Expose filtered nodes on TransportClient (#4571, 0.90 and master)
GeoPointFieldMapper.doXContentBody doesn't honor includeDefaults (#4563, 0.90 and master)
Explicit doc values setting (#4560, master)
Single shards APIs should fail if routing is required (#4506, master)
Allow omit_norms on the _all field (#3734, 0.90 and master)
Term statistics are now accessible in scripts (#3772, 0.90 and master)
Allow GetAliasRequest to retrieve all aliases (#4455, 0.90 and master)
Replaced ignore_indices with ignore_unavailable, expand_wildcards and allow_no_indices (#4436, master)
Made parsing of ByteSizeValues case independent, 12GB as well as 12gb (#4442, 0.90 and master)
Remove GET _aliases api in favour for GET _alias api (#4539, master)
Term Vector settings should be treated like flags without propagation (#4582, 0.90 and master)
Simulate the entire toXContent() instead of special caseing (#4579 and #4581, 0.90 and master)
Add field data circuit breaker to stop field data loading from running out of memory (#4592, master)
Cat API: Support for aliases in column names (commit, master)
Using Haversine for accurate distance measurement (#4596, 0.90 and master)
Fixed NullPointerException in IndexShardRoutingTable.getActiveAttribute (#4589, 0.90 and master)
Refresh the id_cache if a new child type with _parent field has been introduced (#4568, 0.90 and master)
Do not balance shards from nodes with newer version of lucene to nodes with older versions of lucene (#4588, 0.90 and master)
Cat API> Cat: Add cache numbers to cat/nodes (#4543, master)
Plugin manager: new timeout option (#4603, 0.90 and master)
The fields option should always return an array for JSON document fields and single valued field for metadata fields (#4542, master)
Deb and RPM Packages are not started anymore automatically after installation (#3722, master)
Double wildcards in the the index name can cause a request to hang (#4610, 0.90 and master)
Indices stats API changes, using URIs instead of parameters (#4054, master)
Nodes stats API changes, using URIs instead of parameters (#4057, master)
Move create index api to new acknowledgement mechanism (#4421, 0.90 and master)
Warmers: Dedicated Norms/Terms warm options in mappings (#4079, 0.90 and master)
Rename score to track_scores in percolate api. (#4624, master)
BalancedShardAllocator might trigger unnecessary relocation under rare circumstances (#4630, 0.90 and master)
Introduced Page-based cache recycling (#4557, master)
Make partial dates without year to be 1970 based instead of 2000 (#4451, master)
Don't schedule a flush if there are no operations in the translog (commit, 0.90 and master)
A GeoHashGrid aggregation that buckets GeoPoints into cells whose dimensions are determined by a choice of GeoHash resolution (commit, master)
Randomize flush interval so multiple shards won't flush at the same time (commit, 0.90 and master)
Cluster State API: Make ClusterStateRequest consistent with others (#4065, master)
Simplify usage of nodes info API (#4055, 0.90 and master)
Rename ElasticSearch to Elasticsearch (including class names, thus breaking) (#4634, master)
Changed get index settings api to use new internal get index settings api instead of relying on the cluster state api. (#4620, master)
Stop FastVectorHighlighter from throwing away some query boosts (#4351, 0.90 and master)
FastVectorHighlighter: Use phraseLimit (#4645, 0.90 and master)

Elasticsearch Ecosystem

Here's some more information about what is happening in the ecosystem we are maintaining around Elasticsearch, including plugin and driver releases, as well as news about Logstash and Kibana.

The biggest news of the week is that Wikipedia and all other Wikimedia sites are moving to Elasticsearch! You can read more on the Wikimedia Blog and see what The Next Web has to say.
High Scalability posted an article on How HipChat Stores and Indexes Billions Of Messages Using Elasticsearch And Redis. You can learn even more about this use case from Zuhaib Siddique, the engineer interviewed for this article, in the video below.
logstash 1.3.2 has been released
The Elasticsearch python client has been released in version 0.4.4.
The Scala client Elastic4s has been released in version 0.90.9.0
Lalit Kumar Jha has created Elasticsearch Talend Component
A new milestone version of ElasticHQ has been released; see the changelog for details.
The Sunlight Foundation published a guest blog post from Luke Rosiak on how Elasticsearch is used in CitizenAudit, a free tool for non-profits that helps with reporting financial information.
Bogdan Dumitrescu authored an article on determining how many shards are needed for your Elasticsearch index.
Christiaan Baes wrote up a tutorial on using Elasticsearch with NEST
Michael Wulf created a tutorial on using Firebase in combination with Elasticsearch.
Thomas Ardal wrote a guest blog post for the Elasticsearch blog on rapid prototyping using the Distributed Percolator.
Chris Simpson shared an overview of Elasticsearch's aggregations feature.
Alex Brasetvik authored an introduction to Elasticsearch's aggregations feature.
Eric VanBergen posted a guide to Getting Started on Centralized Logging with Logstash, Elasticsearch and Kibana.
Olivier January performed a successful experiment to use collectd, Logstash, Kibana as monitoring solution. (en français)
Sebastien Jarrin shared his experiences on integrating Elasticsearch with Symfony 2. (en français)

Slides & Videos

How HipChat Scaled to 1 Billion Messages per Day Using Elasticsearch

Simeon Simeonov posted his slides Swoop: Revolutionizing Search Advertising with Elasticsearch

How Facebook Uses Elasticsearch

Where to Find Us

Belgium

Leslie Hawthorn and Honza Kral will be attending FOSDEM 2014 on February 1st and 2nd. Stop by the Elasticsearch table to say hello!

Czech Republic

Honza Kral will give two presentations at DevConf.cz: Design for Cloud with Elasticsearch and Centralized Logging with Logstash. Honza's presentations take place on Friday, February 7th, and the conference runs from the 7th through the 9th.

France

David Pilato will tell you how to Make Sense of Your (BIG) Data! as part of the Human Talks series. David will be presenting in Angers on January 14th; the event starting at 7 PM.
Vladislav Pernin will present on using Elasticsearch, Logstash and Kibana in his talk Centralizing Large Volumes of Logs at the Lyon JUG. The event takes place on January 21st and doors open at 7 PM.

Germany

Michael Schneider from Jimdo will talk about Elasticsearch at the Big Data & NoSQL Meetup Hamburg on January 16th.
Alexander Reelsen will talk about Elasticsearch at the E-Commerce Hacktable in Hamburg on January 22nd. The meetup will also feature a talk from Sebastian Betz of Antevorte on their use of Elasticsearch. Doors open at 7 PM.
We will be present with a booth at OOP Konferenz in Munich from the 4th of February till the 6th. There will also be a workshop on the 5th of February, featuring an introduction to Elasticsearch, Logstash and Kibana

Japan

Thanks to Jun Ohtani, the 3rd Elasticsearch Meetup will be held in Tokyo on February 7th starting at 7 PM. Please remember to register for the meetup.

Netherlands

Boaz Leskes will present From A to JSON - an Overview of Elasticsearch at the 010dev meetup in Rotterdam. Doors open tomorrow night, January 9th, at 6 PM.

United Kingdom

Mark Harwood will talk about What's New in Elasticsearch 1.0 at QCon Night in London on January 15th. Attendance is free of charge, though registration is required. Doors open at 5 PM.

United States

The first ever Elasticsearch Atlanta Meetup will take place on Wednesday, January 15th at 6:30 PM, with talks from two of Elasticsearch's core developers. Boaz Leskes will present on What's New in Elasticsearch 1.0 and Zach Tong will cover Query Optimization.
The second Silicon Valley Elasticsearch Meetup is slated for January 23rd. More details on location and talks will be available by next week - for now, just save the date!
Shay Banon will hold an open format Q&A session at the Elasticsearch Boston Meetup on February 6th. Doors open at 6 PM.
Dates are not yet confirmed, but we're planning a meetup in New York City and Washington, D.C. for early February. Same story for Denver late in the month. Stay tuned for further details, which we hope to have for you by next week.
We're working on setting dates for our first ever meetup in Portland, Oregon. Sign up for the Portlandia Meetup Group to get regular updates.

Where to Find You

Our Community Manager, Leslie Hawthorn, is hard at work to help folks create more Elasticsearch meetup groups and to help meetup organizers find more speakers. If you are interested in either effort, take a moment to let her know.

Oh yeah, we're also hiring. If you'd like us to find you for employment purposes, just drop us a note. We care more about your skill set and passion for Elasticsearch, Kibana and Logstash than where you rest your head.

Training

If you are interested in Elasticsearch training we have courses taught by our core developers coming up in:

New York - February 3, 2014
Stockholm - February 5, 2014
Paris - February 11, 2014
Boulder - February 24, 2014
London - February 25, 2014
San Francisco - February 27, 2014

Elasticsearch Platform

ELK Stack

Elastic Cloud

Observability

Security

Search

산업별

솔루션별

고객 스포트라이트

개발자

소통

학습

도움말

Elastic에서 어떤 일이 진행되고 있는지 확인