Elasticsearch Data Journey: Life of a Document in Elasticsearch

Ever wondered about the lifecycle of a single document in Elasticsearch? What happens when you index it? How does Elasticsearch ensure a document is replicated and found across the whole cluster reliably? Does deleting a document really physically remove the document from disk? How do we get from a blob of text and keywords to near real-time search and analytics?

This talk will explain the when, where and why of your document's life inside of Elasticsearch. Alex and Boaz will take you through the journey of a document across a cluster, taking off with JSON and the curly braces, travelling through the network into the memory and all the analysis chains, heading further onto storage when writing into the transaction logs and Apache Lucene index, being read back by executing searches, all the way until the document is finally deleted.

Even though this talk will cover a lot of different aspects, it's a talk for those who may be less familiar with core searchfunctionality. You do not need to be an Apache Lucene wizard to follow along and find this session useful.

Alexander Reelsen

Alexander is an Elasticsearch and Shield developer, who is interested in all things search and scale. Having worked on Elasticsearch for about two years, he enjoys writing code, giving talks and trainings, and introducing people to all things ELK. He loves hiking and basketball, and enjoys watching or playing whenever possible.

Boaz Leskes

Boaz is lead developer of Elasticsearch Marvel and the author of Sense, the popular front end for Elasticsearch. Boaz's background is diverse, ranging from C++ to C#, Python, Java, and sometimes even JavaScript. Based in Amsterdam, he's a fan of faceting, Lucene, monitoring, and search.