Elasticsearch Data Journey: Life of a Document in Elasticsearch

Ever wondered about the lifecycle of a single document in Elasticsearch? What happens when you index it? How does Elasticsearch ensure a document is replicated and found across the whole cluster reliably? Does deleting a document really physically remove the document from disk? How do we get from a blob of text and keywords to near real-time search and analytics?

This talk will explain the when, where and why of your document's life inside of Elasticsearch. Alex and Boaz will take you through the journey of a document across a cluster, taking off with JSON and the curly braces, travelling through the network into the memory and all the analysis chains, heading further onto storage when writing into the transaction logs and Apache Lucene index, being read back by executing searches, all the way until the document is finally deleted.

Even though this talk will cover a lot of different aspects, it's a talk for those who may be less familiar with core searchfunctionality. You do not need to be an Apache Lucene wizard to follow along and find this session useful.

Alexander Reelsen

Alexander is an Elasticsearch developer interested in all things search and scale. He enjoys writing code, giving talks and trainings, as well as introducing people to all parts of the Elastic Stack. When offline, he goes hiking, watches basketball, and tries to get online again.

Boaz Leskes

Boaz is lead developer of Elasticsearch Marvel and the author of Sense, the popular front end for Elasticsearch. Boaz's background is diverse, ranging from C++ to C#, Python, Java, and sometimes even JavaScript. Based in Amsterdam, he's a fan of faceting, Lucene, monitoring, and search.