Elasticsearch Data Journey: Life of a Document in Elasticsearch

Ever wondered about the lifecycle of a single document in Elasticsearch? What happens when you index it? How does Elasticsearch ensure a document is replicated and found across the whole cluster reliably? Does deleting a document really physically remove the document from disk? How do we get from a blob of text and keywords to near real-time search and analytics?

This talk will explain the when, where and why of your document's life inside of Elasticsearch. Alex and Boaz will take you through the journey of a document across a cluster, taking off with JSON and the curly braces, travelling through the network into the memory and all the analysis chains, heading further onto storage when writing into the transaction logs and Apache Lucene index, being read back by executing searches, all the way until the document is finally deleted.

Even though this talk will cover a lot of different aspects, it's a talk for those who may be less familiar with core searchfunctionality. You do not need to be an Apache Lucene wizard to follow along and find this session useful.

Alexander Reelsen

Alexander is an Elasticsearch developer interested in all things search and scale. He enjoys writing code, giving talks and trainings, as well as introducing people to all parts of the Elastic Stack. When offline, he goes hiking, watches basketball, and tries to get online again.

Boaz Leskes

Boaz is a core Elasticsearch developer. When not working on consensus algorithms, cluster state changes, data replication, and sequence numbers, you can find him at the ping pong table, playing office DJ, collaborating with colleagues on Zoom, or, if it's Friday, eating hummus.