Introduction to Embeddings

In Machine Learning, an embedding is a vector (an array of numbers) that represents real-world objects such as words, sentences, images or videos. The interesting property that these embeddings have is that two embeddings that represent similar or related real-world entities will share some similarities as well, so embeddings can be compared, and a distance between them can be calculated.

When thinking specifically in terms of an application for searching, performing a search of embeddings in the vector space tends to find results that are more related to concepts, instead of to the exact keywords typed in the search prompt.

In this section of the tutorial you are going to learn how to generate embeddings using freely available machine learning models, then you will use Elasticsearch's vector database support to store and search these embeddings. And towards the end, you will also combine vector and full-text search results and create a powerful hybrid search solution that offers the best of both approaches.