Glossary

This glossary describes essential terms and concepts to help you understand Elasticsearch and its related technologies.

k-Nearest Neighbors (kNN)

A search method that finds the k most similar vectors to a query by comparing it against every item in the dataset using a chosen distance metric. kNN is exhaustive and exact, guaranteeing optimal results, but scales as O(n) per query, making it impractical for large datasets. In practice it is used for small datasets or as a ground truth baseline for measuring the recall of approximate nearest neighbor methods.

Knowledge Base

A collection of documents or data that a search or RAG system retrieves from at query time. It can contain structured or unstructured content such as company documents, product manuals, or research papers. In embedding-based systems, the knowledge base is chunked, embedded, and indexed into a vector store. Its quality directly determines the quality of retrieval and generation: incomplete or noisy content yields unreliable results regardless of model quality.

Knowledge Distillation

Training a smaller model (the student) to replicate the behavior of a larger, more capable model (the teacher). The student learns not just from raw data but from the teacher's outputs, which contain richer information about relationships between examples. The result is a compact model that performs closer to the large model than if trained on its own.

Prêt à créer des expériences de recherche d'exception ?

Une recherche suffisamment avancée ne se fait pas avec les efforts d'une seule personne. Elasticsearch est alimenté par des data scientists, des ML ops, des ingénieurs et bien d'autres qui sont tout aussi passionnés par la recherche que vous. Mettons-nous en relation et travaillons ensemble pour construire l'expérience de recherche magique qui vous permettra d'obtenir les résultats que vous souhaitez.

Jugez-en par vous-même