Glossary

This glossary describes essential terms and concepts to help you understand Elasticsearch and its related technologies.

k-Nearest Neighbors (kNN)

A search method that finds the k most similar vectors to a query by comparing it against every item in the dataset using a chosen distance metric. kNN is exhaustive and exact, guaranteeing optimal results, but scales as O(n) per query, making it impractical for large datasets. In practice it is used for small datasets or as a ground truth baseline for measuring the recall of approximate nearest neighbor methods.

Knowledge Base

A collection of documents or data that a search or RAG system retrieves from at query time. It can contain structured or unstructured content such as company documents, product manuals, or research papers. In embedding-based systems, the knowledge base is chunked, embedded, and indexed into a vector store. Its quality directly determines the quality of retrieval and generation: incomplete or noisy content yields unreliable results regardless of model quality.

Knowledge Distillation

Training a smaller model (the student) to replicate the behavior of a larger, more capable model (the teacher). The student learns not just from raw data but from the teacher's outputs, which contain richer information about relationships between examples. The result is a compact model that performs closer to the large model than if trained on its own.

¿Estás listo para crear experiencias de búsqueda de última generación?

No se logra una búsqueda suficientemente avanzada con los esfuerzos de uno. Elasticsearch está impulsado por científicos de datos, operaciones de ML, ingenieros y muchos más que son tan apasionados por la búsqueda como tú. Conectemos y trabajemos juntos para crear la experiencia mágica de búsqueda que te dará los resultados que deseas.

Pruébalo tú mismo