Hard Negatives
Negative examples that are challenging because they look similar to the positive but are actually irrelevant. A passage that shares many keywords with a query but doesn't answer it is a hard negative. Training with hard negatives forces the model to learn deeper semantic distinctions rather than relying on surface-level word overlap.
High-Dimensional Space
A mathematical space with many dimensions — often hundreds or thousands. Humans can visualize two or three dimensions; embedding models work in spaces with far more. Many geometric intuitions break down in high-dimensional spaces, but the core idea holds: points that are close together are similar, points that are far apart are different.
High-Dimensional Vector
A mathematical object represented as an ordered list of many numerical values (called components or coordinates) — ranging from dozens to millions — that represents a specific point within a high-dimensional space. The two are inseparable: the vector is the concrete "resident," while the space is the abstract "container" that defines the rules for distances, angles, and relationships. A vector with 1,000 components lives in a 1,000-dimensional space, with each combination of values mapping to a unique point. In machine learning, high-dimensional vectors encode complex objects such as words or images, with relationships between objects reflected in geometric relationships between their vectors.
Hybrid Search
A search approach that combines semantic search (using embeddings) with lexical search (using keyword matching, typically BM25). By blending both signals, hybrid search captures the strengths of each: the meaning-awareness of embeddings and the precision of keyword matching.