Glossary

Embedding

A vector produced by a trained neural network to represent the meaning of a piece of content, such as a sentence, a paragraph, an image, or a code snippet. Unlike a vector of hand-picked features, an embedding is learned during training. Two pieces of content with similar meanings will produce embeddings that are close together in vector space.

Embedding Distillation

Knowledge distillation applied specifically to embedding models. A smaller embedding model is trained to produce embeddings that match those of a larger, higher-quality model. The result is a lightweight model that generates embeddings nearly as good as the large model's, at a fraction of the computational cost.

Embedding Model

A model designed to convert input data, text, images or code, into fixed-size dense vectors optimized for search, similarity and retrieval. Unlike general language models, embedding models are trained with objectives such as contrastive learning that pull similar inputs closer together and push dissimilar ones apart in vector space, making their representations directly comparable.

Embedding Space

The mathematical space in which all embeddings from a given model live. Think of it as a coordinate system where each point represents a piece of content. Content with similar meaning clusters together; unrelated content sits far apart. How well this space organizes meaning is what separates a good embedding model from a poor one.

Encoder

In its original conception, an encoder converts its input into a machine-usable form. In transformer-based models, the encoder reads and processes input text, producing a rich internal representation. Encoder models process the full input at once and are naturally suited to tasks that require understanding the meaning of text, such as generating embeddings

Encoder-Only Model

A model built using only the encoder component of the transformer. These models (such as BERT) are optimized for understanding text rather than generating it, making them the traditional choice for embedding and classification tasks. They process the input bidirectionally — each token can attend to every other token.

Endpoint

A specific URL that accepts requests for a particular function. For example, an embeddings endpoint generates vectors, a reranking endpoint reorders results. Each endpoint expects inputs in a defined format.

Euclidean Distance

The straight-line distance between two points in a multi-dimensional vector space. It’s calculated as the square root of the sum of squared differences across all dimensions (L2 norm). Smaller distances indicate greater similarity. Less commonly used for text embeddings than cosine similarity, as it is sensitive to vector magnitude rather than direction.

Evaluation Metric

A measure that quantifies a specific aspect of model performance on a task. Different tasks require different metrics: retrieval tasks use normalized discontinued cumulative gain (NDCG) and recall; classification tasks use accuracy or F1, which balances precision and recall. Metric choice is consequential: it determines which behaviors are rewarded during evaluation and can obscure weaknesses that matter in production but are not captured by the chosen measure.