Glossary

Task-Specific Prefix

A short text instruction prepended to the input that tells the embedding model what task is being performed. Prefixing a query with "search_query:" or a document with "search_document:" helps the model produce embeddings optimized for that use case. Jina Embeddings v3 uses task-specific prefixes to tailor embedding behavior.

Teacher Model

The larger, more capable model in a knowledge distillation setup. The teacher provides reference outputs that the student model learns to imitate.

Text Embedding

A fixed-size dense vector representation of a piece of text, ranging from a single word to a full paragraph. Text embeddings encode linguistic information into a vector space where geometric relationships reflect meaning: similar texts cluster together, and operations like analogy and search become computable through vector arithmetic. They are the output artifact produced by embedding models.

Throughput

The number of items a system processes per unit of time, measured in embeddings per second for embedding pipelines. Throughput is maximized by batching inputs, allowing the model to process multiple items in parallel on the same hardware. It exists in tension with latency: batching improves throughput but increases the time any individual request waits, making the tradeoff a central consideration in production system design.

Token

The smallest unit of text that a model processes. A token might be a whole word, part of a word, or a single character, depending on the tokenizer. The word "embeddings" might be split into "embed" and "dings" as two tokens. Models operate on tokens, not words, which is why token counts differ from word counts.

Tokenization

Breaking text into tokens before feeding it to a model. Different models use different tokenization strategies, and the same sentence can produce different token counts depending on the tokenizer. Tokenization determines what the model actually sees.

Training Triplet

A set of three items used in training: an anchor (e.g., a query), a positive (a relevant result), and a negative (an irrelevant result). The model is trained so the anchor's embedding is closer to the positive than to the negative.

Transformer

The neural network architecture behind virtually all modern embedding and language models. Introduced in 2017, transformers process all parts of an input simultaneously rather than sequentially, and use attention mechanisms to understand relationships between words regardless of their position in the text.

Triplet Loss

A loss function that operates on training triplets (anchor, positive, negative). It penalizes the model when the distance between the anchor and positive is not sufficiently smaller than the distance between the anchor and negative. The required gap is controlled by a margin parameter.