Glossary

This glossary describes essential terms and concepts to help you understand Elasticsearch and its related technologies.

NDCG (Normalized Discounted Cumulative Gain)

A ranking metric that measures how well a system places relevant results near the top of a ranked list. It accumulates relevance gains across positions, applies a logarithmic discount to results appearing lower in the ranking, and normalizes the score against an ideal ordering, hence normalized discounted cumulative gain. Unlike binary metrics such as recall, NDCG supports graded relevance where some results are more relevant than others. Scores range from 0 to 1, with 1 representing a perfect ranking, making it the standard metric for retrieval benchmarks like MTEB.

Negative Pair

Two pieces of content that should be treated as dissimilar during training. For example, a question is paired with an unrelated passage. The model learns to produce embeddings that are far apart for negative pairs.

Negative Sampling

The process of selecting negative examples for contrastive training. The quality of negatives significantly affects model performance. Random negatives are easy for the model and provide little learning signal; carefully selected hard negatives drive meaningful improvement.

NLP (Natural Language Processing)

A field of artificial intelligence concerned with enabling machines to understand, process, and generate human language. NLP underpins most tasks in text-based AI systems, from tokenization and parsing to semantic understanding and generation. Modern NLP is dominated by transformer-based models, which have largely replaced earlier rule-based and statistical approaches.

Normalization

Scaling a vector so its length equals 1, producing a unit vector. After normalization, cosine similarity and dot product become equivalent, which simplifies computation. Most embedding models output normalized vectors by default.

최첨단 검색 환경을 구축할 준비가 되셨나요?

충분히 고급화된 검색은 한 사람의 노력만으로는 달성할 수 없습니다. Elasticsearch는 여러분과 마찬가지로 검색에 대한 열정을 가진 데이터 과학자, ML 운영팀, 엔지니어 등 많은 사람들이 지원합니다. 서로 연결하고 협력하여 원하는 결과를 얻을 수 있는 마법 같은 검색 환경을 구축해 보세요.

직접 사용해 보세요