Author’s articles
September 19, 2024
Evaluating search relevance part 2 - Phi-3 as relevance judge
Using the Phi-3 language model as a relevance judge, with tips & techniques to improve the agreement with human-generated annotation
July 16, 2024
Evaluating search relevance part 1 - The BEIR benchmark
Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes.
May 3, 2024
Evaluating scalar quantization in Elasticsearch
Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment.
December 1, 2023
RAG evaluation metrics: A journey through metrics
Explore RAG evaluation metrics like BLEU score, ROUGE score, PPL, BARTScore, and more. Discover how Elastic is evaluating RAG with UniEval.