TV

Thomas Veasey

Distinguished MLE I | Data and Compute - Elasticsearch

작성자의 글

K-means for building vector indices

2025년 6월 30일

K-means for building vector indices

We discuss optimizing k-means to efficiently create high quality vector indices

Robust Optimized Scalar Quantization

2025년 5월 31일

Robust Optimized Scalar Quantization

We discuss a sparse preconditioner to apply to vectors which results in more stable quantization performance with respect to data distribution

Speeding up merging of HNSW graphs

2025년 4월 7일

Speeding up merging of HNSW graphs

Explore the work we’ve been doing to reduce the overhead of building multiple HNSW graphs, particularly reducing the cost of merging graphs.

Improve search results by calibrating model scoring in Elasticsearch

2024년 12월 23일

Improve search results by calibrating model scoring in Elasticsearch

Learn how to leverage annotated data to calibrate semantic model scoring for better search results

Understanding optimized scalar quantization

2024년 12월 19일

Understanding optimized scalar quantization

In this post, we explain a new form of scalar quantization we've developed at Elastic that achieves state-of-the-art accuracy for binary quantization.

Exploring depth in a 'retrieve-and-rerank' pipeline

2024년 12월 5일

Exploring depth in a 'retrieve-and-rerank' pipeline

Select an optimal re-ranking depth for your model and dataset.

Introducing Elastic Rerank: Elastic's new semantic re-ranker model

2024년 11월 25일

Introducing Elastic Rerank: Elastic's new semantic re-ranker model

Learn about how Elastic's new re-ranker model was trained and how it performs.

What is semantic reranking and how to use it?

2024년 10월 29일

What is semantic reranking and how to use it?

Introducing the concept of semantic reranking. Learn about the trade-offs using semantic reranking in search and RAG pipelines.

Evaluating search relevance part 2 - Phi-3 as relevance judge

2024년 9월 19일

Evaluating search relevance part 2 - Phi-3 as relevance judge

Using the Phi-3 language model as a search relevance judge, with tips & techniques to improve the agreement with human-generated annotation.

Evaluating search relevance part 1 - The BEIR benchmark

2024년 7월 16일

Evaluating search relevance part 1 - The BEIR benchmark

Learn to evaluate your search system in the context of better understanding the BEIR benchmark, with tips & techniques to improve your search evaluation processes.

Evaluating scalar quantization in Elasticsearch

2024년 5월 3일

Evaluating scalar quantization in Elasticsearch

Learn how scalar quantization can be used to reduce the memory footprint of vector embeddings in Elasticsearch through an experiment.

Understanding Int4 scalar quantization in Lucene

2024년 4월 25일

Understanding Int4 scalar quantization in Lucene

This blog explains how int4 quantization works in Lucene, how it lines up, and the benefits of using int4 quantization.

Scalar quantization optimized for vector databases

2024년 4월 25일

Scalar quantization optimized for vector databases

Optimizing scalar quantization for the vector database use case allows us to achieve significantly better performance for the same retrieval quality at high compression ratios.

Speeding Up Multi-graph Vector Search

2024년 3월 12일

Speeding Up Multi-graph Vector Search

Explore multi-graph vector search in Lucene and discover how sharing information between segment searches enhances search speed.

RAG evaluation metrics: A journey through metrics

2023년 12월 1일

RAG evaluation metrics: A journey through metrics

Explore RAG evaluation metrics like BLEU score, ROUGE score, PPL, BARTScore, and more. Discover how Elastic is evaluating RAG with UniEval.

Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2

2023년 10월 17일

Improving information retrieval in the Elastic Stack: Improved inference performance with ELSER v2

Learn about the improvements we've made to the inference performance of ELSER v2, achieving a 60% to 120% speed increase over ELSER v1.

Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2

2023년 10월 17일

Improving information retrieval in the Elastic Stack: Optimizing retrieval with ELSER v2

Learn how we are reducing the retrieval costs of the Learned Sparse EncodeR (ELSER) v2.

Improving information retrieval in the Elastic Stack: Hybrid retrieval

2023년 7월 20일

Improving information retrieval in the Elastic Stack: Hybrid retrieval

In this blog we introduce hybrid retrieval and explore two concrete implementations in Elasticsearch. We explore improving Elastic Learned Sparse Encoder’s performance by combining it with BM25 using Reciprocal Rank Fusion and Weighted Sum of Scores.

Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval

2023년 7월 13일

Improving information retrieval in the Elastic Stack: Benchmarking passage retrieval

In this blog post, we'll examine benchmark solutions to compare retrieval methods. We use a collection of data sets to benchmark BM25 against two dense models and illustrate the potential gain using fine-tuning strategies with one of those models.

Improving information retrieval in the Elastic Stack: Steps to improve search relevance

2023년 7월 13일

Improving information retrieval in the Elastic Stack: Steps to improve search relevance

In this first blog post, we will list and explain the differences between the primary building blocks available in the Elastic Stack to do information retrieval.

Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model

2023년 6월 21일

Improving information retrieval in the Elastic Stack: Introducing Elastic Learned Sparse Encoder, our new retrieval model

Learn about the Elastic Learned Sparse Encoder (ELSER), its retrieval performance, architecture, and training process.

Aggregate data faster with new the random_sampler aggregation

2022년 4월 20일

Aggregate data faster with new the random_sampler aggregation

Aggregate billions of documents in milliseconds instead of minutes with Elastic. Learn more about how the new random_sampler aggregation gives you statistically robust results at a lower cost.

최첨단 검색 환경을 구축할 준비가 되셨나요?

충분히 고급화된 검색은 한 사람의 노력만으로는 달성할 수 없습니다. Elasticsearch는 여러분과 마찬가지로 검색에 대한 열정을 가진 데이터 과학자, ML 운영팀, 엔지니어 등 많은 사람들이 지원합니다. 서로 연결하고 협력하여 원하는 결과를 얻을 수 있는 마법 같은 검색 환경을 구축해 보세요.

직접 사용해 보세요