Up to 3x faster stored-vector queries in Elasticsearch

Try out vector search for yourself using this self-paced hands-on learning for Search AI. You can start a free cloud trial or try Elastic on your local machine now.

Finding documents similar to a stored vector in Elasticsearch used to require two round trips: Fetch the vector with GET, and then send it back in a k-nearest neighbor (kNN) query. Elasticsearch 9.4 collapses that flow into one request with query_vector_builder.lookup, simplifying the API and improving latency by up to 3x in a two-node Google Cloud Platform (GCP) benchmark.

Why stored-vector search used to require two requests

Previously, when you wanted to find documents similar to a stored vector, you needed to:

Call GET to fetch the vector value from Elasticsearch.
Call _search referencing that vector value in Elasticsearch:
- Serialize the vector value via JSON.

This means paying serialization and network costs twice:

Serialization and deserialization of the vector for both requests.
Network latency costs in both directions.
Potential egress costs in cloud deployments.

In Python, the pattern would be:

While these two calls seem cheap, the overhead is unnecessary. Let’s make this better.

How query_vector_builder.lookup works in Elasticsearch 9.4

In Elasticsearch 9.4, we added lookup to simplify the API and eliminate unnecessary costs:

This request now grabs the dense_vector value stored in the product-vector field, in the document with ID product-123 in the seed-products index. This example is a “more like this” search, finding the nearest vectors to the one with ID product-123. You can refer to any index, effectively using lookup as a query vector store.

How much latency lookup vector search can remove

The goal is to simplify the experience and make it faster. The performance gains aren't just from eliminating the client round trip. Many Elasticsearch instances involve multiple nodes, and traffic between nodes can carry its own serialization and network costs. Elasticsearch actively biases execution toward the local node, which cuts network serialization costs on the server side, too.

To illustrate the potential performance improvements, here’s a benchmark we ran. We used a modified version of our so_vector, where instead of using the query vectors, one path did the GET and then _search pattern and the other used lookup. Running on two nodes in the same zone in GCP, the results were strong. Latency was consistently improved by almost 3x. Even when nodes are within the same data center and the same availability zone, network and serialization costs can have a real impact.

Percentile	get-then-knn (ms)	lookup-knn (ms)	Reduction	Speedup
p50	10.3796	3.14093	69.74%	3.30x
p90	25.4429	5.89807	76.82%	4.31x
p99	27.7167	8.07109	70.88%	3.43x
max (p100)	28.522	12.6497	55.65%	2.25x

This benchmark ran with 2M documents, and the latency improvement will depend on your overall search costs. Even when the speedup is smaller, lookup still removes the extra client-side request. Less code, fewer round trips.

A simpler path for stored-vector search

Sometimes small changes can have an outsized impact. While this is a simple feature, I hope it removes some unnecessary friction in your Elasticsearch usage and makes us that much more lovable.

이 콘텐츠가 얼마나 도움이 되었습니까?

도움이 되지 않음

어느 정도 도움이 됩니다

매우 도움이 됨

문제 신고하기

Up to 3x faster stored-vector queries in Elasticsearch

Why stored-vector search used to require two requests

How query_vector_builder.lookup works in Elasticsearch 9.4

How much latency lookup vector search can remove

A simpler path for stored-vector search

이 콘텐츠가 얼마나 도움이 되었습니까?

관련 콘텐츠

A picture is worth 1.5x the words: What we learned benchmarking product search embeddings

The disk that never woke up: what actually decided our Qdrant vector search benchmark rematch

How BBQ shrinks Jina v5 embeddings by 29x without losing recall in Elasticsearch

Short queries, formal documents: how HyDE improved semantic search precision by 50% in Elasticsearch

A simdvec deep-dive: How Elasticsearch uses neural-net and video-codec CPU instructions for vector search

최첨단 검색 환경을 구축할 준비가 되셨나요?