Filtered HNSW search, fast mode

Explore the improvements we have made for HNSW vector search in Apache Lucene through our ACORN-1 algorithm implementation.

Want to get Elastic certified? Find out when the next Elasticsearch Engineer training is running! You can start a free cloud trial or try Elastic on your local machine now.

For years, Apache Lucene and Elasticsearch have supported filtered search with kNN queries, allowing users to retrieve the nearest neighbors that meet a specified metadata filter. However, performance has always suffered when dealing with semi-restrictive filters. In Apache Lucene, we are introducing a variation of ACORN-1—a new approach for filtered kNN search that achieves up to 5x faster searches with little drop in recall.

This blog goes over the challenges of filtered HNSW search, explaining why performance slows as filtering increases, and how we improved HNSW vector search in Apache Lucene with the ACORN-1 algorithm.

Why searching fewer docs is actually slower in HNSW

Counterintuitively, filtering documents—thereby reducing the number of candidates—can actually make kNN searches slower. For traditional lexical search, fewer documents mean fewer scoring operations, meaning faster search. However, in an HNSW graph, the primary cost is the number of vector comparisons needed to identify the k nearest neighbors. At certain filter set sizes, the number of vector comparisons can increase significantly, slowing down search performance.

Because the HNSW graph in Apache Lucene has no knowledge of filtering criteria when built, it constructs purely based on vector similarity. When applying a filter to retrieve the k nearest neighbors, the search process traverses more of the graph. This happens because the natural nearest neighbors within a local graph neighborhood may be filtered out, requiring deeper exploration and increasing the number of vector comparisons.

You may ask, why perform vector comparisons against nodes that don’t match the filter at all? Well, HNSW graphs are already sparsely connected. If we were to consider only matching nodes during exploration, the search process could easily get stuck, unable to traverse the graph efficiently.

We gotta make this faster: Improving HNSW vector search in Lucene

Since the graph doesn’t account for filtering criteria, we have to explore the graph more. Additionally, to avoid getting stuck, we must perform vector comparisons against filtered-out nodes. How can we reduce the number of vector operations without getting stuck? This is the exact problem tackled by Liana Patel et. al. in their ACORN paper.

While the paper discusses multiple graph techniques, the specific algorithm we care about with Apache Lucene is their ACORN-1 algorithm. The main idea is that you only explore nodes that satisfy your filter. To compensate for the increased sparsity, ACORN-1 extends the exploration beyond the immediate neighborhood. Now, instead of exploring just the immediate neighbors, each neighbor’s neighbor is also explored. This means that for a graph with 32 connections, instead of only looking at the nearest 32 neighbors, exploration will attempt to find matching neighbors in 32*32=1024 extended neighborhood.

Within Lucene, we have slightly adapted the ACORN-1 algorithm in the following ways. The extended neighborhoods are only explored if more than 10% of the vectors are filtered out in the immediate neighborhood. Additionally, the extended neighborhood isn’t explored if we have already scored at least neighborCount * 1.0/(1.0 - neighborFilterRatio). This allows the searcher to take advantage of more densely connected neighborhoods where the neighborhood connectedness is highly correlated with the filter.

We also have noticed both in inversely correlated filters (e.g. filters that only match vectors that are far away from the query vector) or exceptionally restrictive filters, only exploring the neighborhood of each neighbor isn’t enough. The searcher will also attempt branching further than the neighbors’ neighbors when no valid vectors passing the filter are found. However, to prevent getting lost in the graph, this additional exploration is bounded.

Numbers don’t lie

Across multiple real-world datasets, this new filtering approach has delivered significant speed improvements. Here is randomly filtering at 0.05% 1M Cohere vectors:

To further investigate this reduction in improvement as more vectors pass the filter, we did another test over an 8M Cohere Wiki document data set. Generally, no matter the number of vectors filtered, you want higher recall, with fewer visited vectors. A simple way to quantify this is by examining the recall-to-visited ratio.

It's clear that near 60%, the improvements level off or disappear. Consequently, in Lucene, this new algorithm will only be utilized when 40% or more of the vectors are filtered out.

Even our nightly Lucene benchmarks saw an impressive improvement with this change.

Gotta go fast

Filtering kNN search over metadata is key for real-world use cases. In Lucene 10.2, we have made it as much as 5x faster, using fewer resources, and keeping high recall. I am so excited about getting this in the hands of our users in a future Elasticsearch v9 release.

자주 묻는 질문

What is ACORN-1?

ACORN-1 is a new approach for filtered kNN search that achieves up to 5x faster searches with little drop in recall.

관련 콘텐츠

최첨단 검색 환경을 구축할 준비가 되셨나요?

충분히 고급화된 검색은 한 사람의 노력만으로는 달성할 수 없습니다. Elasticsearch는 여러분과 마찬가지로 검색에 대한 열정을 가진 데이터 과학자, ML 운영팀, 엔지니어 등 많은 사람들이 지원합니다. 서로 연결하고 협력하여 원하는 결과를 얻을 수 있는 마법 같은 검색 환경을 구축해 보세요.

직접 사용해 보세요