Elastic and NVIDIA help you deploy AI apps faster without draining IT infrastructure

Remove bottlenecks. Scale smarter. Control costs. With Elastic and NVIDIA, you get the power of a GPU-accelerated vector database for high-performance AI.

Dive deeper

Unleash AI performance with GPU-accelerated vector search

Elasticsearch is teaming up with NVIDIA to bring GPU power to your search stack. By leveraging the cuVS library and the CAGRA algorithm, Elasticsearch has unlocked massive parallelism to deliver fast and ultra-low latency indexing for your most demanding retrieval augmented generation (RAG) pipelines and AI applications.

Index on GPUs for maximum throughput. Search on CPUs for cost efficiency. Optimize for both performance and price.

By the numbers

12x
Boost in indexing throughput
7x
Reduction in merge latency
5x
Cost-adjusted throughput vs. CPU-only

Elasticsearch vector database with NVIDIA cuVS: Better together

Accelerate your AI factory
Launch high-performance search and agentic AI faster with pre-engineered blueprints. Elasticsearch is the recommended vector database for the NVIDIA Enterprise AI Factory validated design, providing a reliable on-premises framework for scale.
Turbocharge indexing speed
Index your data at scale without bottlenecks. Integrating NVIDIA cuVS delivers up to a 12x boost in indexing throughput and 7x faster force merging, allowing you to handle massive data volumes with unprecedented efficiency.
Maximize infrastructure value
Offload math-intensive indexing to GPUs to slash CPU strain and reclaim resources. On a cost-adjusted basis, GPU acceleration delivers 5x higher throughput and 6x faster force merges, giving you superior performance from your existing hardware budget.
Enhance query performance
Handle massive query volumes with near-instant response times. Elastic’s NVIDIA-accelerated search ensures your infrastructure scales alongside the next generation of GenAI, delivering the high-speed retrieval required for complex agentic workflows.

FOR ENTERPRISE

The best of Elastic and NVIDIA, optimized for you

Open and enterprise-ready
Build with confidence on a foundation of open source innovation. GPU acceleration is powered by the Apache 2.0-licensed NVIDIA cuVS library and integrated into Elasticsearch via an ELv2-licensed plugin, combining open flexibility with enterprise-grade support.
Limitless indexing scale
Index your data at scale without bottlenecks. Integrating NVIDIA cuVS delivers a 12x boost in indexing throughput and 7x faster force merging, allowing you to handle massive data volumes with unprecedented efficiency.
Elastic scaling with Kubernetes
Scale your acceleration as easily as you scale your cluster. By mapping Elasticsearch processes to individual GPUs via Kubernetes orchestration, you can distribute large indexing workloads across multiple servers for maximum parallel throughput.
Seamless CPU-GPU synergy
Get the best of both worlds. Elastic leverages GPUs for what they do best — bulk arithmetic for graph construction — while keeping search on the CPU. This ensures your high-performance HNSW graphs are built in record time but remain accessible for standard retrieval.

Frequently asked questions

Is GPU-accelerated vector indexing for Elasticsearch available as open source?

Yes, the code implementing GPU-accelerated vector indexing is open source (under a dual license: AGPL and ELv2). Elasticsearch exposes the GPU-accelerated vector indexing functionality via a plugin that is licensed under the ELv2 license and is available under the Enterprise subscription tier. NVIDIA cuVS, the library that powers the GPU indexing features in Elasticsearch, is also available as open source under the Apache 2.0 license.

What should I do if I run into issues or have suggestions?

In case of any issues, try our troubleshooting instructions. If your problem persists, create an issue on the Elasticsearch GitHub if it's an Elasticsearch-specific problem. If the problem pertains to NVIDIA cuVS and its dependencies, open an issue on the NVIDIA cuVS GitHub. If you have an Enterprise subscription, reach out to us via Elastic customer support channels for resolution. Use the same channels for suggestions and feature requests.

How do I install NVIDIA cuVS on an Elasticsearch data node to enable GPU vector indexing?

You can install NVIDIA cuVS as a precompiled package via tarball from NVIDIA channels for database users or via pip or conda package managers for data science users. You can also build cuVS from source and maintain the binary yourself. For more information, see the NVIDIA cuVS installation page. For users with NVIDIA AI Enterprise (NVAIE) subscription with your GPUs, CVE fix supported cuVS tarball with support guarantees for CVEs will be available via NGC catalog in a few months. Reach out to the NVAIE support team or your NVIDIA sales representative for more information.

Can vector indexing scale across multiple GPUs across one or multiple servers?

Yes, you can use a container orchestration system like Kubernetes to map each Elasticsearch process to one available GPU. A single Elasticsearch process should have exclusive use of a single GPU. In this way, scaling to use multiple GPUs becomes scaling nodes in the cluster.

Is the vector index size limited by the available GPU memory?

We support building indices that are larger than GPU memory (a.k.a. out-of-core) by building them in batches. Overall, GPU indexing does not introduce any additional limitations beyond those already present with CPU-based indexing.

Is GPU acceleration available for vector search?

No, only HNSW index construction is GPU-accelerated today. The resulting HNSW graph is then loaded into host (CPU) memory, and vector retrieval runs on the CPU. The reasoning for this decision is the immense advantage that GPUs have in bulk vector operations. Further extending the use of GPU will be considered as the technology and the use cases evolve.

How do I evaluate performance and cost benefits of GPU vector indexing?

You can use Elastic's Rally tool to evaluate the impact of GPUs on indexing throughput, force merge latency, and vector search accuracy and latency/throughput. View instructions and best practices to run E2e vector indexing benchmarking on GPUs via Rally.

Which element and index types are supported?

Elasticsearch supports several different indexing parameters. Both the hnsw and int8_hnsw values are supported for the index_options.type parameter. For the element_type, only float is supported. At this time, no other index and element types are supported.

Elastic and NVIDIA help you deploy AI apps faster without draining IT infrastructure

Dive deeper

Exploring GPU-accelerated vector search in Elasticsearch with NVIDIA

12x faster vector indexing with NVIDIA GPUs

Massive-scale vector search: Powering Lucene with GPUs

Unleash AI performance with GPU-accelerated vector search

By the numbers

Elasticsearch vector database with NVIDIA cuVS: Better together

Accelerate your AI factory

Turbocharge indexing speed

Maximize infrastructure value

Enhance query performance

FOR ENTERPRISE

The best of Elastic and NVIDIA, optimized for you

Open and enterprise-ready

Limitless indexing scale

Elastic scaling with Kubernetes

Seamless CPU-GPU synergy

Frequently asked questions

Is GPU-accelerated vector indexing for Elasticsearch available as open source?

What should I do if I run into issues or have suggestions?

How do I install NVIDIA cuVS on an Elasticsearch data node to enable GPU vector indexing?

Can vector indexing scale across multiple GPUs across one or multiple servers?

Is the vector index size limited by the available GPU memory?

Is GPU acceleration available for vector search?

How do I evaluate performance and cost benefits of GPU vector indexing?

Which element and index types are supported?

Context engineering

Vector database

Search powered applications

Logs

Threat protection

Workflows

Elasticsearch

Kibana (Discover, Dashboards)

Elastic Agent Builder

AutoOps

Piped query language

Jina AI search models

Elastic Cloud Serverless

Elastic Cloud Hosted

Self-managed Elasticsearch

Ecommerce search

Customer support search

Search-driven apps

Log analytics

Infrastructure monitoring

Digital experience monitoring

App performance monitoring

AIOps

LLM observability

Next-gen SIEM

Workflows for security

XDR and endpoint security

AI for security

10x your data's value

Cloud providers

Elastic AI Ecosystem

Search AI Partner Program

AV-Comparatives

Forrester Wave™ XDR

Gartner Magic Quadrant Leader

IDC MarketScape

Search

Security

Observability

Get started

Demo gallery

Downloads

Integrations

Docs

Elasticsearch Labs

Elastic Security Labs

Elastic Observability Labs

Blog

Community

Events

Webinars

Discuss

Training

Support

Consulting

Elastic and NVIDIA help you deploy AI apps faster without draining IT infrastructure

Dive deeper

Exploring GPU-accelerated vector search in Elasticsearch with NVIDIA

12x faster vector indexing with NVIDIA GPUs

Massive-scale vector search: Powering Lucene with GPUs

Unleash AI performance with GPU-accelerated vector search

By the numbers

Elasticsearch vector database with NVIDIA cuVS: Better together

Accelerate your AI factory

Turbocharge indexing speed

Maximize infrastructure value

Enhance query performance

FOR ENTERPRISE

The best of Elastic and NVIDIA, optimized for you

Open and enterprise-ready

Limitless indexing scale

Elastic scaling with Kubernetes

Seamless CPU-GPU synergy

Frequently asked questions

Is GPU-accelerated vector indexing for Elasticsearch available as open source?

What should I do if I run into issues or have suggestions?

How do I install NVIDIA cuVS on an Elasticsearch data node to enable GPU vector indexing?

Can vector indexing scale across multiple GPUs across one or multiple servers?

Is the vector index size limited by the available GPU memory?

Is GPU acceleration available for vector search?