GPU accelerated vector indexing
Stack
Stack
Elasticsearch can use GPU acceleration to significantly speed up the indexing of dense vectors. GPU indexing is based on the Nvidia cuVS library and leverages the parallel processing capabilities of graphics processing units to accelerate the construction of HNSW vector search indexes. GPU accelerated vector indexing is particularly beneficial for large-scale vector datasets and high-throughput indexing scenarios, freeing up CPU resources for other tasks.
GPU vector indexing requires the following:
- An Enterprise subscription
- A supported NVIDIA GPU (Ampere architecture or better, compute capability
= 8.0) with a minimum 8GB of GPU memory
- GPU driver, CUDA and cuVS runtime libraries installed on the node
LD_LIBRARY_PATHenvironment variable configured to include the cuVS libraries path and its dependencies (CUDA, rmm, etc.)- Supported platform: Linux x86_64 only, Java 22 or higher
- Supported dense vector configurations:
hnswandint8_hnsw;floatelement type only
GPU vector indexing is controlled by the
vectors.indexing.use_gpu
node-level setting.
By default, Elasticsearch uses GPU indexing for supported vector types if a compatible GPU and required libraries are detected. Check server logs for messages indicating whether Elasticsearch has detected a GPU.
If you see a message like the following, a GPU was successfully detected and GPU indexing will be used:
[o.e.x.g.GPUSupport ] [elasticsearch-0] Found compatible GPU [NVIDIA L4] (id: [0])
If you don't see this message, look for warning messages explaining why GPU indexing is not being used, such as an unsupported environment, missing libraries, or an incompatible GPU.
To enforce GPU indexing, set vectors.indexing.use_gpu: true in
elasticsearch.yml.
The node will fail to start if GPU indexing is not available, e.g. if a GPU
is not detected by Elasticsearch, or if the runtime is not supported, or if the
necessary dependencies are not correctly configured, etc.
If the node fails to start, check:
- A supported NVIDIA GPU is present
- CUDA runtime libraries and drivers are installed (check with
nvidia-smi) LD_LIBRARY_PATHincludes paths to the cuVS libraries and to their dependencies (e.g. CUDA)- Supported platform: Linux x86_64 with Java 22 or higher
If you are sure that GPU indexing is enabled but don't see performance improvement, check the following:
- Ensure supported vector index types and element type are used
- Ensure the dataset is large enough to benefit from GPU acceleration
- Check if there are different bottlenecks affecting the indexing process:
using GPU indexing accelerates the HNSW graph building, but speedups can be
limited by other factors.
- Indexing throughput depends on how fast you can get data into Elasticsearch. Check network speed and client performance. Use multiple clients if needed.
- JSON parsing could dominate the computation: use base64 encoded vectors as opposed to json arrays
- Storage speed is also important: as the GPU is able to process lots of data, you need a storage solution that is able to keep up. Avoid using network attached storage, and prefer fast NVMe to extract the most performance
- Consider monitoring CPU usage to demonstrate offloading to GPU
- Consider monitoring GPU usage (e.g. with
nvidia-smi)