IBM watsonx & Elastic: Using IBM watsonx Slate text embeddings

Elasticsearch has native integrations with the industry-leading Gen AI tools and providers. Check out our webinars on going Beyond RAG Basics, or building prod-ready apps with the Elastic vector database.

To build the best search solutions for your use case, start a free cloud trial or try Elastic on your local machine now.

Elastic expanded their open inference API capabilities with the integration of IBM watsonx™ Slate embedding models, marking an important milestone in our ongoing partnership with the IBM watsonx team. With this, Elasticsearch users gain immediate, simplified access to IBM’s Slate family of models while the IBM watsonx community can take advantage of Elasticsearch’s comprehensive AI search tooling and proven vector database capabilities.

Elastic’s open inference API, now generally available, enables you to create endpoints and use machine learning models from providers like IBM watsonx™. IBM® watsonx™ AI and Data Platform includes core components and AI assistants designed to scale and accelerate AI's impact using trusted data. The platform features open-sourced Slate embedding models (slate-125m, slate-30m) for retrieval-augmented generation, semantic search, and document comparison, and also the Granite family of LLMs trained on trusted enterprise data.

In this blog, we will explain how to use IBM watsonx™ Slate text embeddings when building Search AI experiences with Elasticsearch vector database. Elastic now supports the usage of these text embeddings, with the new semantic_text field chunking incoming text by default to fit the token limits of the platform’s models.

Prerequisites & creation of inference endpoint

You will need an IBM Cloud® Databases for Elasticsearch deployment. You can provision one through the catalog, the Cloud Databases CLI plug-in, the Cloud Databases API, or Terraform. Once the account is set up successfully, you should land on the IBM cloud home page.

You can then provision a Kibana instance and connect to your Databases for Elasticsearch instance using the managed service model of IBM Cloud using these steps -

Set the Admin Password for your Elasticsearch deployment.
Install Docker to pull the Kibana container image and connect it to Databases for Elasticsearch.

Alternatively, if you prefer not to run Kibana locally or install Docker, you can deploy Kibana using IBM Cloud® Code Engine. For details, see the documentation on deploying Kibana with Code Engine and connecting it to your Databases for Elasticsearch instance.

Generate an API key

Go to IBM watsonx.ai cloud and log in using your credentials. You will land on the welcome page.

Go to the API keys page.
Create an API key.

Steps in Elasticsearch

Using DevTools in Kibana, create an inference endpoint using the watsonxai service for text_embedding

You will receive the following response on the successful creation of the inference endpoint:

Generate embeddings

Below is an example of generating text_embedding for a single string

You will receive the following response as embeddings:

Additionally, let's look at a semantic_text mapping example

Create an index containing semantic_text field

Insert some documents to the created index

Next, run a query using semantic_text

You will receive the following response from the query

Conclusion

With the integration of IBM watsonx™ text embeddings, the Elasticsearch Open Inference API continues to empower developers with enhanced capabilities for building powerful and flexible AI-powered search experiences. Explore more supported encoder foundation models available with watsonx.ai.

이 콘텐츠가 얼마나 도움이 되었습니까?

도움이 되지 않음

어느 정도 도움이 됩니다

매우 도움이 됨

문제 신고하기

관련 콘텐츠

A simdvec deep-dive: How Elasticsearch uses neural-net and video-codec CPU instructions for vector search

Vector Database Inside Elastic

2026년 7월 2일

A simdvec deep-dive: How Elasticsearch uses neural-net and video-codec CPU instructions for vector search

Four ways Elasticsearch's vector search engine reuses neural-network, video-codec and cryptography CPU instructions for up to 6x speedups; with the math, the failed attempts and the benchmarks.

LD CH

작성자: Lorenzo Dematte 및 Chris Hegarty

Elasticsearch DiskBBQ delivers 7x faster vector search than Qdrant on network-attached storage

Vector Database AI+1

2026년 6월 24일

Elasticsearch DiskBBQ delivers 7x faster vector search than Qdrant on network-attached storage

Elasticsearch DiskBBQ achieves up to 7x higher vector search throughput than Qdrant at comparable recall on network-attached storage. Explore the benchmark methodology and full results.

작성자: Sachin Frayne

Jingra: A Reproducible Framework for Vector Search Benchmarking

Vector Database Java

2026년 6월 18일

Jingra: A Reproducible Framework for Vector Search Benchmarking

Jingra is an open source benchmarking framework that runs the same vector search workload across Elasticsearch, OpenSearch and Qdrant so you can compare engines under identical, reproducible conditions.

작성자: Sachin Frayne

How we built a persistent agent memory layer on Elasticsearch with 0.89 recall and zero tenant leaks

Agentic AI Hybrid Search+1

2026년 6월 16일

How we built a persistent agent memory layer on Elasticsearch with 0.89 recall and zero tenant leaks

Discover the architecture behind a persistent, multi-tenant agent memory layer on Elasticsearch: three indices, hybrid retrieval with RRF and a reranker, supersession, decay, and per-user DLS isolation. R@10 0.89 across 168 questions. Full open-source implementation included.

작성자: Noam Schwartz

2026년 6월 15일

Your search index is already an agent memory system: Persistent agent memory for Claude Code with Elasticsearch

Give your AI agent persistent cross-session memory using Elasticsearch: Hybrid recall, a knowledge graph, and cross-device handoffs. Three commands to install.

작성자: Jeff Vestal

Elasticsearch open inference API adds support for IBM watsonx.ai Slate embedding models