Frequently asked questions about ESREedit

What is the Elasticsearch Relevance Engine?

The Elasticsearch Relevance Engine (ESRE) is a collection of tools for developing search applications using machine learning (ML) and artificial intelligence (AI).

See Learn to learn about each of the components that make up ESRE.

What can I build with Elasticsearch Relevance Engine?

Elasticsearch is a leading search technology for websites (think ecommerce product and discovery) and retrieving information within an organization’s digital environment (think customer success knowledge bases and enterprise search). ESRE is a toolkit for building AI-powered search experiences. Your users can express their queries in natural language, in the form of a question or a description of the information they need. Combine this natural language capability with generative AI to augment models with context from private or proprietary data.

See Examples for links to example applications and implementations.

How are Elasticsearch and Elasticsearch Relevance Engine different?

Elasticsearch Relevance Engine tools are designed to use Elasticsearch as an underlying storage and search technology. Developers can use Elastic APIs or familiar tools, such as Kibana, to interact with this toolkit.

What is Elastic Learned Sparse Encoder?

Elastic Learned Sparse Encoder is a model built by Elastic for high relevance semantic search across a variety of domains. This text expansion model uses a sparse vector representation of text, compared to traditional dense vector representations. This means you don’t need to generate embeddings for your data (or queries), and you don’t need to fine tune the model for your domain.

This model helps capture meaning and intent in natural language queries and, because it doesn’t need to be fine tuned on your data, it works out of the box.

What is hybrid search?

Hybrid search is the combination of vector search and lexical search. Elasticsearch is the industry leader in lexical search, and we’ve been investing in vector search capabilities since 2019. Elastic enables you to combine the best of both worlds. Use RRF to power your hybrid search strategy with Elastic.

What is RRF?

Reciprocal Rank Fusion(RRF)is a state-of-the-art rank fusion algorithm for combining rankings from multiple information retrieval systems, without requiring calibration or fine tuning.

With Elastic 8.9.0, you can now implement hybrid search strategies by combining ELSER-powered semantic search with classical lexical search using the retrievers option.

What is a vector database?

A vector database consists of two primary components:

  • Embedding store and index. Embeddings are the vector representations of your unstructured data (text, images, etc.). Each data point is represented by an array (or vector) of numbers, plotted (or embedded) in a high-dimensional mathematical space. In simple terms, embedding means translating your data into floating-point numbers where similar data points are situated closer together. Different models use different techniques to generate embeddings, but the principle is the same: similar data points are closer together in the vector space.
  • A search algorithm. A vector database uses a search algorithm to find the nearest neighbors to a given query. When a user sends a query the query text is embedded, using the same algorithm as the training data. This enables rapid semantic similarity search. Because the values exist on a continuum, you can find data points that are semantically similar, even if they don’t share the same keywords.

Note that traditional vector search uses dense vector representations of data, which is a different approach to the Elastic Learned Sparse Encoder model’s sparse representation.

Is Elasticsearch a vector database?

Elasticsearch is a vector database and much more! Unlike pure play vector databases, Elastic combines all the components you need to work with vectors in a single platform:

  1. Embedding store. Store and index your embeddings in Elasticsearch natively using the dense_vector field type.
  2. Nearest neighbor search. Efficiently search for nearest neighbors to a given query in your dataset using our inference API.
  3. Embedding model. Generate embeddings for your data within the Elastic platform.

This approach eliminates the inefficiencies and complexities of making external API calls, a limitation of pure vector databases.

What is a transformer, and is Elastic Learned Sparse Encoder a transformer model?

A transformer is a deep neural network architecture which serves as the basis for LLMs. Transformers consist of various components and can be composed of encoders, decoders, and many "deep" neural network layers with many millions (or even billions) of parameters.

Typically trained on very large corpora of text like data on the Internet, and can be fine-tuned to perform a variety of NLP tasks. Our new retrieval model uses a transformer architecture but consists of an encoder designed specifically for semantic search across a wide variety of domains.

How do I get started with Elasticsearch Relevance Engine? Do I need to purchase Elasticsearch Relevance Engine separately?

All of Elasticsearch Relevance Engine’s capabilities come with Elastic Enterprise Search Platinum and Enterprise plans. If you have an Elasticsearch license, Elasticsearch Relevance Engine is included as part of your purchase. You can get started with text expansion with ELSER in the Kibana Search UI.

Use our examples for inspiration on how to build your own AI-powered search applications using semantic search, hybrid search, and more.

What is Elastic AI Assistant?

The Elastic AI Assistant is our first domain-specific application of generative AI, powered by ESRE. The Assistant is available in a chat interface, users can ask questions in natural language and receive tailored answers.