Paving the way for modern search workflows and generative AI apps

Elastic’s innovative investments to support an open ecosystem and a simpler developer experience

elastic-de-135742-blogheader-pav_V1.png

In this blog, we want to share the investments that Elastic® is making to simplify your experience as you build AI applications. We know that developers have to stay nimble in today’s fast-evolving AI environment. Yet, common challenges make building generative AI applications needlessly rigid and complicated. To name just a few:

  • Vectors — from how many to which ones you can use and how to chunk large passages of text

  • Evaluating, swapping, and managing large language models (LLMs)

  • Setting up effective semantic search (particularly if your development team has limited resources or skill gaps) 

  • Leveraging existing investments and current architectures while balancing tech debt

  • Scaling from proof-of-concept to production 

  • Making sure that end-user applications are fast and cost-effective and reflect secure, up-to-date proprietary data in responses to queries

  • Fragmented and complex implementation

Flexible tools help you adapt quickly, respond to changes, and accelerate your projects. This is why Elastic is building on its foundation in Apache Lucene to offer the best open code vector database and search engine available. Elastic is also actively partnering across the ecosystem to expand support for transformer and foundation models. 

Moreover, we’re making it easier to get highly relevant semantic search out of the box with Elastic’s proprietary Learned Sparse EncodeR model, ELSER — now in GA. We’re reducing the costs and processing time associated with retrieval augmented generation (RAG), the retrieval process that provides relevant responses to natural language queries from proprietary data sources to LLMs, for custom use cases. And, we’re streamlining the developer experience across Elasticsearch®, so that implementation is simple and straightforward.

Developers are actively shaping the future of generative AI apps. Elastic’s ground-breaking investments (and many more to come) reflect why our AI-powered search analytics platform is the best choice for a new generation of search workloads.

All in on Apache Lucene

It all started with Apache Lucene, an open source search engine software library that has stood the test of time and provides the basis for Elasticsearch. While Elasticsearch has grown to be recognized as the most downloaded vector database with its innovations in vector search, scalability, and performance, the strength of our platform originates from the fact that Elastic and Lucene’s communities invest in these advancements in Apache Lucene first. In fact, Elastic has a history of enhancing Lucene’s capabilities, such as numeric and geospatial search capabilities, Weak AND support, and improved columnar storage. Advancing the Lucene community means everyone goes farther, faster. Being the driver for these investments means Elastic users receive the value first, tailored to their search needs. 

At Elastic, we know that Lucene has potential beyond full-text search: developers need a full scope of features to build search apps and generative AI experiences including aggregations, filtering, faceting, etc. Ultimately we are on track to make Lucene the most leading-edge vector database in the world and to share its capabilities with millions of Elasticsearch users across the globe. That’s why Elastic’s developers regularly commit code to Lucene and leverage its foundational code for new projects, such as:

Since Elasticsearch is built on top of Lucene, when you upgrade to our latest release, you automatically benefit from all of the latest improvements. And we’ve already started to contribute the next foundational investments our customers will need by adding scalar quantization support to Lucene, a key cost savings capability.

Second to none in semantic search and RAG

Developers are tasked with building search and generative AI applications that are relevant, performant, and cost-effective. Quite simply, you need to be able to retrieve data from all your proprietary data sources to build RAG to deliver the best, most pertinent results. To that end, we’ve added more native connectors and connector clients for enterprise databases and popular productivity tools, and content sources like OneDrive, Google Drive, GitHub, ServiceNow, Sharepoint, Teams, Slack, and plenty of others. 

Even more notably with Elastic’s 8.11 release, we’ve announced the general availability of Elastic Learned Sparse EncodeR (ELSER). It’s our proprietary AI model for delivering world-class semantic search. ELSER is a pre-trained, text retrieval model that provides highly relevant results across domains and lets you implement semantic search by following a few simple steps. Since its technical preview in May, ELSER has had wide adoption, allowing us to make improvements based on customer feedback. Our GA ELSER model brings increased relevance and reduced ingest and retrieval time. You can upgrade now to take advantage of these enhancements.

Another obstacle that comes with generative AI territory: higher compute costs and slower response times. Generative LLM calls incur costs per token and require additional processing, which takes time. However, with the power of embeddings and fast k-Nearest Neighbors algorithms (kNN), Elastic can be used as a caching layer for generative AI applications, readily identifying similar queries and responses and providing quicker, more cost-effective answers. With respect to cost efficiencies, on AWS, we now also offer a vector search optimized Elastic Cloud hardware profile with an optimal default RAM ratio for a price effective ability to store more vectors.

The better Elastic is at making semantic search and RAG simple to use together, the faster developers can make great generative AI experiences for end users. That’s why we’re laser-focused on making the technology easy and practical for developers to use.

Choice and flexibility across the ecosystem

Helping you respond to change quickly in the AI era with an open platform where you can use a variety of tools and consistent standards is key to accelerating generative AI projects. That’s why developers have flexibility to use and host a variety of transformer models within Elasticsearch, including private and public Hugging Face models. You can also store vectors in Elasticsearch generated by third-party services like AWS SageMaker, Google Vertex AI, Cohere, OpenAI, and more. 

We’re also expanding our support for ecosystem tools so you can easily use Elasticsearch as your vector database with LangChain and LlamaIndex. In fact, we recently collaborated with the LangChain team on LangChain Templates to help developers build production-ready generative AI apps. Thanks to our community, Elastic is already one of the most popular vector stores on LangChain. Now with the new RAG template, you can create production-level capabilities with LangSmith and Elasticsearch.

A simple developer experience

We’re dedicated to creating a simplified developer experience. We’re releasing streamlined commands that abstract away the complexity of inference and model management work streams that you can use behind one simple API. We’re improving default settings for dense vectors and providing automatic mappings too. With one call, you can summarize results or embed text as vectors from any model, reducing the time it takes for you to build and learn.

Soon, we’ll introduce Elastic’s new serverless architecture, a new deployment option for developers who want to focus on creating innovative experiences, not managing their underlying infrastructure. We’re focused on giving you all of the tools you need, so we’re adding new language clients in our serverless architecture for Python, PHP, JavaScript, Ruby, Java, .Net, and Go. 

We’re also well aware that it can be challenging to get started with fast-changing, new technologies, which is why we’re offering simple onboarding with inline guidance and code across every one of Elastic’s deployment options, including real-world examples to help you spin up new projects quickly.

There’s never been a better time to be an Elasticsearch developer. Our recent research and development efforts are making Lucene the best vector database in the world. We’re ensuring that semantic search and RAG are unparalleled when it comes to ease of use, relevance, speed, scale, and cost efficiency. And we’re putting ecosystem openness, flexibility, and simplicity at the heart of developer experience.

Ready to start building next-generation search on Elasticsearch? Try the Elasticsearch Relevance Engine™, our suite of developer tools for building AI search apps.

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.

In this blog post, we may have used or referred to third party generative AI tools, which are owned and operated by their respective owners. Elastic does not have any control over the third party tools and we have no responsibility or liability for their content, operation or use, nor for any loss or damage that may arise from your use of such tools. Please exercise caution when using AI tools with personal, sensitive or confidential information. Any data you submit may be used for AI training or other purposes. There is no guarantee that information you provide will be kept secure or confidential. You should familiarize yourself with the privacy practices and terms of use of any generative AI tools prior to use. 

Elastic, Elasticsearch, ESRE, Elasticsearch Relevance Engine and associated marks are trademarks, logos or registered trademarks of Elasticsearch N.V. in the United States and other countries. All other company and product names are trademarks, logos or registered trademarks of their respective owners.