Elastic Search: Build a semantic search experience

Overview


Onboard your data

Create an Elastic Cloud account

Get started with a 14-day trial. Once you go to cloud.elastic.co and create an account, follow the steps below to learn how to launch your first Elastic stack in any one of our 50+ supported regions globally.

If you click on Edit settings you can choose a cloud provider, including AWS, Microsoft Azure, or Google Cloud. Once you select your cloud provider you'll be able to select the relevant region. Next, you have the option to choose between a few different hardware profiles so you can better customize the deployment to suit your needs. Plus, the latest version of Elastic has already been preselected for you.

While your deployment is being created, you'll be given a username and password. Be sure to copy or download this as you'll need it when you install your integrations.

Ingest data with the Elastic web crawler

Now that you've created your deployment it's time to get data into Elastic. Let's do this by using Elastic's web crawler. First, you'll select the tile, Build a semantic search experience.

Next, to set up semantic search, you'll see a page where you can get started with either of the following:

  • Elastic Learned Sparse Encoder
  • Vector Search
  • NLP Enrichment

All of these capabilities and more are part of the Elasticsearch Relevance Engine (ESRE).

For the purpose of this guide, let's go through setting up semantic search with both, the Elastic Learned Sparse Encoder and Vector Search.

NOTE: If you're getting started with semantic search and want to search text, you should try the Elastic Learned Sparse Encoder guide first. The kNN Vector Search guide may be more suitable for users who meet some of these criteria:

  • Have access to a data science or data science skill set
  • Have determined the built-in Elastic Learned Sparse Encoder semantic search model will not cover their use-cases
  • Are experienced in comparing embedding models, and potentially fine-tuning ML models
  • Are aware that fast kNN search may require significant RAM resources

If you're ready to get started, select your preferred method to build an AI search-powered application.

For both methods, you'll get started by selecting Create an index. From here you can select the web crawler to get started with ingesting your data.

To set up the web crawler, check out this guided tour or follow the instructions below:

Now create an index. For the purpose of this guide, we're ingesting blogs across elastic.co.

Web crawler search index

Once you give your index a name, select Create index. Next, you'll Validate Domain and then select Add domain.

After you add the domain in the lower right you'll select Edit so you can add a subdomain if needed.

Add a domain to your indexNext, you'll select Crawl rules and add your crawl rules as seen below.*

Manage domains

*Because the page you want to crawl will have pages linked to it you should add the additional rules to disallow those links and any others.

 

Next, when you select your field later, some fields exceed the 512 token count, such as body_content. You should leverage Extraction rules to filter out only the relevant parts of the blogs.

When you select Extraction rules click Add content extraction rule.

Next, under Rule description give it a name that will help others understand what data this rule will extract. For the purpose of this guide, let's call it "main."

Now, select Apply to all URLs then Add content fields, and a flyout will appear. Fill out and select the following criteria below when the flyout appears.

  • Document field:
    • Field name: main
  • Source:
    • Extract content from: HTML element
    • CSS selector or XPath expression: main
  • Content
    • Use content from: Extracted Value
    • Store extracted content as: A string

Once you fill in this criteria click Save then Save rule.


Working with Elasticsearch and ESRE

Ingest and search your data using Elastic Learned Sparse Encoder

If you reviewed the recommended criteria above for getting started on vector search and that's your preferred method, navigate to the Search your data using kNN vector search on the left and follow the instructions.

Otherwise, if you prefer to use the Elastic Learned Sparse Encoder, Elastic's out-of-the-box semantic search model, check out the instructions below.

To do this you'll select Pipelines and Unlock your custom pipelines by selecting Copy and customize at the top. Next, under Machine Learning Inference Pipelines, select Deploy to download the model and install it on your Elasticsearch deployment.

Once it deploys select Start single-threaded then + Add inference Pipeline. Next you'll do the following:

  1. Select new or existing pipeline
  2. Give it a name
  3. Finally, in the Select trained ML Model dropdown select ELSER Text Expansion then click Continue.

Now, you'll need to select the fields where you're going to apply the ELSER text expansion. Select "title" and "main" as source fields, then Add.

Next, click Continue.

Skip the Test your pipeline results step by clicking Continue, then Create pipeline.

Now that you've created your pipeline, select Crawl in the upper right corner then Crawl all domains on this index.

Now it's time to search for the information you're looking for. There are two recommended ways to do this:

  • Using the Dev Tools
  • Leveraging the Search Application functionality as an endpoint for your application

When to use each:

  • If you're a developer who's implementing search (ie. for your web application) you should use the Dev Tools to test and refine search results from your indexed data.
  • If you want to create a search endpoint that you can send search requests from your own application where you can return search results, you should use the Search Application functionality.

Check out these two short videos below for a short walk through of how to leverage Dev Tools and Search Application functionality. You can also learn how by exploring this guided tour.


Next steps

Thanks for taking the time to set up semantic search for your data with Elastic Cloud. As you begin your journey with Elastic, understand some operational, security, and data components you should manage as a user when you deploy across your environment.