Tutorial: semantic search with semantic_text
editTutorial: semantic search with semantic_text
editThis functionality is in beta and is subject to change. The design and code is less mature than official GA features and is being provided as-is with no warranties. Beta features are not subject to the support SLA of official GA features.
This tutorial shows you how to use the semantic text feature to perform semantic search on your data.
Semantic text simplifies the inference workflow by providing inference at ingestion time and sensible default values automatically. You don’t need to define model related settings and parameters, or create inference ingest pipelines.
The recommended way to use semantic search in the Elastic Stack is following the semantic_text
workflow.
When you need more control over indexing and query settings, you can still use the complete inference workflow (refer to this tutorial to review the process).
This tutorial uses the elser
service for demonstration, but you can use any service and their supported models offered by the Inference API.
Requirements
editTo use the semantic_text
field type, you must have an inference endpoint deployed in
your cluster using the Create inference API.
Create the inference endpoint
editCreate an inference endpoint by using the Create inference API:
resp = client.inference.put( task_type="sparse_embedding", inference_id="my-elser-endpoint", inference_config={ "service": "elser", "service_settings": { "num_allocations": 1, "num_threads": 1 } }, ) print(resp)
const response = await client.inference.put({ task_type: "sparse_embedding", inference_id: "my-elser-endpoint", inference_config: { service: "elser", service_settings: { num_allocations: 1, num_threads: 1, }, }, }); console.log(response);
PUT _inference/sparse_embedding/my-elser-endpoint { "service": "elser", "service_settings": { "num_allocations": 1, "num_threads": 1 } }
The task type is |
|
The |
You might see a 502 bad gateway error in the response when using the Kibana Console.
This error usually just reflects a timeout, while the model downloads in the background.
You can check the download progress in the Machine Learning UI.
If using the Python client, you can set the timeout
parameter to a higher value.
Create the index mapping
editThe mapping of the destination index - the index that contains the embeddings
that the inference endpoint will generate based on your input text - must be created. The
destination index must have a field with the semantic_text
field type to index the output of the used inference endpoint.
resp = client.indices.create( index="semantic-embeddings", mappings={ "properties": { "semantic_text": { "type": "semantic_text", "inference_id": "my-elser-endpoint" }, "content": { "type": "text", "copy_to": "semantic_text" } } }, ) print(resp)
const response = await client.indices.create({ index: "semantic-embeddings", mappings: { properties: { semantic_text: { type: "semantic_text", inference_id: "my-elser-endpoint", }, content: { type: "text", copy_to: "semantic_text", }, }, }, }); console.log(response);
PUT semantic-embeddings { "mappings": { "properties": { "semantic_text": { "type": "semantic_text", "inference_id": "my-elser-endpoint" }, "content": { "type": "text", "copy_to": "semantic_text" } } } }
The name of the field to contain the generated embeddings. |
|
The field to contain the embeddings is a |
|
The |
|
The field to store the text reindexed from a source index in the Reindex the data step. |
|
The textual data stored in the |
Load data
editIn this step, you load the data that you later use to create embeddings from it.
Use the msmarco-passagetest2019-top1000
data set, which is a subset of the MS
MARCO Passage Ranking data set. It consists of 200 queries, each accompanied by
a list of relevant text passages. All unique passages, along with their IDs,
have been extracted from that data set and compiled into a
tsv file.
Download the file and upload it to your cluster using the
Data Visualizer
in the Machine Learning UI. Assign the name id
to the first column and content
to
the second column. The index name is test-data
. Once the upload is complete,
you can see an index named test-data
with 182469 documents.
Reindex the data
editCreate the embeddings from the text by reindexing the data from the test-data
index to the semantic-embeddings
index. The data in the content
field will
be reindexed into the content
field of the destination index.
The content
field data will be copied to the semantic_text
field as a result of the copy_to
parameter set in the index mapping creation step. The copied data will be
processed by the inference endpoint associated with the semantic_text
semantic text
field.
resp = client.reindex( wait_for_completion=False, source={ "index": "test-data", "size": 10 }, dest={ "index": "semantic-embeddings" }, ) print(resp)
const response = await client.reindex({ wait_for_completion: "false", source: { index: "test-data", size: 10, }, dest: { index: "semantic-embeddings", }, }); console.log(response);
POST _reindex?wait_for_completion=false { "source": { "index": "test-data", "size": 10 }, "dest": { "index": "semantic-embeddings" } }
The default batch size for reindexing is 1000. Reducing size to a smaller number makes the update of the reindexing process quicker which enables you to follow the progress closely and detect errors early. |
The call returns a task ID to monitor the progress:
resp = client.tasks.get( task_id="<task_id>", ) print(resp)
const response = await client.tasks.get({ task_id: "<task_id>", }); console.log(response);
GET _tasks/<task_id>
It is recommended to cancel the reindexing process if you don’t want to wait until it is fully complete which might take a long time for an inference endpoint with few assigned resources:
resp = client.tasks.cancel( task_id="<task_id>", ) print(resp)
const response = await client.tasks.cancel({ task_id: "<task_id>", }); console.log(response);
POST _tasks/<task_id>/_cancel
Semantic search
editAfter the data set has been enriched with the embeddings, you can query the data
using semantic search. Provide the semantic_text
field name and the query text
in a semantic
query type. The inference endpoint used to generate the embeddings
for the semantic_text
field will be used to process the query text.
resp = client.search( index="semantic-embeddings", query={ "semantic": { "field": "semantic_text", "query": "How to avoid muscle soreness while running?" } }, ) print(resp)
const response = await client.search({ index: "semantic-embeddings", query: { semantic: { field: "semantic_text", query: "How to avoid muscle soreness while running?", }, }, }); console.log(response);
GET semantic-embeddings/_search { "query": { "semantic": { "field": "semantic_text", "query": "How to avoid muscle soreness while running?" } } }
As a result, you receive the top 10 documents that are closest in meaning to the
query from the semantic-embedding
index:
"hits": [ { "_index": "semantic-embeddings", "_id": "6DdEuo8B0vYIvzmhoEtt", "_score": 24.972616, "_source": { "semantic_text": { "inference": { "inference_id": "my-elser-endpoint", "model_settings": { "task_type": "sparse_embedding" }, "chunks": [ { "text": "There are a few foods and food groups that will help to fight inflammation and delayed onset muscle soreness (both things that are inevitable after a long, hard workout) when you incorporate them into your postworkout eats, whether immediately after your run or at a meal later in the day. Advertisement. Advertisement.", "embeddings": { (...) } } ] } }, "id": 1713868, "content": "There are a few foods and food groups that will help to fight inflammation and delayed onset muscle soreness (both things that are inevitable after a long, hard workout) when you incorporate them into your postworkout eats, whether immediately after your run or at a meal later in the day. Advertisement. Advertisement." } }, { "_index": "semantic-embeddings", "_id": "-zdEuo8B0vYIvzmhplLX", "_score": 22.143118, "_source": { "semantic_text": { "inference": { "inference_id": "my-elser-endpoint", "model_settings": { "task_type": "sparse_embedding" }, "chunks": [ { "text": "During Your Workout. There are a few things you can do during your workout to help prevent muscle injury and soreness. According to personal trainer and writer for Iron Magazine, Marc David, doing warm-ups and cool-downs between sets can help keep muscle soreness to a minimum.", "embeddings": { (...) } } ] } }, "id": 3389244, "content": "During Your Workout. There are a few things you can do during your workout to help prevent muscle injury and soreness. According to personal trainer and writer for Iron Magazine, Marc David, doing warm-ups and cool-downs between sets can help keep muscle soreness to a minimum." } }, { "_index": "semantic-embeddings", "_id": "77JEuo8BdmhTuQdXtQWt", "_score": 21.506052, "_source": { "semantic_text": { "inference": { "inference_id": "my-elser-endpoint", "model_settings": { "task_type": "sparse_embedding" }, "chunks": [ { "text": "This is especially important if the soreness is due to a weightlifting routine. For this time period, do not exert more than around 50% of the level of effort (weight, distance and speed) that caused the muscle groups to be sore.", "embeddings": { (...) } } ] } }, "id": 363742, "content": "This is especially important if the soreness is due to a weightlifting routine. For this time period, do not exert more than around 50% of the level of effort (weight, distance and speed) that caused the muscle groups to be sore." } }, (...) ]