Building software in any programming language, including Go, is committing to a lifetime of learning. Through her university and working career, Carly has dabbled in many programming languages and technologies, including the latest and greatest implementations of vector search. But that wasn't enough! So recently Carly started playing with Go, too.
Just like animals, programming languages, and your friendly author, search has undergone an evolution of different practices that can be difficult to decide between for your own search use case. In this blog, we'll share an overview of vector search along with examples of each approach using Elasticsearch and the Elasticsearch Go client. These examples will show you how to find gophers and determine what they eat using vector search in Elasticsearch and Go.
Prerequisites
To follow with this example, ensure the following prerequisites are met:
- Installation of Go version 1.21 or later
- Creation of your own Go repo with the
- Creation of your own Elasticsearch cluster, populated with a set of rodent-based pages, including for our friendly Gopher, from Wikipedia:
Connecting to Elasticsearch
In our examples, we shall make use of the Typed API offered by the Go client. Establishing a secure connection for any query requires configuring the client using either:
- Cloud ID and API key if making use of Elastic Cloud.
- Cluster URL, username, password and the certificate.
Connecting to our cluster located on Elastic Cloud would look like this:
func GetElasticsearchClient() (*elasticsearch.TypedClient, error) {
var cloudID = os.Getenv("ELASTIC_CLOUD_ID")
var apiKey = os.Getenv("ELASTIC_API_KEY")
var es, err = elasticsearch.NewTypedClient(elasticsearch.Config{
CloudID: cloudID,
APIKey: apiKey,
Logger: &elastictransport.ColorLogger{os.Stdout, true, true},
})
if err != nil {
return nil, fmt.Errorf("unable to connect: %w", err)
}
return es, nil
}
The client
connection can then be used for vector search, as shown in subsequent sections.
Vector search
Vector search attempts to solve this problem by converting the search problem into a mathematical comparison using vectors. The document embedding process has an additional stage of converting the document using a model into a dense vector representation, or simply a stream of numbers. The advantage of this approach is that you can search non-text documents such as images and audio by translating them into a vector alongside a query.
In simple terms, vector search is a set of vector distance calculations. In the below illustration, the vector representation of our query Go Gopher
is compared against the documents in the vector space, and the closest results (denoted by constant k
) are returned:
Depending on the approach used to generate the embeddings for your documents, there are two different ways to find out what gophers eat.
Approach 1: Bring your own model
With a Platinum license, it's possible to generate the embeddings within Elasticsearch by uploading the model and using the inference API. There are six steps involved in setting up the model:
- Select a PyTorch model to upload from a model repository. For this example, we're using the sentence-transformers/msmarco-MiniLM-L-12-v3 from Hugging Face to generate the embeddings.
- Load the model into Elastic using the Eland Machine Learning client for Python using the credentials for our Elasticsearch cluster and task type
text_embeddings
. If you don't have Eland installed, you can run the import step using Docker, as shown below:
docker run -it --rm --network host \
docker.elastic.co/eland/eland \
eland_import_hub_model \
--cloud-id $ELASTIC_CLOUD_ID \
--es-api-key $ELASTIC_API_KEY \
--hub-model-id sentence-transformers/msmarco-MiniLM-L-12-v3 \
--task-type text_embedding
- Once uploaded, quickly test the model
sentence-transformers__msmarco-minilm-l-12-v3
with a sample document to ensure the embeddings are generated as expected:
- Create an ingest pipeline containing an inference processor. This will allow the vector representation to be generated using the uploaded model:
PUT _ingest/pipeline/search-rodents-vector-embedding-pipeline
{
"processors": [
{
"inference": {
"model_id": "sentence-transformers__msmarco-minilm-l-12-v3",
"target_field": "text_embedding",
"field_map": {
"body_content": "text_field"
}
}
}
]
}
- Create a new index containing the field
text_embedding.predicted_value
of typedense_vector
to store the vector embeddings generated for each document:
PUT vector-search-rodents
{
"mappings": {
"properties": {
"text_embedding.predicted_value": {
"type": "dense_vector",
"dims": 384,
"index": true,
"similarity": "cosine"
},
"text": {
"type": "text"
}
}
}
}
- Reindex the documents using the newly created ingest pipeline to generate the text embeddings as the additional field
text_embedding.predicted_value
on each document:
POST _reindex
{
"source": {
"index": "search-rodents"
},
"dest": {
"index": "vector-search-rodents",
"pipeline": "search-rodents-vector-embedding-pipeline"
}
}
Now we can use the Knn
option on the same search API using the new index vector-search-rodents
, as shown in the below example:
func VectorSearch(client *elasticsearch.TypedClient, term string) ([]Rodent, error) {
var k = 10
var numCandidates = 10
res, err := client.Search().
Index("vector-search-rodents").
Knn(types.KnnSearch{
# Field in document containing vector
Field: "text_embedding.predicted_value",
# Number of neighbors to return
K: &k,
# Number of candidates to evaluate in comparison
NumCandidates: &numCandidates,
# Generate query vector using the same model used in the inference processor
QueryVectorBuilder: &types.QueryVectorBuilder{
TextEmbedding: &types.TextEmbedding{
ModelId: "sentence-transformers__msmarco-minilm-l-12-v3",
ModelText: term,
},
}}).Do(context.Background())
if err != nil {
return nil, fmt.Errorf("error in rodents vector search: %w", err)
}
return getRodents(res.Hits.Hits)
}
Converting the JSON result object via unmarshalling is done in the exact same way as the keyword search example. Constants K
and NumCandidates
allow us to configure the number of neighbor documents to return and the number of candidates to consider per shard. Note that increasing the number of candidates increases the accuracy of results but leads to a longer-running query as more comparisons are performed.
When the code is executed using the query What do Gophers eat?
, the results returned look similar to the below, highlighting that the Gopher article contains the information requested unlike the prior keyword search:
[
{ID:64f74ecd4acb3df024d91112 Title:Gopher - Wikipedia Url:https://en.wikipedia.org/wiki/Gopher}
{ID:64f74ed34acb3d71aed91fcd Title:Squirrel - Wikipedia Url:https://en.wikipedia.org/wiki/Squirrel}
//Other results omitted
]
Approach 2: Hugging Face inference API
Another option is to generate these same embeddings outside of Elasticsearch and ingest them as part of your document. As this option does not make use of an Elasticsearch machine learning node, it can be done on the free tier.
Hugging Face exposes a free-to-use, rate-limited inference API that, with an account and API token, can be used to generate the same embeddings manually for experimentation and prototyping to help you get started. It is not recommended for production use. Invoking your own models locally to generate embeddings or using the paid API can also be done using a similar approach.
In the below function GetTextEmbeddingForQuery
we use the inference API against our query string to generate the vector returned from a POST
request to the endpoint:
// HuggingFace text embedding helper
func GetTextEmbeddingForQuery(term string) []float32 {
// HTTP endpoint
model := "sentence-transformers/msmarco-minilm-l-12-v3"
posturl := fmt.Sprintf("https://api-inference.huggingface.co/pipeline/feature-extraction/%s", model)
// JSON body
body := []byte(fmt.Sprintf(`{
"inputs": "%s",
"options": {"wait_for_model":True}
}`, term))
// Create a HTTP post request
r, err := http.NewRequest("POST", posturl, bytes.NewBuffer(body))
if err != nil {
log.Fatal(err)
return nil
}
token := os.Getenv("HUGGING_FACE_TOKEN")
r.Header.Add("Authorization", fmt.Sprintf("Bearer %s", token))
client := &http.Client{}
res, err := client.Do(r)
if err != nil {
panic(err)
}
defer res.Body.Close()
var post []float32
derr := json.NewDecoder(res.Body).Decode(&post)
if derr != nil {
log.Fatal(derr)
return nil
}
return post
}
The resulting vector, of type []float32
is then passed as a QueryVector
instead of using the QueryVectorBuilder
option to leverage the model previously uploaded to Elastic.
func VectorSearchWithGeneratedQueryVector(client *elasticsearch.TypedClient, term string) ([]Rodent, error) {
vector, err := GetTextEmbeddingForQuery(term)
if err != nil {
return nil, err
}
if vector == nil {
return nil, fmt.Errorf("unable to generate vector: %w", err)
}
var k = 10
var numCandidates = 10
res, err := client.Search().
Index("vector-search-rodents").
Knn(types.KnnSearch{
# Field in document containing vector
Field: "text_embedding.predicted_value",
# Number of neighbors to return
K: &k,
# Number of candidates to evaluate in comparison
NumCandidates: &numCandidates,
# Query vector returned from Hugging Face inference API
QueryVector: vector,
}).
Do(context.Background())
if err != nil {
return nil, err
}
return getRodents(res.Hits.Hits)
}
Note that the K
and NumCandidates
options remain the same irrespective of the two options and that the same results are generated as we are using the same model to generate the embeddings
Conclusion
Here we've discussed how to perform vector search in Elasticsearch using the Elasticsearch Go client. Check out the GitHub repo for all the code in this series. Follow on to part 3 to gain an overview of combining vector search with the keyword search capabilities covered in part one in Go.
Until then, happy gopher hunting!
Resources
Ready to try this out on your own? Start a free trial.
Elasticsearch has integrations for tools from LangChain, Cohere and more. Join our advanced semantic search webinar to build your next GenAI app!
Related content
October 8, 2024
LangChain4j with Elasticsearch as the embedding store
LangChain4j (LangChain for Java) has Elasticsearch as an embedding store. Discover how to use it to build your RAG application in plain Java.
October 4, 2024
Using Eland on Elasticsearch Serverless
Learn how to use Eland on Elasticsearch Serverless
October 3, 2024
Testing your Java code with mocks and real Elasticsearch
Learn how to write your automated tests for Elasticsearch, using mocks and Testcontainers
How to ingest data from AWS S3 into Elastic Cloud - Part 1 : Elastic Serverless Forwarder
Learn about different ways you can ingest data from AWS S3 into Elastic Cloud
September 30, 2024
Automating traditional search with LLMs
Learn how to use LLMs to write Elastic Query DSL and query structured data with filters