Perform dense embedding inference on the service Technical preview

POST /_inference/embedding/{inference_id}

Path parameters

  • inference_id string Required

    The inference Id

Query parameters

  • timeout string

    Specifies the amount of time to wait for the inference request to complete.

    Values are -1 or 0.

    External documentation
application/json

Body Required

  • input string | array[string] | object | array[object]

    Inference input. Either a string, an array of strings, a content object, or an array of content objects.

    string example:

    "input": "Some text"
    

    string array example:

    "input": ["Some text", "Some more text"]
    

    content object example:

    "input": {
        "content": {
          "type": "image",
          "format": "base64",
          "value": "data:image/jpeg;base64,..."
        }
      }
    

    content object array example:

    "input": [
      {
        "content": {
          "type": "text",
          "format": "text",
          "value": "Some text to generate an embedding"
        }
      },
      {
        "content": {
          "type": "image",
          "format": "base64",
          "value": "data:image/jpeg;base64,..."
        }
      }
    ]
    
    One of:
  • input_type string

    The input data type for the embedding model. Possible values include:

    • SEARCH
    • INGEST
    • CLASSIFICATION
    • CLUSTERING

    Not all models support all values. Unsupported values will trigger a validation exception. Accepted values depend on the configured inference service, refer to the relevant service-specific documentation for more info.


    The input_type parameter specified on the root level of the request body will take precedence over the input_type parameter specified in task_settings.

  • task_settings object

    Task settings for the individual inference request. These settings are specific to the you specified and override the task settings specified when initializing the service.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • embeddings_bytes array[object]
      Hide embeddings_bytes attribute Show embeddings_bytes attribute object

      The dense embedding result object for byte representation

      • embedding array[number] Required

        Dense Embedding results containing bytes are represented as Dense Vectors of bytes.

    • embeddings_bits array[object]
      Hide embeddings_bits attribute Show embeddings_bits attribute object

      The dense embedding result object for byte representation

      • embedding array[number] Required

        Dense Embedding results containing bytes are represented as Dense Vectors of bytes.

    • embeddings array[object]
      Hide embeddings attribute Show embeddings attribute object

      The dense embedding result object for float representation

      • embedding array[number] Required

        Dense Embedding results are represented as Dense Vectors of floats.

POST /_inference/embedding/{inference_id}
curl \
 --request POST 'http://api.example.com/_inference/embedding/{inference_id}' \
 --header "Content-Type: application/json" \
 --data '"{\n  \"input\": [\n      {\n          \"content\": {\n              \"type\": \"image\",\n              \"format\": \"base64\",\n              \"value\": \"data:image/jpeg;base64,...\"\n          }\n      },\n      {\n          \"content\": {\n              \"type\": \"text\",\n              \"value\": \"Some text to create an embedding\"\n          }\n      }\n  ]\n}"'
Request examples
Run `POST _inference/embedding/my-multimodal-endpoint` to generate embeddings from the example text and image
{
  "input": [
      {
          "content": {
              "type": "image",
              "format": "base64",
              "value": "data:image/jpeg;base64,..."
          }
      },
      {
          "content": {
              "type": "text",
              "value": "Some text to create an embedding"
          }
      }
  ]
}
Run `POST _inference/embedding/my-text-only-endpoint` to generate embeddings from the example text
{
  "input": ["The first text", "The second text"]
}
Response examples (200)
An abbreviated response from `POST _inference/embedding/my-multimodal-endpoint`.
{
  "embeddings": [
    {
      "embedding": [
        -0.0189209,
        -0.04174805,
        0.00854492,
        0.01556396,
        0.01928711,
        -0.00616455,
        -0.00460815,
        0.01477051,
        -0.00656128,
        0.05419922
      ]
    },
    {
      "embedding": [
        -0.01379395,
        -0.02368164,
        0.01068115,
        0.0279541,
        0.01043701,
        -7.7057E-4,
        0.04150391,
        0.00836182,
        -0.01135254,
        0.0246582
      ]
    }
  ]
}
An abbreviated response from `POST _inference/embedding/my-text-only-endpoint`.
{
  "embeddings": [
    {
      "embedding": [
        0.00854492,
        -0.00616455,
        -0.0189209,
        0.01556396,
        -0.00460815,
        0.01477051,
        -0.04174805,
        0.01928711,
        -0.00656128,
        0.05419922
      ]
    },
    {
      "embedding": [
        -0.01135254,
        0.0279541,
        -0.02368164,
        0.01068115,
        0.01043701,
        0.04150391,
        0.00836182,
        -7.7057E-4,
        -0.01379395,
        0.0246582
      ]
    }
  ]
}