Get an inference endpoint Added in 8.11.0

GET /_inference/{task_type}/{inference_id}

Path parameters

  • task_type string Required

    The task type

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference Id

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object
      • Additional properties are allowed.

        Hide chunking_settings attributes Show chunking_settings attributes object
        • Specifies the maximum size of a chunk in words This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy)

        • overlap number

          Specifies the number of overlapping words for chunks Only for word chunking strategy This value cannot be higher than the half of max_chunk_size

        • Specifies the number of overlapping sentences for chunks Only for sentence chunking strategy It can be either 1 or 0

        • strategy string

          Specifies the chunking strategy It could be either sentence or word

      • service string Required

        The service type

      • service_settings object Required

        Additional properties are allowed.

      • Additional properties are allowed.

      • inference_id string Required

        The inference Id

      • task_type string Required

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{task_type}/{inference_id}
curl \
 --request GET http://api.example.com/_inference/{task_type}/{inference_id} \
 --header "Authorization: $API_KEY"