Get an inference endpoint Generally available; Added in 8.11.0

GET /_inference/{task_type}/{inference_id}

All methods and paths for this operation:

GET /_inference

GET /_inference/{inference_id}
GET /_inference/{task_type}/_all
GET /_inference/{task_type}/{inference_id}

This API requires the monitor_inference cluster privilege (the built-in inference_admin and inference_user roles grant this privilege).

Path parameters

  • task_type string

    The task type of the endpoint to return

    Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

  • inference_id string Required

    The inference Id of the endpoint to return. Using _all or * will return all endpoints with the specified task_type if one is specified, or all endpoints for all task types if no task_type is specified

Responses

  • 200 application/json
    Hide response attribute Show response attribute object
    • endpoints array[object] Required
      Hide endpoints attributes Show endpoints attributes object

      Represents an inference endpoint as returned by the GET API

      • chunking_settings object

        The chunking configuration object. Applies only to the sparse_embedding and text_embedding task types. Not applicable to the rerank, completion, or chat_completion task types.

      • service string Required

        The service type

      • service_settings object Required

        Settings specific to the service

      • task_settings object

        Task settings specific to the service and task type

      • inference_id string Required

        The inference Id

      • task_type string Required

        The task type

        Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{task_type}/{inference_id}
GET _inference/sparse_embedding/my-elser-model
resp = client.inference.get(
    task_type="sparse_embedding",
    inference_id="my-elser-model",
)
const response = await client.inference.get({
  task_type: "sparse_embedding",
  inference_id: "my-elser-model",
});
response = client.inference.get(
  task_type: "sparse_embedding",
  inference_id: "my-elser-model"
)
$resp = $client->inference()->get([
    "task_type" => "sparse_embedding",
    "inference_id" => "my-elser-model",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_inference/sparse_embedding/my-elser-model"
client.inference().get(g -> g
    .inferenceId("my-elser-model")
    .taskType(TaskType.SparseEmbedding)
);