Get an inference endpoint | Elasticsearch API documentation (v8)

Get an inference endpoint Generally available; Added in 8.11.0

GET /_inference/{task_type}/_all

Path parameters

task_type string Required

The task type of the endpoint to return

Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

Responses

200 application/json
Hide response attribute Show response attribute object
- endpoints array[object] Required
  
  Hide endpoints attributes Show endpoints attributes object
  
  Represents an inference endpoint as returned by the GET API
  
  chunking_settings object
  
  Chunking configuration object
  
  Hide chunking_settings attributes Show chunking_settings attributes object
  
  max_chunk_size number
  
  The maximum size of a chunk in words. This value cannot be higher than 300 or lower than 20 (for sentence strategy) or 10 (for word strategy).
  
  Default value is 250.0.
  
  overlap number
  
  The number of overlapping words for chunks. It is applicable only to a word chunking strategy. This value cannot be higher than half the max_chunk_size value.
  
  Default value is 100.0.
  
  sentence_overlap number
  
  The number of overlapping sentences for chunks. It is applicable only for a sentence chunking strategy. It can be either 1 or 0.
  
  Default value is 1.0.
  
  strategy string
  
  The chunking strategy: sentence or word.
  
  Default value is sentence.
  
  service string Required
  
  The service type
  
  service_settings object Required
  
  Settings specific to the service
  
  task_settings object
  
  Task settings specific to the service and task type
  
  inference_id string Required
  
  The inference Id
  
  task_type string Required
  
  The task type
  
  Values are sparse_embedding, text_embedding, rerank, completion, or chat_completion.

GET /_inference/{task_type}/_all

curl \
 --request GET 'http://api.example.com/_inference/{task_type}/_all'