Create an OpenAI inference endpoint
Added in 8.12.0
Create an inference endpoint to perform an inference task with the openai
service or openai
compatible APIs.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform. NOTE: The
chat_completion
task type only supports streaming and only through the _stream API.Values are
chat_completion
,completion
, ortext_embedding
. -
openai_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
openai
. -
service_settings
object Required -
task_settings
object
PUT
/_inference/{task_type}/{openai_inference_id}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{openai_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"openai\",\n \"service_settings\": {\n \"api_key\": \"OpenAI-API-Key\",\n \"model_id\": \"text-embedding-3-small\",\n \"dimensions\": 128\n }\n}"'
Request examples
A text embedding task
Run `PUT _inference/text_embedding/openai-embeddings` to create an inference endpoint that performs a `text_embedding` task. The embeddings created by requests to this endpoint will have 128 dimensions.
{
"service": "openai",
"service_settings": {
"api_key": "OpenAI-API-Key",
"model_id": "text-embedding-3-small",
"dimensions": 128
}
}
Run `PUT _inference/completion/amazon_bedrock_completion` to create an inference endpoint to perform a completion task.
{
"service": "amazonbedrock",
"service_settings": {
"access_key": "AWS-access-key",
"secret_key": "AWS-secret-key",
"region": "us-east-1",
"provider": "amazontitan",
"model": "amazon.titan-text-premier-v1:0"
}
}