Create an Azure OpenAI inference endpoint
Added in 8.14.0
Create an inference endpoint to perform an inference task with the azureopenai
service.
The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform. NOTE: The
chat_completion
task type only supports streaming and only through the _stream API.Values are
completion
ortext_embedding
. -
azureopenai_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
azureopenai
. -
service_settings
object Required -
task_settings
object
PUT
/_inference/{task_type}/{azureopenai_inference_id}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{azureopenai_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"azureopenai\",\n \"service_settings\": {\n \"api_key\": \"Api-Key\",\n \"resource_name\": \"Resource-name\",\n \"deployment_id\": \"Deployment-id\",\n \"api_version\": \"2024-02-01\"\n }\n}"'
Request examples
A text embedding task
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
"service": "azureopenai",
"service_settings": {
"api_key": "Api-Key",
"resource_name": "Resource-name",
"deployment_id": "Deployment-id",
"api_version": "2024-02-01"
}
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
"service": "azureopenai",
"service_settings": {
"api_key": "Api-Key",
"resource_name": "Resource-name",
"deployment_id": "Deployment-id",
"api_version": "2024-02-01"
}
}