Create an inference endpoint to perform an inference task with the azureopenai service.
The list of chat completion models that you can choose from in your Azure OpenAI deployment include:
The list of embeddings models that you can choose from in your deployment can be found in the Azure models documentation. ##Required authorization
- Cluster privileges:
manage_inference
Path parameters
-
The type of the inference task that the model will perform. NOTE: The
chat_completiontask type only supports streaming and only through the _stream API.Values are
completionortext_embedding. -
The unique identifier of the inference endpoint.
PUT
/_inference/{task_type}/{azureopenai_inference_id}
Console
PUT _inference/text_embedding/azure_openai_embeddings
{
"service": "azureopenai",
"service_settings": {
"api_key": "Api-Key",
"resource_name": "Resource-name",
"deployment_id": "Deployment-id",
"api_version": "2024-02-01"
}
}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{azureopenai_inference_id}' \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"azureopenai\",\n \"service_settings\": {\n \"api_key\": \"Api-Key\",\n \"resource_name\": \"Resource-name\",\n \"deployment_id\": \"Deployment-id\",\n \"api_version\": \"2024-02-01\"\n }\n}"'
Request examples
A text embedding task
Run `PUT _inference/text_embedding/azure_openai_embeddings` to create an inference endpoint that performs a `text_embedding` task. You do not specify a model, as it is defined already in the Azure OpenAI deployment.
{
"service": "azureopenai",
"service_settings": {
"api_key": "Api-Key",
"resource_name": "Resource-name",
"deployment_id": "Deployment-id",
"api_version": "2024-02-01"
}
}
Run `PUT _inference/completion/azure_openai_completion` to create an inference endpoint that performs a `completion` task.
{
"service": "azureopenai",
"service_settings": {
"api_key": "Api-Key",
"resource_name": "Resource-name",
"deployment_id": "Deployment-id",
"api_version": "2024-02-01"
}
}