Create an AlibabaCloud AI Search inference endpoint
Added in 8.16.0
Create an inference endpoint to perform an inference task with the alibabacloud-ai-search
service.
Path parameters
-
task_type
string Required The type of the inference task that the model will perform.
Values are
completion
,rerank
,space_embedding
, ortext_embedding
. -
alibabacloud_inference_id
string Required The unique identifier of the inference endpoint.
Body
-
chunking_settings
object -
service
string Required Value is
alibabacloud-ai-search
. -
service_settings
object Required -
task_settings
object
PUT
/_inference/{task_type}/{alibabacloud_inference_id}
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{alibabacloud_inference_id}' \
--header "Authorization: $API_KEY" \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"alibabacloud-ai-search\",\n \"service_settings\": {\n \"host\" : \"default-j01.platform-cn-shanghai.opensearch.aliyuncs.com\",\n \"api_key\": \"AlibabaCloud-API-Key\",\n \"service_id\": \"ops-qwen-turbo\",\n \"workspace\" : \"default\"\n }\n}"'
Request examples
A completion task
Run `PUT _inference/completion/alibabacloud_ai_search_completion` to create an inference endpoint that performs a completion task.
{
"service": "alibabacloud-ai-search",
"service_settings": {
"host" : "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"api_key": "AlibabaCloud-API-Key",
"service_id": "ops-qwen-turbo",
"workspace" : "default"
}
}
Run `PUT _inference/rerank/alibabacloud_ai_search_rerank` to create an inference endpoint that performs a rerank task.
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "AlibabaCloud-API-Key",
"service_id": "ops-bge-reranker-larger",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}
Run `PUT _inference/sparse_embedding/alibabacloud_ai_search_sparse` to create an inference endpoint that performs perform a sparse embedding task.
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "AlibabaCloud-API-Key",
"service_id": "ops-text-sparse-embedding-001",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}
Run `PUT _inference/text_embedding/alibabacloud_ai_search_embeddings` to create an inference endpoint that performs a text embedding task.
{
"service": "alibabacloud-ai-search",
"service_settings": {
"api_key": "AlibabaCloud-API-Key",
"service_id": "ops-text-embedding-001",
"host": "default-j01.platform-cn-shanghai.opensearch.aliyuncs.com",
"workspace": "default"
}
}