Path parameters
-
The type of the inference task that the model will perform.
Values are
chat_completion,completion, ortext_embedding. -
The unique identifier of the inference endpoint.
Query parameters
-
Specifies the amount of time to wait for the inference endpoint to be created.
Values are
-1or0.External documentation
Body
Required
-
The chunking configuration object. Applies only to the
text_embeddingtask type. Not applicable to thecompletionorchat_completiontask types.External documentation -
The type of service supported for the specified task type. In this case,
fireworksai.Value is
fireworksai. -
Settings used to install the inference model. These settings are specific to the
fireworksaiservice. -
Settings to configure the inference task. Applies only to the
completionorchat_completiontask types. Not applicable to thetext_embeddingtask type. These settings are specific to the task type you specified.
curl \
--request PUT 'http://api.example.com/_inference/{task_type}/{fireworksai_inference_id}' \
--header "Content-Type: application/json" \
--data '"{\n \"service\": \"fireworksai\",\n \"service_settings\": {\n \"api_key\": \"your-api-key\",\n \"model_id\": \"fireworks/qwen3-embedding-8b\"\n }\n}"'
{
"service": "fireworksai",
"service_settings": {
"api_key": "your-api-key",
"model_id": "fireworks/qwen3-embedding-8b"
}
}
{
"service": "fireworksai",
"service_settings": {
"api_key": "your-api-key",
"model_id": "fireworks/qwen3-embedding-8b",
"dimensions": 1024,
"similarity": "cosine",
"rate_limit": {
"requests_per_minute": 6000
}
}
}
{
"service": "fireworksai",
"service_settings": {
"api_key": "your-api-key",
"model_id": "accounts/fireworks/models/deepseek-v3p1"
}
}
{
"service": "fireworksai",
"service_settings": {
"api_key": "your-api-key",
"model_id": "accounts/fireworks/models/deepseek-v3p1"
}
}
{
"inference_id": "my-fireworks-embeddings",
"task_type": "text_embedding",
"service": "fireworksai",
"service_settings": {
"model_id": "fireworks/qwen3-embedding-8b",
"url": "https://api.fireworks.ai/inference/v1/embeddings",
"similarity": "cosine",
"dimensions": 4096,
"rate_limit": {
"requests_per_minute": 6000
}
},
"chunking_settings": {
"strategy": "sentence",
"max_chunk_size": 250,
"sentence_overlap": 1
}
}
{
"inference_id": "my-fireworks-chat",
"task_type": "chat_completion",
"service": "fireworksai",
"service_settings": {
"model_id": "accounts/fireworks/models/deepseek-v3p1",
"rate_limit": {
"requests_per_minute": 6000
}
}
}