Create inference API | Elasticsearch Guide [8.11]

IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

› › ›

« Perform inference API Info API »

Create inference APIedit

This functionality is in technical preview and may be changed or removed in a future release. Elastic will work to fix any issues, but features in technical preview are not subject to the support SLA of official GA features.

Creates a model to perform an inference task.

Requestedit

PUT /_inference/<task_type>/<model_id>

Prerequisitesedit

Requires the manage cluster privilege.

Descriptionedit

The create inference API enables you to create and configure an inference model to perform a specific inference task.

Path parametersedit

<model_id>

(Required, string) The unique identifier of the model.

<task_type>

(Required, string) The type of the inference task that the model will perform. Available task types:

sparse_embedding,
text_embedding.

Request bodyedit

service

(Required, string) The type of service supported for the specified task type. Available services:

elser

service_settings

(Required, object) Settings used to install the inference model. These settings are specific to the service you specified.

task_settings

(Optional, object) Settings to configure the inference task. These settings are specific to the <task_type> you specified.

Examplesedit

The following example shows how to create an inference model called my-elser-model to perform a sparse_embedding task type.

PUT _inference/sparse_embedding/my-elser-model
{
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

Example response:

{
  "model_id": "my-elser-model",
  "task_type": "sparse_embedding",
  "service": "elser",
  "service_settings": {
    "num_allocations": 1,
    "num_threads": 1
  },
  "task_settings": {}
}

« Perform inference API Info API »