﻿---
title: Semantic text field type reference
description: This page provides reference content for the semantic_text field type, including parameter descriptions, inference endpoint configuration options, chunking...
url: https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/semantic-text-reference
products:
  - Elasticsearch
applies_to:
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available since 9.0
---

# Semantic text field type reference
This page provides reference content for the `semantic_text` field type, including parameter descriptions, inference endpoint configuration options, chunking behavior, update operations, querying options, and limitations.

## Parameters for `semantic_text`

The `semantic_text` field type uses default indexing settings based on the [inference endpoint](#configuring-inference-endpoints) specified, enabling you to get started without providing additional configuration details. You can override these defaults by customizing the parameters described below.
<definitions>
  <definition term="inference_id">
    (Optional, string) Inference endpoint that will be used to generate
    embeddings for the field. If `search_inference_id` is specified, the inference
    endpoint will only be used at index time. Learn more about [configuring this parameter](#configuring-inference-endpoints).
  </definition>
</definitions>

**Updating the `inference_id` parameter**
<applies-switch>
  <applies-item title="stack: ga 9.3+" applies-to="Elastic Stack: Generally available since 9.3">
    You can update this parameter by using
    the [Update mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping).
    You can update the inference endpoint if no values have been indexed or if the new endpoint is compatible with the current one.
    <important>
      When updating an `inference_id` it is important to ensure the new inference endpoint produces embeddings compatible with those already indexed. This typically means using the same underlying model.
    </important>
  </applies-item>

  <applies-item title="stack: ga 9.0-9.2" applies-to="Elastic Stack: Generally available from 9.0 to 9.2">
    This parameter cannot be updated.
  </applies-item>
</applies-switch>

<definitions>
  <definition term="search_inference_id">
    (Optional, string) The inference endpoint that will be used to generate
    embeddings at query time. Use the [Create inference API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put) to create the endpoint. If not specified, the inference endpoint defined by
    `inference_id` will be used at both index and query time.
    You can update this parameter by using
    the [Update mapping API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-put-mapping).
    Learn how to [use dedicated endpoints for ingestion and search](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#dedicated-endpoints-for-ingestion-and-search).
  </definition>
  <definition term="index_options Elastic Stack: Generally available since 9.1">
    (Optional, object) Specifies the index options to override default values
    for the field. Currently, `dense_vector` and `sparse_vector` index options are supported. For text embeddings, `index_options` may match any allowed.
  </definition>
</definitions>

- [dense_vector index options](/docs/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-index-options)
- [sparse_vector index options](/docs/reference/elasticsearch/mapping-reference/sparse-vector#sparse-vectors-params) <applies-to>Elastic Stack: Generally available since 9.2</applies-to>

<definitions>
  <definition term="chunking_settings Elastic Stack: Generally available since 9.1">
    (Optional, object) Settings for chunking text into smaller passages.
    If specified, these will override the chunking settings set in the Inference
    endpoint associated with `inference_id`.
    If chunking settings are updated, they will not be applied to existing documents
    until they are reindexed.  Defaults to the optimal chunking settings for [Elastic Rerank](https://www.elastic.co/docs/explore-analyze/machine-learning/nlp/ml-nlp-rerank).
    To completely disable chunking, use the `none` chunking strategy.
    <important>
      When using the `none` chunking strategy, if the input exceeds the maximum token limit of the underlying model,
      some services (such as OpenAI) may return an error. In contrast, the `elastic` and `elasticsearch` services will
      automatically truncate the input to fit within the model's limit.
    </important>
  </definition>
</definitions>


### Customizing semantic_text indexing

The following example shows how to configure `inference_id`, `index_options` and `chunking_settings` for a `semantic_text` field type:
```json

{
  "mappings": {
    "properties": {
      "inference_field": {
        "type": "semantic_text",
        "inference_id": "my-text-embedding-endpoint", <1>
        "index_options": { <2>
          "dense_vector": {
            "type": "int4_flat"
          }
        },
        "chunking_settings": { <3>
          "type": "none"
        }
      }
    }
  }
}
```

<note>
  <applies-to>Elastic Stack: Generally available since 9.1</applies-to>  Newly created indices with `semantic_text` fields using dense embeddings will be
  [quantized](/docs/reference/elasticsearch/mapping-reference/dense-vector#dense-vector-quantization)
  to `bbq_hnsw` automatically as long as they have a minimum of 64 dimensions.
</note>


## Inference endpoints

The `semantic_text` field type specifies an inference endpoint identifier (`inference_id`) that is used to generate embeddings.
The following inference endpoint configurations are available:
- [Default and preconfigured endpoints](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#default-and-preconfigured-endpoints): Use `semantic_text` without creating an inference endpoint manually.
- [ELSER on EIS](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#using-elser-on-eis): Use the ELSER model through the Elastic Inference Service.
- [Custom endpoints](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#using-custom-endpoint): Create your own inference endpoint using the [Create inference API](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put) to use custom models or third-party services.

If you use a [custom inference endpoint](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#using-custom-endpoint) through your ML node and not through Elastic Inference Service (EIS), the recommended method is to [use dedicated endpoints for ingestion and search](/docs/reference/elasticsearch/mapping-reference/semantic-text-setup-configuration#dedicated-endpoints-for-ingestion-and-search).
<applies-to>Elastic Stack: Generally available since 9.1</applies-to> If you use EIS, you don't have to set up dedicated endpoints.
<warning>
  Removing an inference endpoint will cause ingestion of documents and semantic
  queries to fail on indices that define `semantic_text` fields with that
  inference endpoint as their `inference_id`. Trying
  to [delete an inference endpoint](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-delete)
  that is used on a `semantic_text` field will result in an error.
</warning>


## Chunking

Inference endpoints have a limit on the amount of text they can process. To
allow for large amounts of text to be used in semantic search, `semantic_text`
automatically generates smaller passages if needed, called chunks.
Each chunk refers to a passage of the text and the corresponding embedding
generated from it. When querying, the individual passages will be automatically
searched for each document, and the most relevant passage will be used to
compute a score.
Chunks are stored as start and end character offsets rather than as separate
text strings. These offsets point to the exact location of each chunk within the
original input text.
You can [pre-chunk content](/docs/reference/elasticsearch/mapping-reference/semantic-text-ingestions#pre-chunking) by providing text as arrays before indexing.
Refer to the [Inference API documentation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-put#operation-inference-put-body-application-json-chunking_settings) for values for `chunking_settings` and to [Configuring chunking](https://www.elastic.co/docs/explore-analyze/elastic-inference/inference-api#infer-chunking-config) to learn about different chunking strategies.

## Pre-filtering for dense vector queries

<applies-to>
  - Elastic Cloud Serverless: Generally available
  - Elastic Stack: Generally available since 9.3
</applies-to>

When you query `semantic_text` fields with dense vector embeddings, Elasticsearch automatically applies filters from Query DSL or ESQL queries as [pre-filters](/docs/reference/query-languages/query-dsl/query-dsl-knn-query#knn-query-filtering) to the vector search. The vector search then finds the most semantically relevant results within the filtered set of documents, ensuring that the number of requested documents is returned.
The following examples in Query DSL and ESQL syntax demonstrate finding the 10 most relevant documents matching "quick drying t-shirts" while filtering to only green items.

### Query DSL example

In Query DSL, `must`, `filter`, and `must_not` queries within the parent `bool` query are used as pre-filters for `semantic_text` queries. The `term` query below will be applied as a pre-filter to the knn search on `dense_semantic_text_field`.
```json

{
  "size" : 10,
  "query" : {
    "bool" : {
      "must" : {
        "match": { <1>
          "dense_semantic_text_field": {
            "query": "quick drying t-shirts"
          }
        }
      },
      "filter" : {
        "term" : {
          "color": {
            "value": "green"
          }
        }
      }
    }
  }
}
```

<important>
  When you query a `semantic_text` field directly with a [kNN query](/docs/reference/query-languages/query-dsl/query-dsl-knn-query#knn-query-with-semantic-text) in Query DSL, automatic pre-filtering does not apply. The kNN query provides a direct parameter for defining pre-filters as explained in [Pre-filters and post-filters](/docs/reference/query-languages/query-dsl/query-dsl-knn-query#knn-query-filtering).
</important>


### ES|QL example

The `WHERE color == "green"` clause will be applied as a pre-filter to the kNN search on `dense_semantic_text_field`.
```json

{
  "query": """
          FROM my-index METADATA _score
          | WHERE MATCH(dense_semantic_text_field, "quick drying t-shirts") <1>
          | WHERE color == "green"
          | SORT _score DESC
          | LIMIT 10
   """
}
```


## Limitations

`semantic_text` field types have the following limitations:
- `semantic_text` fields are not currently supported as elements
  of [nested fields](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/nested).
- `semantic_text` fields can't currently be set as part
  of [dynamic templates](https://www.elastic.co/docs/manage-data/data-store/mapping/dynamic-templates).
- `semantic_text` fields are not supported in indices created prior to 8.11.0.
- `semantic_text` fields do not support [Cross-Cluster Replication (CCR)](https://www.elastic.co/docs/deploy-manage/tools/cross-cluster-replication).
- [Automatic pre-filtering](#pre-filtering-for-dense-vector-queries) in Query DSL does not apply to [Nested queries](https://www.elastic.co/docs/reference/query-languages/query-dsl/query-dsl-nested-query). Such queries will be applied as post-filters.
- [Automatic pre-filtering](#pre-filtering-for-dense-vector-queries) in ESQL does not apply to filters that use certain functions (like `WHERE TO_LOWER(my_field) == 'a'`). Such filters will be applied as post-filters.


## Document count discrepancy in `_cat/indices`

When an index contains a `semantic_text` field, the `docs.count` value returned by the [`_cat/indices`](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-cat-indices) API may be higher than the number of documents you indexed.
This occurs because `semantic_text` stores embeddings in [nested documents](https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/nested), one per chunk. The `_cat/indices` API counts all documents in the Lucene index, including these hidden nested documents.
To count only top-level documents, excluding the nested documents that store embeddings, use one of the following APIs:
- `GET /<index>/_count`
- `GET _cat/count/<index>`