Simulate data ingestion | Elasticsearch API documentation

Simulate data ingestion Technical preview; Added in 8.12.0

POST /_ingest/{index}/_simulate

All methods and paths for this operation:

GET /_ingest/_simulate

POST /_ingest/_simulate

GET /_ingest/{index}/_simulate

POST /_ingest/{index}/_simulate

Run ingest pipelines against a set of provided documents, optionally with substitute pipeline definitions, to simulate ingesting data into an index.

This API is meant to be used for troubleshooting or pipeline development, as it does not actually index any data into Elasticsearch.

The API runs the default and final pipeline for that index against a set of documents provided in the body of the request. If a pipeline contains a reroute processor, it follows that reroute processor to the new index, running that index's pipelines as well the same way that a non-simulated ingest would. No data is indexed into Elasticsearch. Instead, the transformed document is returned, along with the list of pipelines that have been run and the name of the index where the document would have been indexed if this were not a simulation. The transformed document is validated against the mappings that would apply to this index, and any validation error is reported in the result.

This API differs from the simulate pipeline API in that you specify a single pipeline for that API, and it runs only that one pipeline. The simulate pipeline API is more useful for developing a single pipeline, while the simulate ingest API is more useful for troubleshooting the interaction of the various pipelines that get applied when ingesting into an index.

By default, the pipeline definitions that are currently in the system are used. However, you can supply substitute pipeline definitions in the body of the request. These will be used in place of the pipeline definitions that are already in the system. This can be used to replace existing pipeline definitions or to create new ones. The pipeline substitutions are used only within this request.

Required authorization

Index privileges: index

Path parameters

index string Required

The index to simulate ingesting into. This value can be overridden by specifying an index on each document. If you specify this parameter in the request path, it is used for any documents that do not explicitly specify an index argument.

Query parameters

pipeline string

The pipeline to use as the default pipeline. This value can be used to override the default pipeline of the index.
merge_type string

The mapping merge type if mapping overrides are being provided in mapping_addition. The allowed values are one of index or template. The index option merges mappings the way they would be merged into an existing index. The template option merges mappings the way they would be merged into a template.

Values are index or template.

application/json

Body Required

docs array[object] Required

Sample documents to test in the pipeline.
Hide docs attributes Show docs attributes object
- _id string
  
  Unique identifier for the document. This ID must be unique within the _index.
- _index string
  
  Name of the index containing the document.
- _source object Required
  
  JSON body for the document.
component_template_substitutions object

A map of component template names to substitute component template definition objects.
Hide component_template_substitutions attribute Show component_template_substitutions attribute object
- * object
  Hide * attributes Show * attributes object
  
  template object Required
  
  Hide template attributes Show template attributes object
  
  _meta object
  
  Hide _meta attribute Show _meta attribute object
  
  * object Additional properties
  
  version number
  
  settings object
  
  Hide settings attribute Show settings attribute object
  
  * object
  Index settings
  
  mappings object
  
  Hide mappings attributes Show mappings attributes object
  
  date_detection boolean
  
  dynamic_date_formats array[string]
  
  dynamic_templates array[object]
  
  numeric_detection boolean
  
  properties object
  
  runtime object
  
  enabled boolean
  
  aliases object
  
  Hide aliases attribute Show aliases attribute object
  
  * object Additional properties
  
  Hide * attributes Show * attributes object
  
  index_routing string
  
  Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.
  
  is_write_index boolean
  
  If true, the index is the write index for the alias.
  
  Default value is false.
  
  routing string
  
  Value used to route indexing and search operations to a specific shard.
  
  search_routing string
  
  Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.
  
  is_hidden boolean Generally available; Added in 7.16.0
  
  If true, the alias is hidden. All indices for the alias must have the same is_hidden value.
  
  Default value is false.
  
  lifecycle object
  
  Data stream lifecycle with rollover can be used to display the configuration including the default rollover conditions, if asked.
  
  data_stream_options object
  
  Data stream options contain the configuration of data stream level features for a given data stream, for example, the failure store configuration.
  
  version number
  
  _meta object
  
  Hide _meta attribute Show _meta attribute object
  
  * object Additional properties
  
  deprecated boolean
  
  created_date string | number
  
  Date and time when the component template was created. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the component template was created. Only returned if the human query parameter is true.
  
  Date and time when the component template was created. Only returned if the human query parameter is true.
  
  created_date_millis number
  
  Time unit for milliseconds
  
  modified_date string | number
  
  Date and time when the component template was last modified. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the component template was last modified. Only returned if the human query parameter is true.
  
  Date and time when the component template was last modified. Only returned if the human query parameter is true.
  
  modified_date_millis number
  
  Time unit for milliseconds
index_template_substitutions object

A map of index template names to substitute index template definition objects.
Hide index_template_substitutions attribute Show index_template_substitutions attribute object
- * object
  Hide * attributes Show * attributes object
  
  index_patterns string | array[string] Required
  
  Name of the index template.
  
  composed_of array[string] Required
  
  An ordered list of component template names. Component templates are merged in the order specified, meaning that the last component template specified has the highest precedence.
  
  template object
  
  Template to be applied. It may optionally include an aliases, mappings, or settings configuration.
  
  Hide template attributes Show template attributes object
  
  aliases object
  
  Aliases to add. If the index template includes a data_stream object, these are data stream aliases. Otherwise, these are index aliases. Data stream aliases ignore the index_routing, routing, and search_routing options.
  
  Hide aliases attribute Show aliases attribute object
  
  * object Additional properties
  
  Hide * attributes Show * attributes object
  
  is_hidden boolean
  
  If true, the alias is hidden. All indices for the alias must have the same is_hidden value.
  
  Default value is false.
  
  is_write_index boolean
  
  If true, the index is the write index for the alias.
  
  Default value is false.
  
  mappings object
  
  Mapping for fields in the index. If specified, this mapping can include field names, field data types, and mapping parameters.
  
  Hide mappings attributes Show mappings attributes object
  
  date_detection boolean
  
  dynamic_date_formats array[string]
  
  dynamic_templates array[object]
  
  numeric_detection boolean
  
  properties object
  
  runtime object
  
  enabled boolean
  
  settings object
  
  Configuration options for the index.
  
  Index settings
  
  lifecycle object
  
  Data stream lifecycle with rollover can be used to display the configuration including the default rollover conditions, if asked.
  
  data_stream_options object
  
  Data stream options contain the configuration of data stream level features for a given data stream, for example, the failure store configuration.
  
  version number
  
  Version number used to manage index templates externally. This number is not automatically generated by Elasticsearch.
  
  priority number
  
  Priority to determine index template precedence when a new data stream or index is created. The index template with the highest priority is chosen. If no priority is specified the template is treated as though it is of priority 0 (lowest priority). This number is not automatically generated by Elasticsearch.
  
  _meta object
  
  Optional user metadata about the index template. May have any contents. This map is not automatically generated by Elasticsearch.
  
  Hide _meta attribute Show _meta attribute object
  
  * object Additional properties
  
  allow_auto_create boolean
  
  data_stream object
  
  If this object is included, the template is used to create data streams and their backing indices. Supports an empty object. Data streams require a matching index template with a data_stream object.
  
  Hide data_stream attributes Show data_stream attributes object
  
  hidden boolean
  
  If true, the data stream is hidden.
  
  Default value is false.
  
  allow_custom_routing boolean
  
  If true, the data stream supports custom routing.
  
  Default value is false.
  
  deprecated boolean Generally available; Added in 8.12.0
  
  Marks this index template as deprecated. When creating or updating a non-deprecated index template that uses deprecated components, Elasticsearch will emit a deprecation warning.
  
  ignore_missing_component_templates string | array[string]
  
  A list of component template names that are allowed to be absent.
  
  created_date string | number
  
  Date and time when the index template was created. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the index template was created. Only returned if the human query parameter is true.
  
  Date and time when the index template was created. Only returned if the human query parameter is true.
  
  created_date_millis number
  
  Time unit for milliseconds
  
  modified_date string | number
  
  Date and time when the index template was last modified. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the index template was last modified. Only returned if the human query parameter is true.
  
  Date and time when the index template was last modified. Only returned if the human query parameter is true.
  
  modified_date_millis number
  
  Time unit for milliseconds
mapping_addition object
Hide mapping_addition attributes Show mapping_addition attributes object
- all_field object
  Hide all_field attributes Show all_field attributes object
  
  analyzer string Required
  
  enabled boolean Required
  
  omit_norms boolean Required
  
  search_analyzer string Required
  
  similarity string Required
  
  store boolean Required
  
  store_term_vector_offsets boolean Required
  
  store_term_vector_payloads boolean Required
  
  store_term_vector_positions boolean Required
  
  store_term_vectors boolean Required
- date_detection boolean
- dynamic string
  
  Values are strict, runtime, true, or false.
- dynamic_date_formats array[string]
- dynamic_templates array[object]
- _field_names object
  Hide _field_names attribute Show _field_names attribute object
  
  enabled boolean Required
- index_field object
  Hide index_field attribute Show index_field attribute object
  
  enabled boolean Required
- _meta object
  Hide _meta attribute Show _meta attribute object
  
  * object Additional properties
- numeric_detection boolean
- properties object
- _routing object
  Hide _routing attribute Show _routing attribute object
  
  required boolean Required
- _size object
  Hide _size attribute Show _size attribute object
  
  enabled boolean Required
- _source object
  Hide _source attributes Show _source attributes object
  
  compress boolean
  
  compress_threshold string
  
  enabled boolean
  
  excludes array[string]
  
  includes array[string]
  
  mode string
  
  Supported values include:
  
  disabled
  
  stored
  
  synthetic: Instead of storing source documents on disk exactly as you send them, Elasticsearch can reconstruct source content on the fly upon retrieval.
  
  Values are disabled, stored, or synthetic.
- runtime object
  Hide runtime attribute Show runtime attribute object
  
  * object Additional properties
  
  Hide * attributes Show * attributes object
  
  fields object
  
  For type composite
  
  Hide fields attribute Show fields attribute object
  
  * object Additional properties
  
  fetch_fields array[object]
  
  For type lookup
  
  Hide fetch_fields attributes Show fetch_fields attributes object
  
  field
  
  format string
  
  format string
  
  A custom format for date type runtime fields.
  
  input_field string
  
  For type lookup
  
  target_field string
  
  For type lookup
  
  target_index string
  
  For type lookup
  
  script object
  
  Painless script executed at query time.
  
  Hide script attributes Show script attributes object
  
  params object
  
  Specifies any named parameters that are passed into the script as variables. Use parameters instead of hard-coded values to decrease compile time.
  
  options object
  
  type string Required
  
  Field type, which can be: boolean, composite, date, double, geo_point, ip,keyword, long, or lookup.
  
  Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.
- enabled boolean
- subobjects string
  
  Values are true or false.
- _data_stream_timestamp object
  Hide _data_stream_timestamp attribute Show _data_stream_timestamp attribute object
  
  enabled boolean Required
pipeline_substitutions object

Pipelines to test. If you don’t specify the pipeline request path parameter, this parameter is required. If you specify both this and the request path parameter, the API only uses the request path parameter.
Hide pipeline_substitutions attribute Show pipeline_substitutions attribute object
- * object Additional properties
  Hide * attributes Show * attributes object
  
  description string
  
  Description of the ingest pipeline.
  
  on_failure array[object]
  
  Processors to run immediately after a processor failure.
  
  Hide on_failure attributes Show on_failure attributes object
  
  append object
  
  attachment object
  
  bytes object
  
  circle object
  
  community_id object
  
  convert object
  
  csv object
  
  date object
  
  date_index_name object
  
  dissect object
  
  dot_expander object
  
  drop object
  
  enrich object
  
  fail object
  
  fingerprint object
  
  foreach object
  
  ip_location object
  
  geo_grid object
  
  geoip object
  
  grok object
  
  gsub object
  
  html_strip object
  
  inference object
  
  join object
  
  json object
  
  kv object
  
  lowercase object
  
  network_direction object
  
  pipeline object
  
  redact object
  
  registered_domain object
  
  remove object
  
  rename object
  
  reroute object
  
  script object
  
  set object
  
  set_security_user object
  
  sort object
  
  split object
  
  terminate object
  
  trim object
  
  uppercase object
  
  urldecode object
  
  uri_parts object
  
  user_agent object
  
  processors array[object]
  
  Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.
  
  Hide processors attributes Show processors attributes object
  
  append object
  
  attachment object
  
  bytes object
  
  circle object
  
  community_id object
  
  convert object
  
  csv object
  
  date object
  
  date_index_name object
  
  dissect object
  
  dot_expander object
  
  drop object
  
  enrich object
  
  fail object
  
  fingerprint object
  
  foreach object
  
  ip_location object
  
  geo_grid object
  
  geoip object
  
  grok object
  
  gsub object
  
  html_strip object
  
  inference object
  
  join object
  
  json object
  
  kv object
  
  lowercase object
  
  network_direction object
  
  pipeline object
  
  redact object
  
  registered_domain object
  
  remove object
  
  rename object
  
  reroute object
  
  script object
  
  set object
  
  set_security_user object
  
  sort object
  
  split object
  
  terminate object
  
  trim object
  
  uppercase object
  
  urldecode object
  
  uri_parts object
  
  user_agent object
  
  version number
  
  Version number used by external systems to track ingest pipelines.
  
  deprecated boolean
  
  Marks this ingest pipeline as deprecated. When a deprecated ingest pipeline is referenced as the default or final pipeline when creating or updating a non-deprecated index template, Elasticsearch will emit a deprecation warning.
  
  Default value is false.
  
  _meta object
  
  Arbitrary metadata about the ingest pipeline. This map is not automatically generated by Elasticsearch.
  
  Hide _meta attribute Show _meta attribute object
  
  * object Additional properties
  
  created_date string | number
  
  Date and time when the pipeline was created. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the pipeline was created. Only returned if the human query parameter is true.
  
  Date and time when the pipeline was created. Only returned if the human query parameter is true.
  
  created_date_millis number
  
  Time unit for milliseconds
  
  modified_date string | number
  
  Date and time when the pipeline was last modified. Only returned if the human query parameter is true.
  
  One of:
  string-1 string EpochTimeUnitMillis number
  
  Date and time when the pipeline was last modified. Only returned if the human query parameter is true.
  
  Date and time when the pipeline was last modified. Only returned if the human query parameter is true.
  
  modified_date_millis number
  
  Time unit for milliseconds
  
  field_access_pattern string
  
  Controls how processors in this pipeline should read and write data on a document's source.
  
  Values are classic or flexible.

Responses

200 application/json
Hide response attribute Show response attribute object
- docs array[object] Required
  
  Hide docs attribute Show docs attribute object
  
  doc object
  
  The results of ingest simulation on a single document. The _source of the document contains the results after running all pipelines listed in executed_pipelines on the document. The list of executed pipelines is derived from the pipelines that would be executed if this document had been ingested into _index.
  
  Hide doc attributes Show doc attributes object
  
  _id string Required
  
  Identifier for the document.
  
  _index string Required
  
  Name of the index that the document would be indexed into if this were not a simulation.
  
  _source object Required
  
  JSON body for the document.
  
  Hide _source attribute Show _source attribute object
  
  * object Additional properties
  
  _version
  
  executed_pipelines array[string] Required
  
  A list of the names of the pipelines executed on this document.
  
  ignored_fields array[object]
  
  A list of the fields that would be ignored at the indexing step. For example, a field whose value is larger than the allowed limit would make it through all of the pipelines, but would not be indexed into Elasticsearch.
  
  error object
  
  Any error resulting from simulatng ingest on this doc. This can be an error generated by executing a processor, or a mapping validation error when simulating indexing the resulting doc.
  
  effective_mapping object

POST /_ingest/{index}/_simulate

POST /_ingest/_simulate
{
  "docs": [
    {
      "_id": "123",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "456",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}

resp = client.simulate.ingest(
    docs=[
        {
            "_id": "123",
            "_index": "my-index",
            "_source": {
                "foo": "bar"
            }
        },
        {
            "_id": "456",
            "_index": "my-index",
            "_source": {
                "foo": "rab"
            }
        }
    ],
)

const response = await client.simulate.ingest({
  docs: [
    {
      _id: "123",
      _index: "my-index",
      _source: {
        foo: "bar",
      },
    },
    {
      _id: "456",
      _index: "my-index",
      _source: {
        foo: "rab",
      },
    },
  ],
});

response = client.simulate.ingest(
  body: {
    "docs": [
      {
        "_id": "123",
        "_index": "my-index",
        "_source": {
          "foo": "bar"
        }
      },
      {
        "_id": "456",
        "_index": "my-index",
        "_source": {
          "foo": "rab"
        }
      }
    ]
  }
)

$resp = $client->simulate()->ingest([
    "body" => [
        "docs" => array(
            [
                "_id" => "123",
                "_index" => "my-index",
                "_source" => [
                    "foo" => "bar",
                ],
            ],
            [
                "_id" => "456",
                "_index" => "my-index",
                "_source" => [
                    "foo" => "rab",
                ],
            ],
        ),
    ],
]);

curl -X POST -H "Authorization: ApiKey $ELASTIC_API_KEY" -H "Content-Type: application/json" -d '{"docs":[{"_id":"123","_index":"my-index","_source":{"foo":"bar"}},{"_id":"456","_index":"my-index","_source":{"foo":"rab"}}]}' "$ELASTICSEARCH_URL/_ingest/_simulate"

client.simulate().ingest(i -> i
    .docs(List.of(Document.of(d -> d
            .id("123")
            .index("my-index")
            .source(JsonData.fromJson("{\"foo\":\"bar\"}"))),Document.of(d -> d
            .id("456")
            .index("my-index")
            .source(JsonData.fromJson("{\"foo\":\"rab\"}")))))
);

Request examples

In this example the index `my-index` has a default pipeline called `my-pipeline` and a final pipeline called `my-final-pipeline`. Since both documents are being ingested into `my-index`, both pipelines are run using the pipeline definitions that are already in the system.

{
  "docs": [
    {
      "_id": "123",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "456",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ]
}

In this example the index `my-index` has a default pipeline called `my-pipeline` and a final pipeline called `my-final-pipeline`. But a substitute definition of `my-pipeline` is provided in `pipeline_substitutions`. The substitute `my-pipeline` will be used in place of the `my-pipeline` that is in the system, and then the `my-final-pipeline` that is already defined in the system will run.

{
  "docs": [
    {
      "_index": "my-index",
      "_id": "123",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_index": "my-index",
      "_id": "456",
      "_source": {
        "foo": "rab"
      }
    }
  ],
  "pipeline_substitutions": {
    "my-pipeline": {
      "processors": [
        {
          "uppercase": {
            "field": "foo"
          }
        }
      ]
    }
  }
}

In this example, imagine that the index `my-index` has a strict mapping with only the `foo` keyword field defined. Say that field mapping came from a component template named `my-mappings-template`. You want to test adding a new field, `bar`. So a substitute definition of `my-mappings-template` is provided in `component_template_substitutions`. The substitute `my-mappings-template` will be used in place of the existing mapping for `my-index` and in place of the `my-mappings-template` that is in the system.

{
  "docs": [
    {
      "_index": "my-index",
      "_id": "123",
      "_source": {
        "foo": "foo"
      }
    },
    {
      "_index": "my-index",
      "_id": "456",
      "_source": {
        "bar": "rab"
      }
    }
  ],
  "component_template_substitutions": {
    "my-mappings_template": {
      "template": {
        "mappings": {
          "dynamic": "strict",
          "properties": {
            "foo": {
              "type": "keyword"
            },
            "bar": {
              "type": "keyword"
            }
          }
        }
      }
    }
  }
}

The pipeline, component template, and index template substitutions replace the existing pipeline details for the duration of this request.

{
  "docs": [
    {
      "_id": "id",
      "_index": "my-index",
      "_source": {
        "foo": "bar"
      }
    },
    {
      "_id": "id",
      "_index": "my-index",
      "_source": {
        "foo": "rab"
      }
    }
  ],
  "pipeline_substitutions": {
    "my-pipeline": {
      "processors": [
        {
          "set": {
            "field": "field3",
            "value": "value3"
          }
        }
      ]
    }
  },
  "component_template_substitutions": {
    "my-component-template": {
      "template": {
        "mappings": {
          "dynamic": true,
          "properties": {
            "field3": {
              "type": "keyword"
            }
          }
        },
        "settings": {
          "index": {
            "default_pipeline": "my-pipeline"
          }
        }
      }
    }
  },
  "index_template_substitutions": {
    "my-index-template": {
      "index_patterns": [
        "my-index-*"
      ],
      "composed_of": [
        "component_template_1",
        "component_template_2"
      ]
    }
  },
  "mapping_addition": {
    "dynamic": "strict",
    "properties": {
      "foo": {
        "type": "keyword"
      }
    }
  }
}

Response examples (200)

A successful response when the simulation uses pipeline definitions that are already in the system.

{
  "docs": [
    {
      "doc": null,
      "_id": 123,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field1": "value1",
        "field2": "value2",
        "foo": "bar"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    },
    {
      "doc": null,
      "_id": 456,
      "_index": "my-index",
      "_version": "-3,",
      "_source": {
        "field1": "value1",
        "field2": "value2",
        "foo": "rab"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    }
  ]
}

A successful response when the simulation uses pipeline substitutions.

{
  "docs": [
    {
      "doc": null,
      "_id": 123,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field2": "value2",
        "foo": "BAR"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    },
    {
      "doc": null,
      "_id": 456,
      "_index": "my-index",
      "_version": -3,
      "_source": {
        "field2": "value2",
        "foo": "RAB"
      },
      "executed_pipelines": [
        "my-pipeline",
        "my-final-pipeline"
      ]
    }
  ]
}

A successful response when the simulation uses pipeline substitutions.

{
  "docs": [
    {
      "doc": {
        "_id": "123",
        "_index": "my-index",
        "_version": -3,
        "_source": {
          "foo": "foo"
        },
        "executed_pipelines": [],
        "effective_mapping": {
          "_doc": {
            "properties": {
              "foo": {
                "type": "keyword"
              }
            }
          }
        }
      }
    },
    {
      "doc": {
        "_id": "456",
        "_index": "my-index",
        "_version": -3,
        "_source": {
          "bar": "rab"
        },
      "executed_pipelines": [],
        "effective_mapping": {
          "_doc": {
            "properties": {
              "bar": {
                "type": "keyword"
              }
            }
          }
        }
      }
    }
  ]
}

Simulate data ingestion Technical preview; Added in 8.12.0

Required authorization

Path parameters

Query parameters

Body Required

created_date string | number

modified_date string | number

created_date string | number

modified_date string | number

created_date string | number

modified_date string | number

Responses