Preview a transform Generally available; Added in 7.2.0

GET /_transform/{transform_id}/_preview

Generates a preview of the results that you will get when you create a transform with the same configuration.

It returns a maximum of 100 results. The calculations are based on all the current data in the source index. It also generates a list of mappings and settings for the destination index. These values are determined based on the field types of the source index and the transform aggregations.

Required authorization

  • Index privileges: read,view_index_metadata
  • Cluster privileges: manage_transform

Path parameters

  • transform_id string Required

    Identifier for the transform to preview. If you specify this path parameter, you cannot provide transform configuration details in the request body.

Query parameters

  • timeout string

    Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.

    Values are -1 or 0.

    External documentation
application/json

Body

  • dest object

    The destination for the transform.

    Hide dest attributes Show dest attributes object
    • index string

      The destination index for the transform. The mappings of the destination index are deduced based on the source fields when possible. If alternate mappings are required, use the create index API prior to starting the transform.

    • pipeline string

      The unique identifier for an ingest pipeline.

  • description string

    Free text description of the transform.

  • frequency string

    The interval between checks for changes in the source indices when the transform is running continuously. Also determines the retry interval in the event of transient failures while the transform is searching or indexing. The minimum value is 1s and the maximum is 1h.

    External documentation
  • pivot object

    The pivot method transforms the data by aggregating and grouping it. These objects define the group by fields and the aggregation to reduce the data.

    Hide pivot attributes Show pivot attributes object
    • aggregations object

      Defines how to aggregate the grouped data. The following aggregations are currently supported: average, bucket script, bucket selector, cardinality, filter, geo bounds, geo centroid, geo line, max, median absolute deviation, min, missing, percentiles, rare terms, scripted metric, stats, sum, terms, top metrics, value count, weighted average.

    • group_by object

      Defines how to group the data. More than one grouping can be defined per pivot. The following groupings are currently supported: date histogram, geotile grid, histogram, terms.

      Hide group_by attribute Show group_by attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • date_histogram object
        • geotile_grid object
        • histogram object
        • terms object
  • source object

    The source of the data for the transform.

    Hide source attributes Show source attributes object
    • index string | array[string] Required

      The source indices for the transform. It can be a single index, an index pattern (for example, "my-index-*""), an array of indices (for example, ["my-index-000001", "my-index-000002"]), or an array of index patterns (for example, ["my-index-*", "my-other-index-*"]. For remote indices use the syntax "remote_name:index_name". If any indices are in remote clusters then the master node and at least one transform node must have the remote_cluster_client node role.

    • query object

      A query clause that retrieves a subset of data from the source index.

      External documentation
      Hide query attributes Show query attributes object
      • bool object
      • boosting object
      • common object Deprecated
      • combined_fields object
      • constant_score object
      • dis_max object
      • distance_feature
      • exists object
      • function_score object
      • fuzzy object

        Returns documents that contain terms similar to the search term, as measured by a Levenshtein edit distance.

        External documentation
      • geo_bounding_box object
      • geo_distance object
      • geo_grid object

        Matches geo_point and geo_shape values that intersect a grid cell from a GeoGrid aggregation.

      • geo_polygon object
      • geo_shape object
      • has_child object
      • has_parent object
      • ids object
      • intervals object

        Returns documents based on the order and proximity of matching terms.

        External documentation
      • knn object
      • match object

        Returns documents that match a provided text, number, date or boolean value. The provided text is analyzed before matching.

        External documentation
      • match_all object
      • match_bool_prefix object

        Analyzes its input and constructs a bool query from the terms. Each term except the last is used in a term query. The last term is used in a prefix query.

        External documentation
      • match_none object
      • match_phrase object

        Analyzes the text and creates a phrase query out of the analyzed text.

        External documentation
      • match_phrase_prefix object

        Returns documents that contain the words of a provided text, in the same order as provided. The last term of the provided text is treated as a prefix, matching any words that begin with that term.

        External documentation
      • more_like_this object
      • multi_match object
      • nested object
      • parent_id object
      • percolate object
      • prefix object

        Returns documents that contain a specific prefix in a provided field.

        External documentation
      • query_string object
      • range object

        Returns documents that contain terms within a provided range.

        External documentation
      • rank_feature object
      • regexp object

        Returns documents that contain terms matching a regular expression.

        External documentation
      • rule object
      • script object
      • script_score object
      • semantic object
      • shape object
      • simple_query_string object
      • span_containing object
      • span_field_masking object
      • span_first object
      • span_multi object
      • span_near object
      • span_not object
      • span_or object
      • span_term object

        Matches spans containing a term.

        External documentation
      • span_within object
      • term object

        Returns documents that contain an exact term in a provided field. To return a document, the query term must exactly match the queried field's value, including whitespace and capitalization.

        External documentation
      • terms object
      • terms_set object

        Returns documents that contain a minimum number of exact terms in a provided field. To return a document, a required number of terms must exactly match the field values, including whitespace and capitalization.

        External documentation
      • text_expansion object Deprecated Generally available; Added in 8.8.0

        Uses a natural language processing model to convert the query text into a list of token-weight pairs which are then used in a query against a sparse vector or rank features field.

        External documentation
      • weighted_tokens object Deprecated Generally available; Added in 8.13.0

        Supports returning text_expansion query results by sending in precomputed tokens with the query.

        External documentation
      • wildcard object

        Returns documents that contain terms matching a wildcard pattern.

        External documentation
      • wrapper object
      • type object
    • runtime_mappings object Generally available; Added in 7.12.0

      Definitions of search-time runtime fields that can be used by the transform. For search runtime fields all data nodes, including remote nodes, must be 7.12 or later.

      Hide runtime_mappings attribute Show runtime_mappings attribute object
      • * object Additional properties
        Hide * attributes Show * attributes object
        • fields object

          For type composite

          Hide fields attribute Show fields attribute object
          • * object Additional properties
        • fetch_fields array[object]

          For type lookup

        • format string

          A custom format for date type runtime fields.

        • input_field string

          For type lookup

        • target_field string

          For type lookup

        • target_index string

          For type lookup

        • script object

          Painless script executed at query time.

        • type string Required

          Field type, which can be: boolean, composite, date, double, geo_point, ip,keyword, long, or lookup.

          Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.

  • settings object

    Defines optional transform settings.

    Hide settings attributes Show settings attributes object
    • align_checkpoints boolean

      Specifies whether the transform checkpoint ranges should be optimized for performance. Such optimization can align checkpoint ranges with the date histogram interval when date histogram is specified as a group source in the transform config. As a result, less document updates in the destination index will be performed thus improving overall performance.

      Default value is true.

    • dates_as_epoch_millis boolean

      Defines if dates in the ouput should be written as ISO formatted string or as millis since epoch. epoch_millis was the default for transforms created before version 7.11. For compatible output set this value to true.

      Default value is false.

    • deduce_mappings boolean

      Specifies whether the transform should deduce the destination index mappings from the transform configuration.

      Default value is true.

    • docs_per_second number

      Specifies a limit on the number of input documents per second. This setting throttles the transform by adding a wait time between search requests. The default value is null, which disables throttling.

    • max_page_search_size number

      Defines the initial page size to use for the composite aggregation for each checkpoint. If circuit breaker exceptions occur, the page size is dynamically adjusted to a lower value. The minimum value is 10 and the maximum is 65,536.

      Default value is 500.0.

    • use_point_in_time boolean

      Specifies whether the transform checkpoint will use the Point In Time API while searching over the source index. In general, Point In Time is an optimization that will reduce pressure on the source index by reducing the amount of refreshes and merges, but it can be expensive if a large number of Point In Times are opened and closed for a given index. The benefits and impact depend on the data being searched, the ingest rate into the source index, and the amount of other consumers searching the same source index.

      Default value is true.

      External documentation
    • num_failure_retries number Generally available; Added in 8.4.0

      Defines the number of retries on a recoverable failure before the transform task is marked as failed. The minimum value is 0 and the maximum is 100, where -1 indicates that the transform retries indefinitely. If unset, the cluster-level setting num_transform_failure_retries is used.

      This setting cannot be specified when unattended is true, because unattended transforms always retry indefinitely.

    • unattended boolean Generally available; Added in 8.5.0

      If true, the transform runs in unattended mode. In unattended mode, the transform retries indefinitely in case of an error which means the transform never fails. Setting the number of retries other than infinite fails in validation.

      Default value is false.

  • sync object

    Defines the properties transforms require to run continuously.

    Hide sync attribute Show sync attribute object
    • time object

      Specifies that the transform uses a time field to synchronize the source and destination indices.

      Hide time attributes Show time attributes object
      • delay string

        The time delay between the current time and the latest input data time.

        External documentation
      • field string Required

        The date field that is used to identify new documents in the source. In general, it’s a good idea to use a field that contains the ingest timestamp. If you use a different field, you might need to set the delay such that it accounts for data transmission delays.

  • retention_policy object

    Defines a retention policy for the transform. Data that meets the defined criteria is deleted from the destination index.

    Hide retention_policy attribute Show retention_policy attribute object
    • time object

      Specifies that the transform uses a time field to set the retention policy.

      Hide time attributes Show time attributes object
      • field string Required

        The date field that is used to calculate the age of the document.

      • max_age string Required

        Specifies the maximum age of a document in the destination index. Documents that are older than the configured value are removed from the destination index.

        External documentation
  • latest object

    The latest method transforms the data by finding the latest document for each unique key.

    Hide latest attributes Show latest attributes object
    • sort string Required

      Specifies the date field that is used to identify the latest documents.

    • unique_key array[string] Required

      Specifies an array of one or more fields that are used to group the data.

Responses

  • 200 application/json
    Hide response attributes Show response attributes object
    • generated_dest_index object Required Additional properties
      Hide generated_dest_index attributes Show generated_dest_index attributes object
      • aliases object
        Hide aliases attribute Show aliases attribute object
        • * object Additional properties
          Hide * attributes Show * attributes object
          • filter object

            Query used to limit documents the alias can access.

          • index_routing string

            Value used to route indexing operations to a specific shard. If specified, this overwrites the routing value for indexing operations.

          • is_hidden boolean

            If true, the alias is hidden. All indices for the alias must have the same is_hidden value.

            Default value is false.

          • is_write_index boolean

            If true, the index is the write index for the alias.

            Default value is false.

          • routing string

            Value used to route indexing and search operations to a specific shard.

          • search_routing string

            Value used to route search operations to a specific shard. If specified, this overwrites the routing value for search operations.

      • mappings object
        Hide mappings attributes Show mappings attributes object
        • all_field object
        • date_detection boolean
        • dynamic string

          Values are strict, runtime, true, or false.

        • dynamic_date_formats array[string]
        • dynamic_templates array[object]
        • _field_names object
        • index_field object
        • _meta object
        • numeric_detection boolean
        • properties object
        • _routing object
        • _size object
        • _source object
        • runtime object
          Hide runtime attribute Show runtime attribute object
          • * object Additional properties
        • enabled boolean
        • subobjects string

          Values are true or false.

        • _data_stream_timestamp object
      • settings object
        Hide settings attributes Show settings attributes object
        • index object
        • mode string
        • routing_path string | array[string]

        • soft_deletes object
        • sort object
        • number_of_shards number | string Generally available

          One of:

          Default value is 1.

          Default value is 1.

        • number_of_replicas number | string Generally available

          One of:

          Default value is 0.

          Default value is 0.

        • number_of_routing_shards number
        • check_on_startup string

          Values are true, false, or checksum.

        • codec string

          Default value is LZ4.

        • routing_partition_size
        • load_fixed_bitset_filters_eagerly boolean

          Default value is true.

        • hidden boolean | string

          One of:

          Default value is false.

          Default value is false.

        • auto_expand_replicas string | null

          One of:

          Default value is false.

          A null value that is to be interpreted as an actual value, unless other uses of null that are equivalent to a missing value. It is used for exemple in settings, where using the NullValue for a setting will reset it to its default value.

        • merge object
        • refresh_interval string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • max_result_window number

          Default value is 10000.0.

        • max_inner_result_window number

          Default value is 100.0.

        • max_rescore_window number

          Default value is 10000.0.

        • max_script_fields number

          Default value is 32.0.

        • max_ngram_diff number

          Default value is 1.0.

        • max_shingle_diff number

          Default value is 3.0.

        • blocks object
        • max_refresh_listeners number
        • analyze object

          Settings to define analyzers, tokenizers, token filters and character filters. Refer to the linked documentation for step-by-step examples of updating analyzers on existing indices.

        • highlight object
        • max_terms_count number

          Default value is 65536.0.

        • max_regex_length number

          Default value is 1000.0.

        • routing object
        • gc_deletes string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • default_pipeline string
        • final_pipeline string
        • lifecycle object
        • provided_name string
        • creation_date
        • creation_date_string
        • uuid string
        • version object
        • verified_before_close boolean | string

        • format string | number

        • max_slices_per_scroll number
        • translog object
        • query_string object
        • priority number | string

        • top_metrics_max_size number
        • analysis object
        • settings object
        • time_series object
        • queries object
        • similarity object

          Configure custom similarity settings to customize how search results are scored.

        • mapping object

          Enable or disable dynamic mapping for an index.

        • indexing.slowlog object
        • indexing_pressure object

          Configure indexing back pressure limits.

        • store object

          The store module allows you to control how index data is stored and accessed on disk.

      • defaults object

        Default settings, included when the request's include_default is true.

        Hide defaults attributes Show defaults attributes object
        • index object
        • mode string
        • routing_path string | array[string]

        • soft_deletes object
        • sort object
        • number_of_shards number | string Generally available

          One of:

          Default value is 1.

          Default value is 1.

        • number_of_replicas number | string Generally available

          One of:

          Default value is 0.

          Default value is 0.

        • number_of_routing_shards number
        • check_on_startup string

          Values are true, false, or checksum.

        • codec string

          Default value is LZ4.

        • routing_partition_size
        • load_fixed_bitset_filters_eagerly boolean

          Default value is true.

        • hidden boolean | string

          One of:

          Default value is false.

          Default value is false.

        • auto_expand_replicas string | null

          One of:

          Default value is false.

          A null value that is to be interpreted as an actual value, unless other uses of null that are equivalent to a missing value. It is used for exemple in settings, where using the NullValue for a setting will reset it to its default value.

        • merge object
        • refresh_interval string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • max_result_window number

          Default value is 10000.0.

        • max_inner_result_window number

          Default value is 100.0.

        • max_rescore_window number

          Default value is 10000.0.

        • max_script_fields number

          Default value is 32.0.

        • max_ngram_diff number

          Default value is 1.0.

        • max_shingle_diff number

          Default value is 3.0.

        • blocks object
        • max_refresh_listeners number
        • analyze object

          Settings to define analyzers, tokenizers, token filters and character filters. Refer to the linked documentation for step-by-step examples of updating analyzers on existing indices.

        • highlight object
        • max_terms_count number

          Default value is 65536.0.

        • max_regex_length number

          Default value is 1000.0.

        • routing object
        • gc_deletes string

          A duration. Units can be nanos, micros, ms (milliseconds), s (seconds), m (minutes), h (hours) and d (days). Also accepts "0" without a unit and "-1" to indicate an unspecified value.

        • default_pipeline string
        • final_pipeline string
        • lifecycle object
        • provided_name string
        • creation_date
        • creation_date_string
        • uuid string
        • version object
        • verified_before_close boolean | string

        • format string | number

        • max_slices_per_scroll number
        • translog object
        • query_string object
        • priority number | string

        • top_metrics_max_size number
        • analysis object
        • settings object
        • time_series object
        • queries object
        • similarity object

          Configure custom similarity settings to customize how search results are scored.

        • mapping object

          Enable or disable dynamic mapping for an index.

        • indexing.slowlog object
        • indexing_pressure object

          Configure indexing back pressure limits.

        • store object

          The store module allows you to control how index data is stored and accessed on disk.

      • data_stream string
      • lifecycle object Generally available; Added in 8.11.0

        Data stream lifecycle applicable if this is a data stream.

        Hide lifecycle attributes Show lifecycle attributes object
        • data_retention string

          If defined, every document added to this data stream will be stored at least for this time frame. Any time after this duration the document could be deleted. When empty, every document in this data stream will be stored indefinitely.

        • downsampling array[object]

          The list of downsampling rounds to execute as part of this downsampling configuration

        • downsampling_method string

          The method used to downsample the data. There are two options aggregate and last_value. It requires downsampling to be defined. Defaults to aggregate.

          Values are aggregate or last_value.

        • enabled boolean

          If defined, it turns data stream lifecycle on/off (true/false) for this data stream. A data stream lifecycle that's disabled (enabled: false) will have no effect on the data stream.

          Default value is true.

    • preview array[object] Required
GET /_transform/{transform_id}/_preview
curl \
 --request GET 'http://api.example.com/_transform/{transform_id}/_preview' \
 --header "Content-Type: application/json" \
 --data '"{\n  \"source\": {\n    \"index\": \"kibana_sample_data_ecommerce\"\n  },\n  \"pivot\": {\n    \"group_by\": {\n      \"customer_id\": {\n        \"terms\": {\n          \"field\": \"customer_id\",\n          \"missing_bucket\": true\n        }\n      }\n    },\n    \"aggregations\": {\n      \"max_price\": {\n        \"max\": {\n          \"field\": \"taxful_total_price\"\n        }\n      }\n    }\n  }\n}"'
Request example
Run `POST _transform/_preview` to preview a transform that uses the pivot method.
{
  "source": {
    "index": "kibana_sample_data_ecommerce"
  },
  "pivot": {
    "group_by": {
      "customer_id": {
        "terms": {
          "field": "customer_id",
          "missing_bucket": true
        }
      }
    },
    "aggregations": {
      "max_price": {
        "max": {
          "field": "taxful_total_price"
        }
      }
    }
  }
}
Response examples (200)
An abbreviated response from `POST _transform/_preview` that contains a preview a transform that uses the pivot method.
{
  "preview": [
    {
      "max_price": 171,
      "customer_id": "10"
    },
    {
      "max_price": 233,
      "customer_id": "11"
    },
    {
      "max_price": 200,
      "customer_id": "12"
    },
    {
      "max_price": 301,
      "customer_id": "13"
    },
    {
      "max_price": 176,
      "customer_id": "14"
    },
    {
      "max_price": 2250,
      "customer_id": "15"
    },
    {
      "max_price": 170,
      "customer_id": "16"
    },
    {
      "max_price": 243,
      "customer_id": "17"
    },
    {
      "max_price": 154,
      "customer_id": "18"
    },
    {
      "max_price": 393,
      "customer_id": "19"
    },
    {
      "max_price": 165,
      "customer_id": "20"
    },
    {
      "max_price": 115,
      "customer_id": "21"
    },
    {
      "max_price": 192,
      "customer_id": "22"
    },
    {
      "max_price": 169,
      "customer_id": "23"
    },
    {
      "max_price": 230,
      "customer_id": "24"
    },
    {
      "max_price": 278,
      "customer_id": "25"
    },
    {
      "max_price": 200,
      "customer_id": "26"
    },
    {
      "max_price": 344,
      "customer_id": "27"
    },
    {
      "max_price": 175,
      "customer_id": "28"
    },
    {
      "max_price": 177,
      "customer_id": "29"
    },
    {
      "max_price": 190,
      "customer_id": "30"
    },
    {
      "max_price": 190,
      "customer_id": "31"
    },
    {
      "max_price": 205,
      "customer_id": "32"
    },
    {
      "max_price": 215,
      "customer_id": "33"
    },
    {
      "max_price": 270,
      "customer_id": "34"
    },
    {
      "max_price": 184,
      "customer_id": "36"
    },
    {
      "max_price": 222,
      "customer_id": "37"
    },
    {
      "max_price": 370,
      "customer_id": "38"
    },
    {
      "max_price": 240,
      "customer_id": "39"
    },
    {
      "max_price": 230,
      "customer_id": "4"
    },
    {
      "max_price": 229,
      "customer_id": "41"
    },
    {
      "max_price": 190,
      "customer_id": "42"
    },
    {
      "max_price": 150,
      "customer_id": "43"
    },
    {
      "max_price": 175,
      "customer_id": "44"
    },
    {
      "max_price": 190,
      "customer_id": "45"
    },
    {
      "max_price": 150,
      "customer_id": "46"
    },
    {
      "max_price": 310,
      "customer_id": "48"
    },
    {
      "max_price": 223,
      "customer_id": "49"
    },
    {
      "max_price": 283,
      "customer_id": "5"
    },
    {
      "max_price": 185,
      "customer_id": "50"
    },
    {
      "max_price": 190,
      "customer_id": "51"
    },
    {
      "max_price": 333,
      "customer_id": "52"
    },
    {
      "max_price": 165,
      "customer_id": "6"
    },
    {
      "max_price": 144,
      "customer_id": "7"
    },
    {
      "max_price": 198,
      "customer_id": "8"
    },
    {
      "max_price": 210,
      "customer_id": "9"
    }
  ],
  "generated_dest_index": {
    "mappings": {
      "_meta": {
        "_transform": {
          "transform": "transform-preview",
          "version": {
            "created": "10.0.0"
          },
          "creation_date_in_millis": 1712948905889
        },
        "created_by": "transform"
      },
      "properties": {
        "max_price": {
          "type": "half_float"
        },
        "customer_id": {
          "type": "keyword"
        }
      }
    },
    "settings": {
      "index": {
        "number_of_shards": "1",
        "auto_expand_replicas": "0-1"
      }
    },
    "aliases": {}
  }
}