Find the structure of a text field in an Elasticsearch index.
This API provides a starting point for extracting further information from log messages already ingested into Elasticsearch.
For example, if you have ingested data into a very simple index that has just @timestamp and message fields, you can use this API to see what common structure exists in the message field.
The response from the API contains:
All this information can be calculated by the structure finder with no guidance. However, you can optionally override some of the decisions about the text structure by specifying one or more query parameters.
If the structure finder produces unexpected results, specify the explain query parameter and an explanation will appear in the response.
It helps determine why the returned structure was chosen.
monitor_text_structureIf format is set to delimited, you can specify the column names in a comma-separated list.
If this parameter is not specified, the structure finder uses the column names from the header row of the text.
If the text does not have a header row, columns are named "column1", "column2", "column3", for example.
If you have set format to delimited, you can specify the character used to delimit the values in each row.
Only a single character is supported; the delimiter cannot have multiple characters.
By default, the API considers the following possibilities: comma, tab, semi-colon, and pipe (|).
In this default scenario, all rows must have the same number of fields for the delimited format to be detected.
If you specify a delimiter, up to 10% of the rows can have a different number of columns than the first row.
The number of documents to include in the structural analysis. The minimum value is 2.
The mode of compatibility with ECS compliant Grok patterns.
Use this parameter to specify whether to use ECS Grok patterns instead of legacy ones when the structure finder creates a Grok pattern.
This setting primarily has an impact when a whole message Grok pattern such as %{CATALINALOG} matches the input.
If the structure finder identifies a common structure but has no idea of the meaning then generic field names such as path, ipaddress, field1, and field2 are used in the grok_pattern output.
The intention in that situation is that a user who knows the meanings will rename the fields before using them.
Values are disabled or v1.
If true, the response includes a field named explanation, which is an array of strings that indicate how the structure finder produced its result.
The field that should be analyzed.
The high level structure of the text. By default, the API chooses the format. In this default scenario, all rows must have the same number of fields for a delimited format to be detected. If the format is set to delimited and the delimiter is not set, however, the API tolerates up to 5% of rows that have a different number of columns than the first row.
Values are delimited, ndjson, semi_structured_text, or xml.
If the format is semi_structured_text, you can specify a Grok pattern that is used to extract fields from every message in the text.
The name of the timestamp field in the Grok pattern must match what is specified in the timestamp_field parameter.
If that parameter is not specified, the name of the timestamp field in the Grok pattern must match "timestamp".
If grok_pattern is not specified, the structure finder creates a Grok pattern.
The name of the index that contains the analyzed field.
If the format is delimited, you can specify the character used to quote the values in each row if they contain newlines or the delimiter character.
Only a single character is supported.
If this parameter is not specified, the default value is a double quote (").
If your delimited text format does not use quoting, a workaround is to set this argument to a character that does not appear anywhere in the sample.
If the format is delimited, you can specify whether values between delimiters should have whitespace trimmed from them.
If this parameter is not specified and the delimiter is pipe (|), the default value is true.
Otherwise, the default value is false.
The maximum amount of time that the structure analysis can take. If the analysis is still running when the timeout expires, it will be stopped.
Values are -1 or 0.
The name of the field that contains the primary timestamp of each record in the text.
In particular, if the text was ingested into an index, this is the field that would be used to populate the @timestamp field.
If the format is semi_structured_text, this field must match the name of the appropriate extraction in the grok_pattern.
Therefore, for semi-structured text, it is best not to specify this parameter unless grok_pattern is also specified.
For structured text, if you specify this parameter, the field must exist within the text.
If this parameter is not specified, the structure finder makes a decision about which field (if any) is the primary timestamp field. For structured text, it is not compulsory to have a timestamp in the text.
The Java time format of the timestamp field in the text. Only a subset of Java time format letter groups are supported:
adddEEEEEEEHHHhMMMMMMMMMMmmssXXXXXyyyyyyzzzAdditionally S letter groups (fractional seconds) of length one to nine are supported providing they occur after ss and are separated from the ss by a period (.), comma (,), or colon (:).
Spacing and punctuation is also permitted with the exception a question mark (?), newline, and carriage return, together with literal text enclosed in single quotes.
For example, MM/dd HH.mm.ss,SSSSSS 'in' yyyy is a valid override format.
One valuable use case for this parameter is when the format is semi-structured text, there are multiple timestamp formats in the text, and you know which format corresponds to the primary timestamp, but you do not want to specify the full grok_pattern.
Another is when the timestamp format is one that the structure finder does not consider by default.
If this parameter is not specified, the structure finder chooses the best format from a built-in set.
If the special value null is specified, the structure finder will not look for a primary timestamp in the text.
When the format is semi-structured text, this will result in the structure finder treating the text as single-line messages.
Values are disabled or v1.
Values are delimited, ndjson, semi_structured_text, or xml.
Description of the ingest pipeline.
Processors used to perform transformations on documents before indexing. Processors run sequentially in the order specified.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If false, the processor does not append values already present in the field.
Default value is true.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
The number of chars being used for extraction to prevent huge fields.
Use -1 for no limit.
Default value is 100000.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Array of properties to select to be stored.
Can be content, title, name, author, keywords, date, content_type, content_length, language.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true, the binary field will be removed from the document
Default value is false.
Field containing the name of the resource to decode. If specified, the processor passes this resource name to the underlying Tika library to enable Resource Name Based Detection.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The difference between the resulting inscribed distance from center to side and the circle’s radius (measured in meters for geo_shape, unit-less for shape).
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Values are geo_shape or shape.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Seed for the community ID hash. Must be between 0 and 65535 (inclusive). The seed can prevent hash collisions between network domains, such as a staging and production network that use the same addressing scheme.
Default value is 0.
If true and any required fields are missing, the processor quietly exits without modifying the document.
Default value is true.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Values are integer, long, double, float, boolean, ip, string, or auto.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Value used to fill empty fields.
Empty fields are skipped if this is not provided.
An empty field is one with no value (2 consecutive separators) or empty quotes ("").
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Quote used in CSV, has to be single character string.
Default value is ".
Separator used in CSV, has to be single character string.
Default value is ,.
Trim whitespaces in unquoted fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
An array of the expected date formats. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
The locale to use when parsing the date, relevant when parsing month names or week days. Supports template snippets.
Default value is ENGLISH.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
The timezone to use when parsing the date. Supports template snippets.
Default value is UTC.
The format to use when writing the date to target_field. Must be a valid java time pattern.
Default value is yyyy-MM-dd'T'HH:mm:ss.SSSXXX.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
An array of the expected date formats for parsing dates / timestamps in the document being preprocessed. Can be a java time pattern or one of the following formats: ISO8601, UNIX, UNIX_MS, or TAI64N.
How to round the date when formatting the date into the index name. Valid values are:
y (year), M (month), w (week), d (day), h (hour), m (minute) and s (second).
Supports template snippets.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
The format to be used when printing the parsed date into the index name. A valid java time pattern is expected here. Supports template snippets.
Default value is yyyy-MM-dd.
A prefix of the index name to be prepended before the printed date. Supports template snippets.
The locale to use when parsing the date from the document being preprocessed, relevant when parsing month names or week days.
Default value is ENGLISH.
The timezone to use when parsing the date and when date math index supports resolves expressions into concrete index names.
Default value is UTC.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The character(s) that separate the appended fields.
Default value is "".
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
The pattern to apply to the field.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Controls the behavior when there is already an existing nested object that conflicts with the expanded field.
When false, the processor will merge conflicts by combining the old and the new values into an array.
When true, the value from the expanded field will overwrite the existing value.
Default value is false.
The field that contains the field to expand.
Only required if the field to expand is part another object field, because the field option can only understand leaf fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
The maximum number of matched documents to include under the configured target field.
The target_field will be turned into a json array if max_matches is higher than 1, otherwise target_field will become a json object.
In order to avoid documents getting too large, the maximum allowed value is 128.
Default value is 1.
If processor will update fields with pre-existing non-null-valued field.
When set to false, such fields will not be touched.
Default value is true.
The name of the enrich policy to use.
Values are intersects, disjoint, within, or contains.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The error message thrown by the processor. Supports template snippets.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Salt value for the hash function.
Values are MD5, SHA-1, SHA-256, SHA-512, or MurmurHash3.
If true, the processor ignores any missing fields. If all fields are missing, the processor silently exits without modifying the document.
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true, the processor silently exits without changing the document if the field is null or missing.
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.
Default value is GeoLite2-City.mmdb.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true, only the first found IP location data will be returned, even if the field contains an array.
Default value is true.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Controls what properties are added to the target_field based on the IP location lookup.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created.
Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The field to interpret as a geo-tile.=
The field format is determined by the tile_type.
Values are geotile, geohex, or geohash.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Values are geojson or wkt.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
The database filename referring to a database the module ships with (GeoLite2-City.mmdb, GeoLite2-Country.mmdb, or GeoLite2-ASN.mmdb) or a custom database in the ingest-geoip config directory.
Default value is GeoLite2-City.mmdb.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true, only the first found geoip data will be returned, even if the field contains an array.
Default value is true.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Controls what properties are added to the target_field based on the geoip lookup.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true (and if ingest.geoip.downloader.eager.download is false), the missing database is downloaded when the pipeline is created.
Else, the download is triggered by when the pipeline is used as the default_pipeline or final_pipeline in an index.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Must be disabled or v1. If v1, the processor uses patterns with Elastic Common Schema (ECS) field names.
Default value is disabled.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
A map of pattern-name and pattern tuples defining custom patterns to be used by the current processor. Patterns matching existing names will override the pre-existing definition.
An ordered list of grok expression to match and extract named captures with. Returns on the first expression in the list that matches.
When true, _ingest._grok_match_index will be inserted into your matched document’s metadata with the index into the pattern found in patterns that matched.
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
The pattern to be replaced.
The string to replace the matching patterns with.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document,
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Maps the document field names to the known field names of the model. This mapping takes precedence over any default mappings provided in the model configuration.
If true and any of the input fields defined in input_ouput are missing then those missing fields are quietly ignored, otherwise a missing field causes a failure. Only applies when using input_output configurations to explicitly list the input fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
The separator character.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Flag that forces the parsed JSON to be added at the top level of the document.
target_field must not be set when this option is chosen.
Default value is false.
Values are replace or merge.
When set to true, the JSON parser will not fail if the JSON contains duplicate keys.
Instead, the last encountered value for any duplicate key wins.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
List of keys to exclude from document.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Regex pattern to use for splitting key-value pairs.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
List of keys to filter and insert into document. Defaults to including all keys.
Prefix to be added to extracted keys.
Default value is null.
If true. strip brackets (), <>, [] as well as quotes ' and " from extracted values.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
String of characters to trim from extracted keys.
String of characters to trim from extracted values.
Regex pattern to use for splitting the key from the value within a key-value pair.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
List of internal networks. Supports IPv4 and IPv6 addresses and ranges in CIDR notation. Also supports the named ranges listed below. These may be constructed with template snippets. Must specify only one of internal_networks or internal_networks_field.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and any required fields are missing, the processor quietly exits without modifying the document.
Default value is true.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Whether to ignore missing pipelines instead of failing.
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
A list of grok expressions to match and redact named captures with
Start a redacted section with this token
Default value is <.
End a redacted section with this token
Default value is >.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
If true and the current license does not support running redact processors, then the processor quietly exits without modifying the document
Default value is false.
If true then ingest metadata _ingest._redact._is_redacted is set to true if the document has been redacted
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and any required fields are missing, the processor quietly exits without modifying the document.
Default value is true.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
A static value for the target. Can’t be set when the dataset or namespace option is set.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Object containing parameters for the script.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and value is a template snippet that evaluates to null or the empty string, the processor quietly exits without modifying the document.
Default value is false.
The media type for encoding value.
Applies only when value is a template snippet.
Must be one of application/json, text/plain, or application/x-www-form-urlencoded.
If true processor will update fields with pre-existing non-null-valued field.
When set to false, such fields will not be touched.
Default value is true.
The value to be set for the field.
Supports template snippets.
May specify only one of value or copy_from.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Controls what user related properties are added to the field.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Values are asc or desc.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Preserves empty trailing fields, if any.
Default value is false.
A regex which matches the separator, for example, , or \s+.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist or is null, the processor quietly exits without modifying the document.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
If true, the processor copies the unparsed URI to <target_field>.original.
Default value is true.
If true, the processor removes the field after parsing the URI string.
If parsing fails, the processor does not remove the field.
Default value is false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Description of the processor. Useful for describing the purpose of the processor or its configuration.
Ignore failures for the processor.
Handle failures for the processor.
Identifier for the processor. Useful for debugging and metrics.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
If true and field does not exist, the processor quietly exits without modifying the document.
Default value is false.
The name of the file in the config/ingest-user-agent directory containing the regular expressions for parsing the user agent string. Both the directory and the file have to be created before starting Elasticsearch. If not specified, ingest-user-agent will use the regexes.yaml from uap-core it ships with.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Controls what properties are added to target_field.
Values are name, os, device, original, or version. Default value is ["name", "major", "minor", "patch", "build", "os", "os_name", "os_major", "os_minor", "device"].
Extracts device type from the user agent string on a best-effort basis.
Default value is false.
Values are strict, runtime, true, or false.
For type composite
For type lookup
A custom format for date type runtime fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
Values are boolean, composite, date, double, geo_point, geo_shape, ip, keyword, long, or lookup.
Values are true or false.
Path to field or array of paths. Some API's support wildcards in the path to select multiple fields.
GET _text_structure/find_field_structure?index=test-logs&field=message
resp = client.text_structure.find_field_structure(
index="test-logs",
field="message",
)
const response = await client.textStructure.findFieldStructure({
index: "test-logs",
field: "message",
});
response = client.text_structure.find_field_structure(
index: "test-logs",
field: "message"
)
$resp = $client->textStructure()->findFieldStructure([
"index" => "test-logs",
"field" => "message",
]);
curl -X GET -H "Authorization: ApiKey $ELASTIC_API_KEY" "$ELASTICSEARCH_URL/_text_structure/find_field_structure?index=test-logs&field=message"
client.textStructure().findFieldStructure(f -> f
.field("message")
.index("test-logs")
);
{
"num_lines_analyzed" : 22,
"num_messages_analyzed" : 22,
"sample_start" : "[2024-03-05T10:52:36,256][INFO ][o.a.l.u.VectorUtilPanamaProvider] [laptop] Java vector incubator API enabled; uses preferredBitSize=128\n[2024-03-05T10:52:41,038][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-url]\n",
"charset" : "UTF-8",
"format" : "semi_structured_text",
"multiline_start_pattern" : "^\\[\\b\\d{4}-\\d{2}-\\d{2}[T ]\\d{2}:\\d{2}",
"grok_pattern" : "\\[%{TIMESTAMP_ISO8601:timestamp}\\]\\[%{LOGLEVEL:loglevel} \\]\\[.*",
"ecs_compatibility" : "disabled",
"timestamp_field" : "timestamp",
"joda_timestamp_formats" : [
"ISO8601"
],
"java_timestamp_formats" : [
"ISO8601"
],
"need_client_timezone" : true,
"mappings" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"loglevel" : {
"type" : "keyword"
},
"message" : {
"type" : "text"
}
}
},
"ingest_pipeline" : {
"description" : "Ingest pipeline created by text structure finder",
"processors" : [
{
"grok" : {
"field" : "message",
"patterns" : [
"\\[%{TIMESTAMP_ISO8601:timestamp}\\]\\[%{LOGLEVEL:loglevel} \\]\\[.*"
],
"ecs_compatibility" : "disabled"
}
},
{
"date" : {
"field" : "timestamp",
"timezone" : "{{ event.timezone }}",
"formats" : [
"ISO8601"
]
}
},
{
"remove" : {
"field" : "timestamp"
}
}
]
},
"field_stats" : {
"loglevel" : {
"count" : 22,
"cardinality" : 1,
"top_hits" : [
{
"value" : "INFO",
"count" : 22
}
]
},
"message" : {
"count" : 22,
"cardinality" : 22,
"top_hits" : [
{
"value" : "[2024-03-05T10:52:36,256][INFO ][o.a.l.u.VectorUtilPanamaProvider] [laptop] Java vector incubator API enabled; uses preferredBitSize=128",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,038][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-url]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,042][INFO ][o.e.p.PluginsService ] [laptop] loaded module [rest-root]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [ingest-user-agent]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-core]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,043][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-redact]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [lang-painless]]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [repository-s3]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-analytics]",
"count" : 1
},
{
"value" : "[2024-03-05T10:52:41,044][INFO ][o.e.p.PluginsService ] [laptop] loaded module [x-pack-autoscaling]",
"count" : 1
}
]
},
"timestamp" : {
"count" : 22,
"cardinality" : 14,
"earliest" : "2024-03-05T10:52:36,256",
"latest" : "2024-03-05T10:52:49,199",
"top_hits" : [
{
"value" : "2024-03-05T10:52:41,044",
"count" : 6
},
{
"value" : "2024-03-05T10:52:41,043",
"count" : 3
},
{
"value" : "2024-03-05T10:52:41,059",
"count" : 2
},
{
"value" : "2024-03-05T10:52:36,256",
"count" : 1
},
{
"value" : "2024-03-05T10:52:41,038",
"count" : 1
},
{
"value" : "2024-03-05T10:52:41,042",
"count" : 1
},
{
"value" : "2024-03-05T10:52:43,291",
"count" : 1
},
{
"value" : "2024-03-05T10:52:46,098",
"count" : 1
},
{
"value" : "2024-03-05T10:52:47,227",
"count" : 1
},
{
"value" : "2024-03-05T10:52:47,259",
"count" : 1
}
]
}
}
}