An ingest pipeline is made up of a sequence of processors that are applied to documents as they are ingested into an index. Each processor performs a specific task, such as filtering, transforming, or enriching data.
Each successive processor depends on the output of the previous processor, so the order of processors is important. The modified documents are indexed into Elasticsearch after all processors are applied.
Elasticsearch includes over 40 configurable processors. The subpages in this section contain reference documentation for each processor. To get a list of available processors, use the nodes info API.
response = client.nodes.info( node_id: 'ingest', filter_path: 'nodes.*.ingest.processors' ) puts response
Ingest processors by categoryedit
We’ve categorized the available processors on this page and summarized their functions. This will help you find the right processor for your use case.
Data enrichment processorsedit
Refer to Enrich your data for detailed examples of how to use the
enrich processor to add data from your existing indices to incoming documents during ingest.
- Uses machine learning to classify and tag text fields.
- Parses and indexes binary data, such as PDFs and Word documents.
- Converts a location field to a Geo-Point field.
- Computes the Community ID for network flow data.
- Computes a hash of the document’s content.
- Converts geo-grid definitions of grid tiles or cells to regular bounding boxes or polygons which describe their shape.
- Adds information about the geographical location of an IPv4 or IPv6 address.
- Calculates the network direction given a source IP address, destination IP address, and a list of internal networks.
- Extracts the registered domain (also known as the effective top-level domain or eTLD), sub-domain, and top-level domain from a fully qualified domain name (FQDN).
Sets user-related details (such as
authentication_type) from the current authenticated user to the current document by pre-processing the ingest.
- Parses a Uniform Resource Identifier (URI) string and extracts its components as an object.
- URL-decodes a string.
- Parses user-agent strings to extract information about web clients.
Data transformation processorsedit
- Converts a field in the currently ingested document to a different type, such as converting a string to an integer.
- Extracts structured fields out of a single text field within a document. Unlike the grok processor, dissect does not use regular expressions. This makes the dissect’s a simpler and often faster alternative.
- Extracts structured fields out of a single text field within a document, using the Grok regular expression dialect that supports reusable aliased expressions.
- Converts a string field by applying a regular expression and a replacement.
- Uses the Grok rules engine to obscure text in the input document matching the given Grok patterns.
- Renames an existing field.
- Sets a value on a field.
Converts a human-readable byte value to its value in bytes (for example
- Extracts a single line of CSV data from a text field.
- Extracts and converts date fields.
- Expands a field with dots into an object field.
- Removes HTML tags from a field.
- Joins each element of an array into a single string using a separator character between each element.
- Parse messages (or specific event fields) containing key-value pairs.
- Converts a string field to lowercase or uppercase.
- Splits a field into an array of values.
- Trims whitespace from field.
Data filtering processorsedit
Pipeline handling processorsedit
Array/JSON handling processorsedit
- Runs an ingest processor on each element of an array or object.
- Converts a JSON string into a structured JSON object.
Runs an inline or stored script on incoming documents.
The script runs in the painless
- Sorts the elements of an array in ascending or descending order.
Add additional processorsedit
You can install additional processors as plugins.
You must install any plugin processors on all nodes in your cluster. Otherwise, Elasticsearch will fail to create pipelines containing the processor.
Mark a plugin as mandatory by setting
elasticsearch.yml. A node will fail to start if a mandatory plugin is not