IMPORTANT: No additional bug fixes or documentation updates will be released for this version. For the latest information, see the current release documentation.

« Beats Central Management Field management »

› ›

Ingest Node Pipelines

edit

IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.

Ingest Node Pipelines

edit

Ingest Node Pipelines enables you to create and manage Elasticsearch pipelines that perform common transformations and enrichments on your data. For example, you might remove a field, rename an existing field, or set a new field.

To begin, open the main menu, then click Stack Management > Ingest Node Pipelines. With Ingest Node Pipelines, you can:

View a list of your pipelines and drill down into details.
Create a pipeline that defines a series of tasks, known as processors.
Test a pipeline before feeding it with real data to ensure the pipeline works as expected.
Delete a pipeline that is no longer needed.

Ingest node pipeline list

Required permissions

edit

The minimum required permissions to access Ingest Node Pipelines are the manage_pipeline and cluster:monitor/nodes/info cluster privileges.

To add privileges, open the main menu, then click Stack Management > Roles.

Privileges required for Ingest Node Pipelines

Manage pipelines

edit

From the list view, you can to drill down into the details of a pipeline. To edit, clone, or delete a pipeline, use the Actions menu.

If you don’t have any pipelines, you can create one using the Create pipeline form. You’ll define processors to transform documents in a specific way. To handle exceptions, you can optionally define failure processors to execute immediately after a failed processor. Before creating the pipeline, you can verify it provides the expected output.

Example: Create a pipeline

edit

In this example, you’ll create a pipeline to handle server logs in the Common Log Format. The log looks similar to this:

212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\"
200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\"

The log contains an IP address, timestamp, and user agent. You want to give these three items their own field in Elasticsearch for fast search and visualization. You also want to know where the request is coming from.

In Ingest Node Pipelines, click Create a pipeline.
Provide a name and description for the pipeline.

Add a grok processor to parse the log message:

Click Add a processor and select the Grok processor type.

Set the field input to message and enter the following grok pattern:

%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \[%{HTTPDATE:timestamp}\] "%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}

Click Update to save the processor.

Add processors to map the date, IP, and user agent fields.

Map the appropriate field to each processor type:

Date: timestamp
GeoIP: clientip
User agent: agent

For the Date processor, you also need to specify the date format you want to use: dd/MMM/YYYY:HH:mm:ss Z.

Your form should look similar to this:

Processors for Ingest Node Pipelines

Alternatively, you can click the Import processors link and define the processors as JSON:

{
  "processors": [
    {
      "grok": {
        "field": "message",
        "patterns": ["%{IPORHOST:clientip} %{USER:ident} %{USER:auth} \\[%{HTTPDATE:timestamp}\\] \"%{WORD:verb} %{DATA:request} HTTP/%{NUMBER:httpversion}\" %{NUMBER:response:int} (?:-|%{NUMBER:bytes:int}) %{QS:referrer} %{QS:agent}"]
      }
    },
    {
      "date": {
        "field": "timestamp",
        "formats": [ "dd/MMM/YYYY:HH:mm:ss Z" ]
      }
    },
    {
      "geoip": {
        "field": "clientip"
      }
    },
    {
      "user_agent": {
        "field": "agent"
      }
    }
  ]
}

The four processors will run sequentially: grok, date, geoip, and user_agent. You can reorder processors using the arrow icon next to each processor.

To test the pipeline to verify that it produces the expected results, click Add documents.

In the Documents tab, provide a sample document for testing:

[
  {
    "_source": {
      "message": "212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
    }
  }
]

Click Run the pipeline and check if the pipeline worked as expected.

You can also view the verbose output and refresh the output from this view.
If everything looks correct, close the panel, and then click Create pipeline.

At this point, you’re ready to use the Elasticsearch index API to load the logs data.

In the Kibana Console, index a document with the pipeline you created.

PUT my-index/_doc/1?pipeline=access_logs
{
  "message": "212.87.37.154 - - [05/May/2020:16:21:15 +0000] \"GET /favicon.ico HTTP/1.1\" 200 3638 \"-\" \"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.116 Safari/537.36\""
}

To verify, run:
```
GET my-index/_doc/1
```

« Beats Central Management Field management »