Monitor Amazon Web Services (AWS) with Amazon Data Firehoseedit

Amazon Data Firehose is a popular service that allows you to send your VPC flow logs data to Elastic in minutes without a single line of code and without building or managing your own data ingestion and delivery infrastructure. Amazon Data Firehose Helps you answer questions like what percentage of your traffic is getting dropped, and how much traffic is getting generated for specific sources and destinations.

What you’ll learnedit

In this tutorial, you’ll learn how to:

  • Install AWS integration in Kibana
  • Create a delivery stream in Amazon Data Firehose
  • Specify the destination settings for your Firehose stream
  • Send data to the Firehose delivery stream

Before you beginedit

Create a deployment using our hosted Elasticsearch Service on Elastic Cloud. The deployment includes an Elasticsearch cluster for storing and searching your data, and Kibana for visualizing and managing your data. You also need an AWS account with permissions to pull the necessary data from AWS.

Step 1: Install AWS integration in Kibanaedit

  1. Install AWS integrations to load index templates, ingest pipelines, and dashboards into Kibana. In Kibana, navigate to Management > Integrations in the sidebar. Find the AWS Integration by browsing the catalog.
  2. Navigate to the Settings tab and click Install AWS assets. Confirm by clicking Install AWS in the popup.
  3. Install AWS Firehose integration assets in Kibana.

Firehose integration is currently in beta. Make sure to enable Display beta integrations.

Step 2: Create a delivery stream in Amazon Data Firehoseedit

  1. Go to the AWS console and navigate to Amazon Data Firehose.
  2. Click Create Firehose stream and choose the source and destination of your Firehose stream. Unless you are streaming data from Kinesis Data Streams, set source to Direct PUT and destination to Elastic.
  3. Provide a meaningful Firehose stream name that will allow you to identify this delivery stream later.

For advanced use cases, source records can be transformed by invoking a custom Lambda function. When using Elastic integrations, this should not be required.

Step 3: Specify the destination settings for your Firehose streamedit

  1. From the Destination settings panel, specify the following settings:

    • Elastic endpoint URL: Enter the Elastic endpoint URL of your Elasticsearch cluster. To find the Elasticsearch endpoint, go to the Elastic Cloud console and select Connection details. Here is an example of how it looks like: https://my-deployment.es.us-east-1.aws.elastic-cloud.com.
    • API key: Enter the encoded Elastic API key. To create an API key, go to the Elastic Cloud console, select Connection details and click Create and manage API keys. If you are using an API key with Restrict privileges, make sure to review the Indices privileges to provide at least "auto_configure" & "write" permissions for the indices you will be using with this delivery stream.
    • Content encoding: For a better network efficiency, leave content encoding set to GZIP.
    • Retry duration: Determines how long Firehose continues retrying the request in the event of an error. A duration of 60-300s should be suitable for most use cases.
    • Parameters:

      • es_datastream_name: Elastic recommends setting the es_datastream_name parameter to logs-awsfirehose-default to leverage the routing rules defined in this integration. If this parameter is not specified, data is sent to the logs-generic-default data stream by default.
      • include_cw_extracted_fields: This parameter is optional and can be set when using a CloudWatch logs subscription filter as the Firehose data source. When set to true, extracted fields generated by the filter pattern in the subscription filter will be collected. Setting this parameter can add many fields into each record and may significantly increase data volume in Elasticsearch. As such, use of this parameter should be carefully considered and used only when the extracted fields are required for specific filtering and/or aggregation.
  2. In the Backup settings panel, it is recommended to configure S3 backup for failed records. It’s then possible to configure workflows to automatically retry failed records, for example by using Elastic Serverless Forwarder.

Step 4: Send data to the Firehose delivery streamedit

You can configure a variety of log sources to send data to Firehose delivery streams. Refer to the AWS documentation for more information. Several services support writing data directly to delivery streams, including Cloudwatch logs. Alternatively, you can also use AWS Database Migration Service (DMS) to create streaming data pipelines to Firehose. For example, a typical workflow for sending VPC Flow Logs to Firehose would be the following:

For more information on Amazon Data Firehose, you can also check the Amazon Data Firehose Integrations documentation.