16. Februar 2016 Engineering

Detecting DNS Tunnels with Packetbeat and Watcher

Von Andrew Kroh

Updated January 12, 2017: This post was updated to reflect changes in Packetbeat 5.x and Elasticsearch 5.x.

Data observed from monitoring DNS traffic on a network can be used as an indicator of compromise (IOC). This blog post will discuss how Elasticsearch and Watcher can be used with Packetbeat to alert when possible malware activity is detected.

Packetbeat is our open source packet analyzer. It monitors the traffic on your network and indexes the DNS requests and responses into Elasticsearch where aggregations can be used to help make sense of the data.

Alerting (formerly Watcher) is part of X-Pack and it provides alerting and notifications based on changes in your data.

There are many use cases for alerting on data collected by Packetbeat such as alerting when the response times for web requests are above a threshold or when there is spike in HTTP errors returned by your web servers. The alerting described in this article has applications in network security. We are going to look at one specific use case -- detecting data exfiltration over DNS tunnels.

Detecting DNS Tunnels

Tunnels can be established over the DNS protocol to covertly move data or provide a command and control channel for malware. Often this technique is used to bypass the protections of corporate firewalls and proxy servers. Tunneling works by encoding data in DNS requests and responses. The client issues a query for a hostname and that query is eventually forwarded to the authoritative name server associated with the domain.

Packetbeat Deployment Architecture

There are a lot of different techniques that can be employed for detecting such traffic. We are going to look at using the number of unique hostnames for a domain as an IOC. DNS tunneling utilities must use a new hostname for each request which leads to a much higher number of hostnames present for the malicious domains in comparison to legitimate domains.

Packetbeat Setup

The first step is to install Packetbeat and configure it to collect DNS traffic. For this setup, the server running Packetbeat is connected to a port mirror so that Packetbeat can observe all the traffic between the local network and the Internet. The Packetbeat documentation has a great Getting Started guide that explains installation and setup procedure, so I will just show the configuration used.

# /etc/packetbeat/packetbeat.yml
packetbeat.interfaces.device: en0
packetbeat.protocols.dns:
  ports: [53]
  include_authorities: true
  include_additionals: true
output.elasticsearch:
  hosts: ["localhost:9200"]

Watch your DNS Traffic

The complete watch file is stored in our elastic/examples repository along with all of the supporting files shown here. We are going to walk through the creation of this watch step-by-step.

Watch Trigger

The watch trigger specifies when the execution should start. This watch is scheduled to execute every 15 minutes.

      "trigger": {
         "schedule": {
            "interval": "15m"
         }
      },

Watch Input

This first step in creating this watch is to design a set of aggregations to be used as the input to the watch. We want to find the cardinality of the hostnames associated with each second-level domain (e.g. badguy.co.).

We start with a query that has just two components, a time window and a whitelist. The time window and whitelist can be customized. Find more on this in the Tuning the Detector section.

Next we use a terms aggregation to create buckets for each second-level domain. Then we apply a sub-aggregation to get the cardinality of the hostnames within that bucket. Finally we apply a bucket selector aggregation to select only the buckets having more than 200 unique hostnames. The watch will generate an alert when the number of unique hostnames breaks this threshold.

GET packetbeat-*/dns/_search
{
  "query": {
    "bool": {
      "filter": {
        "range": {
          "@timestamp": {
            "from": "now-4h"
          }
        }
      },
      "must_not": {
        "terms": {
          "dns.question.etld_plus_one": [
            "akadns.net.",
            "amazonaws.com.",
            "apple.com.",
            "apple-dns.net.",
            "cloudfront.net.",
            "icloud.com.",
            "in-addr.arpa.",
            "google.com.",
            "yahoo.com."
          ]
        }
      }
    }
  },
  "size": 0,
  "aggs": {
    "by_domain": {
      "terms": {
        "size": 1000,
        "field": "dns.question.etld_plus_one"
      },
      "aggs": {
        "unique_hostnames": {
          "cardinality": {
            "field": "dns.question.name"
          }
        },
        "total_bytes_in": {
          "sum": {
            "field": "bytes_in"
          }
        },
        "total_bytes_out": {
          "sum": {
            "field": "bytes_out"
          }
        },
        "high_num_hostnames": {
          "bucket_selector": {
            "buckets_path": {
              "unique_hostnames": "unique_hostnames"
            },
            "script": "params.unique_hostnames > 200"
          }
        }
      }
    }
  }
}

The query above relies on the dns.question.etld_plus_one field provided by Packetbeat 5.x to bucket all requests for a single domain. Packetbeat creates this field using an embedded copy of the Public Suffix List.

Watch Condition

The watch condition is what determines if an alert is triggered. The condition here is simple. This says that if any buckets were returned then trigger the alert.

      "condition": {
         "script": {
            "inline": "ctx.payload.aggregations.by_domain.buckets.size() > 0"
         }
      },

Watch Actions

The watch actions are executed after the condition is met, and define the "output" of a watch. For this watch we are sending an email and also writing a message to the Elasticsearch log. A  transform script is being used to manipulate the data so that it renders better in an email.

        "transform": {
           "script": {
              "file": "dns_transform"
           }
        },
        "actions": {
           "log_domains": {
              "logging": {
                 "text": "The following domain(s) have a high number of unique hostnames: {{ctx.payload.alerts}}"
              }
           },
           "email_alert" : {
            "email": {
              "to": "'John Doe <john.doe@example.com>'",
              "subject": "Suspected DNS Tunnel Alert",
              "body": "The following domain(s) have a high number of unique hostnames: {{ctx.payload.alerts}}"
            }
          }
        }

Below is the dns_transform Painless script. It should be placed into the config/scripts directory of Elasticsearch.

// File: config/scripts/dns_transform.painless
def alerts = ctx.payload.aggregations.by_domain.buckets.stream().collect(Collectors.toMap(p->p.key,item->[
        "total_requests" : item.doc_count,
        "unique_hostnames" : item.unique_hostnames.value,
        "total_bytes_in" : item.total_bytes_in.value,
        "total_bytes_out" : item.total_bytes_out.value,
        "total_bytes" : item.total_bytes_in.value + item.total_bytes_out.value
]));
return ["alerts":alerts];

Here is a sample alert. Notice it contains a some additional metrics that can be used to gauge the severity of the situation.

Date: Fri, 16 Feb 2016 11:00:01 -0500 (EST)
From: Watcher <watcher@example.com>
Message-Id: <201602161600.u0SG01ks024814@example.com>
To: John Doe <john.doe@example.com>
Subject: Suspected DNS Tunnel Alert
The following domain(s) have a high number of unique hostnames:
{badguy.co.={total_requests=222, unique_hostnames=222, total_bytes_in=16716.0,
total_bytes_out=35161.0, total_bytes=51877.0}}

When this alert is received, the recipient can take the domain and do a search using the Discover application in Kibana to find the network clients responsible for the tunnel.

Testing and Results

Unique FQDNs per Second Level Domain

This chart shows the top ten domains with the highest number of unique hostnames over a period of 4 hours. This data was collected from a network with about 100 devices. During that time window I replayed a network capture containing a tunnel created by  iodine. The tunnel which is operating under the fictitious domain  pirate.sea was up for just 20 seconds, and yet it has the highest number of domains.

If the whitelist from the watch is applied then the tunnel really stands out among the other domains as seen below.

Unique FQDNs per Second Level Domain with Whitelist

Tuning the Detector

There are two variables that can be tuned -- the time window and the unique hostnames threshold. A smaller time window can be used with a smaller threshold to make the watch more sensitive to short duration tunnels. In a shorter time window, domains not being used for tunneling will generally accumulate fewer unique hostnames.

Tunnels that move data slowly can be detected using a larger time window. But using a larger time window means that valid domains with a lot of unique hostnames, such as CDNs, will cause false positives. So if you use a large time window you will likely need to add domains to the query's whitelist.

Conclusion

It was fun combining Packetbeat and Watcher to look for DNS tunnels. Remember "defense in depth" if you implement a solution like this. It is important to layer your defenses so that if one layer fails there is another one in place to detect.