Customers

Integrating Bro IDS with the Elastic Stack

Cyber attacks are continually increasing in scope and complexity. Advanced persistent threats are becoming more difficult to detect, leading to what the 2015 Verizon Data Breach Report calls a detection deficit. Mandiant has found that the average time to detection for attacks is 205 days. The core of this detection deficit is the fact that the cost, complexity, and volume of data to be analyzed increases with the maturity of the security organization.

Most organizations are collecting logs from systems, applications, and network devices to generate operational statistics and/or alert on abnormal behavior. Software engineers write the code that determines what gets logged within their applications. Unfortunately, a lot of valuable data is not written to logs, making it improbable for log management systems administrators to detect attacks quickly. The best method to detect attacks is to analyze the sessions and full packet capture data within the environment.

To detect a cyber attack in real-time using packet-level inspection, as well as provide historical analysis, a network security monitoring application should be used. One such option is the Bro Network Security Monitor. Bro is an open source network security monitor that has been around since 1995. Bro can inspect network traffic in real-time or look into previously captured packet capture files. Bro looks for known attacks in the same way a typical intrusion detection system would. The benefit of Bro is that all connections, sessions, and application level data are written to an extensive set of log files for later review. This blog will take a deep look into using Elasticsearch, Logstash, and Kibana for managing and analyzing log data from Bro.

Log Collection

Elasticsearch, Logstash, and Kibana are open source products for collection, normalization, storage, visualization and analysis of log data. You can either install the Elastic Stack on the same system as Bro or you can run it on a separate server and forward logs from Bro via syslog. This blog assumes that Elastic Stack and Bro are install on same server. By default, all Bro logs are written to <BroInstallDir>/logs/current and are rotated on a daily basis.

The Logstash configuration below shows how to tail the Bro log files and index data into a local instance of Elasticsearch. The full Logstash config file can be found here. In the next few section, we do a step-by-step walkthrough of the configuration.

input {
  file { 
    path => "/opt/bro/logs/current/*.log"
  }
}
filter {…}
output {
  elasticsearch {
    host => localhost
    cluster => "elasticsearch "
  }
}

Logstash: Filters

Next, filters need to be added to normalize the logs and extract the metadata such as IP addresses, file names, ports, etc. We will apply the grok filter plugin which uses regular expression to parse through the logs and add structure. Logstash ships with a set of grok expressions, however custom expressions are needed to parse Bro logs.

We can directly embed regular expressions in the filter {} section of the config file. However, I have found it better to simplify the configuration by moving complex regular expressions to a separate rule file in the pattern directory. Below is an example configuration that shows how.

filter {
  grok { 
    match => { 
      patterns_dir => "/path/to/patterns"
      "message" => "%{291009}"
    }
  }
}

The regular expressions for custom pattern 291009 are stored in rule file in the /path/to/patterns directory mentioned above. Patterns for different devices can be stored in separate rule file, for example bro.rule, linux.rule, apache.rule, etc. A sample bro.rule file will contain rule such as shown below.

291009 (?<start_time>\d+\.\d{6})\s+(?<uid>\S+)\s+(?:(?<evt_srcip>[\d\.]+)|(?<evt_srcipv6>[\w:]+)|-)\s+(?:(?<evt_srcport>\d+)|-)\s+(?:(?<evt_dstip>[\d\.]+)|(?<evt_dstipv6>[\w:]+)|-)\s+(?:(?<evt_dstport>\d+)|-)\s+(?<fuid>\S+)\s+(?<file_mime_type>\S+)\s+(?<file_description>\S+)\s+(?<seen_indicator>\S+)\s+(?<seen_indicator_type>[^:]+::\S+)\s+(?<seen_where>[^:]+::\S+)\s+(?<source>\S+(?:\s\S+)*)$

Logstash: Conditionals

To further enhance the data with fields such as action, status, object, and device types, users can use Logstash conditionals, i.e. IF statements to split the grok pattern for each normalization rule into their own code blocks. A quick way to get the match condition for each normalization rule is to use the regular expression code from the rule file and remove the columnn data from the code. For example, if the grok pattern was (?<column1>regex1)(?<column2>regex2), the match expression for the IF statement would be (regex1)(regex2). It is important to remember to use ELSE IF statements for additional message matching so a message is not accidentally matched multiple times.

patterns.png

Such conditional processing lets us enrich the messages with additional key value pairs using the add_field. For each message, users should assign a device type, object, action, status, and rule ID. The device type, object, action, and status are part of the Common Event Expression tags to help identify similar events across multiple devices. These will help when comparing Bro log data with data from other devices (firewall, other IDS, system logs, etc.) that may not use the same naming conventions.

The use of rule ID will also help performance tune Logstash going forward. Each normalized message will now be tagged with a rule ID. Since Logstash is using a top down approach with the IF statements, we can place most commonly used normalization rules near the top of the filter plugin. This will bypass the processor-intensive regular expression matching for messages which are rarely seen.

Logstash: Geolocation

While Bro can leverage the LibGeoIP library for geolocating IP addresses, I recommend moving this functionality to Logstash. Place the geoip plugin after all the IF statements. The geoip filter plugin requires a source column, a destination column, location of the GeoIP database, and fields to be added from the GeoIP database. Logstash ships with a built-in GeoLiteCity database, but it may be useful to provide a separate one that can be updated on demand if needed. Full list of fields included in the built-in GeoLiteCity database are listed here. Only latitude and longitude are required to plot coordinates on the map in Kibana; the other fields are optional for additional contextual information.

filter {
  geoip {
    source => "evt_dstip"
    target => "geoip"
    database => "/path/to/GeoLiteCity.dat"
    add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][city\_name]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][continent\_code]}" ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][country\_code2]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][country\_code3]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][country\_name]}" ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][dma\_code]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][postal\_code]}"  ]
    add_field => [ "[geoip][coordinates]", "%{[geoip][region\_name]}"  ]
  }
}

The default Elasticsearch template that stores the Logstash data has a built-in mapping that sets the geoip field as a geo_point object type. When geolocating multiple IP fields, multiple targets will need to be used. As such, the template will need to be modified to provide mappings for the new geoip fields. To get the current Elasticsearch template for logstash, use the following command.

curl -XGET localhost:9200/_template/logstash

The geoip mapping will need to be modified or copied to add additional geoip fields. To update the template, enter the curl -XPUT localhost:9200/_template/logstash -d ‘<template>’ command , where <template> is the text of the template. Once this is done, multiple geoip plugins can be used with Logstash, changing the target from "geoip" to "geoip_dst" and "geoip_src" in the geoip filter code

Logstash: Log Timestamp

The final optional piece of configuration for Logstash will be modifying the timestamp on indexed logs. When collecting Bro logs in real-time, this will not be an issue. However, if the user wants to analyze packet capture files, Bro will use the timestamp from the packet capture file as the timestamp in the logs. When Logstash collects these Bro logs, it will use the collect time as the timestamp for the log messages. If any forensics analysis is going to be done, the timestamp should be preserved for posterity. This date plugin will need to go after the IF statements in order to use the appropriate time column from the log message. Code below uses the start_time column as the timestamp of the log.

filter {
  date {
    match => ["start_time", "UNIX"]
  }
}

Threat Intelligence Integrations

Threat intelligence feeds are known indicators of compromise, generally shared across industry verticals such as finance, healthcare, industrial, retail, etc. One such provider for Bro is Critical Stack Intel. The Critical Stack agent is installed on the Bro system and is configured to pull feeds from the server. Critical Stack maintains a list of more than 98 threat feeds, including malicious IP addresses, known phishing email addresses, malicious file hashes, and domains known to host malware. These feeds contain over 800,000 indicators of compromise. A free account needs to be created on the Critical Stack website to obtain an API key to enable the agent to pull data. New threat feeds can be added to the agent’s lists with a simple click on the website. On the agent system, the feeds are pulled and converted into Bro scripts. To integrate these scripts into Bro, just reference the target directory at the end of the Bro command. For example: /bro –r file.pcap local /path/to/criticalstack/feeds

When malicious activity is detected by the Critical Stack scripts, logs will be written to the intel.log file in the Bro log directory.

Logstash: Translate for Threat Intel

Logstash does not have a direct integration with threat intelligence providers like Critical Stack. However, the translate filter plugin allows users to perform lookups. The translate plugin takes a normalized field as a source, a destination field to populate, and a dictionary path to perform the lookup. The dictionary file is a YAML formatted file that contains two columns. The first column is the value that is compared to the source field from the translation. If there is a match, the second column in the YAML file is placed into the destination column from the translation. If there is no match, the column will not be created or populated for that log file. A simply python script (for example, see this) can pull the threat feed and transform it into a usable format for Logstash.

filter {
   translate {
   field => "evt_dstip"
   destination => "tor_exit_ip"
   dictionary_path => "/path/to/yaml"
  }
}

The example code above shows a lookup of the evt_dstip column. When a match is found it will populate the tor_exit_ip column with the corresponding data. For reporting purposes, I have been using "IP_Address": "YES" as the format for the YAML file. This allows reports and dashboards containing the translated fields with a value of YES to be displayed.

There are many other publicly available threat feed alternatives which can be similarly utilized with Logstash translations.

Kibana Visualizations

Data is great, but it is useless unless the business can gain context from it. Kibana can hook directly into the Elasticsearch data and provide powerful visualizations to gain context around the Bro logs. The first layer of gaining context through Kibana is to search for the valuable data through the Discover tab. An example would to limit the search to Bro’s intel.log file, which is where the Critical Stack threat intelligence alerts are written to. If you are using the configuration files explained in this blog, the search would look something like path.raw=<BroInstallDir>/logs/current/intel.log. Saving this search will allow us to use it for the next layer of Kibana, Visualizations.

On the Visualize tab, Kibana allows you to choose from a set of eight visualizations; each has a unique perspective on the data. I recommend playing around with each to determine which gives you the best context for your business.

Kibana2a.jpg

In the image above, we can see the geolocation information (adding via data enrichment in the Logstash configuration file) plotted on a Tile map. The lower three visualizations are pie charts exposing the type of attack types and other information contained in the logs. The search bar on the top of the dashboard shows that we are searching for everything; however the underlying visualizations are using the saved search looking for only the intel.log data. Any further search terms put in the search bar on the Dashboard tab will search only for the data visualized, which in this case is the intel.log data.

Kibana1a.jpg

The figure above shows visualizations using a saved search looking for “MaliciousIP=YES”. Based on the Logstash configurations described in this blog, this is looking for any known malicious IP addresses discovered by the Logstash transform plugin. A differentiator in this dashboard versus the previous dashboard is the usage of the Source IP pie charts and the histogram (area chart visualization). The Source IP pie charts can quickly identify critical assets in your environment, any critical assets shown here should be a red flag to the business. The area chart histogram can give the business context around when attackers may think you are most vulnerable.

Conclusion

Even without trying to add packet capture level data for analysis, organizations are bombarded with data from system logs. By leveraging network security monitoring tools such as Bro, the packet data can be analyzed and stored in real-time, or saved in packet captures for future analysis. The Elastic Stack provides a wide array of functionality that can normalize, ingest and analyze Bro logs. All of the data collected by Bro can then be enhanced with Threat Intelligence feeds to detect and block attacks more quickly, lowering the detection deficit and allowing organizations to detect cyberattacks before valuable data is exfiltrated. Using these tools, organizations can observe, orient, decide, and act quickly to the advanced threats facing them today.


Travis Smith is a Senior Security Analyst developing Tripwire's security and compliance solutions. He has 10+ years in the security industry in various positions, including technical support, professional services, and R&D. Travis holds a Masters of Business Administration with a concentration in information security, as well as multiple industry certifications such as the Certified Information Systems Security Professional (CISSP) and GIAC Certified Penetration Tester (GPEN).