Industrial control systems security with Elastic Security and Zeek

industrial-control-systems.jpeg

Industrial control systems (ICS) have historically been isolated and less interconnected. Isolation was one of the things that kept these systems more secure behind air gaps, at the cost of lost coordination and collaboration. This is rapidly changing with the rise of Industry 4.0 with increased interconnectivity and integration of smart technologies like Industrial IoT (IIoT) and cloud computing in modern industrial processes.

This blog walks through the security challenges associated with ICS and how Elastic Security and Zeek can help address them, in addition to the benefits of having integrated machine learning and threat intelligence within the ICS security program.

What are industrial control systems?

NIST defines ICS as information systems that control industrial processes like manufacturing, product handling, production, and distribution. They consist of a combination of control components (e.g., electrical, mechanical, hydraulic, pneumatic) that act together to achieve industrial objectives. These systems run the primary critical infrastructure sectors, such as communication, energy, transportation, chemical industry, and nuclear.

ICS are categorized into different types depending on their mission, scale, and geographical dispersion. The most common type is Supervisory Control and Data Acquisition (SCADA), which controls geographically dispersed assets, such as oil and gas pipelines, electrical power grids, and telecommunication networks. 

ICS are typically a mixture of information technologies (IT) and operational technologies (OT), which are scattered on the different system levels starting from the field networks, to the ICS operation centers, to the corporate networks, and even to the cloud. These technologies generate large data volumes with different formats and structures, which have to be searched, correlated, and visualized promptly across subsystems, processes, assets, regions, plants, and so on, to support the ICS operation and security.

ics reference architecture
Figure 1: ICS reference architecture

Unlike pure IT systems, ICS are generally characterized by unique performance and reliability requirements. Their real-time operational nature does not accept the delay or jitter of the control data, which makes these issues the main root causes of the majority of ICS incidents. 

Moreover, suppose the ICS alerting systems or the Safety Instrumented Systems (SIS) fail or are compromised under any circumstances. In that case, incidents may not be prevented and the system's operators may not even see the alerts on their dashboards.  

Threats against critical infrastructure, such as energy systems or electrical power grids, can damage the national economy and security of the states. They can carry significant risks to the health and safety of human lives and nature. There have been significant reported ICS cyber attacks, most notably the recent
Colonial pipeline ransomware attack, which targeted the largest US fuel pipeline and caused its shut down for several days. This attack caused fuel shortages in the eastern US and impacted multiple industries, and it ultimately led the government to declare a state of emergency.

ICS security with Elastic

In addition to being the backbone of critical national infrastructure, which makes them a principal target for cyber attacks, ICS can also involve high physical exposure of their assets and networks, which makes their security monitoring and defense much broader and more complex.

Furthermore, ICS security has more specific requirements than enterprise security, which makes the application of common enterprise security mechanisms to an ICS particularly negative on the system functionality and safety. 

For example, the Confidentiality, Integrity, Availability (CIA) triad of IT security is upside down in OT, making the goals of IT and ICS security programs very different. The operational nature of the environments also brings other challenges such as legacy and discontinued software, and the inability to patch or upgrade to fix vulnerabilities immediately. This can leave large systems at risk for months or even years without bespoke security mechanisms that combine both, security and functionality.

Elastic Security can be the basis of such a mechanism to deliver these requirements with high performance, scalability, and exceptional visual experience.

Keeping ICS inventory with Elastic

Keeping track of all ICS asset history and accurate status in a global inventory is critical not only for purposes like maintenance, cost management, and environment optimization but also for the system's security. Well-implemented and maintained inventories are key to ICS security programs, since you can’t protect what you don’t know about. Knowing what is on the ICS network, and what should normally be there at any instant, is very important to take action toward any unexpected events.

The ICS inventory should contain all the OT assets with their basic attributes, such as the static information about the manufacturer, model, serial number, etc., as well as the dynamic information, such as the physical location and geo-location, IP and port configuration, alarm settings, software update version and patch status, and so forth.

The inventory should also include metadata that can provide more context about the assets, to be used for maintenance and security such as known deficiencies, possible compatible replacements, known vulnerabilities and exploits, and related threat intelligence. This information reduces the investigation time — especially in ICS field networks, which are usually operated by small teams that may have IT or OT knowledge gaps — and increases response efficiency in emergency cases.

Creating the ICS inventory in Elastic makes it possible to search and find relevant information about the assets we want to secure quickly and at scale and in the same place where security data is being ingested and alerts are triggered. It also enables an exceptional graphical view in Kibana via Maps, Graphs, and Canvas.

It is usually impossible to build and maintain the ICS inventory manually. The more effective method is to use the data flowing to Elastic, particularly the network data, to capture any new assets and update the information of the existing ones. Elastic can transform time-series data into an entity-centric view that helps track each asset and summarizes its configuration — for example, to automatically list a device's open ports and the source and destination IPs it is trying to communicate with. This way, the inventory also automatically stays up to date.

The inventory information can be also automatically enriched with other data from various internal and external sources (e.g., other data indices, external databases, CMDB). Below is a screenshot of an entity-based view of the sample Modbus data. This view summarizes critical information, such as all destination IP addresses and ports that the device communicates with, and the protocol addresses for the Modbus communications. This information is continuously updated by the transform job and can be used to create alerts in case of pattern changes.

Video thumbnail

Figure 2: Pivoted Modbus data sample

Elastic machine learning can also be used to monitor the ICS inventory and keep track of all the assets. It can generate alerts about absent or newly plugged devices or those abnormally changing their location, behavior, or status (e.g., different rates of rejections, new ports opened, changes in control commands, software version updates). Such events vary from an ordinary indicator of failure to an indicator of compromise, both of which require immediate alerting and attention.

ICS network security with Elastic

Most attacks against the ICS utilize the networks on which they are connected. This makes network security monitoring one of the key aspects of the ICS security programs. However, combining IT and OT technologies and protocols makes this aspect a challenging one.

Enterprise networks are the most common entry points for adversaries targeting the ICS. Elastic supports network packet capture from servers and endpoint machines and provides a long list of integrations for the most common enterprise network technologies like Cisco, Palo Alto Networks, and F5.

For the OT part of the network, Elastic integrates with network detection frameworks like Zeek, which currently supports the majority of ICS protocols including Modbus, DNP3, Ethernet/IP, S7comm, MQTT, and more. Zeek also provides the Spicy Framework, which simplifies and accelerates the generation of customer parsers for other protocols that are not currently supported. 

Zeek unobtrusively observes network traffic, which makes it fit the requirement of OT networks not to disrupt control data flow. It uses this traffic to generate compact, high-fidelity, and richly annotated metadata that shows sessions and protocol transactions in a curated and logical way.

Ingestion of this metadata on Elastic allows for better analysis and correlation with the enterprise network data to enable better threat detection and incident response.The below screenshots show the Zeek metadata generated from an ICS packet capture containing multiple control protocols like Modbus, DNP3, Ethernet-IP, and other IT protocols.

Video thumbnail

Figure 3: Zeek metadata generated from the analyzed ICS PCAP

Video thumbnail

Figure 4: A dashboard with the Ethernet-IP protocol meta-data generated by Zeek

Video thumbnail

Figure 5: A dashboard with the DCE-RPC protocol metadata generated by Zeek

Endpoint security for ICS

At the ICS endpoints level, an endpoint detection and response (EDR) solution should replace the traditional anti-malware solutions for multiple reasons: 

  • The majority of ICS endpoints do not have internet access, particularly on the field networks, which require them to have an active solution that works efficiently offline. 
  • There is no reliable signature database for ICS malware. Even with the available signatures, the ICS assets remain unprotected against zero-day attacks, particularly for aged or discontinued platforms or software.
  • ICS patch management can be very delicate for production servers. First, vulnerability scans are usually so infrequent because they can disrupt the industrial operation during the process. Also, if the vendors are still maintaining the software (which might not be the case for 10–15 year environments), patching the discovered vulnerabilities can cause operational errors and adversely affect the system's operation. 

Elastic provides Elastic Defend, an Endpoint Security solution, which can work offline to maintain deeper visibility on the endpoint level, thereby providing detection and prevention capability for known and unknown threats. Elastic Defend includes ransomware protection, memory threat prevention, malicious behavior prevention, attack surface reduction protection, and more. This EDR solution also streamlines centralized response actions, such as isolating infected endpoints to prevent damage from spreading across the ICS network and suspension and killing of processes, among others. 

The Elastic Agent also provides the capacity to run on-demand and scheduled OSquery on hosts. The OSquery integration can also run on-demand YARA scans using the YARA table. This is beneficial in the case of new ICS threats or campaigns where the threat signatures can be available early in the threat intelligence feeds.

Machine learning for ICS security

Anomaly detection with machine learning (ML) is a very efficient detection strategy for ICS threats. This is because the behaviors of these systems are steady and consistent over time. This simplifies the creation of accurate and reliable baselines for the different processes, assets, users, and more to detect any anomalies with high confidence.  Elastic ML provides both supervised and unsupervised techniques to create customized jobs fitting the exact system needs. These jobs can take into account scheduled events like regular maintenance to ensure that the ML model runs with higher fidelity and focuses more on unexpected events.

Creating these jobs in Elastic also allows system operators to quickly identify or predict risky situations by examining abnormal behaviors in their systems, even if no alarms are triggered by the field assets themselves. This makes it possible to overcome situations where attackers attempt to prevent safety functions from engaging and operators attempt to respond to failures using response function inhibition techniques.

Machine learning also plays an important role in detecting ICS malware variants given the lack of reliable signature databases.

ICS threat intelligence

Threat intelligence can provide knowledge about how an ICS could be compromised, who can target it, and why and how. It can also provide context, detections, and mitigations, the information that plays a timely vital role in protecting the systems and preventing the disruptions or incidents that security threats may cause.

In 2020, MITRE released the ICS ATT&CK matrix, a focused matrix that introduced ICS-specific adversary TTPs. It presents a joint between enterprise techniques and ICS techniques, helping to bridge the gap between IT and OT security within an ICS environment. The matrix provides information about the data to collect and from which sources to create efficient detections and mitigations. This matrix can be used in integration with other internal and external threat feeds to embody global intelligence and increase the maturity of the ICS security program.

Elastic Security integrates with many threat intelligence data sources, which allows for:

  • Enrichment of the ICS inventory with the threat intelligence data
  • Creation of indicator match detection rules mapped with MITRE techniques
  • Analysis of the MITRE techniques coverage across both the matrices for Enterprise and ICS to make an overall assessment
  • Use of the threat intel information for threat hunting in the ICS environment 

The figure below is an example of the overall solution architecture with network and endpoint security and integrated threat-intelligence illustrated on a typical oil and gas SCADA system.

ICS security architecture with Elastic and Zeek
Figure 6: ICS security architecture with Elastic and Zeek

Threat hunting in ICS environments

Threat hunting in ICS environments can significantly improve the reliability and security of the system. It aims to find any evidence or hints of attacker activity, which can also result in creating new detection rules, tuning the existing ones, revealing and eliminating blind spots, and identifying critical system components (the ICS crown jewels) — to be more hardened and monitored. 

The threat-hunting platform should provide the hunters with efficiency for the time and scale of their mission. Elastic security provides this platform that can automatically baseline the ICS behaviors with ML; search across various data sources with speed, scale, and relevance; and integrate internal and external threat intelligence feeds to facilitate the hunt. The Zeek network metadata makes it even more powerful, which ultimately helps to speed up the hunting missions and achieve their goals.

Try it out

Start now and create your free 14-day trial of Elastic Cloud to experience the latest version of Elastic. Check out this blog about the Industrial Internet of Things (IIoT) with the Elastic Stack and also watch the demonstration webinar to set yourself up for success.

Guest Author

Samir Bennacer, Octodet

CTO, Octodet

Samir Bennacer, CTO at Octodet, brings extensive security expertise as a former employee of Elastic, Splunk, and Arcsight. With a strong background in SIEM solutions and big data technologies, he leads the way in developing cutting-edge security solutions at Octodet, delivering effective protection for organizations in different business sectors.