Editor’s Note: Elastic joined forces with Endgame in October 2019, and has migrated some of the Endgame blog content to elastic.co. See Elastic Security to learn more about our integrated security solutions.
Adversarial activity is no longer described purely in terms of static Indicators of Compromise (IOCs). Focusing solely on IOCs leads to detections which are brittle and ineffective at discovering unknown attacks, because adversaries modify toolkits to easily evade indicator-based detections. Instead, practitioners need durable detections based on malicious behaviors. MITRE’s ATT&CK framework helps practitioners focus their defensive tradecraft on these malicious behaviors. By organizing adversary tradecraft and behaviors into a matrix of tactics and techniques, ATT&CKTM is ideal to progress detection beyond IOCs and toward behavior-based detection.
With a comprehensive and robust model of adversarial behavior in place, the next step is to build an architecture for event collection that supports hunting and real-time detection, along with a language that promotes usability and performance. We created the Event Query Language (EQL) for hunting and real-time detection, with a simple syntax that helps practitioners express complex queries without a high barrier to entry. I’ll discuss the motivation behind EQL, how it fits into the overall Endgame architecture, and provide several examples of EQL in action to demonstrate its power to drive hunting and real-time detection of adversarial behavior.
Simplifying Complex Queries
Many database and search platforms are cumbersome and unintuitive, with complex syntax and a high barrier to entry. Detecting suspicious behavior requires analysis of multiple data sources and seamless data collection, aggregation, and analysis capabilities. Searching and exploring data should be intuitive and iterative, with the flexibility to fine-tune questions and hone in on suspicious behavior.
Within the Endgame platform, we create functionality to overcome these data analysis and detection challenges. Our goal is to empower users without overwhelming them. That’s why we initially created Artemis, which enables analysts to search using natural English, with intuitive phrases like “Find the process wmic.exe running on active Windows endpoints.” However, analysts also often want to search for things that are difficult to describe clearly and concisely in English. We knew that addressing this next challenge would require a new query language that supports our unique architecture and equips users to find suspicious activity. Our solution, EQL, balances usability while significantly expanding capabilities for both hunting and detection. It can be used to answer complex questions without burdening users with the inner workings of joins, transactions, aggregations, or state management that accompany many database solutions and analysis frameworks. EQL has proven effective, and we’re excited to introduce it to the community to drive real-time detection.
Endgame’s Eventing Architecture
Several architectural decisions in the Endgame platform played a pivotal role in the design and development of EQL. In the age of big data, Endgame takes an efficient and distributed approach to event collection, enrichment, storage, and analysis. Most solutions require data to be forwarded to the cloud to perform analysis, generate alerts, and take action. This introduces a delay for detection, increases network bandwidth utilization, and most importantly, implies that disconnected endpoints are less protected.
With the Endgame platform, live monitoring, collection, and analysis happen where the action happens: on the endpoint. This endpoint-focused architecture allows for rapid search at scale while minimizing bandwidth, cloud storage, and time spent waiting for results. It also enables process and endpoint enrichment, unique forms of stateful analysis—like tracking process lineage—and autonomous decisions without any need for platform or cloud connectivity. When an endpoint detects suspicious behavior, it makes a prevention or detection decision and alerts the platform of suspicious activity with the corresponding events. This decision happens without requiring a round trip to the Endgame management platform or the cloud, assuring that disconnected and connected endpoints are equally protected from suspicious behavior.
These architectural decisions drove the logical next step: structuring a language that optimizes these capabilities. As we developed EQL, we aimed to create a language that is accomodating to users, optimized for our architecture, and shareable for defenders.
Designing a Language
We wanted to ensure EQL supported sophisticated questions within a familiar syntax to limit the learning curve and maximize functionality. EQL provides abstractions that allow a user to perform stateful queries, identify sequences of events, track process ancestry, join across multiple data sources, and perform stacking. In designing EQL, we focused first on exposing the underlying data schema. Every collected event consists of an event type and set of properties. For example, a process event has fields such as the process identifier (PID), name, time, command line, parent information, and a subtype to differentiate creation and termination events. At the most basic level, an event query matches an event type to a condition based on some Boolean logic to compare the fields. The
where keyword is used to tie these two together in a query. Conditionals are combined with Boolean operators
in; and function calls. Numbers and strings are expressed easily, and wildcards (
*) are supported. All of this leads to a simple syntax that should feel similar to Python.
When searching for a single event, it is easy to express many English questions with EQL in a readable, streamlined syntax. For example, the question:
What unique outgoing IPv4 network destinations were reached by svchost.exe to port 1337 with the IP blocks 192.168.0.0/16 or 172.16.0.0/16?
Is expressed as an EQL:
network where event_subtype_full == "ipv4_connection_attempt_event" and process_name == "svchost.exe" and destination_port == 1337 and (destination_address == "192.168.*" or destination_address == "172.16.*") | unique destination_address destination_port
EQL supports searches for multiple related events that are chained together with a sequence of event queries as well as post-processing similar to unix-pipes |. Defenders can assemble these components to define simple or sophisticated behaviors without needing to understand the lower-level mechanics. These building blocks are assembled to build powerful hunting analytics and real-time detections in the Endgame platform.
In the Endgame platform, when valid EQL is entered, the query is compiled and sent to the endpoints which then execute the query and return results. This happens quickly and in parallel, allowing users to immediately see results. Let’s take a look at how EQL is expressed in the following scenarios.
IOC searching isn’t hunting, but it is an important piece of many organizations’ daily security operations. Endgame users can express a simple IOC search in EQL:
process where sha256=="551d62be381a429bb594c263fc01e8cc9f80bda97ac3787244ef16e3b0c05589"
Many Endgame users choose to use Artemis’ natural language capabilities for this sort of search by simply asking, "Search processes for 551d62be381a429bb594c263fc01e8cc9f80bda97ac3787244ef16e3b0c05589".
TIME BOUND SEARCHES
There are many times during incident response in which it is useful to know everything that happened at a specific time on an endpoint. Using a special event type called
any, which matches against all events, EQL can match every event within a five-minute window.
What events occurred between 12 PM UTC and 12:05 PM UTC on April 1, 2018?
any where timestamp_utc >= "2018-04-01 12:00:0Z" and timestamp_utc <= "2018-04-01 12:05:0Z"
STACKING ON THE ENDPOINT
We can filter and process our data as it is searched, without having to pull all the raw results back for stacking, establishing situational awareness, and identifying outlier activity. The language provides data pipes, which are similar to unix pipes, but instead of manipulating lines of input, a data pipe receives a stream of events, performs processing, and outputs another stream of events. Supported pipes enable you to:
- Count the number of times something has occurred
| count <expr>,<expr>,...
- Output events that meet a Boolean condition
| filter <condition>
- Output the first N events
| head <number>
- Output events in ascending order
| sort <expr>, <expr>, …
- Output the last N events
| tail <number>
- Remove duplicates that share properties
| unique <expr>,<expr>,...
Which users ran multiple distinct enumeration commands?
join by user_name [process where process_name=="whoami.exe"] [process where process_name=="hostname.exe"] [process where process_name=="tasklist.exe"] [process where process_name=="ipconfig.exe"] [process where process_name=="net.exe"] | unique user_name
Which network destinations first occurred after May 1st?
network where event_subtype_full=="ipv4_connection_attempt_event" | unique destination_address, destination_port | filter timestamp_utc >= "2018-05-01"
SEQUENCE OF EVENTS
Many behaviors aren’t atomic and span multiple events. To define an ordered series of events, most query languages require elaborate joins or transactions, but EQL provides a sequence construct. Every item in the sequence is described with an event query between square brackets
[<event query>]. Sequences can optionally be constrained to a timespan with the syntax
with maxspan=<duration>, or expire with the syntax
until [<event query>], or match values with the
What files were created by non-system users, first ran as a non-system process, and later ran as a system-level process within an hour?
sequence with maxspan=1h [file where event_subtype_full=="file_create_event" and user_name!="SYSTEM"] by file_path [process where user_name!="SYSTEM"] by process_path [process where user_name=="SYSTEM"] by process_path
The language can even express relationships in a process ancestry. This could be used to look for anomalies that may have normal parent-child relationships, but are chained together in a suspicious way. To check if a process has a certain ancestor, the syntax
descendant of [<ancestor query>] is used.
Did any descendant processes of Word ever create or modify any executable files in system32?
file where file_path=="C:\\Windows\\System32\\*" and file_name=="*.exe" and descendant of [process where process_name=="WINWORD.exe"]
Process ancestry relationships also support nesting and Boolean logic, facilitating rigorous queries. This helps hone in on specific activity and filter out noise.
Did net.exe run from a PowerShell instance that made network activity and wasn’t a descendant of NoisyService.exe?
process where process_name=="net.exe" and descendant of [network where process_name=="powershell.exe" and not descendant of [process where process_name == "NoisyService.exe"]]
Functions can extend EQL capabilities by defining and exposing new functions without having to change any syntax. The
length() function is useful for finding suspicious and rare PowerShell command lines, for example:
What are the unique long Powershell command lines with suspicious arguments?
process where process_name in ("powershell.exe", "pwsh.exe") and length(command_line) > 400 and (command_line=="*enc*" or command_line=="*IO.MemoryStream*" or command_line == "*iex*" or command_line=="* -e* *bypass*") | unique command_line
SORTING WITH THRESHOLDS
EQL can also be used for performing outlier analysis. The pipes
tail filter data on the endpoint so only outliers are returned. With a search constructed this way, there is a well-defined upper bound on how many results are returned by an endpoint. That means there is less bandwidth used, no need for number crunching after the fact, and less time sifting through results. In other words, you don’t have to obtain and store data you don’t need.
What top five outgoing network connections transmitted more than 100MB?
network where total_out_bytes > 100000000 | sort total_out_bytes | tail 5
There is often logic that is shared between various queries. Multiple detections can utilize macros for code reuse and consistency. For instance, a macro could exist that identifies if a file is associated with system or network enumeration:
macro ENUM_COMMAND(name) name in ("whoami.exe", "hostname.exe", "ipconfig.exe", "net.exe", ...)
Once defined, macros are used and expanded with the function call syntax. One useful query for hunting may be to find enumeration commands that were spawned from a command shell that is traced back to an unsigned process:
process where ENUM_COMMAND(process_name) and parent_process_name=="cmd.exe" and descendant of [process where signature_status=="noSignature"]
Real-Time Detection With EQL
Since historical searches and real-time analytics are both described with EQL, it’s easy to check new protections for noise before deployment. This is crucial because alert fatigue is one of the most common problems faced by the defender today. When creating a new analytic, Endgame researchers use a refinement process to filter out false positives.
When detecting malicious behavior and attacker techniques, the first step is often detonating malware or a malicious script and collecting endpoint data. Once data is collected, events are explored with EQL and a new query is written. To establish a reasonable degree of confidence, we then evaluate the query against many sources of data, including Endgame’s internal network, partner data, and custom environments that are intentionally noisy. After passing these checks, a new tradecraft analytic expressed in EQL is enriched with metadata and converted to a format which Endgame’s detection engine understands. Finally, this machine representation of the query is loaded into the sensor, where new events are analyzed in real-time and an alert is generated immediately when a match is detected.
Writing Behavioral Malware Detections
EQL is not limited by the underlying data. In fact, we use EQL in our malware detonation and analysis sandbox, <a href="https://www.endgame.com/blog/executive-blog/endgame-arbiter-solving-now-what-problem">Arbiter</a>® , which has a different underlying data schema. Expressing behavioral detections of malware with EQL in Arbiter® is painless, and our custom analysis engine performs orders of magnitude faster than other approaches we evaluated, allowing us to rapidly perform dynamic malware analysis and detect new behaviors.
PROCESS INJECTION DETECTION IN ARBITER
A traditional remote shellcode injection technique uses several well documented Windows APIs to open a handle to a remote process, allocate memory, write to the newly allocated memory, and start a thread. The code generally looks something like:
hVictim = OpenProcess(PROCESS_ALL_ACCESS, 0, victimPid); lpDestAddress = VirtualAllocEx(hVictim, NULL, numBytes, MEM_COMMIT|MEM_RESERVE, PAGE_EXECUTE_READWRITE); WriteProcessMemory(hVictim, lpDestAddress, lpSourceAddress, numBytes, NULL); CreateRemoteThread(hVictim, NULL, 0, lpStart, NULL, NULL, NULL); CloseHandle(hVictim);
To build an analytic using API events, the process handle needs to be correctly tracked. The handle
hVictim is first returned by
OpenProcess and then used as an argument in the calls to
CreateRemoteThread. However, if and when
CloseHandle is called, the handle is invalidated and all state for that handle needs to be thrown away, because it may be reused. It may sound complicated, but a stateful detection for Arbiter® is easy to create with EQL.
sequence [api_ret where function_name=="OpenProcess"] by return_value [api_call where function_name=="VirtualAllocEx"] by arguments.hProcess [api_call where function_name=="WriteProcessMemory"] by arguments.hProcess [api_call where function_name=="CreateRemoteThread"] by arguments.hProcess until [api_call where function_name=="CloseHandle"] by hObject
One sequence is not enough to detect all forms of process injection. There are many methods to gain arbitrary code execution in another process, each with different API calls, and each requiring another detection. Consequently, as the attacker’s playbook continues to evolve, defenders need to react quickly and find new ways to detect the latest techniques while simultaneously promoting layered detections.
Unifying Hunt and Detection
With the Event Query Language, we can describe events that correspond to adversary techniques without dealing with the mechanics of traditional databases, rule engines, or an awkward data schema. Search, hunt, and detect are unified within the Endgame platform by EQL, where exploring events is made easy without sacrificing power and flexibility. Ultimately, EQL helps Endgame and Endgame customers quickly find suspicious activity and improve detections of attacker techniques defined in MITRE’s ATT&CKTM.
Advancing our collective understanding and adoption of security tools is of utmost importance to combat today’s threats. At Endgame, sharing information about capabilities we’ve built and the underlying architecture which motivated our decisions is one way for us to do that. We look forward to discussions within the security community going forward about EQL and its value in driving advanced hunt and detection in a way that is performant, robust, and most importantly, empowering to defenders.