Ransomware protection in the open: Advancing efficacy through community collaboration
Free and open access is one of the core principles upon which Elastic was originally built and continues to operate. Our products are free to use, and much of our code is accessible in public source code repositories. In recent years, this commitment to transparency and availability has extended to our security offerings. We believe that distributing more of our key security components to the masses and highlighting their inner workings will greatly enhance our product and provide a tangible benefit to researchers and practitioners alike.
In August 2022, we began distributing our EQL and YARA detection rules out in the open in our Protections Artifacts repository, as we have previously done with Security App detection rules. At that time, we also committed to continue operating in a transparent manner and share more artifacts with the community in the future. Today, Elastic is making our ransomware protection artifact available for download as part of our effort to further openness and transparency within security.
Security researchers have embraced openness when it comes to combating the existential threat that ransomware poses to internet users throughout the world. ID Ransomware and the Bleeping Computer forums are two notable resources when it comes to providing victims of ransomware attacks with tools and guidance to better triage their affected systems and, in certain cases, even recover their encrypted files. In a similar vein, our hope is that by publishing the underlying logic that drives our ransomware capabilities in our Elastic Defend integration, we will be able to solicit more meaningful and honest feedback from our users, improve our detection efficacy, and address any critical vulnerabilities that may be discovered. We also hope that this will encourage discussion about how detection capabilities need to evolve and keep pace with the threat that ransomware continues to pose.
Ransomware protection: A brief architecture overview
In order to detect ransomware as quickly as possible on a Windows host, we analyze each file modification event immediately after it has taken place. The general idea is to inspect each file event on its own, as well as within the context of other file events previously attributed to the same process. If a given process has exhibited a sufficient amount of anomalous behavior within a predetermined time frame, an alert will be generated.
The ransomware protection feature built into Defend is based off of a previous capability. One significant innovation that our research team used was introducing an artifact-based detection framework. The initial retrieval and filtering of file events still takes place within the endpoint, but the detection logic is implemented in Lua rather than a lower level programming language. This Lua code, stored in a standalone artifact on disk, will be loaded by the endpoint into a previously initialized and restricted Lua environment.
One of the primary benefits of decoupling the filtering and detection mechanisms is that we can quickly ship out new artifact updates without having to wait for an endpoint update. This way, we can rapidly respond if there are any significant issues discovered relating to detection gaps, false positives that are not easily tuned out with our current exception list capabilities, or critical vulnerabilities that may put our users at risk.
Another benefit of storing our detection logic in Lua is that it has enabled our research team to more easily build tools that aim to provide deeper insight into the current range of metrics that we utilize to discover anomalies.
Automated ransomware analysis pipeline
It goes without saying that chasing every ransomware family or variant manually is impractical due to the large number of samples that appear daily. To close gaps in our heuristics and maintain high detection rates, we built an in-house ransomware analysis pipeline that consumes a daily feed of malware samples, filters out samples that aren’t ransomware, and then sends those samples to a malware detonation service we call “Detonate.”
Detonate runs each sample in a controlled virtual environment for a predefined period (approximately 5 minutes) while monitoring its behavior: hooking the OS system calls like process, thread, load image, and file events to name just a few forms of instrumentation. At the end of the analysis, all intercepted behaviors are stored in an Elasticsearch database for post-processing.
Next, worker processes kick off jobs to sift through file modification events from all the other observable events that the endpoint captured and then serialize those events in a JSON file we call an “event trace.” Here is an example of one entry inside the event trace for a sample belonging to the Babuk ransomware family:
Every file event in an event trace provides context about the file modification operation. For example, we can see the file path, the type of file operation that took place, file rename information (if applicable), the process and thread identifiers, the parent process, file size, file entropy, and so much more.
We use the previously generated event traces as input to our ransomware detection module, which in turn analyzes the events and comes up with a verdict on whether the sample is malicious or not. If you’ve ever wondered why malware detection is sometimes called “conviction,” maybe it makes more sense.
The event traces play a significant role in reproducing a ransomware attack, as we only have to detonate a sample once and save the results in our corpus. Whenever we modify our Lua codebase and enhance our detection capabilities, we can simply re-run our analysis module against all event traces without needing to involve the endpoint or detonate the samples again.
At the highest level, the workflow looks like this:
So far, we’ve discussed how ransomware is detonated to generate an event trace. Now, let’s dig into the analysis framework that contains the malware detection logic.
Ransomware analysis framework
The ransomware analysis framework is a collection of tools written in Python to easily work with our Lua artifact codebase. This Lua code contains the necessary logic to detect anomalous behavior in file events.
One of the most important features of the framework is the ability to load a Lua runtime, feed it Lua artifact code, and replay a previously recorded event trace. By doing so, we can answer simple questions like: are we able to alert on the sample in question?
The example below runs the replay command over a sample from the Donut ransomware family. We can see that we have successfully generated an alert.
Our detection framework is based on a scoring system; as file modification events come in, we continue to evaluate them against our ransomware detection heuristics. Each file event may or may not increase the score depending on how anomalous it appears to be. If the score reaches a certain threshold, we raise an alert and terminate the malicious process.
The next interesting question a security researcher would want to ask is what are the metrics that contributed to the detection of a ransomware sample? Below we run the same command as before but with an extra flag to explain the alert score.
The output is larger than what we see in the screenshot above. For the sake of illustration, we limited the output to two events. The first event suggested that the ransomware deleted the file "C:\\Python39\\include\\internal\\pycore_call.h". The second event created the same file path but with an extra .donut extension appended to the end of the file name. This is a common ransomware behavior pattern. The file is first read from disk and then encrypted in memory. Afterwards, the original file is deleted from disk and a newly-encrypted one is created with the ransomware extension appended to the original file path.
In this example, the extension is “.donut” and the subextension is “.h”. The “metrics” field contains the list of heuristics that matched against that event. For instance, CREATE_EXTENSION_KNOWN_SUBEXTENSION_KNOWN_AND_PREVIOUSLY_DELETED means that we have created a file that contains two extensions in its filename. The initial (.h) and appended (.donut) extensions are both known to us, and the file path without the appended extension was previously deleted. This event resulted in a score of 0.425.
Another useful command to run is analyze trace, which performs a holistic analysis over the whole trace and generates summary information:
Above, 52% of file operations were creation and 45% were deletion, which confirms the same ransomware pattern we have seen earlier. We can also see a breakdown of each unique directory:
It is very common for ransomware to drop a ransom note in each directory, and this view makes it easy to spot the ransom note (decrypt.txt).
More to come
As with other advanced threats, ransomware is far from being a solved problem. We plan on continuing to innovate and extend our detection capabilities further in the hopes of improving our efficacy as well as our mean time to detect. With our Lua artifact now out in the open, we will conduct this process as transparently as possible and will share our findings and success with our users and the security research community.