Elastic Security Labs - Articles by Sergey Polzunov

Into The Weeds: How We Run Detonate

Tue, 13 Jun 2023 00:00:00 GMT

Preamble

In our first post in our Detonate series, we introduced the Detonate system and what we use it for at Elastic. We also discussed the benefits it provides our team when assessing the performance of our security artifacts.

In this publication, we will break down how Detonate works & dive deeper into the technical implementation. This includes how we’re able to create this sandboxed environment in practice, the technology that supports the overall pipeline, and how we submit information to and read information from the pipeline.

Interested in other posts on Detonate? Check out Part 1 - Click, Click…Boom! where we introduce Detonate, why we built it, explore how Detonate works, describe case studies, and discuss efficacy testing.

Architecture

Below is a high-level overview of the Detonate end-to-end architecture.

The overall system consists of a series of message queues and Python workers. Detonation tasks are created by an API server upon accepting a request with as little information as the sample file hash. The task then moves from queue to queue, picked up by workers that execute various operations along the way.
The server and workers run in a container on Amazon ECS. The pipeline can also be brought up locally using Docker Compose for early development and feature testing.

API server

The Detonate API server is a FastAPI python application that accepts a variety of execution target requests: hashes of samples, native commands (in bash or Powershell, with or without arguments), and uploaded files. The server also exposes endpoints for fetching alerts and raw agent telemetry from an Elastic cluster.

The API documentation is generated automatically by FastAPI and incorporated into our global API schema.

Interacting with the API server - CLI

We built a custom Python CLI (command-line interface) tool for interacting with our Detonate server. The CLI tool is built using the Python library click along with rich for a beautiful formatting experience in a terminal window. The tool is particularly useful for debugging the pipeline, as it can also be run against a local pipeline setup. The tool is installed and runs using Poetry, our preferred tool of choice for managing dependencies and running scripts.

❯ DETONATE_CLI_API_ROOT_URL="${API_ENDPOINT_URL}" \
	DETONATE_CLI_API_AUTH_HEADER="${API_KEY}" \
	poetry run cli \
	--hash "${MY_FILE_HASH}"

Interacting with the API server - Web UI

Internally, we host a site called Protections Portal (written using Elastic UI components) to assist our team with research. For a more interactive experience with the Detonate API, we built a page in the Portal to interact with it. Along with submitting tasks, the Web UI allows users to see the feed of all detonations and the details of each task.

Each task can be expanded to see its full details. We provide the links to the data and telemetry collected during the detonation.

Interacting with the API server - HTTP client

If our users want to customize how they interact with the Detonate API, they can also run commands using their HTTP client of choice (such as curl , httpie , etc.). This allows them to add detonations to scripts or as final steps at the end of their own workflows.

Queues

The pipeline is built on a series of queues and workers. Having very basic requirements for the message queues engine, we decided to go with Amazon SQS. One of the many benefits of using a popular service like SQS is the availability of open-source resources and libraries we can build upon. For example, we use softwaremill/elasticmq Docker images as a queue engine when running the pipeline locally.

The queues are configured and deployed with Terraform code that covers all our production and staging infrastructure.

Workers

Each worker is a Python script that acts as both a queue consumer and a queue producer. The workers are implemented in our custom mini-framework, with the boilerplate code for error handling, retries, and monitoring built-in. Our base worker is easily extended, allowing us to add new workers and evolve existing ones if additional requirements arise.

For monitoring, we use the Elastic APM observability solution. It is incredibly powerful, giving us a view into the execution flow and making debugging pipeline issues a breeze. Below, we can see a Detonate task move between workers in the APM UI:

These software and infrastructure components give us everything we need to perform the submission, execution, and data collection that make up a detonation.

Detonations

The pipeline can execute commands and samples in Windows, Linux, and macOS virtual machines (VMs). For Windows and Linux environments, we use VM instances in Google Compute Engine. With the wide selection of public images, it allows us to provision sandboxed environments with different versions of Windows, Debian, Ubuntu, CentOS, and RHEL.

For macOS environments, we use mac1.metal instances in AWS and an on-demand macOS VM provisioning solution from Veertu called Anka. Anka gives us the ability to quickly rotate multiple macOS VMs running on the same macOS bare metal instance.

Detonate is currently focused on the breadth of our OS coverage, scalability, and the collection of contextually relevant data from the pipeline. Fitting sophisticated anti-analysis countermeasures into Detonate is currently being researched and engineered.

VM provisioning

In order to keep our footprint in the VM to a minimum, we use startup scripts for provisioning. Minimizing our footprint is important because our activities within a VM are included in the events we collect, making analysis more complicated after a run. For Windows and Linux VMs, GCP startup scripts written in Powershell and bash are used to configure the system; for macOS VMs, we wrote custom bash and AppleScript scripts.

The startup scripts perform these steps:

Configure the system. For example, disable MS Defender, enable macros execution in MS Office, disable automatic system updates, etc.
Download and install Elastic agent. The script verifies that the agent is properly enrolled into the Fleet Server and that the policies are applied.
Download and detonate a sample, or execute a set of commands. The execution happens in a background process, while the main script collects the STDOUT / STDERR datastreams and sleeps for N seconds.
Collect files from the filesystem (if needed) and upload them into the storage. This allows us to do any additional verification or debugging once the detonation is complete.

The VM lifecycle is managed by the start_vm and stop_vm workers. Since we expect some detonations to break the startup script execution flow (e.g., in the case of ransomware), every VM has a TTL set, which allows the stop_vm worker to delete VMs not in use anymore.

This clean-slate approach, with the startup script used to configure everything needed for a detonation, allows us to use VM images from the vendors from Google Cloud public images catalog without any modifications!

Network configuration

Some of the samples we detonate are malicious and might produce malicious traffic, such as network scans, C2 callouts, etc. In order to keep our cloud resources and our vendor’s infrastructure safe, we limit all outgoing traffic from VMs. The instances are placed in a locked-down VPC that allows outgoing connection only to a predefined list of targets. We restrict traffic flows in VPC using Google Cloud’s routes and firewall rules, and AWS’s security groups.

We also make use of VPC Flow Logs in GCE. These logs allow us to see private network traffic initiated by sandbox VMs in our VPC.

Telemetry collection

To observe detonations, we use the Elastic Agent with the Elastic Defend integration installed with all protections in “Detect” (instead of “Protect”) mode. This allows us to collect as much information from a VM as we can, while simultaneously allowing the Elastic Security solution to produce alerts and detections.

We cover two use cases with this architecture: we can validate protections (comparing events and alerts produced for different OS versions, agent versions, security artifacts deployed, etc) and collect telemetry for analysis (for fresh samples or novel malware) at the same time. All data collected is kept in a persistent Elastic cluster and is available for our researchers.

Running in production

Recently we completed a full month of running Detonate pipeline in production, under the load of multiple data integrations, serving internal users through UI at the same time. Our record so far is 1034 detonations in a single day, and so far, we haven’t seen any scalability or reliability issues.

The bulk of the submissions are Windows-specific samples, for now. We are working on increasing our coverage of Linux and macOS as well – stay tuned for the research blog posts coming soon!

We are constantly improving our support for various file types, making sure the detonation is as close to the intended trigger behavior as possible.

Looking at the detonations from the last month, we see that most of the tasks were completed in under 13 minutes (with a median of 515 seconds). This time includes task data preparation, VM provisioning and cleanup, sample execution, and post-detonation processing.

These are still early days of the service, so it is normal to see the outliers. Since most of the time in a task is spent waiting for a VM to provision, we can improve the overall execution time by using custom VM images, pre-starting VM instances, and optimizing the startup scripts.

What's next?

Now that you see how Detonate works, our next posts will dive into more detailed use cases of Detonate. We’ll go further into how these detonations turn into protecting more of our users, including right here at Elastic!

Click, Click… Boom! Automating Protections Testing with Detonate

Thu, 04 May 2023 00:00:00 GMT

Preamble

Imagine you are an Endpoint artifact developer. After you put in the work to ensure protection against conventional shellcode injections or ransomware innovations, how do you know it actually works before you send it out into the world?

First, you set up your end-to-end system, which involves setting up several services, the infrastructure, network configuration, and more. Then, you run some malware; the data you collect answers questions about performance and efficacy, and may be an important research resource in the future. After you spend a day testing and gathering your results, you may want to run several hundred hashes over multiple kinds of operating systems and machine types, a daunting task if done entirely manually.

To automate this process and test our protections at scale, we built Detonate, a system that is used by security research engineers to measure the efficacy of our Elastic Security solution in an automated fashion. Our goal is to have it take security researchers only a couple of clicks to test our protections against malware. (Thus: click, click… boom!)

In this series of posts, we’ll: - Introduce Detonate and why we built it - Explore how Detonate works and the technical implementation details - Describe case studies on how our teams use it at Elastic - Discuss opening our efficacy testing to the community to help the world protect their data from attack

Interested in other posts on Detonate? Check out Part 2 - Into The Weeds: How We Run Detonate where we break down how Detonate works and dive deeper into the technical implementation.

What is Detonate?

At a high level, Detonate runs malware and other potentially malicious software in a controlled (i.e., sandboxed) environment where the full suite of Elastic Security capabilities are enabled. Detonate accepts a file hash (usually a SHA256) and performs the following actions:

Prepares all files needed for detonation, including the malicious file
Provisions a virtual machine (VM) instance in a sandboxed environment, with limited connectivity to the outside world
Waits until file execution completes; this happens when, for example, an execution result file is found or the VM instance is stopped or older than a task timeout
Stops the running VM instance (if necessary) and cleans up the sandboxed environment
Generates an event summary based on telemetry and alerts produced during detonation

The results of these detonations are made available to the team for research and development purposes. By post-processing the logs, events, and alerts collected during detonation, we can enrich them with third-party intelligence and other sources to evaluate the efficacy of new and existing Elastic Security protection features.

What does it help us with?

Measuring Efficacy

To build the best EPP on the market, we have to continuously measure the effectiveness of our product against the latest threats. Detonate is used to execute many tens of thousands of samples every month from our data feeds. Gaps in coverage are automatically identified and used to prioritize improvements to our protections.

Supporting existing protections

Many of our protections have associated artifacts (such as machine learning models and rule definitions) which receive regular updates. These updates need testing to ensure we identify and remediate regressions before they end up in a user’s environment.

Detonate provides a framework and suite of tools to automate the analysis involved in this testing process. By leveraging a corpus of hashes with known good and bad software, we can validate our protections before they are deployed to users.

Threat research

Some of our security researchers scour the internet daily for new and emerging threats. By giving them an easy-to-use platform to test malicious software they find in the wild, we better understand how Elastic Security defends against those threats or if we need to update our protections.

Evaluating new protections

In addition to testing existing protections, new protections run the risk of adverse interactions with our existing suite of layered capabilities. A new protection may be easily tested on its own, but tests may hide unintended interactions or conflicts with existing protections. Detonate provides a way for us to customize the configuration of the Elastic Stack and individual protections to more easily find and identify such conflicts earlier in development.

What’s next?

In this publication, we introduced Detonate & what we use it for at Elastic. We discussed the benefits it provides our team when assessing the performance of our security artifacts.

Now that you know what it is, we will break down how Detonate works. In our next post, we’ll dive deeper into the technical implementation of Detonate and how we’re able to create this sandboxed environment in practice.