CI/CD pipeline abuse: the problem no one is watching

Preamble

In 2025 and 2026, we watched a pattern play out across the industry. Attackers stopped going after production servers directly and started targeting the automation that deploys to them. Compromised developer credentials, a modified workflow file, and suddenly every secret in a CI/CD environment is streaming to an attacker-controlled endpoint. We saw this play out across incidents involving major open-source projects, Fortune 500 companies, and critical infrastructure tooling.

The attack chain is deceptively simple:

Stolen developer credentials → Modified workflow file → Harvested CI secrets → Lateral movement to cloud and production

Today we are open-sourcing cicd-abuse-detector, a drop-in CI template that uses regex-based signal extraction and LLM analysis to detect suspicious changes to CI/CD pipelines. It works across GitHub Actions, GitLab CI, and Azure DevOps, and is designed around the real-world attack techniques documented in public security research.

Key takeaways

CI/CD environments are high-value targets because a single compromised workflow can exfiltrate cloud credentials, package registry tokens, code signing keys, deploy keys, and OIDC tokens simultaneously
The tool extracts 50+ regex and metadata signals from diffs, then passes them with the full diff to Claude for structured threat analysis. No Python, no dependencies beyond bash and the Claude Code CLI
Detection patterns were tested against offensive toolkits like Nord Stream and Gato-X, and against real incidents including ArtiPACKED and HackerBot-Claw
The project ships with 19 malicious and four benign example diffs modeled after specific incidents, and an automated test suite that validates every signal

Why CI/CD pipelines are a top target

If you spend time reviewing GitHub Actions or GitLab CI configurations, you might notice how much trust is concentrated in these files. A typical deployment workflow has access to AWS credentials, npm publish tokens, Docker Hub passwords, and a GitHub token with write permissions, all at the same time. The attack surface isn't a server with a CVE, it's a YAML file.

Credential harvesting at scale

An attacker with stolen developer credentials modifies a workflow to exfiltrate secrets available in the CI environment. The GhostAction campaign in September 2025 demonstrated this at scale, compromising 327 GitHub users across 817 repositories. 3,325 secrets were stolen through injected workflow files that POST'd credentials to attacker endpoints.

The Shai-Hulud npm worm went further. This self-propagating attack harvested GitHub Personal Access Tokens via gh auth token, ran TruffleHog for secret reconnaissance, and used compromised tokens to silently inject malicious code into other packages owned by the same developer. Over 46,000 malicious packages were published in the first wave alone.

Privileged trigger exploitation

The pull_request_target trigger is one of the most dangerous features in GitHub Actions. Unlike a regular pull_request trigger, it runs workflows in the context of the base repository with access to secrets, but it can execute code from an untrusted fork. The Orca "Pull Request Nightmare" research demonstrated this against repositories maintained by Google, Microsoft, and NVIDIA.

In February 2026, an automated campaign called HackerBot-Claw systematically scanned public repositories for this exact misconfiguration. It used five different exploitation techniques, including poisoned Go init() functions, branch name command injection, filename-based injection, direct script injection, and AI prompt injection against Claude-based code reviewers. In the most severe case, Aqua Security's Trivy repository was fully compromised, leading to a downstream supply chain attack that exposed 33,000 secrets across nearly 7,000 machines. As documented, this supply chain attack was made possible with compromised tokens that were valid weeks after initially stolen.

The rest of the taxonomy

Beyond credential harvesting and trigger exploitation, the threat model covers four additional categories that appear consistently in public research:

Permission escalation, where adding permissions: write-all or id-token: write broadens the blast radius of any compromise
Runner targeting, redirecting jobs to self-hosted runners that often have network access to internal infrastructure, or specifying attacker-controlled container images
Supply chain manipulation through mutable action references (using @main instead of SHA-pinned versions), remote script execution (curl | bash), lockfile registry swaps, and dependency poisoning
Defense evasion via commit timestamp manipulation, making malicious files appear old and trusted. KL4R10N documented this technique in DPRK-linked campaigns where backdated commits reference infrastructure that did not exist at the claimed date

Each of these maps to specific MITRE ATT&CK techniques: T1552 (Unsecured Credentials), T1195 (Supply Chain Compromise), T1070.006 (Timestomp), and T1059 (Command and Scripting Interpreter).

How the detector works

We wanted the templates to work without requiring Python, custom runtimes, or complex dependencies. Everything runs in standard shell utilities on a default ubuntu-latest runner, and the only installed tool is the Claude Code CLI via npm, which handles authentication, retries, and model routing.

Stage 1: Filter and diff

When a pull request is opened (or a push lands on a protected branch), the workflow identifies changed files across three tiers of CI/CD-relevant paths. The first tier covers core CI files like workflow definitions, pipeline configs, and Makefiles. The second covers build and release artifacts like Dockerfiles, package manifests, lockfiles, and signing or deploy scripts. The third tier picks up developer environment configs like .vscode/tasks.json and .devcontainer files.

Each file is diffed individually and capped at 10,000 characters. We do this per-file rather than globally because a single cap on the combined diff is a bypass vector. An attacker can pad a malicious workflow change with a large benign Dockerfile edit to push the exploit past the character limit.

Stage 2: Signal extraction

Before the LLM sees anything, 50+ regex patterns scan each diff for known-dangerous patterns. These signals are advisory. They never gate the analysis, but they provide the LLM with a pre-screened threat summary. A few examples:

Signal	Pattern	What it catches
`secrets_context`	`${{.*secrets.`	Direct secret interpolation in workflows
`pull_request_target`	`pull_request_target`	The dangerous trigger that grants secrets to PR code
`checkout_ref`	`ref:.*github.event.pull_request.head.(sha\|ref)`	Untrusted PR code checked out in a privileged context
`double_base64`	`base64.\|.base64`	Double-encoding to evade log masking (Nord Stream technique)
`ld_preload`	`LD_PRELOAD`	Arbitrary code execution via environment variable injection
`vscode_auto_task`	`runOn.*folderOpen`	VS Code task that executes on folder open (Contagious Interview)

The signal list is based on real adversarial tooling, including Nord Stream and Gato-X, and tested against 19 malicious example diffs modeled after specific incidents.

The detector runs identically across GitHub Actions, GitLab CI, and Azure DevOps. Here are detections firing on each platform:

Stage 3: LLM analysis

The signal summary, full diff, author profile, and commit metadata are bundled and sent to Claude via the Claude Code CLI. The analysis prompt walks the model through several areas:

Diff comprehension and per-file risk assessment
Signal interpretation with context (a signal alone is not a verdict)
Temporal analysis for backdated commits
Author trust assessment using account age, contribution history, and org membership
Severity calibration against a signal combination table with 60+ entries
False positive recognition (e.g., cURL for downloading known tools is not exfiltration)
Concrete, actionable recommendations ("Pin actions/setup-node@main to a specific SHA" instead of "review carefully")

The output is a structured JSON verdict containing severity, confidence, reasoning, evidence, and recommendations, all validated against a JSON Schema.

Stage 4: Alert and gate

Based on the verdict severity, the workflow posts a step summary, creates an issue, sends a Slack notification, and optionally fails the PR check if severity meets a configured threshold.

Alerts in Slack and GitHub Issues solve the immediate notification problem, but they don't give you a queryable history. Every verdict the detector produces (e.g. benign, suspicious, or malicious), can optionally ship to Elasticsearch as a structured document in the logs-cicd.abuse-default data stream. The workflow ships the verdict along with CI/CD metadata (platform, repository, actor, event type, run URL) into a single index that spans all three supported platforms.

This is where cross-platform correlation becomes practical. A GitHub Actions alert and a GitLab CI alert from the same actor land in the same data stream, queryable in a single ES|QL statement:

FROM logs-cicd.abuse-* 
WHERE verdict.verdict IN ("malicious", "suspicious") AND @timestamp > NOW() - 7 days 
EVAL platform = cicd.platform, repo = cicd.repository, actor = cicd.actor, severity = verdict.severity
KEEP @timestamp, platform, repo, actor, severity
SORT @timestamp DESC

The schema includes cicd.platform, cicd.repository, cicd.actor, and the full verdict object (verdict, severity, confidence, summary, reasons, evidence), making it straightforward to build detection rules. A coordinated campaign that hits multiple repos within an hour, a repeat offender flagged across platforms, or a spike in critical findings that warrants an incident response page can be correlated.

Validating against real attacks

To validate coverage, we compared our detection patterns against the actual source code of offensive tools, published research, and public post-mortems.

Nord Stream: verbatim payload matching

Nord Stream is Synacktiv's open-source CI/CD secret extraction tool supporting GitHub, GitLab, and Azure DevOps. We pulled the YAML generator source (nordstream/yaml/github.py) and compared its output templates against our example diffs.

The GitHub payload template uses env -0 | awk -v RS='0' '/^secret_/ {print $0}' | base64 -w0 | base64 -w0. Our nord-stream-pipeline-exfil.diff contains this line verbatim, and our double_base64, env_null_dump, and env_secret_grep signals all fire.
The OIDC Azure template uses azure/login@v1 with id-token: write permissions followed by az account get-access-token | base64 -w0 | base64 -w0. Our diff captures this exact flow and triggers cloud_auth_action and id_token_write.
The Azure DevOps pipeline techniques (addSpnToEnvironment for SPN credential exposure, DownloadSecureFile for secure file theft, SSH task source patching via ssh.js modification) are all present in nord-stream-azure-devops.diff and detected by platform-specific signals.

ArtiPACKED: the artifact race condition

The ArtiPACKED research from Palo Alto Unit 42 showed that uploading the entire checkout directory as an artifact leaks the .git/config file containing the GITHUB_TOKEN. With the v4 artifact API allowing mid-run downloads, an attacker can extract and use the token before the job completes.

Our artifact-token-leak.diff models this exact pattern, using upload-artifact with path: . (the entire workspace). The upload_artifact signal catches it, and the LLM evaluates whether the upload scope includes the .git directory.

GITHUB_ENV injection: LD_PRELOAD to RCE

Legit Security's research on Google Firebase and Apache showed that writing untrusted input to $GITHUB_ENV allows an attacker to set arbitrary environment variables like LD_PRELOAD and NODE_OPTIONS, achieving code execution in privileged workflows.

Our github-env-injection.diff reproduces this technique with three distinct payloads, including LD_PRELOAD pointing to a malicious shared object, NODE_OPTIONS with a required injection, and $GITHUB_PATH manipulation. The github_env_write, ld_preload, and github_path_write signals all trigger as expected.

Contagious Interview: IDE config as initial access

The Contagious Interview campaign attributed to DPRK targets developers through fake job interviews, distributing repositories with .vscode/tasks.json files that auto-execute on folder open. The presentation is hidden (reveal: never, echo: false), and the payload uses curl | node for silent execution.

Our ide-config-poisoning.diff captures the full attack chain, including the auto-execute trigger (runOn: folderOpen), the hidden presentation, the curl | node payload, the files.exclude entry that hides the .vscode directory, and a trojanized postinstall hook with base64-encoded URLs and eval() for code execution. Six signals pick this up at once.

Defensive recommendations

Beyond deploying the detector, here are some hardening measures that came directly out of the attack patterns we studied:

Pin all actions to SHA, not tags, not branches. SHA-pinned references prevent retroactive tag modification attacks like tj-actions (CVE-2025-30066).
Scope secrets to individual steps rather than using job-level environment variables. Each step should only have access to the secrets it actually needs.
Use short lived, ephemeral tokens when possible to reduce attack surface
Avoid pull_request_target unless strictly necessary. If you must use it, never checkout the PR head code in the same workflow. Use a separate workflow_run-triggered workflow for operations that need both secrets and PR context.
Set explicit permissions on every workflow because the default token permissions are far too broad. Set permissions: {} at the workflow level and add specific permissions per job.
Enable persist-credentials: false on checkout since the default behavior of actions/checkout persists the GITHUB_TOKEN in the .git directory. If you upload artifacts, this token goes with them.

Summary

CI/CD pipelines have become a major attack surface for supply chain compromise. The same automation that makes modern software delivery possible is what attackers exploit to harvest credentials, poison packages, and pivot to cloud infrastructure. Traditional code review doesn't catch these patterns well because they're subtle, platform-specific, and designed to look like legitimate DevOps changes.

Combining regex-based signal extraction with LLM reasoning lets us surface these patterns at the pull request stage, before they reach production. The repo includes the full threat model, test suite, and example diffs if you want to dig into the details or adapt it to your own environment.

To get started, check out the cicd-abuse-detector repo for setup instructions, the full threat model, and example diffs. We're always interested in hearing about new attack patterns and detection ideas. Chat with us in our community Slack, and ask questions in our Discuss forums.

CI/CD abuse through MITRE ATT&CK

We use the MITRE ATT&CK framework to map the tactics, techniques, and procedures that adversaries use against CI/CD pipelines.

Tactics

Tactic	CI/CD Relevance
Credential Access (TA0006)	Harvesting secrets from CI environments
Execution (TA0002)	Running commands in pipeline runners
Persistence (TA0003)	Scheduled triggers, cron-based workflows
Defense Evasion (TA0005)	Commit timestamp manipulation, log masking evasion
Initial Access (TA0001)	Compromised developer credentials, phishing for PATs
Lateral Movement (TA0008)	Using harvested cloud credentials to pivot

Techniques

Technique	CI/CD Application
T1552: Unsecured Credentials	Secrets exposed in CI environment variables, artifacts, and runner memory
T1195.002: Compromise Software Supply Chain	Poisoned actions, dependencies, and lockfiles
T1059: Command and Scripting Interpreter	curl
T1070.006: Timestomp	Backdated commit dates to evade review
T1098: Account Manipulation	Permission escalation via write-all, id-token: write
T1078: Valid Accounts	Stolen developer PATs used to modify workflows

References

The following were referenced throughout the above research:

About Elastic Security Labs

Elastic Security Labs is the threat intelligence branch of Elastic Security dedicated to creating positive change in the threat landscape. Elastic Security Labs provides publicly available research on emerging threats with an analysis of strategic, operational, and tactical adversary objectives, then integrates that research with the built-in detection and response capabilities of Elastic Security.

Follow Elastic Security Labs on Twitter @elasticseclabs and check out our research at www.elastic.co/security-labs/.