New Research: 6,943 AI agent skills have security flaws. We scanned all 40,059. Read the report →
Back to Journal
Agentic SecurityMarch 3, 2026·8 min read

9 Verified Exploit Chains Across 8 Agent Frameworks

Individual findings are noise. Exploit chains are signal. We traced 9 complete attack paths across 8 major agent frameworks, from initial access to data exfiltration.

TL;DR

  • Your scanner shows thousands of findings. Firmis shows the 9 that are actual attack paths, so you fix what matters instead of chasing noise.
  • mcp-chrome + Codex, Cline env logging, Eliza clipboard reads, Flowise phishing injection, and soul-file tampering are all verified exploit chains.
  • 3 major frameworks grant agents root access via docker.sock. If you use camel-ai, autogen, or ragflow, your agents can escape the container.
  • Out of 3,972 raw findings across 56 repos, only 34 are confirmed threats. Firmis surfaces the chains so your team spends time on real risk.

Static analysis produces thousands of findings. Most are noise. A stray process.env reference is interesting. The same reference inside a tool that also has outbound HTTP calls and runs with elevated privileges is a chain. That distinction is the entire game.

Firmis scanned 56 open-source agent framework repos and traced every finding forward and backward through the execution graph. Nine chains held. For security teams, this means instead of triaging 3,972 alerts, you review 34 confirmed threats and know exactly which 9 require immediate remediation.

56
Repos Scanned
3,972
Raw Findings
34
Confirmed Threats
9
Verified Chains

3,972 findings. 34 confirmed threats. The difference between a finding and a chain is the difference between noise and a breach.

What Makes a Chain

A chain requires at least two links: an access primitive and an exfiltration or persistence primitive. The links do not need to be in the same file. They need to be reachable in the same execution context, whether that is a session, a subprocess, or a shared config.

1

Access

Read credentials, environment variables, files, or memory

2

Transport or Persist

Send outbound, write to disk, encode for later, or modify agent config

3

Trigger

A condition that activates the chain: a session, a prompt, a request

The 9 Verified Chains

Each entry below is a confirmed two-or-more-link chain. Framework names are the public repo identifiers. All findings were submitted to maintainers before publication.

FrameworkChain Link 1Chain Link 2Impact
mcp-chromebypassPermissions=true (all sessions)Codex engine integrationFull code execution chain
ClineJSON.stringify(process.env) loggingLog file readable by toolsEnv exfiltration
Eliza (fork 1)Bulk env snapshot patternOutbound serializationCredential harvest
Eliza (fork 2)Clipboard read (desktop plugin)Session-scoped env readData exfiltration
KhojCredential readbase64-encoded outboundEncoded exfiltration
LangChainDeserialization gadgetEnv var access in scopeRCE + credential chain
SuperAGIBulk env patternTool-accessible configEnv exfiltration
ShannonBulk env patternOutbound HTTP capabilityCredential harvest
FlowiseGmail/Outlook/SendGrid integrationCustom code execution nodeInjection to phishing

Three Frameworks: Root via docker.sock

Host Escape via Docker Socket
camel-ai, microsoft/autogen, and infiniflow/ragflow all give agents access to the Docker socket. Any agent that can reach docker.sock can spawn a privileged container, mount the host filesystem, and escape the container entirely. This is not a theoretical issue. It is a well-documented host escape primitive with public exploit code.

The docker.sock pattern appears in agent frameworks because it is convenient for orchestration. Spawning sub-containers, running sandboxed code, managing sibling services. Legitimate use cases exist. The problem is that the same socket that manages containers can trivially become a root shell on the host.

Firmis finding: docker.sock exposure
CRITICAL docker-sock-exposure
Agent has write access to /var/run/docker.sock
Chain: socket access + container spawn = host root
Affected: camel-ai, microsoft/autogen, infiniflow/ragflow
Fix: Remove docker.sock mount or scope to read-only

Persistent Compromise via Soul Files

The most interesting chain in this set. Agents in one ByteDance framework can modify their own soul files, the personality and instruction configs that define how the agent behaves. A compromised agent that can rewrite its own instructions survives restarts and resets. Standard remediation (restart the container, redeploy the service) does not remove the compromise.

Standard Compromise

  • Agent executes malicious payload
  • Restart removes the compromise
  • Redeploy restores clean state

Soul-File Compromise

  • Agent modifies its own soul/personality file
  • Restart loads the modified instructions
  • Compromise persists across redeployments

Flowise: Injection to Phishing

Flowise ships with first-party Gmail, Outlook, and SendGrid integrations alongside a custom code execution node. The chain is direct: inject a payload into a Flowise workflow via a crafted input, the custom code node executes it, and the email integration delivers the phishing message at agent scale.

  • Custom code execution node provides arbitrary code execution within workflow scope
  • Gmail, Outlook, and SendGrid nodes provide authenticated outbound email at scale
  • No additional credentials needed once the workflow is compromised
  • Firmis flags this as: code-execution-node + outbound-email-integration = injection-to-phishing chain

What Firmis Catches That Others Miss

Standard static analysis tools find the individual findings. They flag JSON.stringify(process.env) in Cline. They flag the docker.sock mount in autogen. What they do not do is connect them into chains, score them by exploitability, and surface only the 34 that matter out of 3,972.

Standard scanner output

  • × 3,972 findings, unranked
  • × Each finding treated in isolation
  • × No cross-component graph traversal
  • × Analyst fatigue from the noise

Firmis output

  • 34 confirmed threats surfaced
  • 9 verified chains with full paths
  • Cross-component graph traversal
  • Prioritized by exploitability

Run This on Your Stack

  • npx firmis-cli init checks your agent configs, installed tools, and permission graphs in under 60 seconds
  • Chain detection is included in the free tier, no account required
  • If you run any of the 8 frameworks above, scan before your next deployment
  • Pro tier includes continuous monitoring so new chain links are flagged as dependencies update

Your scanner found 3,972 issues. Firmis found the 9 that matter.

$ npx firmis-cli init

References & Sources

  1. [1]
    mcp-chrome: bypassPermissions configuration- Default config grants unrestricted browser access
  2. [2]
    Cline process.env logging disclosure- JSON.stringify(process.env) in session logs
  3. [3]
    Docker socket host escape primitive- Well-documented container escape via docker.sock
  4. [4]
    Flowise injection-to-phishing chain- Custom code execution + email integration nodes
  5. [5]
    LangChain deserialization CVE-2023-44467- Arbitrary code execution via pickle deserialization
  6. [6]
    ClawHavoc campaign analysis- 553 malicious skills deploying AMOS stealer on macOS
  7. [7]
    Firmis Scanner (open source)- Apache-2.0, cross-platform agent security scanner

Try It Now

Find out if your agent stack is safe

One command. 30 seconds. Free.

$npx firmis-cli init

Fix and Monitor included with Pro

View pricing