State of AI Agent SecurityQ1 2026
We scanned the entire ClawHub Skills Registry: 40,059 skills across 14,808 publishers, run through 324 detection rules. 6,943 had at least one security finding. 89 were individually verified as high-severity threats.
TL;DR
- 89 confirmed high-severity findings including 14 active malware, 59 exposed credentials, 16 test fixtures
- 6,943 skills (17.3%) have at least one security finding across 40,059 scanned
- 27 individual skills combine all three legs of the lethal trifecta in a single execution context
- Only Claude Opus refused all three attack classes and seven bypass attempts
40,059
Skills scanned
Full ClawHub registry
6,943
With security findings
2,549 critical-level
89
Confirmed high-severity
Each individually verified
27
Lethal trifecta skills
Private data + untrusted content + external comms
Confirmed High-Severity Findings
High-confidence findings based on exact signature matches and known-malicious blocklist. These are not heuristics. They are verified attack patterns.
89 skills individually verified via deep scan with source code context. 14 active malware (crypto miners, C2 callbacks, MITM interception, obfuscated payloads), 59 exposed live credentials (API keys, tokens, database passwords), and 16 scanner benchmark test fixtures.
89
confirmed
700 findings match known attack techniques (curl|bash, keychain access, sandbox bypass, identity file modification) but have legitimate uses in context. These require human review.
700
dangerous capabilities
27 individual skills combine private data access, untrusted content ingestion, and external communication within a single execution context. This is the exact condition for indirect prompt injection to escalate to data exfiltration.
27
lethal trifecta
Multi-Model Vulnerability
We ran structured penetration tests against five frontier models using three attack classes: remote code execution via curl|bash instructions, identity poisoning via system prompt injection, and classic prompt injection. Results reflect the model's behavior independent of any CLI or sandbox layer.
| Model | curl | bash | Identity poisoning | Prompt injection |
|---|---|---|---|
| Gemini 3 (Gemini CLI) | Executed | Executed | Executed |
| Codex (OpenAI Codex CLI) | Executed (sandbox blocked) | Executed | Refused |
| Claude Haiku | Willing (CLI blocked) | Willing (CLI blocked) | Willing (CLI blocked) |
| Claude Sonnet | Refused | Refused | Refused (Python bypass worked) |
| Claude Opus | Refused all + 7 bypasses | Refused | Refused |
"Willing (CLI blocked)" indicates the model expressed intent to execute the action; the CLI layer prevented it. "Executed (sandbox blocked)" indicates execution occurred within the model's sandbox; network or filesystem was blocked by the host environment.
Key Insight
- →Only Claude Opus refused all three attack classes and survived all seven bypass attempts
- →Most models will execute curl|bash instructions when framed as "installation steps"
- →CLI and sandbox layers are often the last line of defense, not the model itself
Threat Category Breakdown
21 detection categories across 324 detection rules. Counts reflect findings across 6,943 skills. Every high-confidence finding individually verified via deep scan with source code context.
Novel Findings
Three patterns unique to AI agent ecosystems that have no direct parallel in traditional software supply chain security.
Documentation IS Execution
In traditional software, documentation is inert. In AI agent ecosystems, it is not. Markdown files, README content, and inline comments are read and acted upon by language models as instructions. Our scan found that 5% of all skills (2,104) fetch untrusted content into agent context. This is indirect prompt injection delivery by architecture, not intent. The attack surface for AI agents is the entire text of the repository, not just the code.
The Defensive Paradox
Security-oriented skills (tools designed to scan, monitor, or audit an agent's environment) triggered our detection rules at a disproportionately high rate (41% of flagged security tools). Skills that legitimately need to read SSH keys, scan network ports, or inspect environment variables look identical to attackers performing the same actions. This creates a fundamental classification problem: the same capabilities that make a security tool useful also make it look malicious. Current heuristic-based scanners cannot resolve this without context.
Cross-Agent Propagation
637 skills contained agent memory poisoning patterns designed to modify the system prompt or tool definitions of other agents in a multi-agent environment. This is not a theoretical concern. When Agent A installs a compromised skill, that skill can inject instructions into Agent B's context window during a handoff. The compromised skill effectively poisons downstream agents without those agents ever directly installing anything malicious. Standard per-agent scanning misses this entirely: the infection vector is the communication channel, not the installation path.
The attack surface for AI agents is the entire text of the repository, not just the code.
Methodology
- Scanner version v2026.1.2 with 324 detection rules across 21 threat categories
- 3-stage methodology: static scan (free, open source), deep scan (product feature), independent Opus peer validation
- Scan pinned to ClawHub registry state as of 2026-03-27
- 40,059 skills across 14,808 publishers scanned in under 10 minutes (8 parallel workers)
- 2,967 high-confidence findings individually verified with 20 lines of source context
- Independent peer review: 30-sample adversarial spot-check achieved 96.7% accuracy
Access the Data
The anonymized dataset and scan methodology are available for independent verification and further research.
Anonymized Dataset
Aggregated statistics, category breakdowns, and sanitized skill metadata. No raw skill content or author information is included.
Check “Include anonymized dataset” in the form to receive the JSON alongside the PDF.
Reproduce This Scan
The scanner is open source. Run the same scan yourself against the ClawHub registry or your own agent stack.
Requires Node.js 18+. Scan is free. No account required.
Scan your own AI agent stack
What we found in the ClawHub registry likely reflects your own environment. The same skills, same patterns, same attack vectors. Find out what is running in your stack before someone else does.