State of AI Agent Security: Q1 2026
6,943 AI agent skills have security flaws. We scanned all 40,059 across 14,808 publishers. The largest published agent skill security analysis.
TL;DR
- 40,059 AI agent skills scanned across 14,808 publishers. The largest published agent skill security analysis.
- 89 confirmed high-severity threats: 14 active malware, 59 exposed credentials, 16 test fixtures.
- 27 skills combine all three legs of the lethal trifecta: private data access + untrusted content + external communication.
- Only Claude Opus refused all three attack classes and all seven bypass attempts in multi-model pentesting.
Executive Summary
We scanned the entire ClawHub Skills Registry: 40,059 skills across 14,808 publishers. 6,943 had at least one security finding. 89 were individually verified as high-severity threats. This report covers the methodology, the findings, the novel patterns, and the multi-model pentesting results.
Confirmed High-Severity Findings
Multi-Model Vulnerability
We ran structured penetration tests against five frontier models using three attack classes: remote code execution via curl|bash instructions, identity poisoning via system prompt injection, and classic prompt injection. Results reflect the model's behavior independent of any CLI or sandbox layer.
| Model | curl | bash | Identity poisoning | Prompt injection |
|---|---|---|---|
| Gemini 3 (Gemini CLI) | Executed | Executed | Executed |
| Codex (OpenAI Codex CLI) | Executed (sandbox blocked) | Executed | Refused |
| Claude Haiku | Willing (CLI blocked) | Willing (CLI blocked) | Willing (CLI blocked) |
| Claude Sonnet | Refused | Refused | Refused (Python bypass worked) |
| Claude Opus | Refused all + 7 bypasses | Refused | Refused |
Key Insight
- →Only Claude Opus refused all three attack classes and survived all seven bypass attempts
- →Most models will execute curl|bash instructions when framed as "installation steps"
- →CLI and sandbox layers are often the last line of defense, not the model itself
Novel Findings
Three patterns unique to AI agent ecosystems that have no direct parallel in traditional software supply chain security.
1. Documentation IS Execution
- →In AI agent ecosystems, markdown files, README content, and inline comments are read and acted upon by language models as instructions
- →5% of all skills (2,104) fetch untrusted content into agent context
- →The attack surface for AI agents is the entire text of the repository, not just the code
2. The Defensive Paradox
- →Security-oriented skills triggered detection rules at a disproportionately high rate (41% of flagged security tools)
- →Skills that legitimately need to read SSH keys or scan network ports look identical to attackers performing the same actions
- →Current heuristic-based scanners cannot resolve this without context
3. Cross-Agent Propagation
- →637 skills contained agent memory poisoning patterns designed to modify system prompts of other agents
- →When Agent A installs a compromised skill, it can inject instructions into Agent B during a handoff
- →Standard per-agent scanning misses this: the infection vector is the communication channel, not the installation path
The attack surface for AI agents is the entire text of the repository, not just the code.
Methodology
- 3-stage methodology: static scan (free, open source), deep scan (product feature), independent Opus peer validation
- Scan pinned to ClawHub registry state as of 2026-03-27
- 40,059 skills across 14,808 publishers scanned in under 10 minutes (8 parallel workers)
- 2,967 high-confidence findings individually verified with 20 lines of source context
- Independent peer review: 30-sample adversarial spot-check achieved 96.7% accuracy
References & Sources
- [1]Firmis Scanner (Apache-2.0)- Open source CLI used for this scan
- [2]ClawHub Skills Registry- 40,059 skills, 14,808 publishers as of 2026-03-27
- [3]MCPTox: MCP Tool Poisoning Research- 72.8% tool poisoning success rate
- [4]Koi Security: ClawHub Malicious Skills Report- 341 malicious skills identified
- [5]HackerOne: AI Agent Attack Trends 2025- 540% surge in AI agent attacks
Try It Now
Find out if your agent stack is safe
One command. 30 seconds. Free.
Fix and Monitor included with Pro
View pricing