New Research: 6,943 AI agent skills have security flaws. We scanned all 40,059. Read the report →
Back to Journal
ResearchApril 7, 2026·12 min read

State of AI Agent Security: Q1 2026

6,943 AI agent skills have security flaws. We scanned all 40,059 across 14,808 publishers. The largest published agent skill security analysis.

TL;DR

  • 40,059 AI agent skills scanned across 14,808 publishers. The largest published agent skill security analysis.
  • 89 confirmed high-severity threats: 14 active malware, 59 exposed credentials, 16 test fixtures.
  • 27 skills combine all three legs of the lethal trifecta: private data access + untrusted content + external communication.
  • Only Claude Opus refused all three attack classes and all seven bypass attempts in multi-model pentesting.

Executive Summary

40,059
Skills scanned
6,943
With security findings
89
Confirmed high-severity
27
Lethal trifecta skills

We scanned the entire ClawHub Skills Registry: 40,059 skills across 14,808 publishers. 6,943 had at least one security finding. 89 were individually verified as high-severity threats. This report covers the methodology, the findings, the novel patterns, and the multi-model pentesting results.

Confirmed High-Severity Findings

Known-Malicious Signatures
89 skills individually verified via deep scan with source code context.14 active malware (crypto miners, C2 callbacks, MITM interception, obfuscated payloads), 59 exposed live credentials (API keys, tokens, database passwords), and 16 scanner benchmark test fixtures.
Dangerous Capabilities
700 findings match known attack techniques (curl|bash, keychain access, sandbox bypass, identity file modification) but have legitimate uses in context. These require human review.
27 Skills with Full Attack Chain
27 individual skills combine private data access, untrusted content ingestion, and external communication within a single execution context. This is the exact condition for indirect prompt injection to escalate to data exfiltration.

Multi-Model Vulnerability

We ran structured penetration tests against five frontier models using three attack classes: remote code execution via curl|bash instructions, identity poisoning via system prompt injection, and classic prompt injection. Results reflect the model's behavior independent of any CLI or sandbox layer.

Modelcurl | bashIdentity poisoningPrompt injection
Gemini 3 (Gemini CLI)ExecutedExecutedExecuted
Codex (OpenAI Codex CLI)Executed (sandbox blocked)ExecutedRefused
Claude HaikuWilling (CLI blocked)Willing (CLI blocked)Willing (CLI blocked)
Claude SonnetRefusedRefusedRefused (Python bypass worked)
Claude OpusRefused all + 7 bypassesRefusedRefused

Key Insight

  • Only Claude Opus refused all three attack classes and survived all seven bypass attempts
  • Most models will execute curl|bash instructions when framed as "installation steps"
  • CLI and sandbox layers are often the last line of defense, not the model itself

Novel Findings

Three patterns unique to AI agent ecosystems that have no direct parallel in traditional software supply chain security.

1. Documentation IS Execution

  • In AI agent ecosystems, markdown files, README content, and inline comments are read and acted upon by language models as instructions
  • 5% of all skills (2,104) fetch untrusted content into agent context
  • The attack surface for AI agents is the entire text of the repository, not just the code

2. The Defensive Paradox

  • Security-oriented skills triggered detection rules at a disproportionately high rate (41% of flagged security tools)
  • Skills that legitimately need to read SSH keys or scan network ports look identical to attackers performing the same actions
  • Current heuristic-based scanners cannot resolve this without context

3. Cross-Agent Propagation

  • 637 skills contained agent memory poisoning patterns designed to modify system prompts of other agents
  • When Agent A installs a compromised skill, it can inject instructions into Agent B during a handoff
  • Standard per-agent scanning misses this: the infection vector is the communication channel, not the installation path

The attack surface for AI agents is the entire text of the repository, not just the code.

Methodology

  • 3-stage methodology: static scan (free, open source), deep scan (product feature), independent Opus peer validation
  • Scan pinned to ClawHub registry state as of 2026-03-27
  • 40,059 skills across 14,808 publishers scanned in under 10 minutes (8 parallel workers)
  • 2,967 high-confidence findings individually verified with 20 lines of source context
  • Independent peer review: 30-sample adversarial spot-check achieved 96.7% accuracy

References & Sources

  1. [1]
    Firmis Scanner (Apache-2.0)- Open source CLI used for this scan
  2. [2]
    ClawHub Skills Registry- 40,059 skills, 14,808 publishers as of 2026-03-27
  3. [3]
    MCPTox: MCP Tool Poisoning Research- 72.8% tool poisoning success rate
  4. [4]
    Koi Security: ClawHub Malicious Skills Report- 341 malicious skills identified
  5. [5]
    HackerOne: AI Agent Attack Trends 2025- 540% surge in AI agent attacks

Try It Now

Find out if your agent stack is safe

One command. 30 seconds. Free.

$npx firmis-cli init

Fix and Monitor included with Pro

View pricing