6 Critical Security Blind Spots in Anthropic Skills You Must Know
When developers pull an Anthropic Skill from marketplaces like ClawHub or skills.sh, they assume security scanners have vetted every file. But recent research reveals a dangerous gap: scanners overlook test files, even though test runners execute them with full system access. This oversight, combined with large-scale audits showing widespread vulnerabilities, demands immediate attention. Here are six essential facts about this threat landscape.
1. The Test File Blind Spot: Why Scanners Miss the Real Threat
Anthropic Skill scanners thoroughly examine markdown instructions, check for prompt injection, and verify shell commands in the SKILL.md. They pass with flying colors. But these scanners never inspect .test.ts files sitting in the same directory. Since test files aren't part of the agent execution surface, no publicly documented scanner—as of this writing—analyzes them. Yet these files run anyway, not through the agent but through the test runner, gaining full access to the filesystem, environment variables, and SSH keys.

2. How Malicious Code Sneaks In via Test Discovery
Security researcher Jeevan Jutla at Gecko Security demonstrated the attack flow. When a developer runs npx Skills add, the installer copies the entire skill directory into the repository. If a malicious Skill bundles a *.test.ts file, popular test frameworks like Jest and Vitest discover it through recursive glob patterns. They treat it as a first-class test and execute it automatically during npm test or when an IDE triggers test runs on save. Mocha, another open-source JavaScript test framework, uses a similar recursive pattern by default. The malicious payload fires in beforeAll, before any assertions run, and the test output shows nothing unusual. In CI environments, process.env exposes deployment tokens, cloud credentials, and every secret the pipeline can access.
3. Why This Attack Vector Amplifies Risk in Team Workflows
Installed Skills land in a directory designed to be committed and shared across a team. Once committed, the malicious test file propagates to every teammate who clones the repository. It sits outside every scanner's detection surface, making it a persistent threat. Unlike typical supply chain attacks, the agent itself is never invoked, and the Skill scanner reads the right files but operates under the wrong threat model. The result: three audits later, one critical blind spot remains unaddressed.
4. Comparison to Known Trust-on-Install Attacks (and Why This Is Worse)
Malicious npm postinstall scripts and pytest plugins have exploited trust-on-install for years. However, the Skill vector introduces unique dangers. Postinstall scripts run once during installation and can be caught by runtime monitoring. In contrast, test files persist in the repository, execute repeatedly on every developer's machine, and blend into routine CI runs. They bypass not just scanners but also typical security monitoring that expects threats in postinstall hooks, not test suites. This stealthy propagation makes the Skill vector more insidious for teams that share codebases.
5. The SkillScan Academic Study: 26% of Skills Have Vulnerabilities
A academic study published on January 15 analyzed 31,132 unique Anthropic Skills from two major marketplaces. The findings were stark: 26.1% of Skills contained at least one vulnerability spanning 14 distinct patterns across four categories. Data exfiltration appeared in 13.3% of Skills, and privilege escalation in 11.8%. Skills that bundled executable scripts were 2.12 times more likely to contain vulnerabilities than instruction-only Skills. This audit measured threats on the execution surface that scanners already inspect—a surface that excludes test files.
6. The Snyk ToxicSkills Audit: 13.4% of Skills Compromised
Three weeks after the academic study, Snyk published ToxicSkills, the first comprehensive security audit of the ClawHub and skills.sh marketplaces. Scanning 3,984 Skills as of February 5, Snyk found that 13.4% of all Skills contained at least one critical or high-severity vulnerability. These vulnerabilities included hardcoded secrets, unsafe shell command construction, and logic flaws that could lead to data theft. Together, the two audits document widespread problems in the Skill ecosystem—but neither examined test files, leaving the Gecko-discovered attack vector fully open.
Conclusion: The test file blind spot reveals a systemic flaw in how Anthropic Skill security is evaluated. Scanners need to expand their scope to include test files, and developers must verify all files in a Skill, not just the ones the agent uses. Until then, teams should treat any test file in a downloaded Skill as a potential threat, audit test runner configurations, and monitor for unexpected test executions in CI. The combination of high vulnerability rates across marketplaces and the stealthy test-file attack vector makes this a pressing security concern for the entire ecosystem.