benchmark-agents
Pass
Audited by Gen Agent Trust Hub on Mar 17, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTION
Full Analysis
- External Content Retrieval: The skill uses
npxto download and execute a plugin directly from a Vercel-owned GitHub repository. This is a common pattern for bootstrapping development environments and, in this context, targets a verified and trusted source. - Local Workspace Management: Instructions include creating and managing directories within the user's home folder (
~/dev/vercel-plugin-testing/) and accessing debug logs in~/.claude/debug/. These operations are transparently defined and restricted to specific development paths for the purpose of monitoring agent performance. - Subprocess Orchestration: The skill utilizes
wezterm cli spawnto launch interactive terminal sessions. While this involves command execution, it is used to create isolated environments for testing the agent's behavior in real-world scenarios, which is the primary function of the skill. - Dynamic Environment Discovery: A small Node.js snippet is used to programmatically identify the system's temporary directory. This is a standard utility pattern to ensure compatibility across different operating systems during the evaluation process.
Audit Metadata