mcp-builder

Pass

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The evaluation script is designed to execute local MCP server implementations for testing purposes.
  • Evidence: The script scripts/evaluation.py allows users to specify a command and arguments via -c/--command and -a/--args flags to launch a local server process through the mcp library's stdio transport. This is the intended purpose of the test harness and is limited to local execution of the developer's own code.
  • [EXTERNAL_DOWNLOADS]: The skill references official documentation and SDKs from the protocol's authoritative sources.
  • Evidence: SKILL.md contains instructions for the agent to fetch documentation from modelcontextprotocol.io and the modelcontextprotocol organization on GitHub. These are recognized as trusted, well-known services within the MCP ecosystem.
  • [PROMPT_INJECTION]: The evaluation loop processes external data which serves as a surface for indirect prompt injection.
  • Ingestion points: scripts/evaluation.py reads questions from a user-provided XML file and results from the MCP server being tested.
  • Boundary markers: The EVALUATION_PROMPT enforces the use of XML tags (<summary>, <feedback>, <response>) to structure the assistant's output and maintain separation from data.
  • Capability inventory: The script is capable of subprocess execution for local MCP servers and making network requests to the Anthropic API.
  • Sanitization: No explicit sanitization is performed on the questions or tool outputs before they are passed to the model, which is common in testing utilities of this nature.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 6, 2026, 06:38 AM