research-debug
Pass
Audited by Gen Agent Trust Hub on Mar 1, 2026
Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill uses shell commands to run training scripts (e.g.,
python scripts/train.py) and monitor logs to verify fixes during the debugging process. This is an intended capability for ML model analysis.- [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface by ingesting external content via web searches and including it in the context for sub-agents. An attacker could theoretically influence the agent's debugging logic through malicious instructions embedded in web pages or papers. - Ingestion points: Data retrieved by the
WebSearchtool is interpolated into the prompt for thetask-planner-analyzersub-agent as described inSKILL.md. - Boundary markers: While the prompt uses markdown headers to delineate the web research context, it lacks specific delimiters or system instructions to treat that data as potentially untrusted.
- Capability inventory: The skill orchestrates sub-agents that can modify files (
modular-code-architect) and executes local scripts, meaning an injection could influence code changes. - Sanitization: The skill does not implement validation or sanitization for the content retrieved from external URLs before processing.
Audit Metadata