research-debug

Pass

Audited by Gen Agent Trust Hub on Mar 1, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill uses shell commands to run training scripts (e.g., python scripts/train.py) and monitor logs to verify fixes during the debugging process. This is an intended capability for ML model analysis.- [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface by ingesting external content via web searches and including it in the context for sub-agents. An attacker could theoretically influence the agent's debugging logic through malicious instructions embedded in web pages or papers.
  • Ingestion points: Data retrieved by the WebSearch tool is interpolated into the prompt for the task-planner-analyzer sub-agent as described in SKILL.md.
  • Boundary markers: While the prompt uses markdown headers to delineate the web research context, it lacks specific delimiters or system instructions to treat that data as potentially untrusted.
  • Capability inventory: The skill orchestrates sub-agents that can modify files (modular-code-architect) and executes local scripts, meaning an injection could influence code changes.
  • Sanitization: The skill does not implement validation or sanitization for the content retrieved from external URLs before processing.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 1, 2026, 07:47 PM