scientific-debugging

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) due to its core operational logic. * Ingestion points: Reads untrusted data from external websites via the browser_subagent and local source code via view_code_item. * Boundary markers: None. There are no instructions or delimiters to help the agent distinguish between debug data and system instructions. * Capability inventory: Includes the power to modify code ('Fix' step) and interact with browsers. * Sanitization: Absent. The agent is instructed to use experimental data directly for reasoning and fixing, which could lead to executing instructions embedded in the code or websites being debugged.
  • COMMAND_EXECUTION (LOW): The skill relies on browser sub-agents and code inspection tools. While standard for its purpose, these tools provide the functional surface (write/interact) that makes an indirect prompt injection attack impactful.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 08:01 AM