systematic-debugging

Warn

Audited by Gen Agent Trust Hub on Mar 7, 2026

Risk Level: MEDIUMDATA_EXFILTRATIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [DATA_EXPOSURE]: The skill's documentation in SKILL.md explicitly encourages agents to gather evidence by logging environment variables and system state. The provided examples include commands like env | grep IDENTITY and macOS-specific commands such as security list-keychains and security find-identity. Following these templates could lead to the exposure of sensitive credentials, API keys, or certificates in logs or chat history.
  • [COMMAND_EXECUTION]: The find-polluter.sh script executes npm test on files discovered via a user-provided pattern. This allows the agent to execute arbitrary code contained within local test files. If an attacker can introduce or modify files in the directory being analyzed, this script provides a path for local code execution.
  • [PROMPT_INJECTION]: The skill uses extremely assertive language, such as "The Iron Law" and "Violating the letter of this process is violating the spirit of debugging," to override the agent's default behavior. While intended to enforce technical rigor, these patterns represent a high degree of instruction-based control that could be used to bypass other operational constraints.
  • [INDIRECT_PROMPT_INJECTION]: The skill is designed to ingest and analyze external, potentially untrusted data like error messages, stack traces, and test outputs. It lacks sanitization or boundary markers to prevent malicious instructions embedded in these inputs from influencing agent behavior.
  • Ingestion points: Phase 1 (Root Cause Investigation) in SKILL.md and the tracing process in root-cause-tracing.md.
  • Boundary markers: None identified; data is processed as natural language.
  • Capability inventory: File system access, shell execution via find-polluter.sh, and system metadata inspection.
  • Sanitization: None; the skill encourages deep reading and tracing of raw error content.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 7, 2026, 03:34 PM