systematic-debugging

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [CREDENTIALS_UNSAFE] (HIGH): The skill provides explicit instructions to log sensitive system state during diagnostics. Evidence in SKILL.md (Phase 1, Step 4) shows commands like env | grep IDENTITY and security find-identity -v, which are designed to expose environment variables and macOS keychain identities. If executed, these can leak secrets or cryptographic identities into logs or agent history.
  • [COMMAND_EXECUTION] (MEDIUM): The skill encourages the agent to run arbitrary bash commands for 'diagnostic instrumentation'. While focused on debugging, it provides a template for executing system commands (security, env, codesign) that could be abused if the agent's context is compromised.
  • [PROMPT_INJECTION] (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8).
  • Ingestion points: The skill's primary purpose is to 'Read Error Messages Carefully' and 'Gather Evidence' from logs and stack traces (Phase 1).
  • Capability inventory: The agent is given capabilities to execute shell commands and modify files (Phase 4).
  • Boundary markers: No boundary markers or 'ignore embedded instructions' warnings are suggested for the data being analyzed.
  • Sanitization: There is no instruction to sanitize or filter the external logs/error data before the agent processes it. An attacker who can influence log output (e.g., via a web request that triggers an error) can inject malicious instructions that the agent may follow while 'systematically debugging'.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 12:32 PM