systematic-debugging

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
  • CREDENTIALS_UNSAFE (HIGH): The skill explicitly instructs the agent to access sensitive credentials and identities as part of its 'Phase 1: Root Cause Investigation'.
  • Evidence: SKILL.md provides example bash commands for the agent to run, including env | grep IDENTITY (extracting secret values from environment variables) and security find-identity -v (querying the macOS keychain for signing certificates).
  • PROMPT_INJECTION (HIGH): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) because it mandates the ingestion of untrusted data while providing the agent with command execution capabilities.
  • Ingestion points: SKILL.md Phase 1 requires reading 'Error Messages', 'Stack traces', 'Git diffs', and logs from 'Multi-Component Systems' (e.g., CI, API, database logs).
  • Boundary markers: Absent. The skill provides no instructions for delimiting or sanitizing data from these external sources.
  • Capability inventory: The skill encourages the agent to use bash for diagnostic instrumentation, call system tools (security, codesign), and invoke other powerful skills like superpowers:root-cause-tracing.
  • Sanitization: Absent. Malicious instructions embedded in a stack trace or git commit could be parsed and executed by an agent following this debugging logic.
  • DATA_EXFILTRATION (HIGH): While no direct network exfiltration is present, the skill's methodology for 'Gathering Evidence' involves logging sensitive data (environment variables and keychain states) to the standard output or log files.
  • Evidence: SKILL.md Phase 1 explicitly instructs the agent to 'Log what data enters component' and 'Check state at each layer', creating a high-severity exposure risk.
  • COMMAND_EXECUTION (MEDIUM): The skill promotes the generation and execution of shell scripts for diagnostics.
  • Evidence: Phase 1, Step 4 contains a template for a bash script used to 'gather evidence' which is expected to be executed within the environment.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 01:32 PM