systematic-debugging

Fail

Audited by Gen Agent Trust Hub on Feb 14, 2026

Risk Level: HIGHCREDENTIALS_UNSAFECOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [CREDENTIALS_UNSAFE] (HIGH): SKILL.md (Phase 1, Step 4) explicitly instructs the agent to log sensitive environment variables using 'env | grep IDENTITY' and query MacOS keychain details via 'security list-keychains' and 'security find-identity'. This pattern directly exposes private cryptographic identities and secrets to the agent context.
  • [COMMAND_EXECUTION] (HIGH): The 'find-polluter.sh' script executes 'npm test' on arbitrary files identified by a user-provided search pattern ($TEST_PATTERN). An attacker can manipulate this pattern to execute unintended malicious files. Additionally, the skill's debugging steps encourage running low-level system commands like 'codesign' and 'security' with user-supplied variables.
  • [DATA_EXFILTRATION] (MEDIUM): The skill promotes 'Gathering Evidence in Multi-Component Systems' which involves logging entry/exit data across network boundaries (API, service). When combined with the credential exposure findings, this creates a high risk of exfiltrating discovered secrets to external logs or endpoints.
  • [PROMPT_INJECTION] (HIGH): This skill is vulnerable to Indirect Prompt Injection (Category 8). Ingestion Point: SKILL.md Phase 1 (Read Error Messages). Boundary Markers: Absent. Capability Inventory: Shell execution via bash snippets and find-polluter.sh. Sanitization: Absent. A malicious application can produce error messages or stack traces containing instructions that the agent would follow under the mandate of 'Systematic Debugging'.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 14, 2026, 10:25 AM