systematic-debugging

Pass

Audited by Gen Agent Trust Hub on Mar 10, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The script find-polluter.sh executes local test files using npm test to identify state pollution side effects. The skill also suggests using shell commands like security list-keychains and git init for diagnostic purposes in SKILL.md and root-cause-tracing.md during the debugging process.\n- [PROMPT_INJECTION]: The skill defines an indirect prompt injection surface as it instructs the agent to process external data (Ingestion points: Phase 1 error messages, stack traces, and logs in SKILL.md). It lacks boundary markers and explicit sanitization for this data. The agent's capability inventory includes shell execution and file system access.\n- [PROMPT_INJECTION]: Files test-pressure-1.md, test-pressure-2.md, and test-pressure-3.md contain test cases that use adversarial framing, such as high-stakes scenarios and urgency, to evaluate agent compliance with its instructions under pressure.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 10, 2026, 01:15 AM