systematic-debugging

Pass

Audited by Gen Agent Trust Hub on Feb 18, 2026

Risk Level: SAFECOMMAND_EXECUTION
Full Analysis
  • COMMAND_EXECUTION (SAFE): The skill includes a shell script, find-polluter.sh, designed to identify tests that create unwanted global state by executing npm test on local files. It also provides examples in SKILL.md for gathering diagnostics using standard tools like security and codesign. These commands are appropriate for the intended technical use case and follow safe practices, such as proper variable quoting to prevent injection.
  • PROMPT_INJECTION (SAFE): The skill includes a series of benchmark tests (test-pressure-1.md through test-pressure-3.md) designed to evaluate the agent's discipline. These tests simulate high-pressure scenarios to ensure the agent adheres to its debugging methodology rather than succumbing to user pressure for unsafe 'quick fixes', which strengthens the agent's overall reliability.
  • DATA_EXPOSURE (SAFE): Diagnostic examples in SKILL.md demonstrate how to verify the presence of environment variables or signing identities. The examples utilize patterns that check for the existence of secrets (e.g., ${IDENTITY:+SET}) without printing the actual values, adhering to secure logging and debugging principles.
  • INDIRECT_PROMPT_INJECTION (SAFE): While the skill processes untrusted data such as bug reports and test failures, its core methodology (Phase 1-4) is explicitly designed to resist following potentially malicious or misleading instructions contained within symptoms by mandating empirical evidence gathering and root-cause tracing before any action is taken.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 18, 2026, 06:50 PM