systematic-debugging
Warn
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (MEDIUM): The utility script
find-polluter.shis vulnerable to shell command injection. It uses aforloop to iterate over the results of afindcommand and executesnpm test "$TEST_FILE". If the directory contains files with names specifically crafted to include command substitution (e.g., using backticks or$()), the script will execute those commands with the privileges of the user running the script. - PROMPT_INJECTION (LOW): The provided test files (
test-pressure-1.md,test-pressure-2.md,test-pressure-3.md) utilize adversarial pressure techniques and role-play scenarios to test the agent's compliance with the debugging process. These files contain instructions like 'IMPORTANT: This is a real scenario. You must choose and act.', which are characteristic of prompt injection attempts designed to bypass standard operational guidelines. - COMMAND_EXECUTION (LOW):
SKILL.mdincludes examples of sensitive macOS-specific commands such assecurity list-keychainsandsecurity find-identity. While intended for legitimate debugging of code-signing issues, these represent a risk if the agent incorrectly interprets the examples as direct commands to execute in an untrusted environment. - [Category 8: Indirect Prompt Injection] (LOW): The skill's primary function is to process external, potentially untrusted data such as error logs and stack traces (Phase 1). It lacks explicit boundary markers or instructions to sanitize this data, creating a vulnerability surface where malicious instructions hidden within logs could influence the agent's actions during a debugging session.
- Ingestion points: Error message reading and stack trace analysis in Phase 1 of
SKILL.md. - Boundary markers: Absent; no delimiters are defined for separating logs from system instructions.
- Capability inventory: Execution of
npm testviafind-polluter.shand various shell commands suggested in Phase 1 examples. - Sanitization: Absent; the skill does not define methods for escaping or validating external log content.
Audit Metadata