codex-readiness-unit-test
Warn
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (MEDIUM): The skill's 'Execute' mode is designed to run arbitrary shell commands found in
AGENTS.mdandPLANS.md. AlthoughSKILL.mdmandates a confirmation step and mentions a command denylist, the underlying mechanism executes untrusted input from the filesystem. - PROMPT_INJECTION (LOW): The skill is highly susceptible to Indirect Prompt Injection (Category 8). It ingests untrusted markdown files from the repository and uses them to influence both the LLM's evaluation results and the generated execution plan.
- Ingestion points:
AGENTS.md,PLANS.md, and anySKILL.mdfiles referenced via$SkillNameor path patterns (processed inscripts/collect_evidence.py). - Boundary markers: None. The evaluation prompts (e.g.,
references/commands.md,references/loop_quality.md) interpolate the rawEVIDENCE_JSONdirectly into the system instructions. - Capability inventory: The skill has the ability to read files across the current directory and the user's home directory (
~/.codex/skills), and execute arbitrary shell commands via the referenced (but missing from source)run_plan.pyscript. - Sanitization: There is no evidence of sanitization or escaping of the ingested text before it is used in prompts or execution plans.
- DATA_EXFILTRATION (LOW): The
scripts/collect_evidence.pyscript accesses paths outside the current working directory, specifically~/.codex/skills, to resolve skill references. While likely intended for tool configuration, this grants the skill access to data in the user's home directory.
Audit Metadata