codex-skill
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Prompt Injection] (HIGH): The skill description explicitly instructs the agent to operate in a 'non-interactive automation mode' to perform 'hands-off task execution without approval prompts.' This is a direct attempt to bypass human-in-the-loop safety constraints and agent guardrails.
- [Metadata Poisoning] (MEDIUM): The skill metadata uses deceptive claims, referencing non-existent models like 'gpt-5' and 'gpt-5.1.' This is a red flag for malicious or untrustworthy content designed to attract users through false capabilities.
- [Command Execution] (HIGH): The request for 'hands-off' execution for 'implementing' features implies that the skill will perform potentially dangerous system-level operations (e.g., file writes, command execution) while deliberately suppressing the approval mechanisms that protect the user.
Recommendations
- AI detected serious security threats
Audit Metadata