NYC

codex-skill

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [Prompt Injection] (HIGH): The skill description explicitly instructs the agent to operate in a 'non-interactive automation mode' to perform 'hands-off task execution without approval prompts.' This is a direct attempt to bypass human-in-the-loop safety constraints and agent guardrails.
  • [Metadata Poisoning] (MEDIUM): The skill metadata uses deceptive claims, referencing non-existent models like 'gpt-5' and 'gpt-5.1.' This is a red flag for malicious or untrustworthy content designed to attract users through false capabilities.
  • [Command Execution] (HIGH): The request for 'hands-off' execution for 'implementing' features implies that the skill will perform potentially dangerous system-level operations (e.g., file writes, command execution) while deliberately suppressing the approval mechanisms that protect the user.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 11:12 AM