manimate

Fail

Audited by Gen Agent Trust Hub on Mar 6, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION]: The skill operates by spawning a sub-agent to generate Python code for Manim scenes, which are then immediately executed on the local machine using the manim render command. This architectural pattern represents a form of dynamic code execution where the source logic is created at runtime based on external input.
  • Evidence: SKILL.md Step 5 and Step 6.
  • [COMMAND_EXECUTION]: The skill executes several system commands including python3, manim, and ffmpeg. Critically, the preflight checks in SKILL.md configure the Claude CLI with the --dangerously-skip-permissions flag. This flag is explicitly designed to suppress security warnings and permission prompts, significantly increasing the potential impact of any malicious code execution.
  • Evidence: SKILL.md Step 2 (Preflight Checks).
  • [PROMPT_INJECTION]: The skill is vulnerable to Indirect Prompt Injection because it interpolates raw user prompts into the instructions provided to the code-generation worker. A crafted prompt could manipulate the sub-agent into generating Python code that performs unauthorized file access or network communication.
  • Evidence Chain:
  • Ingestion points: User prompt is stored in .manimate/params.json and then read into the worker prompt in SKILL.md Step 5.
  • Boundary markers: The prompt includes textual rules such as "Keep the scene self-contained (no file I/O, no network)", but these are soft constraints for the LLM rather than technical enforcement.
  • Capability inventory: The skill has the capability to execute arbitrary Python code via manim, and run shell scripts via bash.
  • Sanitization: No sanitization or validation of the user's prompt is performed before it is passed to the generation agent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 6, 2026, 08:42 PM