The Agent Skills Directory

[COMMAND_EXECUTION]: The skill makes extensive use of system commands and subprocesses to perform video processing tasks.
Evidence: Multiple scripts (transcribe.py, detect_content.py, compute_reframe.py, snap_boundaries.py, export.sh, preflight.sh) execute ffmpeg and ffprobe to manipulate media files, detect video properties, and extract audio.
Evidence: scripts/detect_gpu.sh executes nvidia-smi to detect local hardware capabilities.
Evidence: remotion/render.mjs executes headless Chrome via the Remotion renderer to generate video frames.
[EXTERNAL_DOWNLOADS]: The skill requires several third-party dependencies to be installed during setup.
Evidence: setup.sh uses pip to install packages including faster-whisper, mediapipe, numpy, opencv-python, and torch (from official PyPI and PyTorch registries).
Evidence: setup.sh runs npm install within the remotion/ directory to fetch Node.js packages like remotion, react, and zod from the npm registry.
[PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection through the content of processed videos.
Ingestion points: In Step 4 of SKILL.md, the agent is instructed to read transcript.json, which contains the text of the video provided by the user.
Boundary markers: There are no explicit instructions or delimiters used when Claude reads the transcript to prevent it from following commands embedded in the spoken text.
Capability inventory: The agent has access to Bash, Write, and Task tools, which could be abused if an attacker-controlled video successfully influences the agent's behavior during transcript analysis.
Sanitization: The skill performs text 'cleanup' (Step 4) to remove filler words, but this is intended for caption aesthetics rather than security sanitization.

shorts