The Agent Skills Directory

[DYNAMIC_EXECUTION]: The script scripts/media_optimizer.py uses the eval() function to parse the r_frame_rate string obtained from ffprobe. Because this string is derived from the metadata of external media files, a maliciously crafted file could potentially execute arbitrary code when processed by this script.
[COMMAND_EXECUTION]: The skill relies on subprocess.run to execute external binaries ffmpeg and ffprobe. While these are used for the legitimate purpose of media optimization, the interaction with system-level utilities increases the attack surface if input filenames or metadata are not strictly validated.
[DATA_EXPOSURE]: The find_api_key function in scripts/gemini_batch_process.py and scripts/document_converter.py implements a recursive search for .env files in parent directories (up to the project root). This behavior could lead to the unintended exposure of sensitive environment variables or credentials stored in higher-level directories.
[INDIRECT_PROMPT_INJECTION]: The skill is designed to ingest and process untrusted external data from various media formats (PDF, audio, video, images) and relay that content to an LLM. It lacks explicit boundary markers or 'ignore' instructions in its prompts to prevent the model from executing commands embedded within the analyzed documents.
Ingestion points: Files provided through the --input or --files CLI arguments in gemini_batch_process.py and document_converter.py.
Boundary markers: None identified; the prompts instruct the model on formatting but do not explicitly warn against instructions contained within the media.
Capability inventory: The skill has the ability to write to the filesystem, execute shell commands (via ffmpeg), and perform network operations (Gemini API).
Sanitization: The skill does not perform sanitization of text extracted from OCR or transcription before passing it to the next stage of the LLM pipeline.

ai-multimodal