The Agent Skills Directory

[COMMAND_EXECUTION]: The skill executes shell commands generated by an LLM based on untrusted input. In scripts/runner.py, the _setup_sandbox function iterates through scenario.setup_commands and executes them using subprocess.run. These commands are produced by scripts/scenario_generator.py, which uses an LLM to derive them from the content of the skill file being analyzed. This creates a high risk of executing arbitrary commands if a malicious skill file is provided as input.
[REMOTE_CODE_EXECUTION]: The skill performs dynamic execution of instructions generated at runtime from external sources. It uses the Claude CLI (claude -p) to process potentially malicious skill definitions and then directly executes the resulting shell commands on the host environment.
[PROMPT_INJECTION]: The skill is highly susceptible to indirect prompt injection attacks. It processes untrusted skill files and agent tool traces, interpolating them into core logic prompts without sufficient sanitization or behavioral guardrails.
Ingestion points: The skill file content provided via CLI argument (args.skill in scripts/run.py) and agent tool traces in scripts/classifier.py.
Boundary markers: Prompt templates in the prompts/ directory use basic delimiters like --- but lack explicit instructions to the model to ignore embedded or conflicting instructions within the provided data.
Capability inventory: The skill possesses significant capabilities, including file system access and the ability to spawn subprocesses via the subprocess module to interact with the host shell.
Sanitization: No validation, sanitization, or safety filtering is applied to the untrusted skill content before it is used to generate further executable shell commands.

skill-comply