The Agent Skills Directory

[REMOTE_CODE_EXECUTION]: The skill's documentation in references/guide.md includes a shell command to install the tool via a piped remote script (curl -fsSL https://decbench.ai/install.sh | sh), which is an inherently risky pattern despite being intended for installation.
[EXTERNAL_DOWNLOADS]: The skill performs external network operations to download the dec-bench CLI and to interact with the decbench.ai registry for publishing and managing evaluation scenarios.
[COMMAND_EXECUTION]: The agent is instructed to execute several CLI commands to manage the lifecycle of evaluation scenarios, including dec-bench create, validate, build, run, and registry publish, as well as checking GitHub authentication using gh auth status.
[PROMPT_INJECTION]: An indirect prompt injection surface is created as the skill generates markdown prompts and TypeScript assertion scripts that are later ingested and executed by the evaluation framework.
Ingestion points: The skill creates files like prompts/naive.md, prompts/savvy.md, and assertions/*.ts that contain instructions processed at runtime.
Boundary markers: There are no explicit markers or safety instructions in the generated files to delimit generated content from the agent's core instructions.
Capability inventory: The framework has the ability to execute code, manage Docker containers, and perform database queries on ClickHouse and Postgres.
Sanitization: The skill does not implement validation or sanitization of the content it authors before it is used by the framework.

dec-bench-evals