The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: Fetches documentation and technical specifications from official LangWatch domains (langwatch.ai) to guide the setup of evaluators and experiments.
[COMMAND_EXECUTION]: Instructs the agent to execute shell commands to run evaluation scripts, including npx tsx for TypeScript and subprocess.run with jupyter nbconvert for executing Python notebooks.
[PROMPT_INJECTION]: The skill exhibits an attack surface for indirect prompt injection (Category 8) as it processes external data to generate evaluation logic.
Ingestion points: Reads the agent's codebase, package manifests (package.json, pyproject.toml), git history, and system prompts (SKILL.md).
Boundary markers: Absent; there are no specific instructions to ignore embedded commands within the analyzed codebase or prompts.
Capability inventory: The skill can create, write to, and execute local files and scripts.
Sanitization: No explicit sanitization or validation of the ingested code or prompts is described before they are interpolated into the evaluation scripts.

evaluations