The Agent Skills Directory

[COMMAND_EXECUTION]: The skill executes a local shell script at ./scripts/bench.sh and uses the jq utility to process files, allowing for arbitrary command execution within the context of the benchmark environment.
[EXTERNAL_DOWNLOADS]: The skill uses uv sync to install Python dependencies from external package registries at runtime.
[PROMPT_INJECTION]: The skill processes untrusted data from external JSON files which serves as a potential surface for indirect prompt injection.
Ingestion points: reads data from tests/benchmark/prediction/opendataloader/evaluation.json and tests/benchmark/thresholds.json.
Boundary markers: None detected. The skill does not use delimiters to isolate processed data from agent instructions.
Capability inventory: Includes shell script execution (./scripts/bench.sh) and system command calls via jq.
Sanitization: No sanitization or validation of the JSON content is performed before the data is processed or used to generate output summaries.

bench