The Agent Skills Directory

[COMMAND_EXECUTION]: The scripts/collect_evidence.sh script executes arbitrary lifecycle scripts (lint, typecheck, test, build, check) found in the package.json of the repository being evaluated. This execution is triggered when the --execute-checks flag is used.
[REMOTE_CODE_EXECUTION]: The collector script invokes uv run on a Python file (render_report.py) located within the path of the target repository. This allows an attacker who controls the target repository to execute arbitrary Python code on the system by modifying the contents of that script.
[DATA_EXFILTRATION]: The skill performs an exhaustive scan of the target directory using rg --files -uu, which explicitly includes hidden and ignored files. The resulting file inventory and collected evidence are stored in evaluation artifacts that may contain sensitive secrets or private information present in the repo.

evidence-heavy-evaluator