The Agent Skills Directory

[EXTERNAL_DOWNLOADS]: The skill fetches external content from arXiv via Tavily and clones code repositories from GitHub using git clone. It also uses wget, curl, and aria2c to download datasets from potentially untrusted URLs.
[COMMAND_EXECUTION]: The workflow relies heavily on shell commands for environment setup and task execution, including uv venv for virtual environments, gh for GitHub interactions, and python for running scripts.
[REMOTE_CODE_EXECUTION]: The skill's core purpose is to download and execute code from third-party repositories. This behavior inherits the security risks of the source content, as any malicious code within a cloned repository would be executed during the reproduction process.
[DYNAMIC_EXECUTION]: The agent is instructed to write custom Python scripts (e.g., train.py, smoke_forward.py, optim_factory.py) and then execute them to verify the model's behavior.
[INDIRECT_PROMPT_INJECTION]:
Ingestion points: Untrusted data enters the context through arXiv HTML/PDF extracts and GitHub repository content (READMEs, code comments).
Boundary markers: The instructions recommend verbatim extraction but lack explicit delimiters or instructions to the agent to disregard malicious directives embedded within the paper text.
Capability inventory: The agent possesses full file-system access, network capabilities, and the ability to execute arbitrary shell commands.
Sanitization: No sanitization or validation is performed on the content extracted from external papers before it is used to guide the implementation and training stages.

reproduce