The Agent Skills Directory

[COMMAND_EXECUTION]: The skill executes a variety of shell commands through the nemo-evaluator-launcher utility, including Docker operations (docker run, docker login), Slurm job submissions (sbatch, sacct), and custom benchmarking commands.
Evidence in references/custom-benchmarks.md: Framework Definition Files (FDF) allow users to define arbitrary shell command templates (e.g., python -m my_custom_eval.run ...) that the launcher executes at runtime.
Evidence in references/execution-backends.md: The Slurm executor uses SSH to execute commands on remote cluster head nodes.
[REMOTE_CODE_EXECUTION]: The skill features a 'Custom Interceptor Discovery' mechanism that dynamically loads and executes Python code from user-defined directories or modules.
Evidence in references/adapter-system.md: The discovery configuration allows specifying modules or dirs (e.g., /path/to/custom/interceptors) from which custom executable interceptors are loaded into the evaluation pipeline.
[EXTERNAL_DOWNLOADS]: The skill instructions involve downloading and installing external software and assets.
Mentions pip install nemo-evaluator-launcher for the core functionality.
Downloads container images from the NVIDIA Container Registry (nvcr.io).
[DATA_EXFILTRATION]: The skill includes a configurable progress_tracking interceptor that transmits evaluation data to an external endpoint.
Evidence in references/adapter-system.md: The progress_tracking_url parameter allows sending runtime data to a remote server, which could serve as a data transmission channel if misconfigured.

nemo-evaluator-sdk