nemo-evaluator-sdk
Warn
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSDATA_EXFILTRATION
Full Analysis
- [COMMAND_EXECUTION]: The skill executes a variety of shell commands through the
nemo-evaluator-launcherutility, including Docker operations (docker run,docker login), Slurm job submissions (sbatch,sacct), and custom benchmarking commands. - Evidence in
references/custom-benchmarks.md: Framework Definition Files (FDF) allow users to define arbitrary shell command templates (e.g.,python -m my_custom_eval.run ...) that the launcher executes at runtime. - Evidence in
references/execution-backends.md: The Slurm executor uses SSH to execute commands on remote cluster head nodes. - [REMOTE_CODE_EXECUTION]: The skill features a 'Custom Interceptor Discovery' mechanism that dynamically loads and executes Python code from user-defined directories or modules.
- Evidence in
references/adapter-system.md: Thediscoveryconfiguration allows specifyingmodulesordirs(e.g.,/path/to/custom/interceptors) from which custom executable interceptors are loaded into the evaluation pipeline. - [EXTERNAL_DOWNLOADS]: The skill instructions involve downloading and installing external software and assets.
- Mentions
pip install nemo-evaluator-launcherfor the core functionality. - Downloads container images from the NVIDIA Container Registry (
nvcr.io). - [DATA_EXFILTRATION]: The skill includes a configurable
progress_trackinginterceptor that transmits evaluation data to an external endpoint. - Evidence in
references/adapter-system.md: Theprogress_tracking_urlparameter allows sending runtime data to a remote server, which could serve as a data transmission channel if misconfigured.
Audit Metadata