nemo-evaluator-sdk

Warn

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: MEDIUMREMOTE_CODE_EXECUTIONCOMMAND_EXECUTIONCREDENTIALS_UNSAFEPROMPT_INJECTION
Full Analysis
  • Dynamic Execution (MEDIUM): The adapter system allows discovery of custom interceptors from arbitrary directories and modules via the discovery config in adapter-system.md. This is a form of dynamic loading from computed paths, which can be exploited to execute malicious code if the configuration is manipulated.\n- Command Execution (MEDIUM): The framework definition files (FDF) in custom-benchmarks.md use shell command templates (e.g., python -m {package_name}.run). This allows execution of arbitrary commands, posing a risk if template placeholders are populated from untrusted inputs.\n- Credentials Unsafe (LOW): Documentation in execution-backends.md references the use of sensitive file paths like ~/.ssh/id_rsa for Slurm authentication and environment variable names for secrets (e.g., NGC_API_KEY). This encourages practices that could lead to accidental credential exposure.\n- Indirect Prompt Injection (LOW): The reasoning interceptor in adapter-system.md parses untrusted model responses based on delimiter tokens (<think>). This constitutes an ingestion point for untrusted data into the agent's logic, which could be exploited via injection if parsing logic is flawed.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 17, 2026, 05:42 PM