nemo-evaluator
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- EXTERNAL_DOWNLOADS (LOW): The skill installs the 'nemo-evaluator-launcher' package via pip to enable its benchmarking capabilities.
- COMMAND_EXECUTION (LOW): The skill executes 'nemo-evaluator-launcher' commands to run evaluation pipelines, which is a core and expected function.
- PROMPT_INJECTION (LOW): A surface for indirect prompt injection exists because the skill ingests data from external benchmarking harnesses that may contain untrusted instructions. * Ingestion points: Configuration files (config.yaml) and external benchmark datasets. * Boundary markers: No explicit delimiters or ignore instructions are present for separating external data from instructions. * Capability inventory: The skill triggers subprocess execution to run benchmarks. * Sanitization: No sanitization of ingested benchmark content is documented.
Audit Metadata