nemo-curator

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [Prompt Injection] (SAFE): No instructions to override behavior or bypass safety filters were found. The skill is purely technical documentation. \n- [Data Exposure & Exfiltration] (SAFE): No hardcoded secrets or access to sensitive local paths. It follows standard data processing workflows for reading and writing Parquet files. \n- [Obfuscation] (SAFE): No hidden or encoded malicious content detected. \n- [Unverifiable Dependencies & Remote Code Execution] (SAFE): Uses reputable libraries (nemo-curator, dask, rapids) and downloads models from established sources (Hugging Face, NVIDIA). \n- [Indirect Prompt Injection] (LOW): As a data curation tool, it processes untrusted web data. While this is an ingestion surface, the skill is specifically built to mitigate these risks through sanitization modules. \n
  • Ingestion points: S3 and local files processed via DocumentDataset.read_parquet. \n
  • Boundary markers: Implicitly handled by the curation logic. \n
  • Capability inventory: File system writing and distributed execution capabilities. \n
  • Sanitization: Includes PII redaction, NSFW classifiers, and quality filters to clean external data.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:06 PM