nemo-curator
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [EXTERNAL_DOWNLOADS] (LOW): The skill downloads and installs Python packages (nemo-curator, cudf, dask, rapids) and pre-trained models from Hugging Face and NVIDIA's registries. While Hugging Face is a trusted source, the use of external code and models at runtime represents a known security surface.
- [PROMPT_INJECTION] (LOW): The skill has an indirect prompt injection surface (Category 8) because it is designed to ingest and process large, untrusted datasets from external sources. • Ingestion points: The skill reads data from 'common_crawl/.parquet' and 's3://large_dataset/.parquet' as described in SKILL.md. • Boundary markers: No specific delimiters or markers are implemented to isolate untrusted data during processing. • Capability inventory: The skill performs data filtering, deduplication, and PII redaction; no high-risk capabilities like shell command execution or system file modification were found. • Sanitization: The skill includes features like PIIRedactor and NSFWClassifier to help filter sensitive or inappropriate content, which mitigates some ingestion risks.
Audit Metadata