NYC

nanogpt

Warn

Audited by Snyk on Feb 15, 2026

Risk Level: MEDIUM
Full Analysis

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

  • Third-party content exposure detected (high risk: 0.90). The skill's data preparation scripts explicitly download and tokenize public web datasets (e.g., data/openwebtext/prepare.py uses datasets.load_dataset("openwebtext") and data/shakespeare_char/prepare.py fetches the Shakespeare text from raw.githubusercontent.com, and other examples ingest Wikipedia/The-Stack), so it ingests untrusted, user-generated third-party content as part of its workflow.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 15, 2026, 09:06 PM