The Agent Skills Directory

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

Third-party content exposure detected (high risk: 0.80). The training guide explicitly shows ingesting external, public corpora (e.g., references/training.md's "Training from Python iterator" that calls datasets.load_dataset('wikitext','wikitext-103-raw-v1') and examples using input='corpus.txt'), so the agent would read untrusted, user-generated web content as part of its workflow.

sentencepiece