The Agent Skills Directory

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

Third-party content exposure detected (high risk: 0.90). The skill's CLI and scripts actively fetch and parse public, user-generated dataset metadata and content from open sources (e.g., search_huggingface uses https://huggingface.co/api/datasets and datasets-server.huggingface.co, search_openml calls https://www.openml.org APIs, search_github invokes the gh CLI on GitHub repos, and search_papers queries Semantic Scholar), and those external descriptions/tags/README/sample rows are used to rank, deduplicate, select configs/splits, and drive pull/detail actions — so untrusted third-party content can materially influence agent decisions and workflow.

dataset-discovery