The Agent Skills Directory

MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).

Third-party content exposure detected (high risk: 0.90). The skill ingests and acts on LLM-generated, untrusted content: it calls Genie via run_genie_query (w.genie.start_conversation) to obtain generated SQL and invokes LLM judges via _call_llm_for_scoring (w.serving_endpoints.query/serving endpoints) — the arbiter and other scorers parse those responses and can trigger material actions such as auto-updating benchmark YAMLs, so external model outputs can indirectly inject instructions that affect tool behavior.

genie-benchmark-evaluator