genie-benchmark-evaluator
Warn
Audited by Snyk on Mar 8, 2026
Risk Level: MEDIUM
Full Analysis
MEDIUM W011: Third-party content exposure detected (indirect prompt injection risk).
- Third-party content exposure detected (high risk: 0.90). The skill ingests and acts on LLM-generated, untrusted content: it calls Genie via run_genie_query (w.genie.start_conversation) to obtain generated SQL and invokes LLM judges via _call_llm_for_scoring (w.serving_endpoints.query/serving endpoints) — the arbiter and other scorers parse those responses and can trigger material actions such as auto-updating benchmark YAMLs, so external model outputs can indirectly inject instructions that affect tool behavior.
Audit Metadata