genie-benchmark-generator

Warn

Audited by Gen Agent Trust Hub on Mar 8, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes SQL queries via spark.sql() during ground truth validation. This expected_sql can be directly provided by users or generated by an AI based on user questions. This creates a vector for SQL injection where malicious queries could be executed with the permissions of the Spark session.\n
  • Evidence: Found in scripts/benchmark_generator.py and references/gt-validation.md within the validate_ground_truth_sql function logic.\n- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection as it processes untrusted user benchmark questions to drive SQL generation and benchmark suite creation.\n
  • Ingestion points: user_questions parameter in SKILL.md and the interactive intake prompt in references/benchmark-intake-workflow.md.\n
  • Boundary markers: No specific boundary markers or 'ignore' instructions are documented for wrapping user input before LLM processing.\n
  • Capability inventory: spark.sql() execution and mlflow.genai.datasets.create_dataset for dataset synchronization.\n
  • Sanitization: No evidence of SQL sanitization or prompt filtering for user-provided strings.\n- [EXTERNAL_DOWNLOADS]: The skill depends on databricks-sdk, mlflow, and mlflow-genai-evaluation.\n
  • These are official packages within the Databricks and MLflow ecosystems and are recognized as well-known services from a trusted vendor context.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 8, 2026, 02:33 AM