skills/databricks-solutions/vibe-coding-workshop-template/genie-benchmark-evaluator/Gen Agent Trust Hub
genie-benchmark-evaluator
Pass
Audited by Gen Agent Trust Hub on Mar 8, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: Executes SQL queries generated by the Genie API and ground truth SQL defined in workspace YAML files via
spark.sql(). This is the core functionality required to compare result sets and validate syntax. - [COMMAND_EXECUTION]: Orchestrates Databricks Jobs and polls for status using
subprocess.runto invoke the official Databricks CLI. This is a standard automation pattern for the platform. - [EXTERNAL_DOWNLOADS]: Utilizes well-known and trusted libraries including
mlflow,databricks-sdk, andpyyaml. No unverified third-party packages or remote scripts are downloaded or executed. - [PROMPT_INJECTION]: Employs structured evaluation prompts managed via an MLflow Prompt Registry. These prompts are designed for scoring and do not contain directives to bypass AI safety guardrails.
- [DATA_EXFILTRATION]: Network activity is restricted to authenticated Databricks services, including MLflow tracking and Model Serving endpoints for judge execution. No data is sent to untrusted external domains.
Audit Metadata