skill-test
Pass
Audited by Gen Agent Trust Hub on Mar 10, 2026
Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]:
- The framework is designed to execute Python, SQL, and YAML code blocks extracted from AI agent responses on Databricks clusters and warehouses to verify their functionality.
- This logic is implemented via MCP tools such as
mcp__databricks__execute_databricks_commandandmcp__databricks__execute_sqlinsrc/skill_test/grp/executor.py. - Utility scripts like
scripts/add_example.pyuse system commands (pbpaste,xclip) to interact with the system clipboard for developer convenience. - [EXTERNAL_DOWNLOADS]:
- The optimization and evaluation components (
optimize.py,mlflow_eval.py) communicate with external LLM provider endpoints, including Databricks Model Serving, OpenAI, and Anthropic. - It also interacts with MLflow tracking servers to log and retrieve evaluation metrics and session traces.
- [PROMPT_INJECTION]:
- The skill exhibits an indirect prompt injection surface because it ingests responses generated by other skills and executes the embedded code blocks during verification.
- Ingestion points: The
interactivefunction insrc/skill_test/cli/commands.pyreceives aresponsestring containing markdown code blocks. - Boundary markers: Relies on standard markdown triple-backtick delimiters (e.g., ```python) to identify code segments.
- Capability inventory: Includes high-privilege operations such as arbitrary code execution on compute resources and file uploads to Unity Catalog volumes.
- Sanitization: Performs basic syntax validation but does not sanitize the logical content of the generated code before execution.
Audit Metadata