databricks-synthetic-data-gen

Pass

Audited by Gen Agent Trust Hub on Mar 5, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill provides instructions to install dependencies and execute Python scripts using uv or pip. This is standard behavior for developer-oriented tools.
  • [EXTERNAL_DOWNLOADS]: The skill references several standard Python libraries (faker, numpy, pandas, holidays, databricks-connect) to be installed from the official Python Package Index (PyPI). As PyPI is a well-known and trusted service, these downloads are documented neutrally.
  • [PROMPT_INJECTION]: The skill's code generation templates expose a surface for indirect prompt injection. User-supplied variables such as catalog and schema names are interpolated into SQL commands within the generated scripts without validation or sanitization.
  • Ingestion points: The variables CATALOG and SCHEMA in scripts/generate_synthetic_data.py are intended to be replaced by user input.
  • Boundary markers: Absent; user input is directly embedded into the code.
  • Capability inventory: The resulting scripts have the capability to execute arbitrary SQL commands via spark.sql() and perform file system operations via Spark's write APIs.
  • Sanitization: No explicit sanitization or validation logic for SQL identifiers is implemented in the provided script templates.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 5, 2026, 11:06 PM