The Agent Skills Directory

[COMMAND_EXECUTION]: The skill instructs the agent to write Python code to local files and execute them on a Databricks cluster using the run_python_file_on_databricks tool. This core functionality is used for managing Spark sessions, creating infrastructure, and processing data.
[EXTERNAL_DOWNLOADS]: The skill recommends installing the faker and holidays libraries from the Python Package Index (PyPI). These are well-known and trusted packages used to generate realistic synthetic data components like names and dates.
[PROMPT_INJECTION]: The skill exposes an indirect prompt injection surface (Category 8). Evidence: 1. Ingestion points: User-provided schema and catalog names. 2. Boundary markers: Absent. 3. Capability inventory: Execution of arbitrary SQL via spark.sql(), file writing to Volumes, and package installation. 4. Sanitization: Absent. The logic interpolates user input directly into SQL strings (e.g., spark.sql(f"CREATE SCHEMA IF NOT EXISTS {CATALOG}.{SCHEMA}")), which could be exploited by a malicious user to execute unauthorized SQL operations.

databricks-synthetic-data-generation