databricks-data-handling
Warn
Audited by Gen Agent Trust Hub on Mar 12, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTIONDATA_EXFILTRATIONEXTERNAL_DOWNLOADS
Full Analysis
- [COMMAND_EXECUTION]: The skill uses Python f-strings to dynamically construct Spark SQL queries in several locations, including
GDPRHandler.process_deletion_request,DataRetentionManager.apply_retention_policies, andgenerate_sar_report. This pattern is susceptible to SQL injection if inputs likeuser_id,user_column, ortable_nameare not properly sanitized. - [COMMAND_EXECUTION]: The skill requests
Bash(databricks:*)permissions, allowing the agent to execute any Databricks CLI command, which represents a high level of privilege within the target environment. - [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection surface as it ingests untrusted data (user-provided identifiers and table metadata) and interpolates it into executable SQL commands without sanitization or boundary markers.
- Ingestion points: User-provided
user_idinprocess_deletion_requestandgenerate_sar_report; table and column tags fetched fromsystem.information_schemainSKILL.md. - Boundary markers: None identified. SQL queries are constructed using direct string interpolation.
- Capability inventory: Uses
spark.sql()to execute DDL and DML operations (DELETE, INSERT, ALTER) andBash(databricks:*)for environment management. - Sanitization: No evidence of input validation, escaping, or parameterized queries for the SQL execution.
- [DATA_EXFILTRATION]: The
generate_sar_reportfunction in the examples allows for the extraction of all data associated with a user across all PII-tagged tables, converting it into a local Pandas DataFrame/dictionary. While intended for GDPR compliance, this provides a mechanism for significant data exposure if misused. - [EXTERNAL_DOWNLOADS]: The skill references official Databricks documentation for Delta Lake privacy and Unity Catalog security. These are well-known, trusted sources used for configuration guidance.
Audit Metadata