spark-structured-streaming
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- Indirect Prompt Injection (HIGH): A significant vulnerability surface exists where the agent is instructed to process untrusted external data while possessing destructive system capabilities.
- Ingestion points: External data enters the agent's logic via
spark.readStreamcalls targeting Kafka topics and cloud storage (cloudFiles), as documented inSKILL.mdandstreaming-best-practices.md. - Boundary markers: Code templates lack delimiters or specific instructions to disregard embedded commands within the streamed data.
- Capability inventory: Destructive and sensitive operations include
dbutils.fs.rm(file deletion),dbutils.fs.cp(file copying),dbutils.fs.head(data reading), andwriteStream.start(process execution) located incheckpoint-best-practices.md. - Sanitization: Helper functions such as
get_checkpoint_locationincheckpoint-best-practices.mddo not include path validation, which could be exploited for path traversal if an attacker influences the table name variable. - Command Execution (HIGH): The skill facilitates the use of powerful administrative commands. Specifically, the template for recovering from corrupted checkpoints in
checkpoint-best-practices.mdusesdbutils.fs.rm(recurse=True), a high-impact operation that can cause unintended data loss if triggered by a malicious prompt injection.
Recommendations
- AI detected serious security threats
Audit Metadata