spark-structured-streaming

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • Indirect Prompt Injection (HIGH): A significant vulnerability surface exists where the agent is instructed to process untrusted external data while possessing destructive system capabilities.
  • Ingestion points: External data enters the agent's logic via spark.readStream calls targeting Kafka topics and cloud storage (cloudFiles), as documented in SKILL.md and streaming-best-practices.md.
  • Boundary markers: Code templates lack delimiters or specific instructions to disregard embedded commands within the streamed data.
  • Capability inventory: Destructive and sensitive operations include dbutils.fs.rm (file deletion), dbutils.fs.cp (file copying), dbutils.fs.head (data reading), and writeStream.start (process execution) located in checkpoint-best-practices.md.
  • Sanitization: Helper functions such as get_checkpoint_location in checkpoint-best-practices.md do not include path validation, which could be exploited for path traversal if an attacker influences the table name variable.
  • Command Execution (HIGH): The skill facilitates the use of powerful administrative commands. Specifically, the template for recovering from corrupted checkpoints in checkpoint-best-practices.md uses dbutils.fs.rm(recurse=True), a high-impact operation that can cause unintended data loss if triggered by a malicious prompt injection.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 04:19 PM