spark-python-data-source

Pass

Audited by Gen Agent Trust Hub on Feb 27, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSCREDENTIALS_UNSAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill facilitates reading from external APIs and databases, creating an attack surface for indirect prompt injection via the ingested data payloads.
  • Ingestion points: Data is ingested through network requests in the read methods of the DataSourceReader and DataSourceStreamReader components.
  • Boundary markers: There are no explicit boundary markers for the data payloads as the skill is designed for structured data integration rather than direct LLM interaction.
  • Capability inventory: The skill provides templates for network operations via the requests library, file system writes for error tracking in references/error-handling.md, and subprocess calls via the poetry build tool.
  • Sanitization: The implementation patterns in references/production-patterns.md include validation logic using regular expressions to verify hostnames, IP addresses, and SQL identifiers.
  • [COMMAND_EXECUTION]: The instructions include development commands such as poetry run pytest and poetry run ruff. The templates also provide patterns for file system interaction, including directory creation and local file writes for dead letter queue management.
  • [EXTERNAL_DOWNLOADS]: The documentation references standard Python libraries such as pyspark, azure-identity, and requests. It also provides links to several external GitHub repositories containing reference implementations for Spark connectors.
  • [CREDENTIALS_UNSAFE]: The skill includes patterns for handling sensitive authentication data such as API keys and client secrets. It provides specific remediation advice, such as implementing redaction for logs and using managed secret storage like Databricks Secrets to avoid hardcoding sensitive information.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 27, 2026, 07:55 PM