spark-python-data-source
Pass
Audited by Gen Agent Trust Hub on Feb 27, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSCREDENTIALS_UNSAFE
Full Analysis
- [PROMPT_INJECTION]: The skill facilitates reading from external APIs and databases, creating an attack surface for indirect prompt injection via the ingested data payloads.
- Ingestion points: Data is ingested through network requests in the
readmethods of theDataSourceReaderandDataSourceStreamReadercomponents. - Boundary markers: There are no explicit boundary markers for the data payloads as the skill is designed for structured data integration rather than direct LLM interaction.
- Capability inventory: The skill provides templates for network operations via the
requestslibrary, file system writes for error tracking inreferences/error-handling.md, and subprocess calls via thepoetrybuild tool. - Sanitization: The implementation patterns in
references/production-patterns.mdinclude validation logic using regular expressions to verify hostnames, IP addresses, and SQL identifiers. - [COMMAND_EXECUTION]: The instructions include development commands such as
poetry run pytestandpoetry run ruff. The templates also provide patterns for file system interaction, including directory creation and local file writes for dead letter queue management. - [EXTERNAL_DOWNLOADS]: The documentation references standard Python libraries such as
pyspark,azure-identity, andrequests. It also provides links to several external GitHub repositories containing reference implementations for Spark connectors. - [CREDENTIALS_UNSAFE]: The skill includes patterns for handling sensitive authentication data such as API keys and client secrets. It provides specific remediation advice, such as implementing redaction for logs and using managed secret storage like Databricks Secrets to avoid hardcoding sensitive information.
Audit Metadata