The Agent Skills Directory

[PROMPT_INJECTION]: The skill is designed to ingest and process data from external sources such as REST APIs and various file formats including Parquet, CSV, and JSON. This creates a surface for indirect prompt injection if the processed data contains malicious instructions intended to influence the agent's logic or downstream operations.
Ingestion points: External data enters the agent context via references/extraction-patterns.md (API responses using httpx), references/polars-patterns.md (scan_parquet), and references/pyspark-patterns.md (spark.read.parquet).
Boundary markers: The skill currently lacks explicit boundary markers or instructions for the agent to disregard instructions embedded within the ingested data.
Capability inventory: The skill provides patterns for network operations via httpx, extensive file system access through Polars, Pandas, and PySpark, and access to environment variables via os.environ.
Sanitization: While the skill emphasizes rigorous data validation using Pydantic, Pandera, and Great Expectations to ensure schema adherence and data quality, these methods do not specifically sanitize content against prompt injection attacks targeting the LLM.

python-data-engineering