The Agent Skills Directory

[PROMPT_INJECTION] (MEDIUM): The skill's core functionality of evaluating agents against test cases creates an indirect prompt injection surface. (1) Ingestion points: The skill ingests untrusted natural language data through Dataset.from_file in unit_testing.py and compare_models.py. (2) Boundary markers: No explicit boundary markers or 'ignore' instructions are used to delimit untrusted input within the test cases. (3) Capability inventory: The skill has the capability to write to the file system (dataset.to_file in add_custom_evaluators.py), execute arbitrary agent tasks (evaluate_sync), and perform network operations for LLM and observability services. (4) Sanitization: There is no evidence of sanitization or filtering for the natural language content within the ingested datasets.
[DATA_EXFILTRATION] (LOW): The skill is configured to transmit execution traces and evaluation results to Logfire via logfire.configure. This constitutes network activity to an external service provider outside the predefined whitelist of trusted domains.

pydantic-evals