The Agent Skills Directory

Prompt Injection (LOW): The skill is designed to transform user requirements into synthetic training examples. This ingestion of untrusted data into the agent's generation pipeline represents a surface for indirect prompt injection (Category 8a, 8d), where malicious instructions could be embedded in the generated output.\n
Ingestion points: User-provided domain requirements and dataset specifications (README.md).\n
Boundary markers: None explicitly defined in the templates to isolate user instructions from the model's generation logic.\n
Capability inventory: Text generation and JSON formatting; no direct file-write or network capabilities are present in the provided skill files.\n
Sanitization: No sanitization or filtering logic is present in the provided templates or documentation.\n- Unverifiable Dependencies & Remote Code Execution (SAFE): The README and quality-validation documentation refer to Python scripts in a scripts/ directory (e.g., validate_chatml.py) that were not included in the provided file list. While the documentation describes these as utility scripts, the absence of their source code prevents a security audit for potential malicious behavior such as unauthorized command execution.\n- Command Execution (SAFE): The documentation contains a bash snippet using jq for JSON validation. This is a standard use of a common utility and does not constitute a security risk in this context.

fine-tuning-data-generator