The Agent Skills Directory

[PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection during the dataset creation phase.
Ingestion points: Raw text data is ingested from external ePub files as documented in Phase 1 and Phase 2 of SKILL.md and the segment_text function in scripts/pipeline_example.py.
Boundary markers: The INSTRUCTION_PROMPT in SKILL.md and scripts/pipeline_example.py interpolates extracted text chunks directly into LLM prompts without explicit delimiters or instructions to ignore potential commands embedded in the source text.
Capability inventory: The pipeline has the capability to perform external LLM API calls for metadata generation and initiates training operations on the Tinker platform, which involves network communication and model weight updates.
Sanitization: No sanitization, validation, or filtering is applied to the book content before it is processed by the instruction generation LLM.
[EXTERNAL_DOWNLOADS]: The skill fetches necessary machine learning artifacts, including model weights and tokenizer configurations, from established and well-known services such as Hugging Face.

book-sft-pipeline