book-sft-pipeline
Pass
Audited by Gen Agent Trust Hub on Mar 8, 2026
Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection during the dataset creation phase.
- Ingestion points: Raw text data is ingested from external ePub files as documented in Phase 1 and Phase 2 of
SKILL.mdand thesegment_textfunction inscripts/pipeline_example.py. - Boundary markers: The
INSTRUCTION_PROMPTinSKILL.mdandscripts/pipeline_example.pyinterpolates extracted text chunks directly into LLM prompts without explicit delimiters or instructions to ignore potential commands embedded in the source text. - Capability inventory: The pipeline has the capability to perform external LLM API calls for metadata generation and initiates training operations on the Tinker platform, which involves network communication and model weight updates.
- Sanitization: No sanitization, validation, or filtering is applied to the book content before it is processed by the instruction generation LLM.
- [EXTERNAL_DOWNLOADS]: The skill fetches necessary machine learning artifacts, including model weights and tokenizer configurations, from established and well-known services such as Hugging Face.
Audit Metadata