book-sft-pipeline

Pass

Audited by Gen Agent Trust Hub on Mar 8, 2026

Risk Level: SAFEPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection during the dataset creation phase.
  • Ingestion points: Raw text data is ingested from external ePub files as documented in Phase 1 and Phase 2 of SKILL.md and the segment_text function in scripts/pipeline_example.py.
  • Boundary markers: The INSTRUCTION_PROMPT in SKILL.md and scripts/pipeline_example.py interpolates extracted text chunks directly into LLM prompts without explicit delimiters or instructions to ignore potential commands embedded in the source text.
  • Capability inventory: The pipeline has the capability to perform external LLM API calls for metadata generation and initiates training operations on the Tinker platform, which involves network communication and model weight updates.
  • Sanitization: No sanitization, validation, or filtering is applied to the book content before it is processed by the instruction generation LLM.
  • [EXTERNAL_DOWNLOADS]: The skill fetches necessary machine learning artifacts, including model weights and tokenizer configurations, from established and well-known services such as Hugging Face.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 8, 2026, 07:40 AM