simpo-training

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [External Downloads] (LOW): The skill instructions involve cloning the alignment-handbook repository from GitHub (huggingface/alignment-handbook). As huggingface is a trusted organization, this finding is downgraded to LOW per the [TRUST-SCOPE-RULE].
  • [Indirect Prompt Injection] (LOW): The skill facilitates workflows that ingest untrusted datasets from Hugging Face for model training. • Ingestion points: Dataset names in YAML configurations such as HuggingFaceH4/ultrafeedback_binarized. • Boundary markers: None explicitly defined in the provided configuration snippets to isolate data from instructions. • Capability inventory: Executes training scripts via accelerate launch. • Sanitization: Implicitly handled by the underlying training frameworks.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 04:59 PM