simpo-training
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFEEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTION
Full Analysis
- EXTERNAL_DOWNLOADS (LOW): The skill instructions involve cloning an external repository to obtain training scripts.
- Evidence:
git clone https://github.com/huggingface/alignment-handbook.gitinSKILL.md. - Trust Status: Hugging Face (
huggingface) is a recognized trusted organization. The finding is downgraded per [TRUST-SCOPE-RULE]. - REMOTE_CODE_EXECUTION (LOW): The skill executes code from a recently cloned repository and installs it as a package.
- Evidence:
python -m pip install .followed byaccelerate launch ... scripts/run_simpo.pyinSKILL.md. - Context: The execution is performed on code from a trusted source (Hugging Face).
- INDIRECT_PROMPT_INJECTION (LOW): The training process ingests untrusted datasets from the Hugging Face Hub (e.g.,
ultrafeedback_binarized). - Ingestion points:
dataset_mixerentries in YAML configurations withinSKILL.md. - Boundary markers: Absent; training processes typically ingest raw text.
- Capability inventory: The skill triggers model training via
accelerate launch, which is a high-compute capability. - Sanitization: Not explicitly implemented in the provided documentation, relying on the underlying training scripts.
Audit Metadata