hugging-face-model-trainer

Pass

Audited by Gen Agent Trust Hub on Mar 11, 2026

Risk Level: SAFE
Full Analysis
  • [Data Ingestion and Processing]: The skill processes external datasets from the Hugging Face Hub, which is a standard ingestion point for machine learning workflows. Ingestion points: Datasets are loaded in 'scripts/train_sft_example.py' and analyzed in 'scripts/dataset_inspector.py'. Boundary markers: The skill relies on standard dataset loading libraries without additional boundary markers between data and instructions. Capability inventory: The training environment includes capabilities for network communication and subprocess execution (e.g., in 'scripts/convert_to_gguf.py') to support model uploads and tool compilation. Sanitization: Input data is handled using standard ML practices without specific instruction-filtering logic, which is typical for training tasks.
  • [Command Execution for Environment Setup]: The skill utilizes 'subprocess.run' within scripts like 'scripts/convert_to_gguf.py' to manage system dependencies and build essential tools. Evidence: Commands such as 'git clone', 'cmake', and 'pip install' are used to prepare the environment and compile the 'llama-quantize' utility. Context: These operations are performed to facilitate model conversion and are directed at well-known repositories like 'llama.cpp'.
  • [External Resource Integration]: The skill fetches training scripts and utility tools from the Hugging Face Hub and GitHub. Evidence: References to 'https://github.com/huggingface/trl/' and 'https://huggingface.co/datasets/mcp-tools/' are present in 'SKILL.md'. Context: These resources originate from the vendor's own infrastructure or established open-source projects, aligning with expected behavior for this skill's domain.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 11, 2026, 06:20 PM