hugging-face-model-trainer
Pass
Audited by Gen Agent Trust Hub on Apr 26, 2026
Risk Level: SAFECOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The script
scripts/convert_to_gguf.pyutilizessubprocess.runto call system utilities includinggitfor cloning repositories,cmakefor build configuration, andpipfor installing dependencies derived from external sources. - [REMOTE_CODE_EXECUTION]: The model conversion process involves cloning the
llama.cpprepository from GitHub and executing its internal Python scripts and compiled binaries. Additionally, the skill uses thetrust_remote_code=Trueparameter when loading models, allowing for the execution of custom code provided within the model's repository. - [DATA_EXFILTRATION]: The skill is designed to upload trained model weights, configuration files, and training logs to the Hugging Face Hub. This process requires the use of a user-supplied
HF_TOKENwith write permissions passed through the environment. - [PROMPT_INJECTION]: The skill ingests and processes external datasets from the Hugging Face Hub, which could contain adversarial instructions intended to influence the training process or the agent's behavior.
- Ingestion points: Data is loaded via
datasets.load_dataset()in several training templates, such asscripts/train_sft_example.pyandscripts/train_dpo_example.py. - Boundary markers: No specific delimiters or validation logic is implemented to identify or ignore instructions embedded within the dataset text fields.
- Capability inventory: The skill can execute shell commands locally via
subprocessand submit scripts for remote execution on managed infrastructure using thehf_jobstool. - Sanitization: The skill performs standard data loading and formatting without specific sanitization or filtering of the content retrieved from external datasets.
Audit Metadata