grpo-rl-training
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFE
Full Analysis
- [EXTERNAL_DOWNLOADS] (SAFE): The script loads standard models and datasets from Hugging Face Hub (
Qwen/Qwen2.5-1.5B-Instructandopenai/gsm8k). These are reputable sources. - [COMMAND_EXECUTION] (SAFE): No arbitrary command execution or shell spawning detected. Logic is restricted to Python-native training loops.
- [DATA_EXFILTRATION] (SAFE): No sensitive file access or unauthorized network calls. The optional use of Weights & Biases (
wandb) for experiment tracking is standard industry practice. - [PROMPT_INJECTION] (SAFE): The system prompt used in the template provides formatting instructions for reasoning/answer tags and does not attempt to bypass agent safety filters or override system instructions.
- [INDIRECT_PROMPT_INJECTION] (LOW): As a training template, the skill ingests data from external datasets (
load_dataset). While this represents a surface area for indirect injection if a malicious dataset is used, the logic is standard for the intended ML training use case.
Audit Metadata