grpo-rl-training

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [EXTERNAL_DOWNLOADS] (SAFE): The script loads standard models and datasets from Hugging Face Hub (Qwen/Qwen2.5-1.5B-Instruct and openai/gsm8k). These are reputable sources.
  • [COMMAND_EXECUTION] (SAFE): No arbitrary command execution or shell spawning detected. Logic is restricted to Python-native training loops.
  • [DATA_EXFILTRATION] (SAFE): No sensitive file access or unauthorized network calls. The optional use of Weights & Biases (wandb) for experiment tracking is standard industry practice.
  • [PROMPT_INJECTION] (SAFE): The system prompt used in the template provides formatting instructions for reasoning/answer tags and does not attempt to bypass agent safety filters or override system instructions.
  • [INDIRECT_PROMPT_INJECTION] (LOW): As a training template, the skill ingests data from external datasets (load_dataset). While this represents a surface area for indirect injection if a malicious dataset is used, the logic is standard for the intended ML training use case.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:06 PM