NYC

grpo-rl-training

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [EXTERNAL_DOWNLOADS] (LOW): The script 'templates/basic_grpo_training.py' downloads the 'Qwen/Qwen2.5-1.5B-Instruct' model and the 'openai/gsm8k' dataset from Hugging Face Hub. Per [TRUST-SCOPE-RULE], downloads from this trusted source are downgraded to LOW severity for visibility.
  • [PROMPT_INJECTION] (LOW): The skill has an indirect prompt injection surface because it processes untrusted dataset content during the training loop. 1. Ingestion point: 'load_dataset' call in 'templates/basic_grpo_training.py'; 2. Boundary markers: A system prompt is defined but lacks robust escaping for external user data; 3. Capability inventory: The script executes a complex training loop and writes output to the local filesystem; 4. Sanitization: No explicit filtering or sanitization of the dataset inputs is implemented.
  • [DATA_EXFILTRATION] (SAFE): The skill does not access sensitive credentials or local files like SSH keys. Network activity is confined to trusted model hubs and experiment tracking (Weights & Biases).
  • [COMMAND_EXECUTION] (SAFE): No arbitrary command execution, shell spawning, or unsafe dynamic evaluation was found in the code.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 05:59 PM