grpo-rl-training
Warn
Audited by Socket on Feb 15, 2026
1 alert found:
SecuritySecurityexamples/reward_functions_library.py
MEDIUMSecurityMEDIUM
examples/reward_functions_library.py
This module is largely benign in its parsing and scoring helpers, but contains a high-risk component: run_test_cases executes extracted code via exec() and then calls a 'solution' function, with no sandboxing, timeouts, or resource restrictions. That creates a clear supply-chain/runtime risk if completions are untrusted — arbitrary code execution, data exfiltration, or system compromise are possible. Use strict sandboxing or avoid executing untrusted code in-process.
Confidence: 85%Severity: 70%
Audit Metadata