openrlhf-training
Warn
Audited by Socket on Feb 15, 2026
1 alert found:
AnomalyAnomalyreferences/custom-rewards.md
LOWAnomalyLOW
references/custom-rewards.md
This file is documentation and examples for implementing reward functions and agent logic. It contains one high-risk pattern: executing model-generated code via subprocess.run(pytest) after writing it to a tempfile, which enables arbitrary code execution on the host and potential data exfiltration or system modification. Other examples are benign algorithmic reward computations or use of evaluation models, but logging and model-loading can leak data or cause network activity. No signs of obfuscated or intentionally malicious code were found, but the code-execution example constitutes a significant security hazard if used without sandboxing and careful privilege, network, and logging controls.
Confidence: 90%Severity: 60%
Audit Metadata