The Agent Skills Directory

[COMMAND_EXECUTION]: The file 'examples/reward_functions_library.py' includes a 'run_test_cases' function that uses the 'exec()' built-in to execute Python code strings. This function is intended to facilitate 'code_execution_reward' by testing the validity of code produced by the model.
[REMOTE_CODE_EXECUTION]: The skill allows for the execution of untrusted code generated by the AI model during runtime. While training, a model might produce scripts that perform unauthorized file access, network requests, or other malicious actions. The current implementation lacks sandboxing (e.g., Docker or gVisor), making the host environment vulnerable to full compromise by the generated code.
[PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection through the processing of untrusted training data. Ingestion points include training datasets loaded via 'load_dataset' in 'templates/basic_grpo_training.py' and CSV files in 'SKILL.md'. XML-style tags are used for structure but do not sanitize the content within them. The 'exec()' function provides a high-privilege execution environment. No input validation or code analysis is performed before the 'exec()' call.

grpo-rl-training