reinforcement-learning

Warn

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTION
Full Analysis
  • Unverifiable Dependencies (LOW): The skill references standard machine learning packages including gymnasium, stable-baselines3, tensorboard, and optuna. These are widely used and trusted in the RL community.
  • Dynamic Execution (MEDIUM): In SKILL.md and references/algorithms.md, the code snippets utilize PPO.load(), DQN.load(), and similar methods. Stable-Baselines3 uses zipfile and pickle/cloudpickle to restore model weights and metadata. Loading a malicious model file (e.g., ppo_cartpole.zip) could result in arbitrary code execution on the user's machine.
  • Indirect Prompt Injection (INFO): The skill defines patterns for ingesting data from environments (observations). While these are typically numeric arrays in the provided examples, environments that return text-based observations represent a theoretical ingestion point for untrusted data into an agent's reasoning loop.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 16, 2026, 01:32 PM