numerai-experiment-design
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFE
Full Analysis
- [COMMAND_EXECUTION] (SAFE): The skill executes local Python modules (e.g., agents.code.modeling) to automate model training and analysis. This is the primary and intended function of the skill within its research context.
- [REMOTE_CODE_EXECUTION] (SAFE): No patterns for downloading or executing scripts from remote, untrusted, or external URLs were found.
- [DATA_EXFILTRATION] (SAFE): Network interaction is limited to model deployment via specific MCP tools (create_model, upload_model) intended for the Numerai platform. No unauthorized data transfer patterns were observed.
- [PROMPT_INJECTION] (SAFE): The instructions do not contain attempts to override system prompts, bypass safety filters, or disclose internal instructions.
- [INDIRECT_PROMPT_INJECTION] (LOW): The skill processes user-defined model ideas and experiment results. While it lacks explicit boundary markers for this data, the ingestion is performed within a controlled data science workflow, presenting minimal risk.
Audit Metadata