devtu-self-evolve

Warn

Audited by Gen Agent Trust Hub on Mar 26, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: Uses system commands including git for branch management and history rewriting (git push --force-with-lease), gh for pull request orchestration, and the ruff linter for code validation.
  • [COMMAND_EXECUTION]: Executes Python CLI commands (python3 -m tooluniverse.cli run) using arguments potentially sourced from untrusted agent reports to verify tool behavior.
  • [REMOTE_CODE_EXECUTION]: Implements a self-evolution loop that involves dynamically modifying Python source files and subsequently executing them via the Python interpreter to validate fixes and new tool registrations.
  • [PROMPT_INJECTION]: Subject to indirect prompt injection because the skill acts on bug reports and instructions generated by 'Researcher Persona Agents' who ingest data from dozens of external biological and pharmacological APIs.
  • [PROMPT_INJECTION]: Mandatory Evidence Chain for Category 8: 1. Ingestion points: Agent-generated issue reports defined in references/persona-template.md. 2. Boundary markers: Absent; instructions do not use specific delimiters or safety headers for untrusted data. 3. Capability inventory: Subprocess calls (python3, git, gh, ruff), file system modification (writing fixes), and network operations (git push, API requests) across SKILL.md. 4. Sanitization: The skill implements a verification step, but this step involves executing the code with agent-supplied JSON arguments without formal sanitization.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 26, 2026, 04:24 PM