Self-Referential Self-Improvement

Warn

Audited by Gen Agent Trust Hub on Apr 28, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTIONREMOTE_CODE_EXECUTION
Full Analysis
  • [DYNAMIC_EXECUTION]: The skill implements an 'evolve loop' where the agent generates code mutations and executes them for scoring.
  • Evidence: The 'Implementation Pattern' section describes a cycle of creating git worktrees, modifying targets, and running 'fitness evaluations' on the mutated code.
  • [INDIRECT_PROMPT_INJECTION]: The skill is designed to ingest and 'improve' external code artifacts and agent prompts, creating a surface for injection.
  • Ingestion points: SKILL.md files, agent prompts, and source code identified via metadata file patterns.
  • Boundary markers: Absent; there are no instructions to isolate or ignore instructions embedded within the targets being improved.
  • Capability inventory: The meta-agent can modify its own system prompt, tool definitions, and underlying logic based on external inputs.
  • Sanitization: Absent; the skill does not specify validation for the content it incorporates into its own logic.
  • [COMMAND_EXECUTION]: The skill instructs the agent to perform shell-level operations to manage its evolutionary environment.
  • Evidence: Instructions explicitly require the use of git worktree to create sandboxed copies of the codebase for mutations.
  • [PERSISTENCE_MECHANISMS]: The skill allows the agent to modify its own core configuration, enabling permanent changes to behavior.
  • Evidence: The 'Everything is Mutable' section states that the agent can modify tool definitions, selection strategies, and even its own system prompt.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 28, 2026, 10:10 PM