skill-evolution-manager
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The core functionality of this skill is to ingest untrusted user feedback from the conversation and persist it into the instruction files (SKILL.md) of other skills. This creates a 'Knowledge Poisoning' attack surface where an attacker can permanently alter the behavior of the AI agent's entire skill library.
- Ingestion points: User feedback strings and session history context during the '/evolve' command or 'Review & Extract' phase.
- Boundary markers: None. The skill documentation does not mention delimiters or sanitization of user-provided 'best practices'.
- Capability inventory: The skill uses
pythonsubprocess calls to run scripts that perform file-write operations on local skill directories. - Sanitization: None documented. The skill instructions encourage taking 'non-structured user feedback' and converting it directly into persistent instructions.
- [Command Execution] (HIGH): The skill triggers the execution of local Python scripts (
merge_evolution.py,smart_stitch.py) using command-line arguments derived from user context, such as<skill_path>and<json_string>. Without strict validation, this could lead to path traversal or command injection. - [Persistence Mechanisms] (HIGH): While the persistence is intended for 'evolution', it effectively provides a mechanism for malicious instructions to survive across sessions and even across version updates (via the
Alignworkflow), making it an ideal vector for a persistent jailbreak.
Recommendations
- AI detected serious security threats
Audit Metadata