meta-skill
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- PROMPT_INJECTION (HIGH): The skill is vulnerable to Indirect Prompt Injection (Category 8) because it uses conversation history to drive the 'evolution' of the agent's skill-set. An attacker can embed instructions in the chat to manipulate the meta-analysis process into creating or modifying skills in a way that introduces backdoors or bypasses safety.
- Ingestion points: Conversation transcripts analyzed via
analyze_transcripts.pyand summarized inSKILL.mdworkflows. - Boundary markers: None; the instructions do not include delimiters or warnings to ignore instructions embedded within the processed transcripts.
- Capability inventory: File writing and appending to
~/.claude/skills/EVOLUTION.mdand potential creation of new skill files in~/.claude/skills/or~/.claude/rules/. - Sanitization: None; the skill directly processes natural language insights from untrusted input to suggest system-level changes.
- COMMAND_EXECUTION (MEDIUM): The skill explicitly calls for the execution of a local Python script to process user data.
- Evidence:
python3 ~/.claude/evaluation/scripts/analyze_transcripts.py --days 30 --output markdowninSKILL.md. - EXTERNAL_DOWNLOADS (LOW): The skill references official Anthropics repositories for skill updates.
- Evidence: References to
https://raw.githubusercontent.com/anthropics/skills/inreferences/evaluation-frameworks.md. - Trust Status: Downgraded to LOW per [TRUST-SCOPE-RULE] as the 'anthropics' organization is a verified trusted source.
Recommendations
- AI detected serious security threats
Audit Metadata