self-improvement
Pass
Audited by Gen Agent Trust Hub on Mar 10, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [PROMPT_INJECTION]: The skill implements a self-improvement loop that processes untrusted data from user corrections and tool outputs. This architecture is vulnerable to indirect prompt injection, as malicious instructions could be logged and subsequently 'promoted' to permanent instruction files such as
CLAUDE.mdorAGENTS.md. - Ingestion points: Data enters via user dialogue and the
CLAUDE_TOOL_OUTPUTenvironment variable processed byscripts/error-detector.sh. - Boundary markers: The skill uses markdown headers and structured fields for logging, which provide weak delimitation against adversarial input.
- Capability inventory: The agent has the capability to write files, execute shell commands, and modify its own workspace instructions.
- Sanitization: There is no evidence of content validation or sanitization before data is logged or promoted to core files.
- [COMMAND_EXECUTION]: The skill includes several bash scripts (
scripts/activator.sh,scripts/error-detector.sh,scripts/extract-skill.sh) that are intended to be executed by the agent or user. While these scripts perform legitimate scaffolding and monitoring tasks, they represent an additional attack surface within the agent's operating environment. Theextract-skill.shscript includes basic regex-based validation for the skill name to mitigate basic injection risks.
Audit Metadata