self-improving-agent

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTION
Full Analysis
  • Indirect Prompt Injection (HIGH): The skill implements a 'Learning Review' mechanism that echoes logged interactions back to the agent at the start of new sessions. • Ingestion points: Captures user_message in correction-detector.js and capability-tracker.js; tool_input and error results captured in enhanced-failure-logger.js. • Boundary markers: No delimiters or ignore instructions are used when displaying learnings during the session start review. • Capability inventory: The review process influences the agent's reasoning and behavioral context based on historical, potentially attacker-controlled data. • Sanitization: No sanitization or filtering of the logged content is performed before presentation to the agent.
  • Data Exposure (MEDIUM): The skill logs interaction data, including tool inputs and full user messages, to persistent local files (~/.claude/logs/). This creates a local cache of potentially sensitive information (credentials passed to tools, private data) in plaintext.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:58 AM