self-improvement

Pass

Audited by Gen Agent Trust Hub on Mar 10, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill implements a workflow for promoting "learnings" and "errors" to project instruction files like CLAUDE.md, AGENTS.md, and .github/copilot-instructions.md. This creates a persistent indirect prompt injection surface.
  • Ingestion points: Untrusted data enters the context via user corrections, tool error messages, and API responses as specified in the "Situation" table and "Detection Triggers" section of SKILL.md.
  • Boundary markers: The skill documentation does not define specific delimiters or "ignore" instructions for the content written to the learning files or promoted instruction files.
  • Capability inventory: The skill has the capability to write to local markdown files and project-wide instruction files, which are used to guide agent behavior in future sessions.
  • Sanitization: There is no mention of sanitization, escaping, or validation of the ingested content before it is logged or promoted.
  • [COMMAND_EXECUTION]: The skill utilizes standard system utilities (mkdir, grep) for directory creation and searching through the .learnings/ logs.
  • [COMMAND_EXECUTION]: The skill instructs the setup of automation hooks in .claude/settings.json and .codex/settings.json that execute local scripts (activator.sh, error-detector.sh, extract-skill.sh) on specific triggers such as prompt submission or tool use.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 10, 2026, 10:09 AM