os-learning-loop

Pass

Audited by Gen Agent Trust Hub on Mar 19, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it reads and processes system logs that contain records of previous agent and user activity which could be influenced by malicious external inputs. The agent uses this information to autonomously or semi-autonomously modify core system files like CLAUDE.md and create new skill files.\n
  • Ingestion points: Reads log data from ${CLAUDE_PROJECT_DIR}/context/events.jsonl and ${CLAUDE_PROJECT_DIR}/context/memory/hook-errors.log during Phase 1 (Context Gathering).\n
  • Boundary markers: The skill contains logic for an "AUTO-APPLY ZONE" and "SANDBOX PROTECTION RULE" (Phase 3), but lacks explicit delimiters or instructions to ignore embedded commands within the ingested log data.\n
  • Capability inventory: The agent can execute arbitrary shell commands via the Bash tool and write to critical configuration files (CLAUDE.md, SKILL.md, memory.md) using the Write tool.\n
  • Sanitization: While the skill includes an evaluation step (eval_runner.py) and requires user approval for most changes, there is no automated sanitization or filtering of the ingested log content to prevent injection payloads from being interpreted as legitimate friction points.\n- [COMMAND_EXECUTION]: The skill frequently uses the Bash tool to interact with the underlying system.\n
  • It executes local Python scripts such as ${CLAUDE_PROJECT_DIR}/context/kernel.py for state management and event emission.\n
  • It runs ${CLAUDE_PLUGIN_ROOT}/skills/skill-improvement-eval/scripts/eval_runner.py to evaluate proposed changes to the system configuration.\n
  • It performs filesystem operations like cp for creating backups and git stash for its safe writing protocol.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 19, 2026, 11:28 PM