auto-improvement

Pass

Audited by Gen Agent Trust Hub on Apr 2, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill uses imperative and restrictive language to override agent autonomy and prevent the deactivation of its procedures. Key phrases include "ALWAYS active", "cannot be disabled", "mandatory", and the use of "STOP" markers to force adherence to specific instrumentation steps regardless of context.
  • [PROMPT_INJECTION]: The skill implements a workflow vulnerable to indirect prompt injection by persisting potentially untrusted external data into long-term memory files.
  • Ingestion points: Data is pulled from task outputs, user-provided corrections, and various project context files (e.g., package.json, source code).
  • Boundary markers: The instructions lack any requirement for boundary markers or warnings to ignore embedded instructions within the data being saved to the memory/ directory.
  • Capability inventory: The skill requires the agent to perform file-write operations to update persistent markdown logs and pattern files.
  • Sanitization: There is no requirement for sanitization, validation, or escaping of data before it is persisted, which could allow malicious instructions in project files or user input to become part of the agent's permanent operational guardrails.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 2, 2026, 12:42 AM