instinct-system

Pass

Audited by Gen Agent Trust Hub on Apr 2, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [SAFE]: The skill manages internal project state by reading and writing to local files such as .claude/instincts.md and MEMORY.md. There is no evidence of data exfiltration, privilege escalation, or unauthorized external connections.
  • [PROMPT_INJECTION]: The system's core functionality involves deriving behavioral rules from project code, which presents a surface for indirect prompt injection if an attacker places malicious instructions in the source code.
  • Ingestion points: Codebase files accessed through developer tools and the project-specific configuration file .claude/instincts.md.
  • Boundary markers: Absent; the skill relies on its confidence-scoring logic (0.3 to 0.9) to distinguish between noise and valid conventions.
  • Capability inventory: Reading and searching the local filesystem; writing configuration and memory files.
  • Sanitization: No sanitization is performed on patterns learned from the code.
  • Mitigation: The 'Observe-Hypothesize-Confirm' cycle requires multiple occurrences of a pattern across different files, and the 'Promotion Protocol' mandates explicit user approval before an instinct becomes a permanent rule in MEMORY.md.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 2, 2026, 05:48 AM