cognitive-memory
Audited by Gen Agent Trust Hub on Feb 12, 2026
The 'cognitive-memory' skill was thoroughly analyzed across all provided files, including Markdown templates, design documents, and shell scripts. The primary function of the skill is to establish and manage a sophisticated memory system for an AI agent, involving local file creation, modification, and version control via Git.
1. Prompt Injection: No direct prompt injection attempts were found. Instead, the skill explicitly implements robust guardrails against prompt injection, such as '⛔ STOP. Do NOT proceed until user responds.' and 'APPROVAL FIRST' before any system file changes. The SOUL.md template also includes '🛑 System Change Guardrails (MANDATORY)' to prevent unauthorized modifications.
2. Data Exfiltration: No evidence of data exfiltration was detected. The shell scripts (init_memory.sh, upgrade_to_1.0.6.sh, upgrade_to_1.0.7.sh) perform standard local file system operations (mkdir, cp, echo) and Git commands (git init, git add, git commit). These operations are confined to the user's workspace and do not involve network requests to untrusted domains or access to sensitive system files (e.g., ~/.aws/credentials, ~/.ssh/id_rsa). The SOUL.md template explicitly states 'No public actions (emails, tweets, posts) without explicit approval', which is a strong internal policy against exfiltration.
3. Obfuscation: No obfuscation techniques (e.g., Base64 encoding, zero-width characters, homoglyphs, URL/hex/HTML encoding) were found in any of the analyzed files.
4. Unverifiable Dependencies: The skill's scripts do not install external packages or download code from untrusted sources. The upgrade scripts use python3 for JSON manipulation, but this is a common system utility, and its execution is guarded by a command -v python3 check, with a manual fallback provided if Python is not available. This is a safe approach and does not introduce unverifiable dependencies.
5. Privilege Escalation: No commands attempting privilege escalation (e.g., sudo, doas, chmod 777 on system files, service installations) were found. All script operations are designed to run within the user's current permissions and workspace.
6. Persistence Mechanisms: The skill does not attempt to establish persistence mechanisms (e.g., modifying ~/.bashrc, crontab, authorized_keys). The use of Git for an 'Audit ground truth' and audit.log is for internal version control and logging within the skill's defined workspace, not for unauthorized access persistence.
7. Metadata Poisoning: The SKILL.md and _meta.json files contain standard descriptive metadata without any malicious instructions or hidden content. The Git commit URLs in _meta.json point to github.com/clawdbot/skills, which is the owner's organization and a trusted domain for references, not a source of malicious code download.
8. Indirect Prompt Injection: As with any LLM-based skill that processes user-provided or dynamically generated content (e.g., sub-agent proposals, episode logs), there is an inherent, general risk of indirect prompt injection if malicious data is introduced into the memory stores. However, the skill includes explicit guardrails, such as strict scope rules for reflection ('NEVER: code, configs, transcripts, outside memory/') and user approval for memory consolidation, which mitigate this risk. This is an informational warning about a general LLM vulnerability, not a specific flaw in the skill's code.
9. Time-Delayed / Conditional Attacks: No time-delayed or conditional attacks were identified. Features like memory decay and scheduled reflection (e.g., 'Cron job during off-peak hours') are part of the skill's intended, benign functionality and are subject to user approval ('Never auto-run without permission').
Conclusion: The 'cognitive-memory' skill demonstrates a strong commitment to security through its design and implementation. The use of local file operations, Git for auditing, and explicit user approval for critical actions significantly reduces potential attack vectors. The skill is primarily composed of declarative Markdown and safe shell scripts, making it inherently less prone to code-based vulnerabilities. The identified risks are either general to LLM interactions or are explicitly mitigated by the skill's design.