agent-native-architecture

Fail

Audited by Gen Agent Trust Hub on Feb 21, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [DATA_EXFILTRATION] (HIGH): In 'references/refactoring-to-prompt-native.md', the instructions explicitly advocate for removing safety filters on tools like 'read_file' (e.g., 'Agent can read anything'), which allows an attacker to exfiltrate sensitive files via prompt injection.
  • [COMMAND_EXECUTION] (HIGH): The 'references/self-modification.md' file provides implementation details for 'self_deploy' and 'restart' tools that execute shell commands like 'npm run build' and 'git merge' based on agent-driven decisions.
  • [REMOTE_CODE_EXECUTION] (HIGH): The skill describes a 'Self-Modification' architecture where an agent is granted permission to write to its own source code files and trigger restarts, creating a high-risk path for persistent backdoors.
  • [PROMPT_INJECTION] (LOW): The 'references/dynamic-context-injection.md' file outlines an indirect prompt injection surface. Evidence: 1. Ingestion points: 'availableBooks', 'recentActivity' (dynamic context). 2. Boundary markers: Absent. 3. Capability inventory: 'read_file', 'write_file', 'web_search'. 4. Sanitization: Absent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 21, 2026, 02:06 PM