greg-isenberg
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFE
Full Analysis
- [Prompt Injection] (SAFE): The skill includes strong persona enforcement instructions (e.g., 'CRITICAL', 'STRICT RULES', 'NEVER break character'). While these guide the model's behavior, they are standard for persona agents and do not attempt to bypass safety filters or extract system prompts.
- [Data Exposure & Exfiltration] (SAFE): No evidence of hardcoded credentials, sensitive file path access, or unauthorized network operations was found. Information retrieval is performed through defined MCP tools within a controlled environment.
- [Remote Code Execution] (SAFE): The skill does not download, install, or execute any external scripts or packages.
- [Indirect Prompt Injection] (LOW): The agent ingests untrusted user input and processes external data retrieved via tools (
retrieve_mental_models,retrieve_core_beliefs,retrieve_transcripts). - Ingestion points: User queries and the outputs of the retrieval tools in
SKILL.md(Step 3). - Boundary markers: Absent. The instructions do not specify delimiters for tool outputs.
- Capability inventory: Limited to calling specific retrieval tools and generating text responses.
- Sanitization: None explicitly defined in the prompt workflow.
- Assessment: While the skill lacks input/output sanitization, its restricted capabilities (no file-writing, no arbitrary network requests, no system commands) limit the impact of potential indirect injections.
Audit Metadata