The Agent Skills Directory

Prompt Injection (SAFE): The skill uses strong directives such as 'CRITICAL' and 'NEVER break character', but these are strictly for persona maintenance and linguistic styling. There are no instructions to bypass safety filters or ignore system-level constraints.
Data Exposure & Exfiltration (SAFE): The skill does not access sensitive local file paths (e.g., ~/.ssh or .env). Communication is limited to predefined MCP tools for retrieving mental models and transcripts. No hardcoded credentials or unauthorized network operations were found.
Unverifiable Dependencies & RCE (SAFE): No external package installations (pip, npm) or remote script executions (curl|bash) are present in the skill definition.
Indirect Prompt Injection (LOW): The skill retrieves data from external sources via MCP tools and synthesizes it for the user.
Ingestion points: Data enters the context from mcp__persona-agent__retrieve_mental_models, retrieve_core_beliefs, and retrieve_transcripts (SKILL.md, Step 3).
Boundary markers: Absent. There are no specific delimiters used to separate retrieved tool data from the system instructions.
Capability inventory: The agent is limited to text synthesis and conversational output. It has no file-write, shell-exec, or direct network capabilities.
Sanitization: None specified. While the agent could theoretically be influenced by malicious content inside the retrieved transcripts, the lack of high-privilege capabilities restricts the impact to conversational bias.

instantly-ai