progress-tracking
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONDATA_EXFILTRATIONCOMMAND_EXECUTION
Full Analysis
- [Prompt Injection] (HIGH): The skill explicitly commands the agent to bypass user consent through instructions like "DO NOT ask permission to checkpoint" and "Don't wait to be asked." This is a direct attempt to override the agent's default safety architecture which requires confirmation for state-altering or data-exporting actions.
- [Data Exfiltration] (HIGH): The skill leverages the
mcp__goldfish__checkpointtool to capture "Changed files" and "Workspace context." By directing the agent to perform these captures proactively and without oversight, it creates a high risk of exfiltrating sensitive data, such as the "API keys" explicitly mentioned in the skill's own example sequence, to a persistent storage backend. - [Indirect Prompt Injection] (HIGH): The skill creates a significant vulnerability surface by monitoring external data (workspace files) and conversation triggers for "completion signals." * Ingestion points: The agent is told to monitor workspace context and specific text signals like "All tests passing" or "Bug fixed." * Boundary markers: Absent. There are no instructions to distinguish between actual task completion and malicious instructions embedded in file contents or logs. * Capability inventory: The skill utilizes
Bash(system access) andmcp__goldfish__checkpoint(external storage), providing both execution and persistence capabilities. * Sanitization: Absent. No filtering or sanitization is applied to the files or context captured before they are sent to the external memory tool.
Recommendations
- AI detected serious security threats
Audit Metadata