state-management
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- PROMPT_INJECTION (HIGH): Indirect prompt injection vulnerability through state file ingestion.
- Ingestion points: The skill reads project state from
.flow/FLOW.mdand item-specific state from.flow/items/ITEM-XXX.md. - Boundary markers: The skill relies on markdown headers and XML-like tags (e.g.,
<task>) but lacks explicit security boundaries or instructions to ignore embedded malicious directions. - Capability inventory: The skill maps project 'Capabilities' to specific plugins/commands in the
Capabilities Cacheand executes arbitrary shell commands defined in the<verify>tags of tasks. - Sanitization: No sanitization or validation of the content within the markdown state files is described.
- Risk: An attacker could modify the state files to inject malicious commands into the
verifysections or alter theCapabilities Cacheto redirect the agent's high-level logic to malicious scripts or endpoints. - COMMAND_EXECUTION (MEDIUM): Execution of commands derived from markdown templates.
- Evidence: The
ITEM-XXX.mdtemplate intemplates.mdincludes<verify>blocks containing strings likenpm run buildandnpm test -- AuthServicewhich are intended for execution. - Risk: While these are common development commands, the mechanism treats strings inside markdown files as executable logic. Without strict validation, this allows for Remote Code Execution (RCE) if the project files are untrusted.
Recommendations
- AI detected serious security threats
Audit Metadata