state-management

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • PROMPT_INJECTION (HIGH): Indirect prompt injection vulnerability through state file ingestion.
  • Ingestion points: The skill reads project state from .flow/FLOW.md and item-specific state from .flow/items/ITEM-XXX.md.
  • Boundary markers: The skill relies on markdown headers and XML-like tags (e.g., <task>) but lacks explicit security boundaries or instructions to ignore embedded malicious directions.
  • Capability inventory: The skill maps project 'Capabilities' to specific plugins/commands in the Capabilities Cache and executes arbitrary shell commands defined in the <verify> tags of tasks.
  • Sanitization: No sanitization or validation of the content within the markdown state files is described.
  • Risk: An attacker could modify the state files to inject malicious commands into the verify sections or alter the Capabilities Cache to redirect the agent's high-level logic to malicious scripts or endpoints.
  • COMMAND_EXECUTION (MEDIUM): Execution of commands derived from markdown templates.
  • Evidence: The ITEM-XXX.md template in templates.md includes <verify> blocks containing strings like npm run build and npm test -- AuthService which are intended for execution.
  • Risk: While these are common development commands, the mechanism treats strings inside markdown files as executable logic. Without strict validation, this allows for Remote Code Execution (RCE) if the project files are untrusted.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 05:58 AM