planning-with-files
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- [Indirect Prompt Injection] (HIGH): The primary workflow encourages the agent to ingest untrusted data from external sources and store it in persistent files that are later used as 'working memory' to guide decisions.
- Ingestion points: Untrusted research findings are written to
notes.mdandtask_plan.md(documented inSKILL.mdandexamples.md). - Boundary markers: Absent. The templates use standard markdown headers but provide no delimiters or instructions to ignore potentially embedded commands within the ingested data.
- Capability inventory: The skill utilizes extensive file read/write/edit capabilities and explicitly demonstrates
bashcommand execution for file management inexamples.md. - Sanitization: There is no mechanism described for sanitizing or filtering external content before it is processed or re-read into the prompt context.
- [Command Execution] (MEDIUM): The skill examples frequently use
bashcode blocks to perform file operations. While functional, this establishes a pattern where the agent is encouraged to use a shell environment, which increases the potential impact of a successful prompt injection attack. - [Metadata Poisoning] (LOW): The
reference.mdfile contains hallucinated or misleading claims regarding a $2 billion acquisition in 'December 2025'. While not directly executable, this deceptive metadata may attempt to falsely establish authority or reliability to influence agent behavior.
Recommendations
- AI detected serious security threats
Audit Metadata