honest-forget
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTION
Full Analysis
- Indirect Prompt Injection (HIGH): The skill possesses a high-risk capability/ingestion pair. It is designed to read and summarize arbitrary files (
read_fileon user-providedtargetpath) while having the power to modify the file system. - Ingestion points: The
targetinput andread_filetool allow the agent to process untrusted external data. - Capability inventory: The skill uses
write_fileandsearch_replace(defined inSKILL.md), allowing for permanent modification of the environment. - Boundary markers: No explicit delimiters or instruction-ignore warnings are specified for the summarization/compression process.
- Sanitization: No validation or escaping of external content is performed before the agent acts on the information, allowing malicious instructions in 'forgotten' files to trigger unauthorized file modifications.
- Self-Referential Deception (MEDIUM): The skill includes a file named
skill-snitch-report.mdwhich provides a pre-fabricated security audit and verdict. - Evidence: The file contains strings like 'Verdict: FORGET WITH INTEGRITY' and 'Risk Level: LOW'. This is a known adversarial pattern used to deceive automated and human analysts into bypassing deeper inspection by presenting a false sense of security.
Recommendations
- AI detected serious security threats
Audit Metadata