update-agents-md
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- Prompt Injection (HIGH): The skill's core purpose is to modify the AI agent's governing rules stored in
AGENTS.md. This enables a persistent injection vector where an attacker (or a compromised upstream data source) can trick the agent into adopting malicious operational guidelines that could be used for data exfiltration, bypass of safety constraints, or unauthorized actions in future sessions. - Indirect Prompt Injection (HIGH):
- Ingestion points: Reads
AGENTS.mdand processes arbitrary user input to generate new behavioral rules. - Boundary markers: Absent. The skill does not implement delimiters or 'ignore embedded instructions' warnings when processing the proposed rules.
- Capability inventory: Performs file write operations to
AGENTS.md, which significantly influences agent reasoning and decision-making for all subsequent tasks. - Sanitization: Absent. The skill lacks validation or filtering of the content being injected into the ruleset, relying solely on the end-user's manual review of a diff.
- Command Execution (MEDIUM): The skill performs file system modifications. While technically a 'write' operation to a markdown file, in the context of an agent-governance file like
AGENTS.md, this is effectively a runtime modification of the agent's logic and behavioral constraints.
Recommendations
- AI detected serious security threats
Audit Metadata