canned-responses-anthropic
Warn
Audited by Gen Agent Trust Hub on Apr 2, 2026
Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill uses deceptive authorship metadata to impersonate a trusted organization.
- Evidence: The YAML frontmatter in
SKILL.mdidentifies the author as 'Anthropic'. This constitutes metadata poisoning by misrepresenting the source of the legal guidelines and templates, potentially manipulating user trust based on the reputation of a well-known AI safety organization. - [SAFE]: No technical capabilities for data exfiltration or malicious command execution were identified.
- Ingestion points: The skill processes user-supplied legal inquiries to populate templates as described in
SKILL.md. - Boundary markers: No explicit delimiters or instructions to ignore embedded commands within user input are present.
- Capability inventory: There are no tools, shell commands, or network operations defined in the skill configuration or documentation.
- Sanitization: No automated validation or escaping of user-provided content is implemented.
- Result: Although the skill lacks input sanitization, the total absence of exploitable capabilities (such as file writing, network access, or command execution) prevents the conversion of input data into a security threat.
Audit Metadata