llm-ops

Pass

Audited by Gen Agent Trust Hub on Apr 5, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The rag_query function in SKILL.md performs a semantic search and interpolates the retrieved document content directly into a system prompt. This creates a surface for indirect prompt injection if the source documents contain malicious instructions designed to hijack the LLM context.
  • [PROMPT_INJECTION]: The evaluate_response function in SKILL.md accepts untrusted input in the actual parameter and places it into an evaluation prompt. An attacker-controlled input could attempt to manipulate the evaluation score or reasoning by providing embedded instructions.
  • [PROMPT_INJECTION]: The cot_analysis function in SKILL.md interpolates the problem variable directly into a reasoning chain. If the input problem contains specific injection sequences, it could bypass the intended step-by-step logic.
  • Ingestion points: rag_query (retrieved document content), evaluate_response (external response content), cot_analysis (user input problem).
  • Boundary markers: Minimal text-based markers (e.g., 'Contexto:', 'PERGUNTA:') are used, which are insufficient to reliably separate data from instructions in adversarial scenarios.
  • Capability inventory: The skill snippets demonstrate use of the anthropic client to execute model calls based on these interpolated strings.
  • Sanitization: No sanitization, validation, or escaping logic is present in the provided code snippets to protect against instruction injection.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 5, 2026, 10:31 AM