advanced-evaluation

Pass

Audited by Gen Agent Trust Hub on Apr 30, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: The skill's primary functionality creates an attack surface for Indirect Prompt Injection.
  • Ingestion points: Untrusted data enters the agent context through the {prompt}, {response}, {response_a}, and {response_b} placeholders in the evaluation prompt templates found in SKILL.md.
  • Boundary markers: The templates utilize Markdown headers (e.g., ## Original Prompt, ## Response to Evaluate) to delimit untrusted content from the evaluation instructions.
  • Capability inventory: The skill facilitates LLM text generation based on these templates, as described in references/implementation-patterns.md and demonstrated in scripts/evaluation_example.py.
  • Sanitization: There is no evidence of input sanitization or explicit "ignore embedded instructions" warnings within the templates to further mitigate adversarial input.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 30, 2026, 10:01 PM