AGENT LAB: SKILLS

agentic-eval

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [Prompt Injection] (LOW): The skill describes patterns for 'LLM-as-judge' and self-reflection, creating a surface for indirect prompt injection.
  • Ingestion points: task and output variables are interpolated directly into prompts in reflect_and_refine, evaluate, and evaluate_outcome functions.
  • Boundary markers: Absent. The prompts do not use delimiters (like triple quotes or XML tags) or system instructions to prevent the evaluator from obeying commands embedded in the data being evaluated.
  • Capability inventory: The skill involves calling an LLM in a loop and potentially executing code via run_tests.
  • Sanitization: None provided.
  • [Command Execution] (LOW): The CodeReflector pattern (Pattern 3) explicitly suggests a workflow that executes dynamically generated Python code and tests via a run_tests function. While the skill only provides the pattern and not the implementation of run_tests, it encourages the execution of untrusted AI-generated content which requires strict sandboxing to avoid host compromise.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 04:48 PM