NYC
skills/smithery/ai/agentic-eval/Gen Agent Trust Hub

agentic-eval

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [REMOTE_CODE_EXECUTION] (HIGH): The 'Code-Specific Reflection' pattern (Pattern 3) generates Python code and unit tests from an untrusted 'spec' and immediately executes them.
  • Evidence: File SKILL.md contains result = run_tests(code, tests).
  • Risk: If the input specification contains malicious instructions, the LLM may generate code that performs unauthorized system operations (e.g., file deletion or network access) which are then executed on the host system.
  • [PROMPT_INJECTION] (HIGH): All provided prompt templates directly interpolate external, untrusted data into the LLM context without any boundary markers or sanitization.
  • Evidence: Templates like llm(f"Complete this task:\n{task}") and llm(f"Write Python code for: {spec}") allow user-provided strings to take control of the LLM's instructions.
  • Risk: An attacker can use 'ignore previous instructions' techniques within the task or spec variables to bypass intended evaluation logic or force the generation of malicious payloads.
  • [INDIRECT PROMPT INJECTION] (HIGH): This skill is a primary target for Category 8 attacks because it combines untrusted data ingestion with high-privilege execution capabilities.
  • Ingestion points: The variables task, spec, and output in all three patterns.
  • Boundary markers: None. No delimiters (like XML tags or triple quotes) are used to separate instructions from data.
  • Capability inventory: The skill possesses the ability to execute code via the run_tests function and parse structured data via json.loads.
  • Sanitization: None. Content is passed directly to the LLM and the execution environment.
  • [COMMAND_EXECUTION] (MEDIUM): The patterns facilitate the execution of arbitrary commands by treating LLM-generated strings as executable code logic.
  • Evidence: The CodeReflector class automate a loop of writing, testing, and fixing code based on error messages, which can be manipulated into a persistent exploit loop.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 09:23 AM