The Agent Skills Directory

[REMOTE_CODE_EXECUTION] (HIGH): The 'Code-Specific Reflection' pattern (Pattern 3) generates Python code and unit tests from an untrusted 'spec' and immediately executes them.
Evidence: File SKILL.md contains result = run_tests(code, tests).
Risk: If the input specification contains malicious instructions, the LLM may generate code that performs unauthorized system operations (e.g., file deletion or network access) which are then executed on the host system.
[PROMPT_INJECTION] (HIGH): All provided prompt templates directly interpolate external, untrusted data into the LLM context without any boundary markers or sanitization.
Evidence: Templates like llm(f"Complete this task:\n{task}") and llm(f"Write Python code for: {spec}") allow user-provided strings to take control of the LLM's instructions.
Risk: An attacker can use 'ignore previous instructions' techniques within the task or spec variables to bypass intended evaluation logic or force the generation of malicious payloads.
[INDIRECT PROMPT INJECTION] (HIGH): This skill is a primary target for Category 8 attacks because it combines untrusted data ingestion with high-privilege execution capabilities.
Ingestion points: The variables task, spec, and output in all three patterns.
Boundary markers: None. No delimiters (like XML tags or triple quotes) are used to separate instructions from data.
Capability inventory: The skill possesses the ability to execute code via the run_tests function and parse structured data via json.loads.
Sanitization: None. Content is passed directly to the LLM and the execution environment.
[COMMAND_EXECUTION] (MEDIUM): The patterns facilitate the execution of arbitrary commands by treating LLM-generated strings as executable code logic.
Evidence: The CodeReflector class automate a loop of writing, testing, and fixing code based on error messages, which can be manipulated into a persistent exploit loop.

agentic-eval