The Agent Skills Directory

Indirect Prompt Injection (MEDIUM): The skill processes 'trace' data containing untrusted external content for evaluation. This creates an attack surface where malicious payloads in the traces could influence the LLM judge's scores or behavior. \n
Ingestion points: scripts/evaluator.py (via trace scoring commands).\n
Boundary markers: Not visible in provided files.\n
Capability inventory: Network access to Langfuse and OpenRouter APIs; execution of local Python scripts.\n
Sanitization: Not visible in provided files.\n- Metadata Poisoning (MEDIUM): The skill description claims to use a non-existent 'GPT-5-nano' model. This is deceptive metadata that misleads users regarding the skill's actual capabilities and technical foundation.\n- Command Execution (LOW): The skill executes local Python scripts using python3 {baseDir}/scripts/evaluator.py. While standard for its stated purpose, it relies on the integrity of the external script which is not provided in the analyzed context.

llm-evaluator