mixseek-evaluator-config
Warn
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (LOW): The skill triggers the execution of a local Python validation script (
validate-config.py) using theuvtool. While this is an operational step for configuration integrity, it represents host-level command execution. - PROMPT_INJECTION (MEDIUM): An indirect prompt injection surface (Category 8) is identified within the
LLMPlainmetric configuration. 1. Ingestion points: User requirements for evaluation metrics (Step 1). 2. Boundary markers: Absent; user input is directly interpolated into thesystem_instructionfield. 3. Capability inventory: Writing configuration files (evaluator.toml,judgment.toml) and executing validation scripts. 4. Sanitization: Absent; no filtering or sanitization of the instructional text is mentioned. This allows potentially malicious instructions to influence the behavior of the downstream LLM agent that consumes these configuration files.
Audit Metadata