llm-judge
Warn
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (MEDIUM): The skill executes potentially dangerous shell commands to analyze repositories and run tests. Evidence:
references/repo-agent.mdcontains logic to rungit,pytest,npm test, andgo testwithin the provided$REPO_PATH. Risk: Malicious repositories can exploit these tools to execute arbitrary code on the host machine viapackage.jsonscripts,conftest.pyfiles, or similar build/test hooks.\n- PROMPT_INJECTION (LOW): The skill is vulnerable to Indirect Prompt Injection (Category 8). Ingestion points: Phase 1 agents read untrusted code from the repository and the specification document as defined inreferences/repo-agent.md. Boundary markers: None identified. The prompt templates inSKILL.mdinterpolate external content ($SPEC_CONTENT and repository data) directly into instructions without delimiters or safety warnings. Capability inventory: The agent possesses capabilities to run shell commands and read local file systems. Sanitization: No sanitization, escaping, or validation is performed on the ingested code or specification text before it is processed by the LLM.
Audit Metadata