neuro-symbolic-reasoning
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
- Privilege Escalation (HIGH): The file
references/packages.mdinstructs the agent to executesudo apt-get install prover9for Linux systems. Explicitly requesting the agent to use root privileges to modify the host system is a severe security risk that could be leveraged for full system compromise. - Unverifiable Dependencies & Remote Code Execution (MEDIUM): The skill directs the agent to install multiple external dependencies through standard package managers (
pip install z3-solver nltk anthropic,brew install prover9). Automated installation of external software and binaries at runtime is a critical attack vector that can be exploited for supply chain attacks or arbitrary code execution during the setup phase. - Indirect Prompt Injection (LOW): The core functionality defined in
SKILL.mdandreferences/logic-llm.mdinvolves taking an untrusted natural language 'NL Problem' from a user and converting it into a 'Logic Program' for execution. - Ingestion points: User input enters the system via the 'NL Problem' field in the Core Pipeline.
- Boundary markers: There are no explicit delimiters or system instructions provided to ensure the agent ignores malicious directives embedded within the natural language problems.
- Capability inventory: The skill possesses the capability to write files (
SKILL.mdpolicy), execute system commands (sudo apt,brew), and execute solver logic via thenltkandz3libraries. - Sanitization: No sanitization or validation of the generated logic programs is present before they are passed to the symbolic solvers (Prover9/Z3).
Recommendations
- AI detected serious security threats
Audit Metadata