neuro-symbolic-reasoning

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADS
Full Analysis
  • Privilege Escalation (HIGH): The file references/packages.md instructs the agent to execute sudo apt-get install prover9 for Linux systems. Explicitly requesting the agent to use root privileges to modify the host system is a severe security risk that could be leveraged for full system compromise.
  • Unverifiable Dependencies & Remote Code Execution (MEDIUM): The skill directs the agent to install multiple external dependencies through standard package managers (pip install z3-solver nltk anthropic, brew install prover9). Automated installation of external software and binaries at runtime is a critical attack vector that can be exploited for supply chain attacks or arbitrary code execution during the setup phase.
  • Indirect Prompt Injection (LOW): The core functionality defined in SKILL.md and references/logic-llm.md involves taking an untrusted natural language 'NL Problem' from a user and converting it into a 'Logic Program' for execution.
  • Ingestion points: User input enters the system via the 'NL Problem' field in the Core Pipeline.
  • Boundary markers: There are no explicit delimiters or system instructions provided to ensure the agent ignores malicious directives embedded within the natural language problems.
  • Capability inventory: The skill possesses the capability to write files (SKILL.md policy), execute system commands (sudo apt, brew), and execute solver logic via the nltk and z3 libraries.
  • Sanitization: No sanitization or validation of the generated logic programs is present before they are passed to the symbolic solvers (Prover9/Z3).
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:40 PM