NYC

langchain

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [REMOTE_CODE_EXECUTION] (HIGH): The 'calculator' tool implementation in SKILL.md uses the eval() function on the expression argument. Because this argument is supplied by the LLM based on user input, it enables arbitrary Python code execution (e.g., system commands via os.system) on the environment running the agent.- [EXTERNAL_DOWNLOADS] (LOW): The skill requires installing langchain, langchain-openai, langchain-anthropic, and langgraph. These are well-known packages from the langchain-ai organization. Under [TRUST-SCOPE-RULE], these downloads are considered low risk.- [COMMAND_EXECUTION] (HIGH): Execution of dynamic strings via eval() within the Python code blocks represents a high-risk operation.- [PROMPT_INJECTION] (LOW): The RAG and agent patterns present an indirect prompt injection surface. Evidence: 1. Ingestion points: retriever results and user question are interpolated into prompts. 2. Boundary markers: No delimiters or instructions to ignore embedded commands are present in the templates. 3. Capability inventory: The agent has access to eval() and mock search tools. 4. Sanitization: No sanitization of retrieved context is performed.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 05:42 PM