guidance

Fail

Audited by Gen Agent Trust Hub on Mar 28, 2026

Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill contains multiple examples of 'ReAct Agent' implementations that use Python's eval() function to execute tool logic based on model-generated input.\n
  • Evidence: In SKILL.md (Pattern 4) and references/examples.md (Agent Systems and Tool Use with Validation), the calculator tool is defined as lambda expr: eval(expr).\n
  • Risk: The input passed to eval() is generated by the LLM via the gen('action_input') command. If the model is manipulated to produce malicious code instead of a mathematical expression, that code will be executed with the permissions of the agent process.\n- [EXTERNAL_DOWNLOADS]: The documentation instructs the user to install third-party dependencies from standard package registries.\n
  • Evidence: The skill references pip install guidance, transformers, and llama_cpp.\n
  • Context: These are well-known open-source libraries provided by reputable organizations and are considered safe for the intended use of this skill.\n- [PROMPT_INJECTION]: The skill includes patterns that are vulnerable to indirect prompt injection due to a lack of input sanitization and boundary enforcement.\n
  • Ingestion points: The skill ingests untrusted text data through the question argument in the react_agent function and the text argument in extract_entities.\n
  • Boundary markers: The implementation lacks specific markers (such as XML tags or explicit instructions) to prevent the LLM from following commands embedded within the processed data.\n
  • Capability inventory: The agent has high-impact capabilities, specifically the ability to execute code via the eval() tool function.\n
  • Sanitization: There is no validation or sanitization logic applied to the generated action_input before it is processed by the execution tool.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Mar 28, 2026, 06:07 PM