guidance
Fail
Audited by Gen Agent Trust Hub on Mar 28, 2026
Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill contains multiple examples of 'ReAct Agent' implementations that use Python's
eval()function to execute tool logic based on model-generated input.\n - Evidence: In
SKILL.md(Pattern 4) andreferences/examples.md(Agent Systems and Tool Use with Validation), thecalculatortool is defined aslambda expr: eval(expr).\n - Risk: The input passed to
eval()is generated by the LLM via thegen('action_input')command. If the model is manipulated to produce malicious code instead of a mathematical expression, that code will be executed with the permissions of the agent process.\n- [EXTERNAL_DOWNLOADS]: The documentation instructs the user to install third-party dependencies from standard package registries.\n - Evidence: The skill references
pip install guidance,transformers, andllama_cpp.\n - Context: These are well-known open-source libraries provided by reputable organizations and are considered safe for the intended use of this skill.\n- [PROMPT_INJECTION]: The skill includes patterns that are vulnerable to indirect prompt injection due to a lack of input sanitization and boundary enforcement.\n
- Ingestion points: The skill ingests untrusted text data through the
questionargument in thereact_agentfunction and thetextargument inextract_entities.\n - Boundary markers: The implementation lacks specific markers (such as XML tags or explicit instructions) to prevent the LLM from following commands embedded within the processed data.\n
- Capability inventory: The agent has high-impact capabilities, specifically the ability to execute code via the
eval()tool function.\n - Sanitization: There is no validation or sanitization logic applied to the generated
action_inputbefore it is processed by the execution tool.
Recommendations
- AI detected serious security threats
Audit Metadata