openai-agents
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFE
Full Analysis
- [Prompt Injection] (SAFE): No malicious instructions were detected. On the contrary, the skill includes specialized templates (
agent-guardrails-input.ts) designed to teach developers how to detect and block prompt injection and jailbreak attempts. - [Data Exposure & Exfiltration] (SAFE): The templates correctly handle sensitive data by using environment variables for API keys and strictly advising against exposing primary credentials to client-side code. It provides reference implementations for backend proxies and ephemeral session tokens.
- [Indirect Prompt Injection] (SAFE): While the agents are designed to ingest untrusted user data (a necessary surface for LLM agents), the skill provides robust evidence of mitigation strategies.
- Ingestion points: The
run()method in almost all agent templates (e.g.,agent-basic.ts). - Boundary markers: Demonstrated in the guardrails templates which use secondary LLM calls for validation.
- Capability inventory: Tools include informational (weather) and high-stakes (refunds, account deletion) actions, with the latter requiring explicit human-in-the-loop approval.
- Sanitization: Provided via the
OutputGuardrailandInputGuardrailpatterns. - [Remote Code Execution] (SAFE): Dependencies are restricted to standard, well-known libraries (
@openai/agents,zod). No instances of unsafe dynamic execution (eval/exec) or piped remote script execution were found.
Audit Metadata