approvals
Pass
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: SAFE
Full Analysis
- [Prompt Injection] (SAFE): The skill instructions do not contain attempts to override or bypass safety filters. Instead, they reinforce a strict governance model where the agent is explicitly 'untrusted' and must justify all actions.
- [Data Exposure & Exfiltration] (SAFE): The skill includes explicit warnings against hardcoding secrets or PII in metadata and context fields. All data access is mediated through governed tools (
decision_read,decision_write) rather than direct network or file system calls. - [Indirect Prompt Injection] (LOW): The skill describes patterns for ingesting untrusted data from 'Data Products.' This represents an attack surface for indirect prompt injection.
- Ingestion points:
decision_readresults are ingested into the agent context. - Boundary markers: The instructions suggest using clear context and reasoning, though they do not provide specific delimiter characters for the data itself.
- Capability inventory: The skill allows for writes and approvals, but only within a governed lifecycle.
- Sanitization: The skill relies on the underlying TraceMem platform to handle validation and policy checks (
decision_evaluate). - [Unverifiable Dependencies] (SAFE): No external package installations or remote script executions are requested. The skill uses a set of internal/core MCP tools.
- [Privilege Escalation] (SAFE): While an 'override' automation mode is defined, the instructions clarify that this requires explicit permission and human oversight, adhering to a 'Fail Closed' philosophy.
Audit Metadata