approvals

Pass

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: SAFE
Full Analysis
  • [Prompt Injection] (SAFE): The skill instructions do not contain attempts to override or bypass safety filters. Instead, they reinforce a strict governance model where the agent is explicitly 'untrusted' and must justify all actions.
  • [Data Exposure & Exfiltration] (SAFE): The skill includes explicit warnings against hardcoding secrets or PII in metadata and context fields. All data access is mediated through governed tools (decision_read, decision_write) rather than direct network or file system calls.
  • [Indirect Prompt Injection] (LOW): The skill describes patterns for ingesting untrusted data from 'Data Products.' This represents an attack surface for indirect prompt injection.
  • Ingestion points: decision_read results are ingested into the agent context.
  • Boundary markers: The instructions suggest using clear context and reasoning, though they do not provide specific delimiter characters for the data itself.
  • Capability inventory: The skill allows for writes and approvals, but only within a governed lifecycle.
  • Sanitization: The skill relies on the underlying TraceMem platform to handle validation and policy checks (decision_evaluate).
  • [Unverifiable Dependencies] (SAFE): No external package installations or remote script executions are requested. The skill uses a set of internal/core MCP tools.
  • [Privilege Escalation] (SAFE): While an 'override' automation mode is defined, the instructions clarify that this requires explicit permission and human oversight, adhering to a 'Fail Closed' philosophy.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 17, 2026, 06:12 PM