confidence-honesty

Pass

Audited by Gen Agent Trust Hub on Feb 21, 2026

Risk Level: SAFENO_CODE
Full Analysis
  • Prompt Injection (SAFE): The skill uses strong instructional markers like 'CRITICAL' and emoji alerts, but these are context-appropriate for enforcing reasoning constraints and do not attempt to override system safety protocols.
  • Indirect Prompt Injection (SAFE): The skill suggests the agent fetch external data (logs, code) to verify claims. While this introduces a potential vulnerability surface if external data contains instructions, the skill does not implement the fetching tools or bypass existing security boundaries. Evidence: 1. Ingestion: codebase search, file fetching. 2. Boundaries: Absent. 3. Capabilities: search, fetch, spawn agent. 4. Sanitization: Absent.
  • Data Exposure & Exfiltration (SAFE): No hardcoded credentials, sensitive file paths, or unauthorized network operations were identified.
  • Unverifiable Dependencies (SAFE): No third-party packages or remote execution patterns were found.
Audit Metadata
Risk Level
SAFE
Analyzed
Feb 21, 2026, 01:21 PM