verify-evidence-loop
Pass
Audited by Gen Agent Trust Hub on Apr 19, 2026
Risk Level: SAFEPROMPT_INJECTIONDATA_EXFILTRATION
Full Analysis
- [PROMPT_INJECTION]: The skill processes untrusted user claims and external web content, presenting a surface for indirect prompt injection attacks.
- Ingestion points: Technical claims via $ARGUMENTS and web content retrieved via WebSearch and WebFetch (SKILL.md).
- Boundary markers: Employs specific delimiters like ---CLAIM-START--- and tags to isolate data from instructions.
- Capability inventory: Uses WebSearch, WebFetch, Write, AskUserQuestion, and Agent calls to process data and generate reports.
- Sanitization: Implements comprehensive hardening including stripping tag characters (<, >), truncating input to 500 characters, and replacing markdown headers in evidence with [H] to prevent prompt structure pollution. Subagents and reviewers are explicitly instructed to ignore commands found within input data.
- [DATA_EXFILTRATION]: The skill performs network operations to external search engines which may transmit technical keywords from the claim to third-party providers.
- This behavior is documented with a mandatory user warning in the Step 0: HITL section.
- The skill implements Seed Hygiene rules to extract only general technical keywords for dissent searches rather than using verbatim claim text, minimizing sensitive data exposure.
Audit Metadata