academic-paper-reviewer

Pass

Audited by Gen Agent Trust Hub on May 19, 2026

Risk Level: SAFE
Full Analysis
  • [PROMPT_INJECTION]: The skill implements a 'Sprint Contract Protocol' (v3.6.2) that enforces a strict separation between a content-blind Phase 1 (establishing criteria) and a content-visible Phase 2 (performing the review). This architecture prevents the content of a paper from overriding the agent's instructions or criteria. Specific agent files, such as agents/eic_agent.md and agents/devils_advocate_reviewer_agent.md, include explicit defensive directives to treat prior outputs and paper segments as data rather than instructions. These directives ensure that any 'ignore previous instructions' strings found within user-provided papers are ignored by the model. The static detector hits are false positives triggered by this defensive terminology.- [DATA_EXFILTRATION]: The skill uses an optional 'cross-model verification' feature (ARS_CROSS_MODEL) that sends data to different LLM providers to ensure diverse feedback. This is a primary, documented feature of the review pipeline and does not constitute unauthorized exfiltration. No hardcoded credentials or sensitive file paths (e.g., .ssh, .env) were found.- [COMMAND_EXECUTION]: The skill references a local Python script scripts/check_sprint_contract.py and JSON schemas for validating contract integrity. These are benign internal tools used for maintaining the skill's operational standards. There are no attempts to execute arbitrary shell commands or remote scripts.
Audit Metadata
Risk Level
SAFE
Analyzed
May 19, 2026, 03:42 AM