paper-revision
Pass
Audited by Gen Agent Trust Hub on Feb 22, 2026
Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
- PROMPT_INJECTION (LOW): The skill is vulnerable to Indirect Prompt Injection because it ingests and acts upon untrusted reviewer feedback ($0).
- Ingestion points: Reviewer comments input ($0) defined in
SKILL.mdand processed throughout the workflow. - Boundary markers: Absent. The prompts in
references/revision-prompts.md(e.g., in the 'Targeted Edit Prompts' and 'For Missing Experiment Concerns' sections) interpolate the{reviewer_comment}directly into the agent's instructions without delimiters or clear separation between system instructions and untrusted data. - Capability inventory: The skill is capable of modifying local files (paper drafts) and invoking the
experiment-codeskill, which involves dynamic code generation and execution. - Sanitization: No sanitization or validation logic is present to filter malicious instructions embedded within reviewer feedback.
- COMMAND_EXECUTION (LOW): The skill workflow explicitly directs the agent to generate and run code using the
experiment-codeskill when "additional experiments" are requested by reviewers. While this is part of the intended autonomous research use case, it creates a risk if the reviewer feedback contains malicious instructions designed to exploit the code execution environment. - DATA_EXPOSURE (SAFE): The skill accesses local paper drafts ($1) and skill-specific reference files. No evidence of hardcoded credentials, sensitive system file access, or unauthorized network exfiltration was detected.
Audit Metadata