close-session

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • Indirect Prompt Injection (HIGH): The skill identifies and processes untrusted data from steps.md, tasks.md, and project files to determine 'work completion' and 'session reports'.
  • Ingestion points: steps.md, tasks.md, and root folder file scans.
  • Boundary markers: None detected. The agent is instructed to read these files directly and interpret their contents as state.
  • Capability inventory: File deletion (delete .md files), file modification (Edit tool), and command execution (python bulk-complete.py).
  • Sanitization: None detected. Malicious instructions embedded in task names or project files could manipulate the 'bulk-complete' script arguments or the file cleanup logic.
  • Command Execution (MEDIUM): The skill executes an external Python script (bulk-complete.py) with interpolated arguments (--project [ID]).
  • Evidence: python 00-system/skills/bulk-complete/scripts/bulk-complete.py --project [ID] --all --no-confirm in SKILL.md.
  • Risk: If the [ID] or other parameters are derived from untrusted project names or metadata, it could lead to command argument injection.
  • Data Modification & Deletion (MEDIUM): The 'Temp File Cleanup' feature performs automated and interactive deletion of files in the root folder.
  • Evidence: 'Clean temp files (delete .md files not in system folders)' in the workflow steps.
  • Risk: High potential for accidental or malicious data loss if the logic for determining 'system folders' is bypassed or if a malicious file name triggers a logic error in the cleanup scan.
  • Prompt Injection (LOW): The skill uses strong imperative language to force its own execution ('CRITICAL: This skill is AUTO-TRIGGERED', 'Never skip this skill'). While typical for system-level automation, this behavior overrides agent autonomy and could be used to hide malicious activity within a 'mandatory' cleanup routine.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 02:08 AM