loki-mode

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONEXTERNAL_DOWNLOADSREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The skill instructions and autonomous runner scripts (CLAUDE.md, autonomy/README.md, scripts/loki-wrapper.sh) mandate the use of the --dangerously-skip-permissions flag. This configuration allows the AI agent to execute shell commands, modify files, and access the network without any human intervention or approval, effectively granting full control over the user system.- REMOTE_CODE_EXECUTION (HIGH): The workflow involves processing external markdown files (PRDs) to generate and execute tasks. Because permissions are skipped, a malicious PRD containing shell commands disguised as requirements can be executed by the agent. The scripts/loki-wrapper.sh script specifically facilitates this by piping generated prompts directly into the Claude Code engine.- EXTERNAL_DOWNLOADS (HIGH): In INSTALLATION.md, the setup instructions provide curl commands to download core logic files (SKILL.md and reference documentation) from an untrusted personal GitHub repository (asklokesh/loki-mode). These sources are not part of the trusted external sources list and could be modified to deliver malicious instructions.- PROMPT_INJECTION (LOW): The skill is highly vulnerable to Indirect Prompt Injection (Category 8) because its primary function is to ingest and act upon untrusted data from PRD files. Evidence Chain for Category 8: 1. Ingestion points: autonomy/run.sh and loki-wrapper.sh read user-supplied PRD files. 2. Boundary markers: No explicit delimiters or 'ignore' instructions were found to separate the PRD content from system instructions. 3. Capability inventory: The skill has full system access via the skipped permissions flag. 4. Sanitization: No sanitization of the PRD content was detected before it is processed by the agent swarms.- DYNAMIC_EXECUTION (MEDIUM): Multiple utility scripts, including scripts/export-to-vibe-kanban.sh and benchmarks/prepare-submission.sh, use heredocs to execute dynamic Python code. Additionally, the generated benchmark solution 160.py includes the use of eval() on strings, which is a high-risk coding practice.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 04:48 PM