loki-mode
Fail
Audited by Gen Agent Trust Hub on Feb 17, 2026
Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
- COMMAND_EXECUTION (HIGH): The skill mandates bypassing standard security safeguards by requiring the --dangerously-skip-permissions flag for Claude Code, as documented in CLAUDE.md and scripts/loki-wrapper.sh. This grants autonomous agents full capability to execute shell commands and modify system files without user oversight.
- REMOTE_CODE_EXECUTION (HIGH): Automated development workflows in references/sdlc-phases.md and autonomy/run.sh perform npm installations and execute generated test/build scripts, creating a direct vector for running unverified code on the host.
- COMMAND_EXECUTION (MEDIUM): The screenshot utility in scripts/take-screenshots.js launches Puppeteer with the --no-sandbox flag, which significantly increases security risks if the browser handles untrusted content.
- PROMPT_INJECTION (LOW): Core instructions in references/core-workflow.md like NEVER ask questions and NEVER wait for confirmation make the agent extremely vulnerable to indirect prompt injection via the PRD files it ingests.
- COMMAND_EXECUTION (MEDIUM): Generated code in benchmarks/results/humaneval-loki-solutions/160.py utilizes the eval() function, a pattern that presents a security risk if used in production code generated by the system.
- EXTERNAL_DOWNLOADS (MEDIUM): Multiple scripts trigger the download and installation of external binary tools and packages from sources like npm and GitHub during the initialization and benchmarking phases.
Recommendations
- AI detected serious security threats
Audit Metadata