auto-optimize

Warn

Audited by Gen Agent Trust Hub on Feb 21, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • [COMMAND_EXECUTION] (MEDIUM): The skill executes a binary (./orchestrator) from a hardcoded absolute path (/Users/apple/Desktop/code/AI/tool/auto-run-agent). This relies on a specific local environment and executes external code not bundled with the skill.
  • [REMOTE_CODE_EXECUTION] (MEDIUM): The skill dynamically generates and executes validation commands (e.g., go test, pytest, cargo test, npx eslint) based on the technical stack detected in the target project. If the target project contains malicious configurations (e.g., in a pytest.ini or package.json), it could lead to arbitrary code execution.
  • [PROMPT_INJECTION] (LOW): The skill is highly vulnerable to Indirect Prompt Injection (Category 8).
  • Ingestion points: Target project files including README, CLAUDE.md, TODO/FIXME tags, and core source code (identified in SKILL.md Phase 1).
  • Boundary markers: None present to distinguish untrusted project content from agent instructions during task generation.
  • Capability inventory: Execution of shell commands (go, python, npx, cargo), execution of the local orchestrator binary, and extensive file system modifications via git and directory creation.
  • Sanitization: No sanitization or validation is performed on ingested project data before it is used to structure the TASKS.md and CONTEXT.md files which drive the automation.
  • [DATA_EXFILTRATION] (LOW): The skill's primary purpose involves deep exploration and reading of local project files, including architecture and source code. While intended for optimization, this capability could be abused to exfiltrate sensitive IP if the agent is compromised via prompt injection.
  • [EXTERNAL_DOWNLOADS] (LOW): The use of npx in the TypeScript validation rules can trigger the download and execution of remote packages from the npm registry at runtime.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 21, 2026, 12:22 PM