craft-autoresearch

Pass

Audited by Gen Agent Trust Hub on May 1, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill requires the agent to execute a 'run harness,' which is a user-defined shell command used to evaluate the target artifact. This occurs during the baseline measurement and throughout the mutation loop (Steps 3 and 4).
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes and evaluates content from external 'Target artifacts' and 'Test inputs' without explicit boundary markers or sanitization.
  • Ingestion points: The target artifact file path and test inputs are ingested as primary data for the optimization loop.
  • Boundary markers: There are no instructions to use XML tags or specific delimiters to separate untrusted artifact content from agent instructions during evaluation.
  • Capability inventory: The agent can modify local files (mutations), execute arbitrary shell commands (run harness), and manage state in the ~/.craftkit/ directory.
  • Sanitization: The instructions do not define any sanitization or validation logic for the content of the artifacts or test inputs.
  • [COMMAND_EXECUTION]: The skill utilizes Git-based version control or file snapshots to manage mutations and perform rollbacks, which involves executing local CLI tools (Step 5).
Audit Metadata
Risk Level
SAFE
Analyzed
May 1, 2026, 04:16 AM