prompt-test

Fail

Audited by Gen Agent Trust Hub on Apr 1, 2026

Risk Level: HIGHCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill constructs and executes shell commands by interpolating user-supplied variables directly into Python execution strings. This allows for arbitrary command execution if a user provides input containing shell metacharacters.
  • Evidence: The process section in SKILL.md describes multiple instances of unsafe command construction, such as python scripts/prompt-ab-tester.py --brand {slug} --action create-test --test-name "{name}" and python scripts/eval-runner.py --brand {slug} --action run-quick --text "{content_or_path}".
  • Vulnerability: An attacker can provide a value like test"; touch /tmp/pwned; # for the {name} variable to execute arbitrary system commands alongside the intended script.
  • [DATA_EXFILTRATION]: The skill accesses and reads data from a hidden directory in the user's home folder, which may contain sensitive brand profiles, marketing strategies, and agency standard operating procedures.
  • Evidence: The skill reads from paths including ~/.claude-marketing/brands/, ~/.claude-marketing/brands/{slug}/profile.json, and ~/.claude-marketing/sops/.
  • [PROMPT_INJECTION]: The skill is susceptible to indirect prompt injection because it processes untrusted external content as part of its core evaluation logic.
  • Ingestion points: The {content_or_path} and {evidence_path} variables in SKILL.md (Step 3) are used to ingest external data for quality evaluation.
  • Boundary markers: No boundary markers or instructions to ignore embedded commands are present in the skill definition.
  • Capability inventory: The skill possesses the capability to execute shell commands via subprocess calls to local Python scripts.
  • Sanitization: There is no evidence of input validation or sanitization before the external content is processed or interpolated into command-line arguments.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Apr 1, 2026, 01:18 AM