assessment-architect

Fail

Audited by Gen Agent Trust Hub on Feb 17, 2026

Risk Level: HIGHCOMMAND_EXECUTIONREMOTE_CODE_EXECUTION
Full Analysis
  • COMMAND_EXECUTION (HIGH): The procedures in 'references/generation-procedures.md' involve direct interpolation of variables like '{PATH}' and '{SLUG}' into shell commands (e.g., 'ls {PATH}/*.md' and 'pandoc ... {SLUG}-exam.md'). This allows for arbitrary command injection if these variables are user-controlled and not strictly validated by the agent.
  • REMOTE_CODE_EXECUTION (MEDIUM): In 'references/generation-procedures.md', the skill utilizes a 'Task' tool to spawn autonomous subagents for parallel processing. This multi-agent execution model can be an attack vector to bypass security constraints or execute logic outside the primary agent's sandbox.
  • DATA_EXPOSURE (LOW): The skill performs directory listings and file reads based on a user-specified path ('ls {PATH}/*.md'). Without path restriction, this could lead to the exposure of sensitive files on the host system.
  • INDIRECT_PROMPT_INJECTION (LOW): The skill is designed to ingest and process untrusted lesson content without sanitization or boundary markers. Evidence Chain: 1. Ingestion points: Phase 1.1 in 'references/generation-procedures.md'. 2. Boundary markers: Absent. 3. Capability inventory: Subprocess calls ('ls', 'wc', 'grep', 'pandoc') and subagent spawning via the 'Task' tool across 'references/generation-procedures.md'. 4. Sanitization: Absent.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 17, 2026, 06:15 PM