personal-benchmark

Pass

Audited by Gen Agent Trust Hub on Apr 29, 2026

Risk Level: SAFE
Full Analysis
  • [INDIRECT_PROMPT_INJECTION]: The skill is designed to ingest and process user-provided 'real work' artifacts, including documents and data piles, to generate benchmarks. This creates an attack surface where instructions embedded in these third-party files could influence the agent's behavior during the synthesis phase. The instructions include no explicit boundary markers or sanitization steps for this ingested data.
  • Ingestion point: User-provided files and text in the 'The Interview' section.
  • Capability inventory: Writing synthesized benchmark files to the local benchmarks/ directory.
  • Boundary markers: Absent in instructions.
  • Sanitization: Absent in instructions.
  • [COMMAND_EXECUTION]: The skill instructs the agent to create a directory structure and write multiple markdown, YAML, and data files to the local filesystem (benchmarks/). While this is the primary function of the skill, the creation of arbitrary files based on user-influenced content is a noted capability.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 29, 2026, 06:42 PM