evaluate-report

Pass

Audited by Gen Agent Trust Hub on Apr 21, 2026

Risk Level: SAFECOMMAND_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill utilizes shell commands including cat, jq, find, and ls to locate and read benchmark files. The file paths are dynamically constructed using the $ARGUMENTS variable (e.g., <plugin>/skills/<skill>/eval-results/benchmark.json). If the user-provided target argument contains path traversal sequences like ../, the skill may attempt to access files outside the intended evaluation directories.
  • [PROMPT_INJECTION]: The skill is vulnerable to indirect prompt injection (Category 8) because it ingests and processes untrusted data from external JSON files (benchmark.json, history.json, plugin-benchmark.json).
  • Ingestion points: Files are read from the filesystem using cat and jq in Step 2 and the 'Agentic Optimizations' section.
  • Boundary markers: The skill lacks explicit delimiters or warnings to the agent to ignore instructions embedded within the JSON content.
  • Capability inventory: The skill has access to shell execution (Bash), file system navigation (find, ls, Glob), and file reading (Read, cat, Grep).
  • Sanitization: There is no evidence of validation or sanitization of the JSON content before it is displayed to the agent, allowing malicious metrics or descriptions to potentially influence downstream logic.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 21, 2026, 01:17 AM