code-optimization

Warn

Audited by Gen Agent Trust Hub on Mar 1, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONREMOTE_CODE_EXECUTIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill's primary workflow requires the agent to execute a variety of shell commands to compile and run code for benchmarking and performance measurement.
  • Evidence: The 'Step 2: Compile and Execute' section in SKILL.md explicitly directs the agent to use commands such as g++, python3, javac, rustc, and ./benchmark.
  • [REMOTE_CODE_EXECUTION]: The agent is instructed to read code from the filesystem, potentially modify it, and then execute the resulting binary or script. This behavior allows for the execution of arbitrary logic provided in the input files.
  • Evidence: The workflow in SKILL.md (Steps 1 and 2) involves reading files using read_file and subsequently executing them via system-level compilers and interpreters.
  • [PROMPT_INJECTION]: The skill possesses a surface for indirect prompt injection because it ingests untrusted external data (user-provided code) and processes it to identify performance bottlenecks without sufficient sanitization or boundary markers.
  • Ingestion points: The skill reads external files using read_file as seen in the 'Optimization Workflow' section of SKILL.md.
  • Boundary markers: No delimiters or instructions to ignore embedded commands within the code content are present.
  • Capability inventory: The skill has the capability to execute shell commands (g++, python3, etc.) and write files to the system (write_file).
  • Sanitization: No evidence of input validation or content filtering is provided in the instructions.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Mar 1, 2026, 12:33 AM