arize-evaluator

Pass

Audited by Gen Agent Trust Hub on Apr 2, 2026

Risk Level: SAFECOMMAND_EXECUTIONEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill executes 'ax' CLI commands to manage Arize resources and utilizes Python one-liners for JSON processing.
  • [EXTERNAL_DOWNLOADS]: The skill instructs users to install the 'arize-ax-cli' package and 'certifi' via standard package managers like pip or uv.
  • [PROMPT_INJECTION]: The skill implements LLM-as-judge evaluation logic which interpolates external data into prompt templates, representing an indirect prompt injection surface.
  • Ingestion points: External data is retrieved via 'ax spans export' and 'ax experiments export' (SKILL.md).
  • Boundary markers: Example prompt templates lack explicit delimiters for variable placeholders.
  • Capability inventory: The system can execute shell commands and interpret local data using Python across all provided files.
  • Sanitization: No data sanitization or validation is specified for ingested content.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 2, 2026, 01:04 AM