exploring-llm-evaluations

Pass

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: SAFEPROMPT_INJECTIONCOMMAND_EXECUTION
Full Analysis
  • [SAFE]: The skill's primary functionality—including evaluation lifecycle management, trace inspection, and SQL querying—is implemented through official vendor tools (posthog:) and is consistent with the documented purpose of monitoring and debugging LLM generations.\n- [PROMPT_INJECTION]: The skill presents a surface for indirect prompt injection via the llm-analytics-evaluation-summary-create tool, which generates AI summaries from event data that could contain adversarial instructions.\n
  • Ingestion points: $ai_generation and $ai_evaluation events are processed by the summarization tool (as described in SKILL.md).\n
  • Boundary markers: The instructions do not define specific delimiters or warnings to ignore instructions embedded within the processed events.\n
  • Capability inventory: The agent has access to powerful tools including execute-sql, evaluation-create, and evaluation-update.\n
  • Sanitization: There is no mention of sanitizing or filtering external content before it is interpolated into the summary prompt.\n- [COMMAND_EXECUTION]: The skill supports the definition and execution of "Hog" code for deterministic, rule-based evaluations. Users can provide code strings for testing via evaluation-test-hog or for production use via evaluation-create. This execution is scoped to the PostHog platform's internal ingestion pipeline.
Audit Metadata
Risk Level
SAFE
Analyzed
Apr 14, 2026, 05:20 PM