exploring-llm-evaluations

Warn

Audited by Gen Agent Trust Hub on Apr 14, 2026

Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
  • [COMMAND_EXECUTION]: The skill implements tools like posthog:execute-sql and posthog:evaluation-test-hog that allow the agent to execute arbitrary database queries and platform-specific logic. This provides a mechanism for dynamic code and query execution on the PostHog backend.
  • [DATA_EXFILTRATION]: Use of the posthog:execute-sql tool enables the retrieval of potentially sensitive event data from the ClickHouse database. An agent could be manipulated to exfiltrate specific data points or entire datasets through these queries.
  • [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection attack surface, primarily in the posthog:llm-analytics-evaluation-summary-create tool.
  • Ingestion points: Untrusted data enters the agent context via AI generations ($ai_generation events) which are retrieved and summarized.
  • Boundary markers: There are no explicit instructions for the agent to use delimiters or ignore instructions embedded within the generation data.
  • Capability inventory: The skill has significant capabilities including database querying (posthog:execute-sql), evaluation creation/modification (posthog:evaluation-create, posthog:evaluation-update), and running evaluations (posthog:evaluation-run).
  • Sanitization: There is no mention of sanitizing, escaping, or validating the content of external AI generations before they are processed by the summarization LLM.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Apr 14, 2026, 01:03 PM