exploring-llm-evaluations
Warn
Audited by Gen Agent Trust Hub on Apr 14, 2026
Risk Level: MEDIUMCOMMAND_EXECUTIONDATA_EXFILTRATIONPROMPT_INJECTION
Full Analysis
- [COMMAND_EXECUTION]: The skill implements tools like
posthog:execute-sqlandposthog:evaluation-test-hogthat allow the agent to execute arbitrary database queries and platform-specific logic. This provides a mechanism for dynamic code and query execution on the PostHog backend. - [DATA_EXFILTRATION]: Use of the
posthog:execute-sqltool enables the retrieval of potentially sensitive event data from the ClickHouse database. An agent could be manipulated to exfiltrate specific data points or entire datasets through these queries. - [PROMPT_INJECTION]: The skill exhibits an indirect prompt injection attack surface, primarily in the
posthog:llm-analytics-evaluation-summary-createtool. - Ingestion points: Untrusted data enters the agent context via AI generations (
$ai_generationevents) which are retrieved and summarized. - Boundary markers: There are no explicit instructions for the agent to use delimiters or ignore instructions embedded within the generation data.
- Capability inventory: The skill has significant capabilities including database querying (
posthog:execute-sql), evaluation creation/modification (posthog:evaluation-create,posthog:evaluation-update), and running evaluations (posthog:evaluation-run). - Sanitization: There is no mention of sanitizing, escaping, or validating the content of external AI generations before they are processed by the summarization LLM.
Audit Metadata