The Agent Skills Directory

[PROMPT_INJECTION]: The skill ingests untrusted metadata from PostHog entities, which could contain malicious instructions designed to hijack the agent's logic during the audit process.
Ingestion points: Untrusted data enters the context through read_data and list_data calls targeting experiment and feature flag objects in the PostHog environment.
Boundary markers: The instructions lack delimiters or specific system-level warnings to the agent to disregard instructions found within the description or metrics fields of the data being processed.
Capability inventory: The skill utilizes create_notebook to generate report artifacts. While it does not have direct access to shell execution or network tools, the ability to generate structured output from untrusted input is a known vector for indirect injection.
Sanitization: No sanitization or validation of the fetched data fields is performed before the content is interpreted by the agent for check evaluation.

auditing-experiments-flags