hypogenic
Warn
Audited by Gen Agent Trust Hub on Feb 15, 2026
Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION] (MEDIUM): The configuration defines several prompt templates (e.g.,
observations,inference,adaptive_refinement) that directly interpolate external content into the LLM context via placeholders like{data_samples},{sample_text}, and{challenging_examples}. \n - Ingestion points: Data is pulled from
data/train.json(and other JSON paths) as well as thepapers/directory (which expects PDF files). \n - Boundary markers: There are no delimiters (like XML tags or triple backticks) or instructions to the model to ignore embedded commands within the interpolated data. \n
- Capability inventory: The skill uses high-capability models (GPT-4) to perform reasoning, classification (
label_extraction), and writes results to anoutput/directory. \n - Sanitization: No sanitization logic is evident in the configuration template. A malicious data sample or a crafted PDF could include instructions to override the agent's task, leading to incorrect classifications or biased hypothesis generation.
Audit Metadata