hypogenic

Warn

Audited by Gen Agent Trust Hub on Feb 15, 2026

Risk Level: MEDIUMPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION] (MEDIUM): The configuration defines several prompt templates (e.g., observations, inference, adaptive_refinement) that directly interpolate external content into the LLM context via placeholders like {data_samples}, {sample_text}, and {challenging_examples}. \n
  • Ingestion points: Data is pulled from data/train.json (and other JSON paths) as well as the papers/ directory (which expects PDF files). \n
  • Boundary markers: There are no delimiters (like XML tags or triple backticks) or instructions to the model to ignore embedded commands within the interpolated data. \n
  • Capability inventory: The skill uses high-capability models (GPT-4) to perform reasoning, classification (label_extraction), and writes results to an output/ directory. \n
  • Sanitization: No sanitization logic is evident in the configuration template. A malicious data sample or a crafted PDF could include instructions to override the agent's task, leading to incorrect classifications or biased hypothesis generation.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 15, 2026, 10:15 PM