golden-dataset-curation

Fail

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
  • PROMPT_INJECTION (HIGH): The skill is highly susceptible to Indirect Prompt Injection (Category 8) due to its core architecture.
  • Ingestion points: Untrusted data enters the pipeline through the FETCH AGENT in references/annotation-patterns.md, which retrieves content from arbitrary URLs.
  • Boundary markers: Absent. Analysis prompts in references/annotation-patterns.md (e.g., Quality Evaluator Agent, Difficulty Classifier Agent) interpolate {content_preview} and {section_titles} directly into the instruction block without XML tags, triple quotes, or instructions to the LLM to ignore embedded commands.
  • Capability inventory: The pipeline has the capability to classify content, generate test queries, and assign quality scores that determine if a document is included in the 'golden dataset'. A malicious document could contain instructions to force an 'include' decision or generate biased test queries.
  • Sanitization: There is no evidence of content sanitization, escaping, or filtering before the external data is passed to the LLM sub-agents.
  • EXTERNAL_DOWNLOADS (MEDIUM): The skill performs network operations to untrusted sources.
  • Evidence: The FETCH AGENT described in references/annotation-patterns.md and the fetch_url function mentioned in the trace logic initiate requests to external URLs provided as input. This presents a risk of Server-Side Request Forgery (SSRF) and exposes the agent to malicious content hosted on attacker-controlled domains.
Recommendations
  • AI detected serious security threats
Audit Metadata
Risk Level
HIGH
Analyzed
Feb 16, 2026, 01:23 AM