golden-dataset-curation
Fail
Audited by Gen Agent Trust Hub on Feb 16, 2026
Risk Level: HIGHPROMPT_INJECTIONEXTERNAL_DOWNLOADS
Full Analysis
- PROMPT_INJECTION (HIGH): The skill is highly susceptible to Indirect Prompt Injection (Category 8) due to its core architecture.
- Ingestion points: Untrusted data enters the pipeline through the
FETCH AGENTinreferences/annotation-patterns.md, which retrieves content from arbitrary URLs. - Boundary markers: Absent. Analysis prompts in
references/annotation-patterns.md(e.g.,Quality Evaluator Agent,Difficulty Classifier Agent) interpolate{content_preview}and{section_titles}directly into the instruction block without XML tags, triple quotes, or instructions to the LLM to ignore embedded commands. - Capability inventory: The pipeline has the capability to classify content, generate test queries, and assign quality scores that determine if a document is included in the 'golden dataset'. A malicious document could contain instructions to force an 'include' decision or generate biased test queries.
- Sanitization: There is no evidence of content sanitization, escaping, or filtering before the external data is passed to the LLM sub-agents.
- EXTERNAL_DOWNLOADS (MEDIUM): The skill performs network operations to untrusted sources.
- Evidence: The
FETCH AGENTdescribed inreferences/annotation-patterns.mdand thefetch_urlfunction mentioned in the trace logic initiate requests to external URLs provided as input. This presents a risk of Server-Side Request Forgery (SSRF) and exposes the agent to malicious content hosted on attacker-controlled domains.
Recommendations
- AI detected serious security threats
Audit Metadata