human-in-the-loop-training
Pass
Audited by Gen Agent Trust Hub on Mar 18, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill implements a learning pipeline that is vulnerable to indirect prompt injection.
- Ingestion points: The
HumanReviewQueue.submit_reviewandIncrementalLearner.correction_bufferinSKILL.mdaccept arbitrary human-provided feedback. - Boundary markers: The
IncrementalLearner.generate_with_correctionsfunction interpolates corrections into a prompt using only text headers without delimiters or instructions to ignore embedded commands. - Capability inventory: The skill uses
llm.generateto produce bot responses and derive rules based on these inputs. - Sanitization: No sanitization or validation of the correction content is performed to prevent instructions from being treated as commands.
Audit Metadata