llm-testing

Warn

Audited by Gen Agent Trust Hub on Feb 16, 2026

Risk Level: MEDIUMEXTERNAL_DOWNLOADSPROMPT_INJECTION
Full Analysis
  • EXTERNAL_DOWNLOADS (MEDIUM): Recommends installing third-party packages 'deepeval' and 'ragas' which are not within the defined trusted organization scope.
  • PROMPT_INJECTION (LOW): The skill processes untrusted LLM outputs for evaluation, creating an indirect prompt injection surface. 1. Ingestion points: 'actual_output' in 'LLMTestCase' and RAGAS 'Dataset' structures. 2. Boundary markers: Absent in provided snippets. 3. Capability inventory: Evaluation scores and pass/fail results influence agent reasoning and automated quality gates. 4. Sanitization: Proactively includes a specific test pattern (test_injection_attempt) for prompt injection resistance.
Audit Metadata
Risk Level
MEDIUM
Analyzed
Feb 16, 2026, 01:13 AM