risk-assessment-frameworks

Pass

Audited by Gen Agent Trust Hub on Mar 4, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: Indirect prompt injection surface detected.
  • Ingestion points: The skill processes untrusted external data including media articles (media_article), online extremist content (online_extremist_content), and general document content (document_content) within the violence_risk_indicators and DemocraticBackslidingDetector logic.
  • Boundary markers: There are no visible delimiters or specific instructions to the agent to disregard potential instructions embedded within the ingested text data.
  • Capability inventory: The skill utilizes Python data analysis libraries (pandas, scikit-learn) and complex SQL queries to generate risk scores and stability predictions. These outputs are intended to guide agent analysis and reporting.
  • Sanitization: While the code uses SQL parameterization to prevent traditional injection, no filtering or sanitization is applied to the natural language content of media or online documents to prevent instruction injection or manipulation of the resulting risk assessments.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 4, 2026, 03:38 AM