mlflow-genai-evaluation

Pass

Audited by Gen Agent Trust Hub on Mar 8, 2026

Risk Level: SAFEPROMPT_INJECTION
Full Analysis
  • [PROMPT_INJECTION]: Indirect prompt injection surface identified in custom scorer implementations where untrusted agent outputs are evaluated by a 'judge' LLM.
  • Ingestion points: The inputs and outputs dictionaries in cost_accuracy_judge (references/custom-scorer-patterns.md) and the _extract_response_text helper (scripts/evaluation_helpers.py) ingest data directly from agent execution results.
  • Boundary markers: Absent. The example evaluation prompts in references/custom-scorer-patterns.md use Python f-strings to embed the query and response_text variables directly into the judge's instructions without delimiters (e.g., XML tags or triple backticks) or instructions to ignore embedded commands.
  • Capability inventory: The _call_llm_for_scoring function (scripts/evaluation_helpers.py) uses the Databricks SDK (WorkspaceClient) to perform network requests to model serving endpoints (w.serving_endpoints.query).
  • Sanitization: Absent. The skill does not perform escaping, filtering, or validation on the extracted text before passing it to the scoring LLM.
Audit Metadata
Risk Level
SAFE
Analyzed
Mar 8, 2026, 02:33 AM