production-monitoring
Pass
Audited by Gen Agent Trust Hub on Mar 8, 2026
Risk Level: SAFEPROMPT_INJECTION
Full Analysis
- [PROMPT_INJECTION]: The skill implements LLM-based evaluators that are vulnerable to indirect prompt injection during the assessment of production traces. \n
- Ingestion points: The
relevance_scorerandguidelines_scorerfunctions inscripts/register_production_scorers.pyandreferences/registered-scorers.mdprocess untrusted data from agent inputs and outputs. \n - Boundary markers: The LLM judge prompts (e.g.,
judge_prompt) lack robust delimiters or specific instructions to ignore malicious commands embedded within the evaluated trace content. \n - Capability inventory: The skill uses
mlflow.models.register_scorerto deploy these evaluators for continuous monitoring andmlflow.genai.assessfor ad-hoc evaluations. \n - Sanitization: No input validation or escaping is applied to the trace data before it is interpolated into the judge prompts, allowing potential instruction override by the data being evaluated.
Audit Metadata