MLflow GenAI Foundation Patterns

When to Use

Use this skill when:

Creating new Databricks GenAI agents
Implementing MLflow tracing for agents
Setting up agent evaluation pipelines
Managing prompts with MLflow Prompt Registry
Troubleshooting AI Playground compatibility issues
Understanding foundational MLflow GenAI concepts

⚠️ CRITICAL: ResponsesAgent is MANDATORY for AI Playground

Databricks recommends ResponsesAgent over ChatAgent for all new agents.

Without proper model signatures, your agent will NOT work in AI Playground, Agent Evaluation, or Mosaic AI features.

Key Points:

ResponsesAgent automatically infers compatible model signatures
Manual signatures break AI Playground compatibility
Use input key (not messages) in input examples
Return ResponsesAgentResponse objects (not dicts)

See: responses-agent-patterns skill for complete implementation guide.

⚠️ CRITICAL: NO LLM Fallback for Data Queries

When an agent uses Genie Spaces or data retrieval tools, NEVER fall back to LLM when tools fail.

Why: LLM fallback generates hallucinated fake data that looks real but is completely fabricated.

Correct Pattern:

Return explicit error messages when tools fail
Include "I will NOT generate fake data" statement
Log errors to trace spans for visibility

See: multi-agent-genie-orchestration skill for complete pattern.

Model Signatures Overview

Why Signatures Matter

"Azure Databricks uses MLflow Model Signatures to define agents' input and output schema. Product features like the AI Playground assume that your agent has one of a set of supported model signatures." — Microsoft Docs

The Golden Rule

"If you follow the recommended approach to authoring agents, MLflow will automatically infer a signature for your agent that is compatible with Azure Databricks product features, with no additional work required on your part." — Microsoft Docs

What Breaks Compatibility

Issue	Impact
Manual signature with wrong schema	❌ AI Playground won't load
`PythonModel` instead of `ResponsesAgent`	❌ No signature inference
`messages` input instead of `input`	❌ Request format mismatch
Legacy dict output instead of `ResponsesAgentResponse`	❌ Response parsing fails

See Model Signatures for detailed explanation.

Tracing Fundamentals

Automatic Tracing with autolog

Enable autolog at module level for automatic tracing:

import mlflow

# At the TOP of your main module
mlflow.langchain.autolog(
    log_models=True,
    log_input_examples=True,
    log_model_signatures=True,
    log_inputs=True
)

Manual Tracing with Decorators

Use @mlflow.trace for custom functions:

import mlflow

@mlflow.trace(name="my_function", span_type="AGENT")
def my_agent_function(query: str) -> dict:
    """Function is automatically traced."""
    result = process(query)
    return result

Manual Span Creation

For fine-grained control:

import mlflow

def complex_operation(data):
    with mlflow.start_span(name="outer_operation") as span:
        span.set_inputs({"data": data})
        
        with mlflow.start_span(name="inner_step", span_type="LLM") as inner:
            inner.set_inputs({"prompt": "..."})
            result = llm.invoke(...)
            inner.set_outputs({"response": result})
        
        span.set_outputs({"result": result})
        span.set_attributes({"custom_metric": 0.95})
    
    return result

See Tracing Patterns for complete guide.

Evaluation Basics

Built-in Scorers

from mlflow.metrics.genai import relevance, safety, Guidelines

results = mlflow.genai.evaluate(
    model=agent,
    data=evaluation_data,
    scorers=[
        Relevance(),
        Safety(),
        GuidelinesAdherence(guidelines=[
            "Include time context",
            "Format costs as USD",
            "Cite sources"
        ])
    ]
)

Custom Scorers with @scorer

from mlflow.genai import scorer, Score

@scorer
def custom_judge(inputs: dict, outputs: dict, expectations: dict = None) -> Score:
    """Custom judge for domain-specific accuracy."""
    response_text = _extract_response_text(outputs)  # Helper required
    score_value = calculate_score(response_text)
    
    return Score(
        value=score_value,
        rationale="Explanation of score"
    )

Production Monitoring

Use mlflow.genai.assess() for real-time assessment:

assessment = mlflow.genai.assess(
    inputs={"query": query},
    outputs={"response": response},
    scorers=[Relevance(), Safety()]
)

if assessment.scores["relevance"] < 0.6:
    trigger_quality_alert()

See Evaluation Basics for complete guide.

Prompt Registry Basics

Log Prompts

import mlflow.genai

mlflow.genai.log_prompt(
    prompt="""You are a helpful assistant.
    
User context: {user_context}
Query: {query}""",
    artifact_path="prompts/assistant",
    registered_model_name="my_app_assistant_prompt"
)

Load Prompts by Alias

# Load production prompt
prompt = mlflow.genai.load_prompt(
    "prompts:/my_app_assistant_prompt/production"
)

# Load by version
prompt_v1 = mlflow.genai.load_prompt(
    "prompts:/my_app_assistant_prompt/1"
)

Set Aliases

from mlflow import MlflowClient

client = MlflowClient()

client.set_registered_model_alias(
    name="my_app_assistant_prompt",
    alias="production",
    version="2"
)

See Prompt Registry Basics for complete guide.

Agent Logging Patterns

ResponsesAgent (Recommended)

import mlflow
from mlflow.pyfunc import ResponsesAgent

agent = MyResponsesAgent()

# CRITICAL: Set model before logging
mlflow.models.set_model(agent)

# Input example in ResponsesAgent format
input_example = {
    "input": [{"role": "user", "content": "What is the status?"}]
}

with mlflow.start_run():
    # DO NOT pass signature parameter - auto-inferred!
    mlflow.pyfunc.log_model(
        artifact_path="agent",
        python_model=agent,
        input_example=input_example,
        # signature=...  # ❌ NEVER include this!
        registered_model_name="my_agent",
        pip_requirements=[
            "mlflow>=3.0.0",
            "databricks-sdk>=0.28.0",
        ],
    )

See: responses-agent-patterns skill for complete implementation.

Common Mistakes Quick Reference

Mistake	Impact	Fix
Using `PythonModel` instead of `ResponsesAgent`	❌ No signature inference	Use `ResponsesAgent`
Manual signature definition	❌ Breaks AI Playground	Let MLflow auto-infer
Using `messages` instead of `input`	❌ Format mismatch	Use `input` key
Returning dict instead of `ResponsesAgentResponse`	❌ Parsing fails	Return `ResponsesAgentResponse`
Missing `set_model()` before `log_model()`	⚠️ May fail	Call `set_model()` first
LLM fallback for data queries	❌ Hallucinated data	Return explicit errors

Validation Checklist

🔴 ResponsesAgent & Model Signatures (CRITICAL)

Agent class inherits from mlflow.pyfunc.ResponsesAgent
predict method accepts single request parameter
predict returns ResponsesAgentResponse object
Input example uses input key (NOT messages)
NO signature parameter in log_model() call
Agent loads successfully in AI Playground

Tracing

mlflow.langchain.autolog() enabled at module level
All custom functions decorated with @mlflow.trace
Span types specified (AGENT, LLM, TOOL, etc.)
Inputs and outputs set for manual spans
Traces tagged with user_id, session_id, environment

Evaluation

Built-in scorers used where appropriate
Custom judges return Score objects
Evaluation metrics logged to MLflow
Production monitoring with assess() implemented

Prompts

All prompts logged to registry
Production alias set for deployment
Prompts loaded by alias in production code

Agent Logging

ResponsesAgent interface implemented (not ChatAgent)
set_model() called before log_model()
Model registered with proper name
Aliases set for dev/staging/production

References

Model Signatures - Why signatures matter, AI Playground compatibility
Tracing Patterns - autolog, decorators, manual spans
Evaluation Basics - Built-in scorers, custom judges, production monitoring
Prompt Registry Basics - Logging, loading, aliases

Official Documentation

Related Skills

responses-agent-patterns - Complete ResponsesAgent implementation
multi-agent-genie-orchestration - NO LLM fallback pattern
mlflow-genai-evaluation - Advanced evaluation patterns

mlflow-genai-foundation