openai-agents-sdk

Installation

SKILL.md

OpenAI Agents SDK

A lightweight, powerful framework for building multi-agent AI workflows in Python.

Installation

pip install openai-agents

# With LiteLLM for multi-provider support
pip install openai-agents[litellm]

Set your API key:

export OPENAI_API_KEY=your-key

Quick Start

from agents import Agent, Runner

agent = Agent(
    name="Assistant",
    instructions="You are a helpful assistant",
)

result = Runner.run_sync(agent, "Hello!")
print(result.final_output)

Core Primitives

Primitive	Purpose
Agent	LLM with instructions and tools
Tools	Python functions agents can call
Handoffs	Delegate to specialized agents
Guardrails	Validate inputs/outputs
Sessions	Maintain conversation history
Runner	Execute agent workflows

Agent with Tools

from agents import Agent, Runner, function_tool

@function_tool
def get_weather(city: str) -> str:
    """Get weather for a city."""
    return f"Weather in {city}: Sunny, 72F"

agent = Agent(
    name="Weather Bot",
    instructions="Help with weather questions",
    tools=[get_weather],
)

result = Runner.run_sync(agent, "What's the weather in NYC?")

Structured Output

from pydantic import BaseModel
from agents import Agent, Runner

class WeatherReport(BaseModel):
    city: str
    temperature: float
    conditions: str

agent = Agent(
    name="Weather Reporter",
    instructions="Extract weather data",
    output_type=WeatherReport,
)

result = Runner.run_sync(agent, "NYC is sunny and 72 degrees")
report: WeatherReport = result.final_output

Multi-Agent Handoffs

from agents import Agent

billing = Agent(name="Billing", instructions="Handle billing issues")
technical = Agent(name="Technical", instructions="Handle tech issues")

triage = Agent(
    name="Triage",
    instructions="Route to billing or technical support",
    handoffs=[billing, technical],
)

Sessions (Conversation Memory)

from agents import Agent, Runner
from agents.extensions.session import SQLiteSession

session = SQLiteSession("user_123", "chats.db")
agent = Agent(name="Assistant", instructions="Be helpful")

# Remembers across calls
result = await Runner.run(agent, "My name is Alice", session=session)
result = await Runner.run(agent, "What's my name?", session=session)

Custom OpenAI-Compatible Providers

Use OpenAIChatCompletionsModel for custom OpenAI-compatible APIs (Google Gemini, local models, etc.):

import os
from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel, set_tracing_disabled

# Disable tracing if not needed
set_tracing_disabled(True)

# Configure client for Google's Generative Language API
client = AsyncOpenAI(
    api_key=os.getenv("GEMINI_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)

# Create model with custom client
llm_model = OpenAIChatCompletionsModel(
    model="gemini-2.5-flash",
    openai_client=client,
)

# Use in agent
agent = Agent(
    name="Gemini Agent",
    instructions="You are a helpful assistant",
    model=llm_model,
    tools=[...],
)

Multi-Provider (LiteLLM)

Alternative approach using LiteLLM for multi-provider support:

from agents import Agent

# Anthropic Claude
claude = Agent(
    name="Claude",
    instructions="Be helpful",
    model="litellm/anthropic/claude-sonnet-4-20250514",
)

# Google Gemini via LiteLLM
gemini = Agent(
    name="Gemini",
    instructions="Be helpful",
    model="litellm/gemini/gemini-2.5-flash",
)

Fallback Model Pattern

Create a custom model that falls back to a secondary provider when the primary API is exhausted:

import os
from collections.abc import AsyncIterator

from openai import AsyncOpenAI, RateLimitError, APIStatusError
from agents import Agent, OpenAIChatCompletionsModel, ModelSettings
from agents.models.interface import Model, ModelTracing
from agents.agent_output import AgentOutputSchemaBase
from agents.handoffs import Handoff
from agents.items import ModelResponse, TResponseInputItem, TResponseStreamEvent
from agents.tool import Tool


class FallbackModel(Model):
    """Model wrapper that falls back to secondary provider on rate limit errors."""

    def __init__(self, primary_model: Model, fallback_model: Model):
        self.primary_model = primary_model
        self.fallback_model = fallback_model
        self._use_fallback = False

    async def get_response(
        self,
        system_instructions: str | None,
        input: str | list[TResponseInputItem],
        model_settings: ModelSettings,
        tools: list[Tool],
        output_schema: AgentOutputSchemaBase | None,
        handoffs: list[Handoff],
        tracing: ModelTracing,
        *,
        previous_response_id: str | None,
        conversation_id: str | None,
        prompt=None,
    ) -> ModelResponse:
        if self._use_fallback and self.fallback_model:
            return await self.fallback_model.get_response(
                system_instructions, input, model_settings, tools,
                output_schema, handoffs, tracing,
                previous_response_id=previous_response_id,
                conversation_id=conversation_id,
                prompt=prompt,
            )

        try:
            return await self.primary_model.get_response(
                system_instructions, input, model_settings, tools,
                output_schema, handoffs, tracing,
                previous_response_id=previous_response_id,
                conversation_id=conversation_id,
                prompt=prompt,
            )
        except (RateLimitError, APIStatusError) as e:
            is_quota_error = (
                isinstance(e, RateLimitError) or
                (isinstance(e, APIStatusError) and e.status_code in (429, 503)) or
                "quota" in str(e).lower()
            )
            if is_quota_error and self.fallback_model:
                print(f"[Fallback] Primary API exhausted, switching to fallback")
                self._use_fallback = True
                return await self.fallback_model.get_response(
                    system_instructions, input, model_settings, tools,
                    output_schema, handoffs, tracing,
                    previous_response_id=previous_response_id,
                    conversation_id=conversation_id,
                    prompt=prompt,
                )
            raise

    def stream_response(self, *args, **kwargs) -> AsyncIterator[TResponseStreamEvent]:
        # Implement similar fallback logic for streaming
        pass


# Usage: Gemini primary, DeepSeek fallback
gemini_client = AsyncOpenAI(
    api_key=os.getenv("GEMINI_API_KEY"),
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/",
)
deepseek_client = AsyncOpenAI(
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com",
)

gemini_model = OpenAIChatCompletionsModel(model="gemini-2.5-flash", openai_client=gemini_client)
deepseek_model = OpenAIChatCompletionsModel(model="deepseek-chat", openai_client=deepseek_client)

fallback_model = FallbackModel(gemini_model, deepseek_model)

agent = Agent(
    name="MyAgent",
    instructions="You are a helpful assistant",
    model=fallback_model,
    tools=[...],
)

DeepSeek Integration

DeepSeek offers two main models via OpenAI-compatible API:

Model	Use Case	Tool Support
`deepseek-chat`	General chat, tool calling	Yes
`deepseek-reasoner`	Complex reasoning (requires `reasoning_content` field)	Limited

Important: Use deepseek-chat for agents with tools. The deepseek-reasoner model requires special handling for tool calls.

from openai import AsyncOpenAI
from agents import Agent, OpenAIChatCompletionsModel

deepseek_client = AsyncOpenAI(
    api_key=os.getenv("DEEPSEEK_API_KEY"),
    base_url="https://api.deepseek.com",
)

# Use deepseek-chat for tool-calling agents
model = OpenAIChatCompletionsModel(
    model="deepseek-chat",  # NOT deepseek-reasoner for tool calls
    openai_client=deepseek_client,
)

agent = Agent(
    name="DeepSeek Agent",
    instructions="You are helpful",
    model=model,
    tools=[my_tool],
)

Reference Documentation

For detailed guidance on specific topics:

agents.md - Agent creation, dynamic instructions, context injection, lifecycle hooks
tools.md - Function tools, hosted tools, Pydantic validation, error handling
handoffs.md - Multi-agent delegation, input filters, conditional handoffs
guardrails.md - Input/output validation, tripwires, execution modes
sessions.md - SQLite, SQLAlchemy, encrypted sessions, memory operations
running.md - Runner class, streaming, configuration, exception handling
models.md - Model settings, LiteLLM integration, multi-provider workflows
patterns.md - Customer support, research assistant, RAG, human-in-the-loop

Common Patterns

Customer Support

from agents import Agent
from agents.extensions.handoff_prompt import RECOMMENDED_PROMPT_PREFIX

triage = Agent(
    name="Triage",
    instructions=f"""{RECOMMENDED_PROMPT_PREFIX}
Route customers to billing, technical, or sales.""",
    handoffs=[billing_agent, tech_agent, sales_agent],
)

Input Validation

from agents import Agent, input_guardrail, GuardrailFunctionOutput

@input_guardrail
async def block_harmful(ctx, agent, input_text):
    is_harmful = check_content(input_text)
    return GuardrailFunctionOutput(
        output_info={"harmful": is_harmful},
        tripwire_triggered=is_harmful,
    )

agent = Agent(
    name="Safe Agent",
    instructions="Be helpful",
    input_guardrails=[block_harmful],
)

Streaming Responses

from agents import Agent, Runner

async def stream():
    result = await Runner.run_streamed(agent, "Write a story")
    async for event in result.stream_events():
        if event.type == "text_delta":
            print(event.delta, end="")

Key Classes

Class	Import	Purpose
`Agent`	`from agents import Agent`	Create agents
`Runner`	`from agents import Runner`	Execute agents
`function_tool`	`from agents import function_tool`	Decorator for tools
`handoff`	`from agents import handoff`	Custom handoff config
`input_guardrail`	`from agents import input_guardrail`	Input validation
`output_guardrail`	`from agents import output_guardrail`	Output validation
`SQLiteSession`	`from agents.extensions.session import SQLiteSession`	Conversation memory
`RunConfig`	`from agents import RunConfig`	Execution settings
`OpenAIChatCompletionsModel`	`from agents import OpenAIChatCompletionsModel`	Custom OpenAI-compatible models
`set_tracing_disabled`	`from agents import set_tracing_disabled`	Disable tracing
`Model`	`from agents.models.interface import Model`	Base class for custom models
`ModelTracing`	`from agents.models.interface import ModelTracing`	Tracing configuration enum
`ModelResponse`	`from agents.items import ModelResponse`	Model response type
`TResponseInputItem`	`from agents.items import TResponseInputItem`	Input item type
`TResponseStreamEvent`	`from agents.items import TResponseStreamEvent`	Stream event type

Related skills

More from salmanferozkhan/cloud-and-fast-api

Installs

Repository

salmanferozkhan…fast-api

First Seen

Feb 6, 2026

openai-agents-sdk

OpenAI Agents SDK

Installation

Quick Start

Core Primitives

Agent with Tools

Structured Output

Multi-Agent Handoffs

Sessions (Conversation Memory)

Custom OpenAI-Compatible Providers

Multi-Provider (LiteLLM)

Fallback Model Pattern

DeepSeek Integration

Reference Documentation

Common Patterns

Customer Support

Input Validation

Streaming Responses

Key Classes

More from salmanferozkhan/cloud-and-fast-api

sqlmodel

fastapi

context7-efficient

internal-comms

pytest

browsing-with-playwright