open-responses

Installation
SKILL.md

Open Responses

Open Responses is an open-source specification defining a unified HTTP protocol for multi-provider LLM interactions. It standardizes how clients and servers communicate — messages, tool calls, streaming, multimodal inputs, reasoning — so that code written against one provider works with any compliant provider.

This is the protocol standard itself, not any specific SDK. Open Responses is provider-agnostic. Any LLM provider (OpenAI, Anthropic, Gemini, Databricks, Hugging Face, Ollama, etc.) can implement a compliant API.

Stateless by default, stateful where needed. The core protocol does not require server-side session persistence. Multi-turn conversations can be threaded via previous_response_id, which instructs the server to reconstruct context from prior responses. However, providers may offer stateful features (e.g., server-side storage, conversation objects) as extensions. The spec notes that item states "do not necessarily mean they are stateful in the sense of being persisted to disk or stored long-term."

Design Principles

  • Multi-provider compatibility — one schema, any provider
  • Stateless-first protocol — context reconstruction via previous_response_id; providers may optionally offer persistence
  • Polymorphic items — all model outputs share a common item structure discriminated by type
  • Semantic streaming — SSE events map directly to state machine transitions
  • Extensible without fragmentation — vendor-prefixed extensions prevent namespace collisions

Specification: https://www.openresponses.org/specification


Reference Files

For detailed schemas, JSON examples, and complete event catalogs, load the appropriate reference file:

File Contents When to Load
references/protocol-and-items.md HTTP protocol, item types, content types, control parameters, error handling Implementing or debugging request/response structure
references/state-machines-and-streaming.md State machine diagrams, streaming event catalog, complete SSE sequences for text and tool use Implementing or debugging streaming, state transitions
references/extensions.md Custom items, custom events, schema extensions, governance path Extending the spec with provider-specific features

To search references for specific topics: grep for function_call, streaming, tool_choice, previous_response_id, vendor:, or other keywords.


Core Concepts

Endpoint and Transport

All requests go to POST /v1/responses with Authorization: Bearer <token> and Content-Type: application/json. Non-streaming responses return JSON. Streaming responses use SSE (text/event-stream) terminated by data: [DONE].

Items

Items are polymorphic atomic units discriminated by type. Output items (those emitted by the model in a response) must include id, type, and status fields. Core output types: message, function_call, reasoning. Providers extend with vendor-prefixed types (e.g., acme:web_search_call).

Input items (those sent by the client in a request) have different requirements per type. Content types like input_text, input_image, and input_file do not carry id or status. function_call_output items require call_id and output but treat id and status as optional.

Message roles: user, assistant, system, developer. The system role is distinct from the instructions parameter — it is an inline message item in the input array. The developer role is a separate role that providers may handle differently from system.

State Machines and Event Emission

The response and item lifecycles are both finite state machines. Each state constrains which events can be emitted.

Response Lifecycle — Events Emitted Per State

stateDiagram-v2
    [*] --> created : response.created
    created --> queued : response.queued
    queued --> in_progress : response.in_progress

    state in_progress {
        direction LR
        note right of in_progress
            Events emittable while in_progress:
            ─────────────────────────────────
            response.output_item.added
            response.content_part.added
            response.output_text.delta
            response.output_text.done
            response.function_call_arguments.delta
            response.function_call_arguments.done
            response.reasoning_summary_text.delta
            response.reasoning_summary_text.done
            response.content_part.done
            response.output_item.done
            vendor:custom_event

            All delta events carry: sequence_number,
            output_index, item_id
            Content-level events also carry: content_index
        end note
    }

    in_progress --> completed : response.completed
    in_progress --> incomplete : response.incomplete\n(item hit token budget)
    in_progress --> failed : response.failed
    completed --> [*]
    incomplete --> [*]
    failed --> [*]

Note: If any item ends in incomplete status, the containing response MUST also be incomplete.

Item Lifecycle — Events Emitted Per State

stateDiagram-v2
    [*] --> in_progress : response.output_item.added

    state in_progress {
        direction LR
        note right of in_progress
            Events emittable while item is in_progress:
            ──────────────────────────────────────────
            Message items:
              response.content_part.added
              response.output_text.delta  (repeated)
              response.output_text.done
              response.content_part.done

            Function call items:
              response.function_call_arguments.delta  (repeated)
              response.function_call_arguments.done

            Reasoning items:
              response.reasoning_summary_text.delta  (repeated)
              response.reasoning_summary_text.done
        end note
    }

    in_progress --> completed : response.output_item.done
    in_progress --> incomplete : response.output_item.done
    completed --> [*]
    incomplete --> [*]

    note right of completed : Terminal — no further deltas
    note right of incomplete : Terminal — token budget exhausted

Event Validity Summary

Response State Valid Events
created (transient — response object just created)
queued (waiting for model availability)
in_progress All delta events, all custom events, item lifecycle events
completed (terminal — no more events except [DONE])
incomplete (terminal — no more events except [DONE])
failed (terminal — no more events except [DONE])
Item State Valid Events
in_progress Content deltas (.delta), content completion (.done), part lifecycle
completed (terminal — no further deltas for this item)
incomplete (terminal — no further deltas for this item)

All delta and item events carry sequence_number (monotonically increasing), output_index (position in response output array), and item_id. Content-level events (text, reasoning summary) additionally carry content_index (position within a content part). Servers SHOULD NOT use the SSE id field.

Streaming Events

Two categories of SSE events:

  • Delta events — incremental content: response.output_text.delta, response.function_call_arguments.delta, response.output_item.added, response.output_item.done, etc.
  • Lifecycle events — state transitions: response.created, response.queued, response.in_progress, response.completed, response.incomplete, response.failed

Rule: the event SSE header must match the type field inside the JSON body.


Tools

Open Responses defines two tool categories based on execution location.

Externally-hosted tools — implementation lives outside the provider's system. The model requests invocation via function_call items, and the developer must supply results as function_call_output items in a follow-up request. Note that "externally hosted" does not always mean the developer executes the tool locally — MCP tools are externally hosted (the implementation lives on external servers), but control is not necessarily yielded back to the developer first. Examples: function tools, MCP server tools.

Internally-hosted tools — implementation lives inside the provider's system. The provider executes without yielding control and returns results as provider-specific item types within the same response. These items must be losslessly round-trippable in follow-up requests. Examples: file search, code interpreter, web search.

Tool Definition

{
  "type": "function",
  "name": "get_weather",
  "description": "Get current weather for a location",
  "parameters": {
    "type": "object",
    "properties": {
      "location": {"type": "string", "description": "City name"},
      "units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
    },
    "required": ["location"]
  }
}

Tool Control

The tool_choice parameter controls whether and how the model uses tools:

tool_choice value Purpose
"auto" Model decides whether to call tools (default)
"required" Model must invoke at least one tool
"none" No tool calls permitted
{"type": "function", "name": "..."} Force a specific tool
{"type": "allowed_tools", "tools": [...]} Restrict which tools the model may invoke

The allowed_tools form is nested inside tool_choice, not a separate top-level parameter:

{
  "tool_choice": {
    "type": "allowed_tools",
    "tools": [
      {"type": "function", "name": "get_weather"}
    ]
  }
}

The model MUST restrict its tool calls to the subset named in allowed_tools. Servers MUST enforce this as a hard constraint. Tool definitions remain in the model's context, preserving prompt cache.


Agentic Loop Pattern

The agentic loop is the core pattern for multi-step, tool-augmented workflows.

Flow

  Client                     Provider                    Model
    |                           |                          |
    |-- POST /v1/responses ---->|                          |
    |                           |--- prompt to model ----->|
    |                           |<-- output items ---------|
    |                           |                          |
    |            [external tool calls needing               |
    |             client-supplied results?]                 |
    |                           |                          |
    |              YES                                     |
    |<-- response with --------|                          |
    |   function_call items     |                          |
    |                           |                          |
    |   [client satisfies       |                          |
    |    tool calls]            |                          |
    |                           |                          |
    |-- POST /v1/responses ---->|                          |
    |   previous_response_id +  |                          |
    |   function_call_output    |--- prompt + context ---->|
    |   items in input          |<-- output items ---------|
    |                           |                          |
    |              NO: no client-satisfied calls remain     |
    |<-- completed response ----|                          |
    |   (may contain message,   |                          |
    |    reasoning, hosted-tool |                          |
    |    items, etc.)           |                          |

Key Principles

  1. Stateless-first iteration — Each loop iteration is a new HTTP request. The server reconstructs context from previous_response_id. Providers may optionally persist state, but the protocol does not require it.

  2. Developer controls external tool execution — For externally-hosted function tools, the developer decides when to execute, what results to return, and whether to continue. For MCP tools (also externally hosted), execution may happen without first yielding control to the developer.

  3. Parallel tool calls — The model may emit multiple function_call items in a single response. Execute all of them and return all results in one follow-up request.

  4. Loop termination — The loop ends when no client-satisfied external tool calls remain in the response. The final response may contain not just message items but also reasoning items, internally-hosted tool items, and other non-message output items.

  5. Provider handles internal tools — For internally-hosted tools, the provider executes within the same request and returns provider-specific item types. No developer loop required.

Example: Multi-Tool Agent

Turn 1 — Request with tools:

{
  "model": "provider/model-name",
  "input": [{"type": "message", "role": "user", "content": "Compare the weather in Paris and Tokyo."}],
  "tools": [{"type": "function", "name": "get_weather", "description": "Get current weather for a city", "parameters": {"type": "object", "properties": {"location": {"type": "string"}}, "required": ["location"]}}]
}

Turn 1 — Model emits two parallel function_call items:

{
  "id": "resp_100",
  "status": "completed",
  "output": [
    {"id": "item_101", "type": "function_call", "name": "get_weather", "call_id": "call_paris", "arguments": "{\"location\":\"Paris\"}", "status": "completed"},
    {"id": "item_102", "type": "function_call", "name": "get_weather", "call_id": "call_tokyo", "arguments": "{\"location\":\"Tokyo\"}", "status": "completed"}
  ]
}

Turn 2 — Developer returns tool results:

{
  "model": "provider/model-name",
  "previous_response_id": "resp_100",
  "input": [
    {"type": "function_call_output", "call_id": "call_paris", "output": "{\"temperature\":18,\"condition\":\"partly cloudy\"}"},
    {"type": "function_call_output", "call_id": "call_tokyo", "output": "{\"temperature\":24,\"condition\":\"sunny\"}"}
  ],
  "tools": [...]
}

Turn 2 — Model synthesizes final answer (no function_call items = loop ends):

{
  "id": "resp_101",
  "status": "completed",
  "output": [
    {"id": "item_200", "type": "message", "role": "assistant", "status": "completed", "content": [{"type": "output_text", "text": "Paris is currently 18°C and partly cloudy. Tokyo is warmer at 24°C with sunny skies."}]}
  ]
}

Multi-Turn Conversations

Multi-turn conversations use previous_response_id to chain context. The server reconstructs the full conversation by walking the response chain (providers may also support server-side persistence as an extension):

Server loads: previous_response.input + previous_response.output + new_input
// Turn 1
{"model": "provider/model-name", "input": [{"type": "message", "role": "user", "content": "What is the population of France?"}]}
// Response: {"id": "resp_200", ...}

// Turn 2 — references Turn 1
{"model": "provider/model-name", "previous_response_id": "resp_200", "input": [{"type": "message", "role": "user", "content": "And what about Germany?"}]}

Extensions

Open Responses supports four extension mechanisms, all using vendor-prefixed names to prevent collisions. For full details with examples, load references/extensions.md.

Mechanism Naming Pattern Required Fields Constraint
Custom Items vendor:type_name id, type, status Must follow item state machine, must round-trip
Custom Events vendor:event_name type, sequence_number Must not alter core semantics or token order
Schema Extensions vendor-prefixed fields N/A (optional fields) Must not break clients ignoring unknown fields
Governance Path N/A N/A Broad adoption -> TSC proposal -> core spec

Clients must silently ignore unknown item types and event types — this is the forward-compatibility contract.


Compliance

An API is Open Responses-compliant if it implements the spec directly or is a proper superset. The published acceptance test suite is available at https://www.openresponses.org/.

Core Compliance Tests

Test Validates
Basic Text Response ResponseResource schema, item structure, usage
Streaming Response SSE events, correct ordering, final structure
System Prompt instructions parameter, system role handling
Tool Calling Function tool definition, function_call output, round-tripping
Image Input Image URL in user content
Multi-turn Conversation Message history, assistant + user turns

Server Implementation Checklist

  • POST /v1/responses endpoint with Authorization header
  • Valid output items with id, type, status; input items per their type requirements
  • Item state machine: in_progress -> completed / incomplete
  • Response state machine: created -> queued -> in_progress -> completed / incomplete / failed
  • Emit all 6 lifecycle events: response.created, .queued, .in_progress, .completed, .incomplete, .failed
  • Response incomplete when any item ends incomplete
  • Non-streaming JSON and streaming SSE with event/type matching
  • data: [DONE] terminal marker
  • Function tools: function_call items, function_call_output round-tripping
  • previous_response_id for conversation continuation
  • Error objects: type, code, param, message with correct HTTP status codes
  • Vendor-prefixed extensions (if applicable)

Client Implementation Checklist

  • Send Authorization and Content-Type headers
  • Parse polymorphic items by type field
  • Track item and response state machines
  • Process SSE: parse event: + data: lines, handle [DONE]
  • Implement agentic loop for externally-hosted tools
  • Silently ignore unknown item types and event types
  • Support previous_response_id for multi-turn conversations
  • Handle parallel tool calls in a single response

Quick Reference

Streaming Event Types

Event Category
response.created Lifecycle
response.queued Lifecycle
response.in_progress Lifecycle
response.completed Lifecycle
response.incomplete Lifecycle
response.failed Lifecycle
response.output_item.added / .done Delta
response.content_part.added / .done Delta
response.output_text.delta / .done Delta
response.function_call_arguments.delta / .done Delta
response.reasoning_summary_text.delta / .done Delta
vendor:custom_event Custom

Item Types

Type Category
message Core
function_call Core
function_call_output Core
reasoning Core
vendor:custom_type Extension

State Summary

Object States Terminal
Response created -> queued -> in_progress -> completed / incomplete / failed completed, incomplete, failed
Item in_progress -> completed / incomplete completed, incomplete

If any item ends incomplete, the containing response MUST also be incomplete.

Error Types

Type HTTP Retry
invalid_request 400 No
not_found 404 No
too_many_requests 429 Yes
server_error 500 Yes
model_error 500 Maybe
Weekly Installs
41
GitHub Stars
110
First Seen
1 day ago