Gemini 3 Thought Signature Tool Use Fix

Problem

When routing LLM requests through Google's Gemini 3 models via an OpenAI-compatible proxy, tool use (function calling) produces empty responses on the follow-up turn. Simple text messages work fine. The bot uses the tool successfully but then returns nothing when it should synthesize the tool result into a response.

Context / Trigger Conditions

Empty responses only after tool use: Non-tool messages (greetings, questions) work perfectly; only messages involving tools (web_fetch, exec, browser, etc.) produce empty responses
Google returns 400: "Function call is missing a thought_signature" in error body
Using Gemini 3 models: gemini-3-flash-preview, gemini-3-pro, or similar Gemini 3 family models
OpenAI-compatible client: Any client that serializes conversation history using standard OpenAI format (strips non-standard fields like extra_content)
Second turn fails: The initial tool_call response from Google works; the follow-up request submitting tool results back fails
Provider cascade fails: All fallback providers fail because they all hit the same issue or aren't configured for tool use

Root Cause

Gemini 3 models use mandatory "thinking" mode that produces opaque thought_signature tokens alongside tool calls. The flow breaks like this:

1. Client → Proxy → Google: Request with tools array                    ✅
2. Google → Proxy → Client: Response with tool_calls + thought_signature ✅
3. Client strips extra_content (non-standard OpenAI field)               ❌
4. Client → Proxy → Google: Tool results WITHOUT thought_signature       ❌
5. Google returns 400: "Function call is missing a thought_signature"    ❌
6. Proxy cascade fails → empty response to user                         ❌

Key detail: Google's response includes the signature in a non-standard location:

{
  "tool_calls": [{
    "extra_content": { "google": { "thought_signature": "<opaque-base64>" } },
    "function": { "name": "web_fetch", "arguments": "..." },
    "id": "...", "type": "function"
  }]
}

Standard OpenAI clients (including clawdbot) strip extra_content when rebuilding conversation history because it's not part of the OpenAI API spec. Google then rejects the follow-up because it can't verify the thinking chain without the signature.

Why you can't just disable thinking: Google docs state "Reasoning cannot be turned off for Gemini 3 models." This is a hard requirement.

Solution: Proxy-Level Model Routing

Route requests with tools to a Gemini 2.x model (no thought signatures needed) and requests without tools to Gemini 3 (latest model for text quality).

Add this to the proxy's buildProviderBody() function:

if (provider.name === 'google') {
  delete clone.store;

  // Gemini 3 requires thought_signature for tool use, which clients
  // strip from conversation history. Fall back to 2.5-flash for tool
  // requests where this would break multi-turn function calling.
  if (clone.tools && clone.tools.length > 0) {
    clone.model = 'gemini-2.5-flash';
    console.log('[proxy] Tool use detected → downgrade to gemini-2.5-flash');
  }
}

Why This Approach

Approach	Pros	Cons
Smart model routing (chosen)	5 lines, transparent to client, no state	Tool requests use older model
Proxy-level signature injection	Preserves Gemini 3 for tools	Extremely complex: buffer SSE, parse streaming, cache per-session, inject on follow-up
Strip tools for Google	Simple	User loses tool use entirely
Use only 2.5-flash	No routing needed	Loses Gemini 3 quality for text
Fix the client	Ideal long-term	Client is closed-source npm package

Verification

After deploying the fix:

Tool use test: Send a message requiring tool use (e.g., "read this URL: ...") → bot should use the tool AND produce a non-empty text response
Non-tool test: Send "Hello" → should respond normally (via Gemini 3)
Provider status: Google provider should show failures: 0 and lastSuccess set
Debug logs: Should show "Tool use detected → downgrade to gemini-2.5-flash" for tool-bearing requests
Circuit breaker: Google circuit should remain CLOSED

Diagnostic Steps

If you suspect this issue:

Check provider-status endpoint: Does Google show lastSuccess: null despite requests?
Check proxy logs for 400 errors from Google containing "thought_signature"
Compare tool vs non-tool messages: Do non-tool messages work while tool messages fail?
Check the model name: Is it a Gemini 3 model? (Gemini 2.x doesn't have this issue)

Notes

Gemini 2.5-flash does not require thought signatures for function calling — it works with standard OpenAI-compatible tool_call format
This may be resolved in future Gemini versions if Google makes thought_signature optional or adds it to the standard response format. Check Google AI Studio changelog.
The check is clone.tools && clone.tools.length > 0 — this catches requests that have tools defined, not just requests that are submitting tool results. Both scenarios need the older model because the conversation may involve tool calls.
Related skill: See multi-provider-llm-proxy-debugging for general proxy chain debugging patterns (field stripping, URL construction, error body visibility)

gemini3-thought-signature-tool-use-fix