gemini3-thought-signature-tool-use-fix
Gemini 3 Thought Signature Tool Use Fix
Problem
When routing LLM requests through Google's Gemini 3 models via an OpenAI-compatible proxy, tool use (function calling) produces empty responses on the follow-up turn. Simple text messages work fine. The bot uses the tool successfully but then returns nothing when it should synthesize the tool result into a response.
Context / Trigger Conditions
- Empty responses only after tool use: Non-tool messages (greetings, questions) work perfectly; only messages involving tools (web_fetch, exec, browser, etc.) produce empty responses
- Google returns 400:
"Function call is missing a thought_signature"in error body - Using Gemini 3 models:
gemini-3-flash-preview,gemini-3-pro, or similar Gemini 3 family models - OpenAI-compatible client: Any client that serializes conversation history using
standard OpenAI format (strips non-standard fields like
extra_content) - Second turn fails: The initial tool_call response from Google works; the follow-up request submitting tool results back fails
- Provider cascade fails: All fallback providers fail because they all hit the same issue or aren't configured for tool use
Root Cause
Gemini 3 models use mandatory "thinking" mode that produces opaque thought_signature
tokens alongside tool calls. The flow breaks like this:
1. Client → Proxy → Google: Request with tools array ✅
2. Google → Proxy → Client: Response with tool_calls + thought_signature ✅
3. Client strips extra_content (non-standard OpenAI field) ❌
4. Client → Proxy → Google: Tool results WITHOUT thought_signature ❌
5. Google returns 400: "Function call is missing a thought_signature" ❌
6. Proxy cascade fails → empty response to user ❌
Key detail: Google's response includes the signature in a non-standard location:
{
"tool_calls": [{
"extra_content": { "google": { "thought_signature": "<opaque-base64>" } },
"function": { "name": "web_fetch", "arguments": "..." },
"id": "...", "type": "function"
}]
}
Standard OpenAI clients (including clawdbot) strip extra_content when rebuilding
conversation history because it's not part of the OpenAI API spec. Google then rejects
the follow-up because it can't verify the thinking chain without the signature.
Why you can't just disable thinking: Google docs state "Reasoning cannot be turned off for Gemini 3 models." This is a hard requirement.
Solution: Proxy-Level Model Routing
Route requests with tools to a Gemini 2.x model (no thought signatures needed) and requests without tools to Gemini 3 (latest model for text quality).
Add this to the proxy's buildProviderBody() function:
if (provider.name === 'google') {
delete clone.store;
// Gemini 3 requires thought_signature for tool use, which clients
// strip from conversation history. Fall back to 2.5-flash for tool
// requests where this would break multi-turn function calling.
if (clone.tools && clone.tools.length > 0) {
clone.model = 'gemini-2.5-flash';
console.log('[proxy] Tool use detected → downgrade to gemini-2.5-flash');
}
}
Why This Approach
| Approach | Pros | Cons |
|---|---|---|
| Smart model routing (chosen) | 5 lines, transparent to client, no state | Tool requests use older model |
| Proxy-level signature injection | Preserves Gemini 3 for tools | Extremely complex: buffer SSE, parse streaming, cache per-session, inject on follow-up |
| Strip tools for Google | Simple | User loses tool use entirely |
| Use only 2.5-flash | No routing needed | Loses Gemini 3 quality for text |
| Fix the client | Ideal long-term | Client is closed-source npm package |
Verification
After deploying the fix:
- Tool use test: Send a message requiring tool use (e.g., "read this URL: ...") → bot should use the tool AND produce a non-empty text response
- Non-tool test: Send "Hello" → should respond normally (via Gemini 3)
- Provider status: Google provider should show
failures: 0andlastSuccessset - Debug logs: Should show "Tool use detected → downgrade to gemini-2.5-flash" for tool-bearing requests
- Circuit breaker: Google circuit should remain CLOSED
Diagnostic Steps
If you suspect this issue:
- Check provider-status endpoint: Does Google show
lastSuccess: nulldespite requests? - Check proxy logs for 400 errors from Google containing "thought_signature"
- Compare tool vs non-tool messages: Do non-tool messages work while tool messages fail?
- Check the model name: Is it a Gemini 3 model? (Gemini 2.x doesn't have this issue)
Notes
- Gemini 2.5-flash does not require thought signatures for function calling — it works with standard OpenAI-compatible tool_call format
- This may be resolved in future Gemini versions if Google makes thought_signature optional or adds it to the standard response format. Check Google AI Studio changelog.
- The check is
clone.tools && clone.tools.length > 0— this catches requests that have tools defined, not just requests that are submitting tool results. Both scenarios need the older model because the conversation may involve tool calls. - Related skill: See
multi-provider-llm-proxy-debuggingfor general proxy chain debugging patterns (field stripping, URL construction, error body visibility)