gemini
Gemini API
Use Google Gemini API via REST for text generation, multimodal analysis, image generation, and more.
Prerequisites
- Environment variable
GOOGLE_API_KEYmust be set - API endpoint:
https://generativelanguage.googleapis.com/v1beta
Available Models
| Model | Use Case |
|---|---|
gemini-2.5-flash |
Fast text generation (default) |
gemini-2.5-pro |
High quality text generation |
gemini-3-flash-preview |
Latest flash model |
gemini-3-pro-preview |
Latest pro model |
gemini-2.5-flash-image |
Image generation (Nano Banana) |
gemini-3-pro-image-preview |
Advanced image generation with thinking & search |
Workflow
Phase 1: Determine Task Type
Based on user request, identify which capability to use:
- Text Generation: Basic prompts, chat, Q&A
- Multimodal Analysis: Analyze images, videos, or audio
- Image Generation: Create or edit images (Nano Banana)
- Function Calling: Execute custom functions
- Search Grounding: Real-time web search integration
Phase 2: Execute API Call
Use the appropriate curl command based on task type.
1. Text Generation
Basic Prompt
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Your prompt here"}]
}]
}'
With Configuration
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Your prompt here"}]
}],
"generationConfig": {
"temperature": 0.9,
"maxOutputTokens": 2000,
"stopSequences": ["END"]
}
}'
Multi-turn Chat
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "First message"}]},
{"role": "model", "parts": [{"text": "Model response"}]},
{"role": "user", "parts": [{"text": "Follow-up question"}]}
]
}'
System Instructions
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"system_instruction": {
"parts": [{"text": "You are a helpful assistant that speaks like a pirate."}]
},
"contents": [{
"parts": [{"text": "Hello!"}]
}]
}'
JSON Mode Output
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "List 3 colors as JSON array"}]
}],
"generationConfig": {
"response_mime_type": "application/json"
}
}'
Streaming Response
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:streamGenerateContent?alt=sse&key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Write a long story"}]
}]
}'
Safety Settings
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [{"text": "Your prompt"}]
}],
"safetySettings": [
{"category": "HARM_CATEGORY_DANGEROUS_CONTENT", "threshold": "BLOCK_ONLY_HIGH"},
{"category": "HARM_CATEGORY_HARASSMENT", "threshold": "BLOCK_ONLY_HIGH"}
]
}'
2. Multimodal Analysis
Image Analysis (Base64 Inline)
# First encode image to base64
BASE64_IMAGE=$(base64 -w0 image.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Describe this image in detail"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMAGE'"}}
]
}]
}'
Video Analysis (File API)
Step 1: Upload Video
# Get upload URL
UPLOAD_URL=$(curl -s "https://generativelanguage.googleapis.com/upload/v1beta/files?key=$GOOGLE_API_KEY" \
-H "X-Goog-Upload-Protocol: resumable" \
-H "X-Goog-Upload-Command: start" \
-H "X-Goog-Upload-Header-Content-Length: $(stat -f%z video.mp4)" \
-H "X-Goog-Upload-Header-Content-Type: video/mp4" \
-H "Content-Type: application/json" \
-d '{"file": {"display_name": "video.mp4"}}' \
-D - | grep -i "x-goog-upload-url" | cut -d' ' -f2 | tr -d '\r')
# Upload file
curl "$UPLOAD_URL" \
-H "X-Goog-Upload-Offset: 0" \
-H "X-Goog-Upload-Command: upload, finalize" \
-H "Content-Type: video/mp4" \
--data-binary @video.mp4
Step 2: Query with Video
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Describe what happens in this video"},
{"file_data": {"mime_type": "video/mp4", "file_uri": "FILE_URI_FROM_UPLOAD"}}
]
}]
}'
Audio Analysis
Similar to video, upload via File API then query:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Transcribe and summarize this audio"},
{"file_data": {"mime_type": "audio/mp3", "file_uri": "FILE_URI_FROM_UPLOAD"}}
]
}]
}'
3. Image Generation (Nano Banana)
Basic Image Generation
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"parts": [{"text": "Create a photorealistic image of a cat wearing a hat"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
With Aspect Ratio Control
Supported ratios: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"parts": [{"text": "Create a landscape scene"}]}],
"generationConfig": {
"responseModalities": ["IMAGE"],
"imageConfig": {
"aspectRatio": "16:9"
}
}
}'
Image Editing (Character Consistency)
BASE64_IMAGE=$(base64 -w0 original.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Put this character in a tropical forest"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMAGE'"}}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
High Resolution (Pro Model - 2K/4K)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"parts": [{"text": "A photo of an oak tree in all four seasons"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "1:1",
"imageSize": "4K"
}
}
}'
Image Generation with Search Grounding (Pro)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-3-pro-image-preview:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"parts": [{"text": "Visualize the current weather forecast for Tokyo as a chart"}]}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"],
"imageConfig": {
"aspectRatio": "16:9"
}
},
"tools": [{"google_search": {}}]
}'
Multi-Image Fusion
BASE64_IMG1=$(base64 -w0 image1.jpg)
BASE64_IMG2=$(base64 -w0 image2.jpg)
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash-image:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"parts": [
{"text": "Combine these two characters in a fantasy world"},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMG1'"}},
{"inline_data": {"mime_type": "image/jpeg", "data": "'$BASE64_IMG2'"}}
]
}],
"generationConfig": {
"responseModalities": ["TEXT", "IMAGE"]
}
}'
4. Function Calling
Define and Call Functions
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{
"role": "user",
"parts": [{"text": "What movies are playing in Mountain View?"}]
}],
"tools": [{
"function_declarations": [{
"name": "find_movies",
"description": "Find movies playing in theaters",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string", "description": "City and state"},
"genre": {"type": "string", "description": "Movie genre"}
},
"required": ["location"]
}
}]
}]
}'
Provide Function Response
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [
{"role": "user", "parts": [{"text": "What movies are playing in Mountain View?"}]},
{"role": "model", "parts": [{"functionCall": {"name": "find_movies", "args": {"location": "Mountain View, CA"}}}]},
{"role": "function", "parts": [{"functionResponse": {"name": "find_movies", "response": {"movies": ["Barbie", "Oppenheimer"]}}}]}
],
"tools": [{
"function_declarations": [{
"name": "find_movies",
"description": "Find movies playing in theaters",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"},
"genre": {"type": "string"}
},
"required": ["location"]
}
}]
}]
}'
5. Search Grounding
Real-time web search integration:
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"contents": [{"parts": [{"text": "What is the current Google stock price?"}]}],
"tools": [{"google_search": {}}]
}'
Response includes groundingMetadata with sources.
6. Context Caching
For repeated queries on the same large content:
Create Cache
curl "https://generativelanguage.googleapis.com/v1beta/cachedContents?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"model": "models/gemini-2.5-flash",
"contents": [{"parts": [{"text": "LARGE_DOCUMENT_TEXT_HERE"}]}],
"ttl": "3600s"
}'
Use Cache
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GOOGLE_API_KEY" \
-H 'Content-Type: application/json' \
-X POST \
-d '{
"cachedContent": "cachedContents/CACHE_ID",
"contents": [{"parts": [{"text": "Summarize the document"}]}]
}'
7. Model Information
List All Models
curl "https://generativelanguage.googleapis.com/v1beta/models?key=$GOOGLE_API_KEY"
Get Specific Model
curl "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash?key=$GOOGLE_API_KEY"
Response Handling
Text Response Structure
{
"candidates": [{
"content": {
"parts": [{"text": "Response text here"}],
"role": "model"
},
"finishReason": "STOP"
}],
"usageMetadata": {
"promptTokenCount": 10,
"candidatesTokenCount": 50,
"totalTokenCount": 60
}
}
Image Response Structure
When using image generation, response includes base64-encoded images:
{
"candidates": [{
"content": {
"parts": [
{"text": "Here is your image:"},
{"inlineData": {"mimeType": "image/png", "data": "BASE64_IMAGE_DATA"}}
]
}
}]
}
To save the image:
# Extract and decode image from response
echo "BASE64_DATA" | base64 -d > output.png
Error Handling
| Error | Cause | Solution |
|---|---|---|
| 400 | Invalid request | Check JSON syntax |
| 401 | Invalid API key | Verify GOOGLE_API_KEY |
| 429 | Rate limit | Wait and retry |
| 500 | Server error | Retry with exponential backoff |
Best Practices
- Use appropriate model: Flash for speed, Pro for quality
- Set temperature: Lower (0.1-0.3) for factual, higher (0.7-1.0) for creative
- Limit output tokens: Set
maxOutputTokensto avoid excessive responses - Use caching: For repeated queries on large documents
- Handle streaming: For long responses, use
streamGenerateContent - Image generation tips: Use detailed, descriptive prompts for best results
More from legacybridge-tech/claude-plugins
multi-perspective-analysis
Analyze propositions from multiple expert perspectives. Dynamically generates 4-6 relevant expert roles, then performs validation, comprehensive analysis, or debate-style examination. Use when user wants to examine ideas critically, find blindspots, or explore different viewpoints on a topic.
16process-file
Process arbitrary files (email, PDF, Office docs, images, audio/video) and integrate with AkashicRecords for intelligent archiving. Reads file content, analyzes intent, and suggests appropriate storage location based on content and project preferences.
12tailwindplus
Access TailwindPlus UI component library - search, list, and retrieve code for Marketing, Application UI, and eCommerce components in HTML/React/Vue with Tailwind CSS v3/v4
11communication-tracker
Track and integrate external communications (emails, chat messages, screenshots) into project context with timeline management. Use when user mentions "email", "message", "communication", "screenshot", "track", "import", "sync", or wants to import external information sources into the project timeline.
7initialize-project
Initialize a new software project with customized structure through interactive Q&A. Use when user mentions "new project", "start project", "initialize project", "create project", or "set up project". Gathers methodology, team structure, documentation preferences, and integration requirements to generate appropriate RULE.md and directory structure.
7cross-domain-thinking
Structured methods for finding connections across disciplines. Use when exploring how concepts from one field illuminate another, seeking novel applications, or analyzing structural similarities between domains.
7