gemini-files

SKILL.md

Gemini File API

Upload and manage files for use with Gemini models through executable scripts, supporting images, audio, video, PDFs, and other file types.

When to Use This Skill

Use this skill when you need to:

  • Upload images for multimodal analysis
  • Upload videos for content processing
  • Upload PDFs for document analysis
  • Upload audio for transcription or processing
  • Pre-upload files for batch operations
  • Check file processing status
  • List and manage uploaded files
  • Use files with other Gemini skills (text, image, etc.)

Available Scripts

scripts/upload.js

Purpose: Upload files to Gemini File API

When to use:

  • Uploading any file for Gemini processing
  • Preparing files for multimodal generation
  • Uploading documents for analysis
  • Batch file preparation

Key parameters:

Parameter Description Example
path File path (required) image.jpg
--name, -n Display name "my-document"
--wait, -w Wait for processing Flag

Output: File name, URI, and status information

Workflows

Workflow 1: Basic File Upload

node scripts/upload.js image.jpg
  • Best for: Quick uploads, simple files
  • Output: File name and URI for API use
  • State: PROCESSING or ACTIVE

Workflow 2: Upload with Custom Name

node scripts/upload.js document.pdf --name "Quarterly Report Q4 2026"
  • Best for: Organizing files, tracking uploads
  • Use when: Original filename not descriptive enough
  • Display name appears in file listings

Workflow 3: Upload and Wait for Processing

node scripts/upload.js video.mp4 --wait
  • Best for: Large files, videos, audio
  • Waits for file to be ACTIVE state
  • Use when: You need to use file immediately after upload

Workflow 4: Upload Image for Analysis

# 1. Upload image
node scripts/upload.js photo.png --name "product-shot"

# 2. Use with gemini-text for analysis
node skills/gemini-text/scripts/generate.js "Describe this image" --image photo.png
  • Best for: Image analysis, captioning, visual Q&A
  • Combines with: gemini-text for multimodal processing

Workflow 5: Upload PDF for Content Extraction

# 1. Upload PDF
node scripts/upload.js research-paper.pdf --name "AI-Research-Paper" --wait

# 2. Extract content with gemini-text
node skills/gemini-text/scripts/generate.js "Extract key findings from this document" --image research-paper.pdf
  • Best for: Document processing, content extraction
  • Combines with: gemini-text for analysis

Workflow 6: Upload Multiple Files for Batch

# 1. Upload multiple files
for file in *.jpg; do
    node scripts/upload.js "$file"
done

# 2. Create batch job using uploaded files (gemini-batch skill)
  • Best for: Preparing files for batch processing
  • Combines with: gemini-batch for bulk operations

Workflow 7: Upload Audio for Transcription

# 1. Upload audio
node scripts/upload.js interview.mp3 --name "interview-001" --wait

# 2. Process with gemini-text (if transcription available)
node skills/gemini-text/scripts/generate.js "Transcribe and summarize this audio" --image interview.mp3
  • Best for: Audio processing, transcription, podcast analysis
  • Combines with: gemini-text for audio analysis

Workflow 8: Upload Video for Content Analysis

# 1. Upload video (may take time)
node scripts/upload.js product-demo.mp4 --name "demo-video" --wait

# 2. Analyze with gemini-text
node skills/gemini-text/scripts/generate.js "Analyze this product demo video" --image product-demo.mp4
  • Best for: Video analysis, content summarization
  • Note: Videos may require significant processing time

Parameters Reference

Supported File Types

Type Extensions Max Size Processing Time
Images jpg, jpeg, png, gif, webp 20MB Seconds
Audio mp3, wav, aac, flac 25MB Seconds-minutes
Video mp4, mov, avi, webm 2GB Minutes-hours
Documents pdf, txt 50MB Seconds-minutes

MIME Types

Script auto-detects based on extension:

  • Images: image/jpeg, image/png, image/gif, image/webp
  • Audio: audio/mpeg, audio/wav
  • Video: video/mp4, video/quicktime, video/webm
  • Documents: application/pdf, text/plain

File States

State Description Ready for Use
PROCESSING File is being analyzed No
ACTIVE File is ready Yes
FAILED Processing failed No

Output Interpretation

Upload Response

Uploading photo.png...
Uploaded: files/abc123...
URI: gs://generation-tmp/abc123...
State: PROCESSING
  • File name: Use in API calls
  • URI: Internal Google Cloud Storage reference
  • State: PROCESSING = wait, ACTIVE = ready

With --wait Flag

Uploading video.mp4...
Uploaded: files/xyz789...
URI: gs://generation-tmp/xyz789...
State: PROCESSING
Waiting for processing...
Still processing...
File ready!
  • Script polls until state is ACTIVE
  • Use for large files requiring processing
  • May take minutes for videos

Using Uploaded Files

Once uploaded, reference file by name:

# With gemini-text
node skills/gemini-text/scripts/generate.js "Analyze" --image <uploaded-file-path>

Common Issues

"google-genai not installed"

npm install @google/genai@latest dotenv@latest

"File not found"

  • Verify file path is correct
  • Use absolute paths if relative paths fail
  • Check file extension matches supported types

"File too large"

  • Check size limits for file type
  • Compress images/videos if possible
  • Split large files into smaller parts

"Unsupported file type"

  • Check supported extensions
  • Convert to supported format if possible
  • Images: jpg, png, gif, webp
  • Videos: mp4, mov, avi, webm

"Processing failed"

  • Check file is not corrupted
  • Try re-uploading the file
  • Verify file format is valid
  • Check API quota limits

"File still processing" (without --wait)

  • File state is PROCESSING, not ACTIVE
  • Use --wait flag or check status later
  • Large files (especially videos) take time
  • Processing can take minutes to hours

Best Practices

Upload Strategy

  • Use --wait for files you'll use immediately
  • Skip --wait for batch uploads to save time
  • Use descriptive --name for organization
  • Keep track of file names for later use

File Organization

  • Use consistent naming conventions
  • Include dates or versions in names
  • Group related files together
  • Document file names in your code

Performance Tips

  • Upload multiple files in parallel (separate processes)
  • Pre-upload files for batch operations
  • Check file state before using in API calls
  • Delete old files to manage storage

Error Handling

  • Check return state after upload
  • Retry failed uploads
  • Verify file integrity before upload
  • Log file names for audit trails

Integration with Other Skills

  • gemini-text: Multimodal analysis, document processing
  • gemini-image: Generate images based on uploaded reference
  • gemini-batch: Use uploaded files in batch jobs
  • gemini-embeddings: Create embeddings from file content

File Lifecycle

  • Upload → PROCESSING → ACTIVE → Use in API
  • Delete old files to free storage
  • Files may expire after certain period
  • Download important files for backup

Related Skills

  • gemini-text: Analyze uploaded files with text generation
  • gemini-image: Create images based on uploaded references
  • gemini-batch: Use uploaded files in batch processing
  • gemini-embeddings: Generate embeddings from file content

Quick Reference

# Basic upload
node scripts/upload.js image.jpg

# With custom name
node scripts/upload.js document.pdf --name "My Document"

# Wait for processing
node scripts/upload.js video.mp4 --wait

# Multiple files
for file in *.jpg; do node scripts/upload.js "$file"; done

File Management API

While not in scripts, you can also manage files via JavaScript:

import { GoogleGenAI } from "@google/genai";

const client = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

// List all files
for await (const file of client.files.list()) {
  console.log(`${file.name}: ${file.displayName} (${file.state})`);
}

// Get file info
const file = await client.files.get({ name: "files/abc123..." });
console.log(`State: ${file.state}`);

// Delete file
await client.files.delete({ name: "files/abc123..." });

Reference

Weekly Installs
8
GitHub Stars
1
First Seen
Jan 29, 2026
Installed on
gemini-cli7
opencode6
github-copilot5
codex5
kimi-cli5
amp5