feishu-doc-reader

Installation

SKILL.md

Feishu Document Reader

This skill enables reading and extracting content from Feishu (Lark) documents using the official Feishu Open API.

Configuration

Set Up the Skill

Create the configuration file at ./reference/feishu_config.json with your Feishu app credentials:

{
  "app_id": "your_feishu_app_id_here",
  "app_secret": "your_feishu_app_secret_here"
}

Make sure the scripts are executable:

chmod +x scripts/read_doc.sh
chmod +x scripts/get_blocks.sh

Security Note: The configuration file should be kept secure and not committed to version control. Consider using proper file permissions (chmod 600 ./reference/feishu_config.json).

Usage

Basic Document Reading

To read a Feishu document, you need the document token (found in the URL: https://example.feishu.cn/docx/DOC_TOKEN).

Using the shell script (recommended):

# Make sure environment variables are set first
./scripts/read_doc.sh "your_doc_token_here"

# Or specify document type explicitly
./scripts/read_doc.sh "docx_token" "doc"
./scripts/read_doc.sh "sheet_token" "sheet"

Get Detailed Document Blocks (NEW)

For complete document structure with all blocks, use the dedicated blocks script:

# Get full document blocks structure
./scripts/get_blocks.sh "docx_AbCdEfGhIjKlMnOpQrStUv"

# Get specific block by ID
./scripts/get_blocks.sh "docx_token" "block_id"

# Get blocks with children
./scripts/get_blocks.sh "docx_token" "" "true"

Using Python directly for blocks:

python scripts/get_feishu_doc_blocks.py --doc-token "your_doc_token_here"
python scripts/get_feishu_doc_blocks.py --doc-token "docx_token" --block-id "block_id"
python scripts/get_feishu_doc_blocks.py --doc-token "docx_token" --include-children

Supported Document Types

Docx documents (new Feishu docs): Full content extraction with blocks, metadata, and structure
Doc documents (legacy): Basic metadata and limited content
Sheets: Full spreadsheet data extraction with sheet navigation
Slides: Basic metadata (content extraction requires additional permissions)

Features

Enhanced Content Extraction

Structured output: Clean JSON with document metadata, content blocks, and hierarchy
Complete blocks access: Full access to all document blocks including text, tables, images, headings, lists, etc.
Block hierarchy: Proper parent-child relationships between blocks
Text extraction: Automatic text extraction from complex block structures
Table support: Proper table parsing with row/column structure
Image handling: Image URLs and metadata extraction
Link resolution: Internal and external link extraction

Block Types Supported

text: Plain text and rich text content
heading1/2/3: Document headings with proper hierarchy
bullet/ordered: List items with nesting support
table: Complete table structures with cells and formatting
image: Image blocks with tokens and metadata
quote: Block quotes
code: Code blocks with language detection
equation: Mathematical equations
divider: Horizontal dividers
page: Page breaks (in multi-page documents)

Error Handling & Diagnostics

Detailed error messages: Clear explanations for common issues
Permission validation: Checks required permissions before making requests
Token validation: Validates document tokens before processing
Retry logic: Automatic retries for transient network errors
Rate limiting: Handles API rate limits gracefully

Security Features

Secure credential storage: Supports both environment variables and secure file storage
No credential logging: Credentials never appear in logs or output
Minimal permissions: Uses only required API permissions
Access token caching: Efficient token reuse to minimize API calls

Command Line Options

Main Document Reader

# Python script options
python scripts/read_feishu_doc.py --help

# Shell script usage
./scripts/read_doc.sh <doc_token> [doc|sheet|slide]

Blocks Reader (NEW)

# Get full document blocks
./scripts/get_blocks.sh <doc_token>

# Get specific block
./scripts/get_blocks.sh <doc_token> <block_id>

# Include children blocks
./scripts/get_blocks.sh <doc_token> "" true

# Python options
python scripts/get_feishu_doc_blocks.py --help

API Permissions Required

Your Feishu app needs the following permissions:

docx:document:readonly - Read document content
doc:document:readonly - Read legacy document content
sheets:spreadsheet:readonly - Read spreadsheet content

Error Handling

Common errors and solutions:

403 Forbidden: Check app permissions and document sharing settings
404 Not Found: Verify document token is correct and document exists
Token expired: Access tokens are valid for 2 hours, refresh as needed
App ID/Secret invalid: Double-check your credentials in Feishu Open Platform
Insufficient permissions: Ensure your app has the required API permissions
99991663: Application doesn't have permission to access the document
99991664: Document doesn't exist or has been deleted
99991668: Token expired, need to refresh

Examples

Extract document with full structure

# Read document
./scripts/read_doc.sh "docx_AbCdEfGhIjKlMnOpQrStUv"

Get complete document blocks (NEW)

# Get all blocks with full structure
./scripts/get_blocks.sh "docx_AbCdEfGhIjKlMnOpQrStUv"

# Get specific block details
./scripts/get_blocks.sh "docx_AbCdEfGhIjKlMnOpQrStUv" "blk_xxxxxxxxxxxxxx"

Process spreadsheet data

./scripts/read_doc.sh "sheet_XyZ123AbCdEfGhIj" "sheet"

Extract only text content (Python script)

python scripts/read_feishu_doc.py --doc-token "docx_token" --extract-text-only

Security Notes

Never commit credentials: Keep app secrets out of version control
Use minimal permissions: Only request permissions your use case requires
Secure file permissions: Set proper file permissions on secret files (chmod 600)
Environment isolation: Use separate apps for development and production
Audit access: Regularly review which documents your app can access

Troubleshooting

Authentication Issues

Verify your App ID and App Secret in Feishu Open Platform
Ensure the app has been published with required permissions
Check that environment variables or config files are properly set
Test with the test_auth.py script to verify credentials

Document Access Issues

Ensure the document is shared with your app or in an accessible space
Verify the document token format (should start with docx_, doc_, or sheet_)
Check if the document requires additional sharing permissions

Network Issues

Ensure your server can reach open.feishu.cn
Check firewall rules if running in restricted environments
The script includes retry logic for transient network failures

Blocks-Specific Issues

Empty blocks response: Document might be empty or have no accessible blocks
Missing block types: Some block types require additional permissions
Incomplete hierarchy: Use --include-children flag for complete block tree

References

Related skills

More from zephyrwang6/myskill

Installs

Repository

zephyrwang6/myskill

GitHub Stars

289

First Seen

Mar 25, 2026

Security Audits

Gen Agent Trust HubPass

SnykWarn