docs-xlsx

SKILL.md

Excel Spreadsheet Parsing

Parse Excel spreadsheets (.xlsx) into CSV files, JSON metadata, and formula artifacts using multi-method extraction.

Usage

Run the parsing script directly:

./scripts/parse_xlsx.py <path_to_file.xlsx> <output_dir>

Example:

./scripts/parse_xlsx.py ~/data/workbook.xlsx ./parsed/

The script uses 4 extraction methods:

  • pandas (basic) - CSV exports + schema
  • pandas (detailed) - Data types + statistics
  • openpyxl (structure) - Workbook structure JSON
  • openpyxl (formulas) - Formula extraction

Output Structure

output_dir/
├── file.xlsx/
│   ├── parsing_summary.json
│   ├── pandas_basic/
│   │   ├── Sheet1.csv
│   │   ├── Sheet2.csv
│   │   └── metadata.json
│   ├── pandas_detailed/
│   │   ├── Sheet1.csv
│   │   ├── Sheet2.csv
│   │   ├── statistics.json
│   │   └── data_types.json
│   ├── openpyxl_structure/
│   │   └── workbook.json
│   └── openpyxl_formulas/
│       └── formulas.json

Script Features

  • Exports each sheet as separate CSV file
  • Captures data types and statistics
  • Extracts formulas with cell references
  • Handles multiple sheets per workbook
  • Provides schema information for each sheet
Weekly Installs
4
GitHub Stars
1
First Seen
14 days ago
Installed on
claude-code4
mcpjam1
kilo1
junie1
windsurf1
zencoder1