digitaliza-data-extractor
Digitaliza Data Extractor
Extract client data from folders to generate digital business cards for digitalizaweb.vercel.app.
Workflow
Client Folder → Extract Data → Extract Colors → Generate JSON → Review
Step 1: Scan Client Folder
ls -la <client_folder>/
Expected files:
datos_extraier.md- Scraped HTML from LinkTree/profile*.png/*.jpg- Screenshots, logo imageslogo.*- Brand logo
Step 2: Extract Data
Run extraction:
python scripts/extract_client_data.py <client_folder> --pretty
Batch all clients:
python scripts/extract_client_data.py <base_folder> --scan-all --output all_clients.json
Extracts: business name, WhatsApp, links (with icons), locations.
Step 3: Extract Brand Colors
python scripts/extract_colors.py <image_path> --num-colors 5 --output json
Returns: customPrimaryColor, customSecondaryColor, customAccentColor, suggestedTheme.
Step 4: Generate Final JSON
Combine into Digitaliza format:
{
"slug": "doomo-saltado",
"name": "Doomo Saltado",
"phone": "+51014711000",
"whatsapp": "51014711000",
"address": "Local en Surco, Lima",
"logoUrl": "logo.png",
"theme": "custom",
"customPrimaryColor": "#dc2626",
"customSecondaryColor": "#b91c1c",
"backgroundStyle": "mesh",
"links": [
{"title": "Reservar", "url": "https://wa.me/51014711000", "icon": "whatsapp", "order": 0, "isActive": true},
{"title": "Instagram", "url": "https://instagram.com/doomo", "icon": "instagram", "order": 1, "isActive": true}
]
}
Manual Completion
Verify after extraction:
| Field | Source |
|---|---|
name |
Screenshots or website |
whatsapp |
Country code + number |
address |
Google Maps or screenshot |
description |
1-2 sentences about business |
theme |
general, italian, mexican, japanese, coffee, hamburguesa, barber, spa, salon |
Link Icons
| Service | Icon |
|---|---|
whatsapp |
|
instagram |
|
facebook |
|
| TikTok | tiktok |
| Google Maps | location |
| UberEats | ubereats |
| Rappi | rappi |
| Menu/Carta | menu |
Batch Checklist
- Scan all:
python scripts/extract_client_data.py . --scan-all - Review extraction notes
- Extract colors for folders with logos
- Flag incomplete data for manual review
Schema Reference
See references/digitaliza_schema.md for complete field definitions.
More from founderjourney/claude-skills
superpowers
Structured software development framework for coding agents. Composable skills enforcing planning, testing, and systematic execution. Prevents code-first chaos.
136firecrawl
Web scraping, search, and data extraction using Firecrawl API. Use when users need to fetch web content, discover URLs on sites, search the web, or extract structured data from pages.
91obsidian-skills
Agent skills for creating and editing Obsidian-compatible files. Supports Obsidian Flavored Markdown, Bases (.base), and JSON Canvas (.canvas) formats.
86saas-business-logic-analyst
|
62seo-geo-skills
Comprehensive SEO and GEO (Generative Engine Optimization) skills covering research, building, optimization, and monitoring phases. 16 specialized skills for search optimization.
52saas-architecture-deep-dive
Dominio de arquitectura SaaS para Senior Full-Stack Developer. Usar cuando el usuario necesite explicar arquitectura SaaS, multi-tenancy, disenar sistemas escalables, evaluar trade-offs arquitectonicos, o preparar respuestas sobre diseño de sistemas. Activa con palabras como SaaS, multi-tenant, arquitectura, escalabilidad, tenant isolation, subscription, billing, feature flags, scaling. Especializado en plataformas como HostelOS y Digitaliza.
45