programmatic-eda
SKILL.md
Programmatic EDA
Quick Start
Execute systematic data quality checks, distribution analysis, and correlation detection on any dataset with automated sanity checks.
Context Requirements
Before starting EDA, Claude needs:
- Dataset Access: The data file or database connection
- Business Context: What this data represents and what decisions it informs
- Quality Thresholds (optional): What % missing/outliers are acceptable
Context Gathering
If dataset not yet loaded:
"Please provide your dataset. I can work with:
- CSV/Excel files (upload or provide path)
- Database connection details
- Pandas DataFrame (if already loaded in notebook)"
If business context missing:
"To provide relevant insights, I need to understand:
- What does this dataset represent? (customers, transactions, events, etc.)
- What business question are you trying to answer?
- What time period does this cover?
- Are there any known data quality issues I should be aware of?"
For quality thresholds (if not provided, use defaults):
"I'll use standard thresholds unless you specify otherwise:
- Missing values: Flag if >5% (warn if >30%)
- Outliers: Flag using IQR method (1.5 × IQR)
- Duplicates: Flag if >1%
Do these work for your use case, or should I adjust?"
Workflow
1. Data Loading & Overview
Weekly Installs
1
Repository
nimrodfisher/da…s-skillsGitHub Stars
1
First Seen
1 day ago
Security Audits
Installed on
amp1
cline1
openclaw1
opencode1
cursor1
kimi-cli1