data-cleaning

SKILL.md

Data Cleaning Skill

Overview

Master data cleaning and preprocessing techniques essential for reliable analytics.

Topics Covered

  • Missing value handling (imputation, deletion)
  • Outlier detection and treatment
  • Data type conversion and validation
  • Duplicate identification and removal
  • String cleaning and normalization

Learning Outcomes

  • Clean messy datasets
  • Handle missing data appropriately
  • Detect and treat outliers
  • Ensure data quality

Error Handling

Error Type Cause Recovery
Memory error Dataset too large Use chunking or sampling
Type conversion failed Invalid data format Apply preprocessing first
Encoding issues Wrong character encoding Detect and specify encoding
Validation failure Data doesn't meet schema Review and adjust validation rules

Related Skills

  • programming (for automation)
  • foundations (for data quality concepts)
  • databases-sql (for SQL-based cleaning)
Weekly Installs
10
GitHub Stars
1
First Seen
Jan 29, 2026
Installed on
gemini-cli9
github-copilot9
codex9
kimi-cli9
cursor9
opencode9