clean-data-xls
Clean Data
Clean messy data in the active sheet or a specified range.
Preflight: Dependency Check
Before starting, verify required libraries are installed and install any that are missing.
python3 -c "import openpyxl" 2>/dev/null || python3 -m pip install openpyxl
Important: Do not skip this step — the workflow below will fail without these libraries.
Environment
- If running inside Excel (Office Add-in / Office JS): Use Office JS directly. Read via
range.values, write helper-column formulas viarange.formulas = [["=TRIM(A2)"]]. The in-place vs helper-column decision still applies. - If operating on a standalone
.xlsxfile: Use Python andopenpyxl.
Workflow
Step 1: Scope
- If a range is given, such as
A1:F200, use it. - Otherwise use the full used range of the active sheet.
- Profile each column: detect its dominant type, text vs number vs date, and identify outliers.
Step 2: Detect issues
| Issue | What to look for |
|---|---|
| Whitespace | Leading/trailing spaces, double spaces |
| Casing | Inconsistent casing in categorical columns like usa, USA, Usa |
| Number-as-text | Numeric values stored as text; stray $, ,, % in number cells |
| Dates | Mixed formats in the same column like 3/8/26, 2026-03-08, March 8 2026 |
| Duplicates | Exact-duplicate rows and near-duplicates caused by case or whitespace differences |
| Blanks | Empty cells in otherwise-populated columns |
| Mixed types | A column that is mostly numbers but has a few text entries |
| Encoding | Mojibake, non-printing characters |
| Errors | #REF!, #N/A, #VALUE!, #DIV/0! |
Step 3: Propose fixes
Show a summary table before changing anything:
| Column | Issue | Count | Proposed Fix |
|---|
Step 4: Apply
- Prefer formulas over hardcoded cleaned values. Where the cleaned output can be expressed as a formula, such as
=TRIM(A2),=VALUE(SUBSTITUTE(B2,"$","")),=UPPER(C2), or=DATEVALUE(D2), write the formula in an adjacent helper column rather than computing the result in Python and overwriting the original. - Only overwrite in place with computed values when the user explicitly asks for it, or when no sensible formula equivalent exists, such as encoding or mojibake repair.
- For destructive operations like removing duplicates, filling blanks, or overwriting originals, confirm with the user first.
- After each category of fix, whitespace, casing, number conversion, dates, dedup, show a sample of what changed and get confirmation before moving to the next category.
- Report a before/after summary of what changed.
More from fivetaku/claude-office-skills
audit-xls
Audit a spreadsheet for formula accuracy, errors, and common mistakes. Scopes to a selected range, a single sheet, or the entire model, including financial-model integrity checks like BS balance, cash tie-out, and logic sanity. Triggers on "audit this sheet", "check my formulas", "find formula errors", "QA this spreadsheet", "sanity check this", "debug model", "model check", "model won't balance", "something's off in my model", and "model review".
3ib-check-deck
Investment banking presentation quality checker. Reviews a pitch deck or client-ready presentation for (1) number consistency across slides, (2) data-narrative alignment, (3) language polish against IB standards, (4) visual and formatting QC. Use whenever the user asks to review, check, QC, proof, or do a final pass on a deck, pitch, or client materials — including requests like "check my numbers", "reconcile figures across slides", "is this client-ready", or "what am I missing before I send this out".
3lbo-model
This skill should be used when completing LBO (Leveraged Buyout) model templates in Excel for private equity transactions, deal materials, or investment committee presentations. The skill fills in formulas, validates calculations, and ensures professional formatting standards that adapt to any template structure.
3dcf-model
Real DCF (Discounted Cash Flow) model creation for equity valuation. Retrieves financial data from SEC filings and analyst reports, builds comprehensive cash flow projections with proper WACC calculations, performs sensitivity analysis, and outputs professional Excel models with executive summaries. Use when users need to value a company using DCF methodology, request intrinsic value analysis, or ask for detailed financial modeling with growth projections and terminal value calculations.
3deck-refresh
Updates a presentation with new numbers — quarterly refreshes, earnings updates, comp rolls, rebased market data. Use whenever the user asks to "update the deck with Q4 numbers", "refresh the comps", "roll this forward", "swap in the new earnings", "change all the $485M to $512M", or any request to swap figures across an existing deck without rebuilding it.
33-statement-model
Complete, populate and fill out 3-statement financial model templates (Income Statement, Balance Sheet, Cash Flow Statement). Use when asked to fill out model templates, complete existing model frameworks, populate financial models with data, complete a partially filled IS/BS/CF framework, or link integrated financial statements within an existing template structure. Triggers include requests to fill in, complete, or populate a 3-statement model template
3