cohort-analysis
When to use
- A stakeholder asks "are we retaining users better than last quarter?"
- You need to measure N-day, weekly, or monthly retention for a product or feature
- You want to compare how different acquisition cohorts (by channel, plan, or signup date) perform over their lifetime
- You're investigating churn and need to identify at which period users typically leave
Process
- Define the cohort and activity — clarify: cohort grouping (signup month, first purchase date, etc.) and retention event (login, purchase, feature use). Document in the report header.
- Pull or build the data — if starting from a database, use
scripts/cohort_query.sqlas the starting point. Adapt thecohort_dateandactivity_datecolumns to your schema. - Build the cohort table — run
scripts/cohort_builder.pyto produce a cohort × period membership table from event data. Output is a CSV withuser_id,cohort_period,activity_period. - Compute the retention matrix — run
scripts/retention_matrix.pyon the cohort table to generate the period-over-period retention rates. Output is an N×M matrix (cohort × period). - Visualise — run
scripts/cohort_visualizer.pyto render a heatmap of the retention matrix and a time-series of retention curves per cohort. - Interpret findings — consult
references/retention_metrics_glossary.mdfor metric definitions andreferences/cohort_definition_patterns.mdfor pattern recognition. - Write the report — fill
assets/cohort_report_template.md. For a visual deliverable, fill in theassets/retention_matrix.htmlheatmap template.
Inputs the skill needs
- Required: event data with
user_id,cohort_date(e.g.signup_date),activity_date - Required: cohort grouping granularity (daily / weekly / monthly)
- Required: retention event definition — what counts as "active" or "retained"?
- Optional: minimum cohort size (recommend ≥ 100 users; smaller cohorts have noisy rates)
- Optional: number of periods to track (e.g. 12 months)
- Optional: cohort attributes to segment by (acquisition channel, plan tier, geography)
Output
assets/cohort_report_template.md(filled) — narrative interpretation and retention figuresassets/retention_matrix.html(filled) — colour-coded retention heatmapscripts/retention_matrix.pyoutput CSV — raw retention rates for downstream use
More from nimrodfisher/data-analytics-skills
funnel-analysis
Conversion funnel analysis with drop-off investigation. Use when analyzing multi-step processes, identifying conversion bottlenecks, comparing segments through a funnel, or optimizing user journeys.
37metric-reconciliation
Cross-source metric validation and discrepancy investigation. Use when metrics from different sources don't match, investigating data quality issues between systems, or validating data migration accuracy.
31insight-synthesis
Transform data findings into compelling insights. Use when converting analysis results into actionable insights, connecting findings to business impact, or preparing insights for stakeholder communication.
31dashboard-specification
Design specifications for effective dashboards. Use when planning new dashboards, improving existing ones, or documenting dashboard requirements before development starts.
30data-quality-audit
Comprehensive data quality assessment against business rules, schema constraints, and freshness expectations. Activate when validating data pipeline outputs before production use, auditing a dataset against defined business rules, or producing a quality scorecard for a data asset.
30time-series-analysis
Temporal pattern detection and forecasting. Use when analyzing trends over time, detecting seasonality, identifying anomalies in time series, or building simple forecasting models for planning.
30