data-analytics-reporter
Data Analytics & Reporting Guide
Overview
This guide covers the process of transforming raw data into actionable business insights: from data quality validation through statistical analysis, dashboard creation, and strategic reporting. It includes SQL patterns, Python analysis code, and a worked report example.
Critical Rules
- Validate data accuracy and completeness before any analysis.
- Document data sources, transformations, and assumptions.
- Include statistical significance testing and confidence levels for all conclusions. Claims without significance testing should be labeled as directional observations, not conclusions.
- Connect every analysis to business outcomes and actionable recommendations.
- Design dashboards for specific stakeholder needs and decision contexts.
- Name every data source with its query date range, row count, and completeness percentage.
- Dashboards should include a "last refreshed" timestamp, data freshness SLA, and a link to the underlying query for each metric.
Workflow
- Data Discovery -- Assess data quality, identify key metrics and stakeholder requirements, establish significance thresholds.
- Analysis -- Build reproducible pipelines, apply appropriate statistical methods, calculate confidence intervals.
- Visualization & Reporting -- Create interactive dashboards with drill-down, write executive summaries with findings and recommendations.
- Impact Measurement -- Track recommendation implementation, measure business outcome correlation, iterate.
See SQL Patterns for executive dashboard and marketing attribution queries.
See Report Example for a full Q1 2026 worked report and Python RFM segmentation code.
Report Structure
Report Quality Checklist
- All statistical claims include confidence intervals at a minimum 95% confidence level.
- Every data source is named with its query date range, row count, and completeness percentage (e.g., "Snowflake analytics.orders, Jan 1 - Mar 31 2026, 142,300 rows, 99.7% complete").
- Recommendations include projected ROI with explicit assumptions stated (e.g., "Assumes 12% reactivation rate based on historical win-back campaign performance of 7-15%").
- Reports are reproducible: all SQL queries and Python scripts are included and can be re-run against the documented data sources to regenerate all figures and tables.
Reference
Capabilities Reference
- Statistical analysis: regression, forecasting, A/B testing, segmentation, correlation
- Dashboard and report creation (Tableau, Power BI, Looker, custom)
- SQL query optimization and data warehouse management
- Python/R for analysis, modeling, and automation
- Customer analytics: lifetime value, churn prediction, RFM segmentation
- Marketing attribution and campaign ROI measurement
- Financial modeling and business performance analysis
- Data quality assurance and GDPR/CCPA compliance in analytics
Scripts
The following scripts are available in the scripts/ directory for data analysis:
scripts/analyze_csv.py
Auto-profiles a CSV dataset: row/column counts, column types (numeric/text/date), null percentages, basic statistics for numeric columns (min, max, mean, median, stddev), and top 5 unique values for text columns. Outputs markdown or JSON.
python scripts/analyze_csv.py data.csv
python scripts/analyze_csv.py data.csv --json
scripts/check_data_quality.py
Checks a CSV file for data quality issues: duplicate rows, columns with >50% nulls, mixed data types, and outliers (>3 stddev from mean). Reports a quality score (0-100) and specific issues found.
python scripts/check_data_quality.py data.csv
python scripts/check_data_quality.py data.csv --stddev 2.5 --json
More from peterhdd/agent-skills
engineering-senior-developer
Lead complex software implementation, architecture decisions, and reliable delivery across any modern technology stack. Use when you need pragmatic architecture tradeoffs, technical plan creation from ambiguous requirements, code quality improvements, production-safe rollout strategies, observability setup, or senior engineering judgment on maintainability, testing, and operational reliability.
63engineering-frontend-developer
Build modern web applications with React, Vue, Angular, or Svelte, focusing on performance and accessibility. Use when you need component library development, TypeScript UI implementation, responsive layouts with CSS Grid and Flexbox, Core Web Vitals optimization, service worker offline support, code splitting, ARIA accessibility, Storybook integration, or frontend API client architecture.
40engineering-backend-architect
Architect scalable backend systems, database schemas, APIs, and cloud infrastructure for robust server-side applications. Use when you need microservice vs monolith decisions, database indexing strategies, API versioning, event-driven architecture, ETL pipelines, WebSocket streaming, data modeling, query optimization, or cloud-native service design with high reliability and sub-20ms query performance.
40engineering-mobile-app-builder
Build native and cross-platform mobile applications for iOS and Android with optimized performance and platform integration. Use when you need SwiftUI or Jetpack Compose development, React Native or Flutter cross-platform apps, offline-first architecture, biometric authentication, push notifications, deep linking, app startup optimization, or mobile-specific UX patterns and gesture handling.
38engineering-system-designer
Design distributed systems, define architecture for scalability and reliability, or create system design documents. Use when you need component diagrams, data flow analysis, capacity planning, database sharding strategies, API contract design, failure mode analysis, CAP theorem tradeoffs, monolith-to-microservice migration, or architecture decision records for new or existing systems.
34engineering-rapid-prototyper
Build functional prototypes and MVPs at maximum speed to validate ideas through working software. Use when you need proof-of-concept development, rapid iteration on user feedback, no-code or low-code solutions, backend-as-a-service integration, A/B testing scaffolding, quick feature validation, or modular architectures designed for fast experimentation and learning.
33